Everything depends upon the way the test is set up and malware samples used, etc.
I've seem quite a few discussions talking about the "real world" applicability of various tests. Of course, most of the time, I just do my best to follow the conversation.
Short version, I pay attention to the various different tests done by AV Comparatives and one or two others. Main thing is if the app is among the top 4 or 5 consistently as well as the number of false positives.
Then it comes down to questions of how well one of them fits my needs and style.
Of course, then there is layering in other defenses as well.