6 Ways Statistics Can Fool You

108 views

Statistics are powerful tools. They are based on actual data, so they should be considered reliable, right? Unfortunately, despite being the most persuasive tool in research other than experiments, it can also be misused and adversely change your decisions. Here are some ways that statistics can fool you. Let’s find out about that.

4. Flawed data collection
5. Small sample sizes
6. P-hacking

Look at the two graphs below:

Those two graphs look vastly different at first glance. One chart shows a trend that grows slowly, while the other is growing quickly. However, these two graphs are the same. The second graph is just the upper part of the first graph, such that the difference is exaggerated. If you don’t look at the graph carefully enough, you might not notice that the trend might be going up slower than you think.

No matter how significant the correlation is, it doesn’t mean one factor causes the other. There are three main possible outcomes:

• One factor causes the other
• Other factors affect both of the factors
• Both are completely unrelated

The first outcome is obvious. For instance, the fact that not updating your software might result in a higher chance of falling victim to cyberattacks is direct if the attack doesn’t involve actively looking for vulnerable computers. In that case, you can be pretty sure that the cause-and-effect relationship only flows in one direction but never in two or more directions.

The second outcome is that there are other factors related to both of the outcomes simultaneously. We’ll be back to the example of cyberattacks, but this time, the attack searches for victims by identifying if a computer is vulnerable or not. That way, the attack will be in the middle of the two factors, letting the chart flow in two directions. Therefore, the attack will execute once a vulnerable system has been found, and thus increase the chances that any outdated system will be invaded.

When you read a report saying that one factor is correlated to another, you’d better also consider the possibility that it is pure luck. The chances of any events happening are not precisely the same, and some correlation must exist no matter how weak it is.

Therefore, whenever you encounter a correlation in a report, think thoroughly about the two factors. Are they related? If they seem totally unrelated or just weakly related, you shouldn’t let them affect your decisions too much.

A significant problem with statistics is the Simpson’s paradox, where opposite trends can appear when you group data in different methods.

WARNING: THE DATA MENTIONED BELOW IS NOT TRUE. THEREFORE, DO NOT USE THE DATA FOR ANY PURPOSE OTHER THAN DESCRIBING THE SIMPSON’S PARADOX.

For instance, suppose your team wants to compare how efficiently Virus A and Virus B retain themselves through backups. In that case, you deliberately infected 2000 computers containing no personal information about its owners and having no access to Internet connection, 1000 with Virus A, and 1000 with Virus B.

At this point, you are assigning the computers into two groups. One with Operating System A installed, and another one with Operating System B installed. However, something went wrong, and the sample sizes of each group became unequal.

After the results are out, you’ve found out that Virus A resides in backups 36% of the time, while Virus B only retains itself 32% of the time. Virus A is more powerful, isn’t it? You may say so until you find out that due to the inequalities of the sample sizes, you realize that Virus B might be more powerful in reality.

• If Virus A infects Operating System A, it will survive through backups 50% of the time.
• If Virus B infects Operating System A, it will survive through backups 52% of the time.
• If Virus A infects Operating System B, it will survive through backups about 26.67% of the time.
• If Virus B infects Operating System B, it will survive through backups about 29.78% of the time.

The error in the research might be insane, but the Simpson’s paradox, which means grouping data in different ways can modify the trend significantly, exists in real life. What are the scenarios where sample sizes are unequal? It can easily arise when the research is done through surveys instead of controlled tests.

4. Flawed Data Collection

Sometimes, research can generate inaccurate results because data is obtained in a wrong or biased way. For instance, if you are asking a question that seems to have a default answer (i.e., choosing a specific option looks more socially acceptable), you are creating bias in the results of the research because the results will be inclined to that option.

Furthermore, if you’ve come across situations where the report says, “We surveyed 7500 men and 1500 women” or anything like that, be very suspicious. That’s fine if the target population contains significantly more men than women, but if the amount of people in both genders are almost equivalent, you probably shouldn’t trust the study. The samples taken in the example are biased, and the researchers in charge of that might just be trying to amplify or invert the results.

Moreover, we’ve all learned that making sure that the experiment is fair is essential. All initial conditions except for the condition to be changed must be unified as much as possible. Otherwise, the result of the experiment will be biased, and the result might be different from if the experiment is fair.

5. Small Sample Sizes

Another mistake in the data collection process is using a sample that is too small. Take a look at this: What if you only survey 6 people to conclude whether people like this product or not? You’ll not get any significant results because the error is too large! For instance, 5 out of 6 people might like this product, while you might quickly see the trend that only 1 out of 6 people like the product.

Never trust studies with tiny sample sizes if they can do significantly better. For instance, if the target population only has 3000 people, why don’t you do a census that covers all individuals in the group? Be suspicious about studies that only mention “7 out of 10” or similar phrases because it doesn’t tell you that the sample is not too small to deal with. It could be 7 out of 10, 70 out of 100, 700 out of 1000, or 4900 out of 7000, and so on.

6. P-hacking

The p-value is the chance that the data does not indicate a significant relationship between the variables studied. Generally, a p-value of 0.05 or below is significant and indicates a positive result.

Researchers might heavily prefer positive results over negative results because they are more likely to get published and go viral. They can manipulate the results so that the p-value is lower than 0.05. This method is p-hacking, and there is at least one way to do so.

They can group the data in ways that produce many groups and calculate the p-value of each group. Although the possibility that statistical significance will be reached is only 5% by chance, there is a good chance that at least one group will have a p-value of 5% or below. If you split them into just 36 groups, there will be an 84% chance that at least one of them will achieve statistical significance, regardless of whether the variables are related.

Furthermore, researchers could analyze the p-values continuously and stop collecting data once statistical significance has been achieved and the sample size is large enough. That way, the researchers prevent the p-value from rising beyond the threshold and conclude that there is a relationship between the variables assessed.

Conclusion

Statistics are persuasive, but they aren’t foolproof either. Therefore, it’s essential not to let statistics fool you by observing the report in more detail rather than just observing the results. It might take you more time, but interpreting statistics correctly is essential, like the fact that it’s crucial to identify scams.