We all know the saying "correlation doesn't imply causation" but I keep seeing this mistake everywhere - in news articles, social media, even some academic papers.
I'm collecting correlation vs causation examples to use in my teaching. The classic ones are ice cream sales and drowning deaths, or stork populations and birth rates.
But what are some more modern or surprising examples you've come across? Especially ones where the correlation seems really convincing but turns out to be completely spurious?
One of my favorite modern correlation vs causation examples is the relationship between ice cream sales and shark attacks. They're strongly correlated because both increase in summer.
But obviously buying ice cream doesn't cause shark attacks. The real cause is more people swimming in the ocean when it's warm.
I see this kind of thing in business all the time. Like companies with more Twitter followers are more profitable." Maybe, or maybe both are caused by being well-known brands. The correlation doesn't tell us which way the causation goes, or if there's a third factor causing both.
There was a study showing a correlation between countries that eat more chocolate and have more Nobel laureates. The media had a field day with chocolate makes you smarter!"
But obviously, wealthier countries can afford more chocolate and also invest more in education and research. The correlation is probably spurious.
What's dangerous is when these correlations get reported as causation in news articles. People start changing their behavior based on bad science. I've seen this with all kinds of health studies - "coffee causes cancer" one year, "coffee prevents cancer" the next.
In education research, there's a famous correlation: kids with bigger feet read better. Obviously foot size doesn't cause reading ability.
The real explanation is age. Older kids have bigger feet and also read better. Age is the confounding variable.
This is why randomized controlled trials are so important. They help establish causation by randomly assigning treatments. With observational data, you can only find correlations, and those can be misleading due to confounding variables.
The problem is that RCTs are expensive and sometimes unethical, so we often have to rely on observational studies. But we need to be much more careful about interpreting them.