I keep hearing about Simpson's paradox in statistics but I'm having trouble wrapping my head around it. The basic idea seems to be that a trend appears in different groups of data but disappears or reverses when these groups are combined.
I've read the textbook definitions but they always use abstract examples. Has anyone encountered a real world situation where Simpson's paradox actually happened? Something from business, medicine, or social sciences would be really helpful.
What makes this one of those statistical paradoxes explained that always trips people up?
Oh man, I actually saw Simpson's paradox in action at my last job. We were analyzing website conversion rates for two different marketing campaigns. Campaign A had a 10% conversion rate overall, Campaign B had 8%. So obviously A was better, right?
But when we broke it down by traffic source - paid search vs organic - something weird happened. Campaign A had 5% conversion on paid and 15% on organic. Campaign B had 4% on paid and 12% on organic. So Campaign B was actually worse in both segments, but because it got way more organic traffic (which converts better), it looked better overall.
My boss almost made the wrong decision based on the aggregated data. This is why looking at stratified analysis matters so much.
Medical studies are full of Simpson's paradox examples. There was this famous case with kidney stone treatments. When you looked at all patients combined, Treatment A seemed better than Treatment B. But when you separated by stone size (small vs large), Treatment B was actually better for both groups!
The paradox happened because Treatment A was used more often for small stones (which are easier to treat) and Treatment B for large stones (harder to treat). So the success rates were confounded by stone size.
This is one of those statistical paradoxes explained that really shows why you can't just trust aggregate numbers. You need to understand the underlying structure of the data.
I work in education and we see this all the time with school performance data. A district might show improving test scores overall, but when you break it down by demographic groups, every single group is actually declining.
The improvement comes from changing demographics - more students from higher-performing backgrounds entering the district. It creates this illusion of improvement that disappears when you do proper disaggregated analysis.
What's scary is how many policy decisions get made based on these aggregate numbers without anyone checking for Simpson's paradox.