A Texan fires randomly at a barn, then paints a bullseye around the tightest cluster of holes and claims to be a sharpshooter. This is how we fool ourselves into seeing patterns where none exist.
One of these patterns is truly random. The other has a subtle structure. Can you tell which is which? (Hint: It's harder than you think!)
Journalists often report "cancer clusters"—areas where cancer rates seem unusually high. But random data ALWAYS has clusters. Press the button to generate a new random distribution of cases.
Vague quatrains written in 1555 are "matched" to modern events AFTER they happen. With enough text and liberal interpretation, anything fits.
Technical analysts find "head and shoulders," "cup and handle" patterns in random price movements. Studies show these patterns don't predict better than chance.
Test enough variables and something will be "statistically significant" (p < 0.05). If you test 20 things, expect 1 false positive by chance alone!
Basketball players seem to go on "hot streaks"—but studies show shooting success is mostly independent. We see runs in randomness and assume meaning.
Decide WHAT you're looking for BEFORE examining the data. Paint the target before you shoot!
If testing many hypotheses, adjust your significance threshold. The Bonferroni correction divides α by the number of tests.
A pattern found once might be chance. Confirm it in new, independent data.
How often would this pattern appear by chance? Random data IS clustered—that's expected!