Hi ,
Between 2000 and 2012, the correlation coefficient between the UK divorce rate and the number of movies released by
Disney was 92.5%.
Fitting a linear regression model to the data, we can infer that each additional movie released by Disney is associated with an average increase of around 1,422 divorces.
And if we wanted to, we could use that relationship to create a "story".
For example, here's one created by Claude.AI:
"Disney movie releases may increase financial stress on families, leading to marital tension and higher divorce rates as couples struggle with the costs of theatre trips and merchandise."
This explanation sounds quite plausible - except for the small
detail that it's almost certainly not true. More likely, the relationship is spurious and no causality exists at all. But, the result is data-driven.
Here's the thing...
Organisations love to think of themselves as being
data-driven, but many don't understand what that really means.
It's easy to construct a story around a set of carefully selected data points to support your agenda. But this is no different from decision-making based on your gut.
For data-driven decisions to be
credible, they need to be based on experimental findings that objectively show cause and effect.
Without experimental proof, you're no longer doing data science - you're doing data pseudoscience.
Talk again soon,
Dr Genevieve Hayes.