Hi ,
There are four key assumptions underpinning a linear regression
model:
- Linearity;
- Normality;
- Independence; and
- Homoscedasticity.
Yet, these are just the explicit assumptions. Anytime you fit a linear regression model, you also make certain implicit assumptions - possibly without even realising it.
For example, you are implicitly assuming:
- All relevant features are included in the model;
- The features are measured without error;
- Any data used to fit the model is representative of what you are trying to predict; and
- The relationship between the features and the response is correctly specified.
It's common practice when fitting a linear regression to check the validity
of the explicit assumptions. But how often have you stopped to check your implicit assumption also hold true?
Here's the thing...
It's not just
models we make assumptions about.
Next time things aren't turning out the way you thought they would - in data science or in life - it might be a sign you need to stop and check your assumptions.
Talk again soon,
Dr Genevieve Hayes.