Why does multicollinearity result in poor estimates of coefficients in linear regression?

In matrix form, the vector of coefficient estimates is derived using the formula: (X’X)-1X’Y, where X is the design matrix where the rows correspond to the observations and columns to the features, and Y is the vector of target values.

Being that the X’X matrix has to be inverted, the computation fails if it is completely singular, which occurs in the case of perfect multicollinearity, such as if one feature is a direct function of others. Even if the multicollinearity is not explicit and estimates are able to be derived, the resulting coefficient estimates exhibit a larger standard error, meaning there is less of a chance of finding the feature to be significant based on its p-value. Further, the estimates can be overly sensitive to small changes in the model, meaning that for a given predictor, its effect size could differ drastically if one other variable was added or removed from the composite equation. 

Author

Help us improve this post by suggesting in comments below:

– modifications to the text, and infographics
– video resources that offer clear explanations for this question
– code snippets and case studies relevant to this concept
– online blogs, and research publications that are a “must read” on this topic

Leave the first comment

Partner Ad
Find out all the ways that you can
Contribute