<< Chapter < Page Chapter >> Page >

The OLS model assumes that all the independent variables are independent of each other. This assumption is easy to test for a particular sample of data with simple correlation coefficients. Correlation, like much in statistics, is a matter of degree: a little is not good, and a lot is terrible.

The goal of the regression technique is to tease out the independent impacts of each of a set of independent variables on some hypothesized dependent variable. If two 2 independent variables are interrelated, that is correlated, then we cannot isolate the effects on Y of one from the other. In an extreme case where is a linear combination of , correlation equal to one, both variables move in identical ways with Y. In this case it is impossible to determine the variable that is the true cause of the effect on Y. (If the two variables were actually perfectly correlated, then mathematically no regression results could actually be calculated.)

The normal equations for the coefficients show the effects of multicollinearity on the coefficients.

b 1 = s y ( r x 1 y - r x 1 x 2 r x 2 y ) s x 1 ( 1 - r x 1 x 2 2 )
b 2 = s y ( r x 2 y - r x 1 x 2 r x 1 y ) s x 2 ( 1 - r x 1 x 2 2 )
b 0 = y - - b 1 x - 1 - b 2 x - 2

The correlation between x 1 and x 2 , and r x 1 x 2 2 , appears in the denominator of both the estimating formula for b 1 and b 2 . If the assumption of independence holds, then this term is zero. This indicates that there is no effect of the correlation on the coefficient. On theother hand, as the correlation between the two independent variables increases the denominator decreases, and thus the estimate of the coefficient increases. Thecorrelation has the same effect on both of the coefficients of these two variables. Inessence, each variable is “taking” part of the effect on Y that should be attributed to the collinear variable. This results in biased estimates.

Multicollinearity has a further deleterious impact on the OLS estimates. The correlation between the two independent variables also shows up in the formulasfor the estimate of the variance for the coefficients.

s b 1 2 = s e 2 ( n - 1 ) s x 1 2 ( 1 - r x 1 x 2 2 )
s b 2 2 = s e 2 ( n - 1 ) s x 2 2 ( 1 - r x 1 x 2 2 )

Here again we see the correlation between x 1 and x 2 in the denominator of the estimates for both variables. If the correlation is zero as assumed in the regression model, then the formula collapses to the familiar ratio of the variance of the errors to the variance of the relevant independent variable. If however the two independent variables are correlated, then the variance of the estimate of the coefficient increases. This results in a smaller t-value for the test of hypothesis of the coefficient. In short, multicollinearity results in failing to reject the null hypothesis that the X variable has no impact on Y when in fact X does have a statistically significant impact on Y. Said another way, the large standard errors of the estimated coefficient created by multicollinearity suggest statistical insignificance even when the hypothesized relationship is strong.

How good is the equation

In the last section we concerned ourselves with testing the hypothesis that the dependent variable did indeed depend upon the hypothesized independent variable or variables. It may be that we find an independent variable that has some effect on the dependent variable, but it may not be the only one, and it may not even be the most important one. Remember that the error term was placed in the model to capture the effects of any missing independent variables. It follows that the error term may be used to give a measure of the "goodness" of the equation taken as a whole in explaining the variation of the dependent variable, Y.

Practice Key Terms 3

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, Introductory statistics. OpenStax CNX. Aug 09, 2016 Download for free at http://legacy.cnx.org/content/col11776/1.26
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Introductory statistics' conversation and receive update notifications?

Ask