0% found this document useful (0 votes)

614 views

Heteroscedasticity Notes

Heteroscedasticity occurs when the variance of the error term is not constant across observations. This violates the assumption of homoscedasticity in ordinary least squares (OLS) regression. Heteroscedasticity can arise when the values of an independent variable are more extreme, due to measurement error, model misspecification, or differences between subpopulations. While OLS estimates remain unbiased and consistent with heteroscedasticity, they are no longer best linear unbiased estimators and test statistics like t and F may not be reliable. The Breusch-Pagan and White tests can detect the presence of heteroscedasticity.

Uploaded by

Denise Myka Tan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

614 views

Heteroscedasticity Notes

Uploaded by

Denise Myka Tan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 9

Heteroscedasticity

What heteroskedasticity is. Recall that OLS makes the assumption that V j (ε ) = σ2 for all j. That is, the
variance of the error term is constant. (Homoskedasticity). If the error terms do not have constant variance,
they are said to be heteroskedastic. The term means “differing variance” and comes from the Greek
“hetero” ('different') and “skedasis” ('dispersion').

When heteroskedasticity might occur.

1) Errors may increase as the value of an IV increases. For example,

consider a model in which annual family income is the IV and annual family
expenditures on vacations is the DV. Families with low incomes will spend
relatively little on vacations, and the variations in expenditures across such
families will be small. But for families with large incomes, the amount of
discretionary income will be higher. The mean amount spent on vacations
will be higher, and there will also be greater variability among such
families, resulting in heteroskedasticity. Note that, in this example, a high
family income is a necessary but not sufficient condition for large vacation
expenditures. Any time a high value for an IV is a necessary but not sufficient condition for an observation
to have a high value on a DV, heteroskedasticity is likely. Similar examples: Error terms associated with
very large firms might have larger variances than error terms associated with smaller firms. Sales of larger
firms might be more volatile than sales of smaller firms.

2) Errors may also increase as the values of an IV become more extreme in

either direction, e.g.
with attitudes that range from extremely negative to extremely positive. This
will produce
something that looks like an hourglass shape:

3) Measurement error can cause heteroskedasticity. Some respondents might provide more accurate
responses than others. (Note that this problem arises from the violation of another assumption, that
variables are measured without error.)

4) Other model misspecifications can produce heteroskedasticity. For example, it may be that
instead of using Y, you should be using the log of Y. Instead of using X, maybe you should be using X2, or
both X and X2 . Important variables may be omitted from the model. If the model were correctly specified,
you might find that the patterns of heteroskedasticity disappeared.

5) Heteroskedasticity can also occur if there are subpopulation

differences or other interaction effects (e.g. the effect of income on
expenditures differs for whites and blacks). (Again, the problem arises
from violation of the assumption that no such differences exist or have
already been incorporated into the model.) For example, in the following
diagram suppose that Z stands for three different populations. At low
values of X, the regression lines for each population are very close to each
other. As X gets bigger, the regression lines get further and further apart.
This means that the residual values will also get further and further apart.

Consequences of heteroscedasticity

1 Heteroscedasticity does not alter the unbiasedness and consistency properties of OLS estimators.
2 But OLS estimators are no longer of minimum variance or efficient. That is, they are NOT best linear
unbiased estimators (BLUE); they are simply linear unbiased estimators (LUE).
3 As a result, the t and F tests based under the standard assumptions of CLRM may not be reliable, resulting in
erroneous conclusion
4 In the presence of heteroscedasticity, the BLUE estimators are provided by the method of weighted least
squares (WLS).

Presence pf heteroscedasticity: Ubiased – yes, consistent – yes, efficient - yes

Data: Abortion rates in US.

A priori, we would expect ABR to be negatively related to religion, price, laws, picket, education, and positively
related to fund and income. We assume the error term satisfies the standard classical assumptions, including
the assumption of homoscedasticity.

We assume the error term satisfies the standard

classical assumptions, including the assumption of
homoscedasticity.

- For every 1 dollar increase in price there is a corresponding 0.04236 decrease in the percent of
abortion in thousands.
- States with restriction (laws) has less 0.8731 abortion than states without restriction (w/o laws)
- For every 1% increase in the religion of a state, there is a corresponding 0.02007 increase in abortion
- For every 1% increase in picketing, there is a corresponding 0.1168 decrease in abortion

As these results show, on the basis of the t statistic, price, income, and picket are statistically significant at the
10% or lower level of significance, whereas the other variables are not statistically significant, although some of
them (laws and education) have the correct signs. But remember that if there is heteroscedasticity, the estimated
t values
may not be reliable.

The R2 value shows that 58% of the variation in the abortion rate is explained by the model. The F statistic,
which tests the hypothesis that all the slopes’ coefficients are simultaneously zero, clearly rejects this
hypothesis, for its value of 8.199 is highly significant; its p value is practically zero. Again, keep in mind that the F
statistic may not be
reliable if there is heteroscedasticity.

As noted, a commonly encountered problem in cross-sectional data is the problem of heteroscedasticity. In our
example, because of the diversity of the states we suspect heteroscedasticity.

VISUAL TEST FOR HETERO:

From Gretl output, SAVE fitted values and square of residuals. Go to main Gretl and plot (Fitted at X and
Square resid at Y).

It seems that there is a systematic relationship between the squared residuals

and
the estimated values of the abortion rate, which can be checked by some formal
tests of heteroscedasticity.

DETECTION:
Breusch-Pagan – Now, for BP test, the null assumes homoskedasticity. So if p_val < 0.05 (or your chosen
alpha value); you reject the null and infer the presence of heteroskedasticity and if p_val > 0.05 (or your
chosen alpha value); you fail to reject the null and conclude there may not be heteroskedasticity.
Note: A weakness of the BP test is that it assumes the heteroskedasticity is a linear function of the
independent variables. Failing to find evidence of heteroskedasticity with the BP doesn’t rule out a
nonlinear relationship between the independent variable(s) and the error variance.

Regress e2 on IVs. (Regress the e2 on all the IVs.).

H0: Error terms are homoscedastic

p-value = .0166, therefore REJECT Ho. Heteroscedastic.

NOTE: same output if we do TESTS – Hetero – BP test

White’s test: White test provides a flexible functional form that’s useful for identifying nearly any pattern
of heteroskedasticity. It allows the independent variable to have a nonlinear and interactive effect on the
error variance. So most commonly used test for homoskedasticity is White test.
Regress e2 on: all regressors, their squares, pairwise cross product

Complete White test: Note we don’t reject H0 (homoscedastic). Chi-square is sensitive to many IVs.
The null hypothesis for White’s test is that the variances for the errors are equal. In math terms, that’s:
H0 = σ2i = σ2.
The alternate hypothesis (the one you’re testing), is that the variances are not equal:
H1 = σ2i ≠ σ2.
The only different between White’s test and the Breusch-Pagan is that its auxillary regression doesn’t
include cross-terms or the original squared variables.
Other than that, the steps are exactly the same.

When to use White test?

If your data set has many explanatory variables, the test
may be challenging to calculate. Unless you have a
specific reason for running the White Test (i.e. you
need your independent variable to have a interactive,
non linear effect on the variance), you should use the
simpler Breusch-Pagan. White’s test is an asymptotic
test, which it’s meant to be used on large samples. For
smaller samples, interpret the results with caution.
One issue with White’s test us that it can return
a significant result even if the variances of the errors
are equal. This happens because the model is a general
one, and may pick up other issues in your data
(although it won’t specify what those issues are!).
According to Richard Williams, one of the reasons the
test is more general is because of added terms that test
for more types of heteroscedasticity, like adding
squares of regressors (i.e. the independent variables) to
try and identify non-linear shapes like an hourglass.

In the test above, p = 0.5667 >.05, hence we do not reject H0, thus we infer homoscedastic error terms. As
this exercise shows, White’s chi-square test is sensitive to whether we add or drop the squared and cross-
product terms from the auxiliary regression. Remember that the White test is a large sample test. Thus,
when we include the regressors and their squared and cross-product terms, which results in a loss of 33
df, the results of the auxiliary regression are likely to be very sensitive, which is the case here.

White’s test – squares only of IVs and not cross – product ( still not signif, thus NOT reject H0), with the p
valeue = 0.123 > .05
White’s test with ONLY fitted value and squared fitted values as regressors (proxy of all the other IV and
their cross-product)

To avoid the loss of so many degrees of freedom, White’s test could be shortened by regressing the
squared residuals on the estimated value of the regressand and its squares. That is, we regress:
ei2 = α1 + α2 * Abortion + α3 * Abortion2 + v i

Now we see p-value is signif, thus reject H0.

REMEDIES
Knowing the consequences of heteroscedasticity, it may be necessary to seek remedial measures. The problem
here is that we do not know the true heteroscedastic variances, σi2 , for they are rarely observed. If we could
observe them, then we could obtain BLUE estimators by dividing each observation by the (heteroscedastic) i
and estimate
the transformed model by OLS. This method of estimation is known as the method of weighted least squares
(WLS). Unfortunately, the true σi2 is rarely known. Then what is the solution?

1) If the true error variance is proportional to the square of one of the regressors, we can divide both
sides of
the equation by that variable and run the transformed
regression. Suppose in the equation on the left, the error
variance is proportional to the square of income. We therefore
divide the equation by the income variable on both sides and
estimate this regression. We then subject this regression to heteroscedasticity tests, such as the BP and White
tests. If these tests indicate that there is no evidence of heteroscedasticity, we may then assume that the
transformed error term is homoscedastic.

2) If the true error variance is proportional to one of the regressors, we can use the so-called square
transformation, that is, we divide both sides of (5.1) by the square root of the chosen regressor. We then
estimate the regression thus transformed and subject that regression to heteroscedasticity tests. If these
tests are satisfactory, we may rely on this regression.

AD HOC Transformations

3) Divide DV by its FITTED VALUE – abortion / abortion

Note that we do transformations to get rid of heteroscedasticity

If you do White’s or BP, still heteroscedastic

4) Logarithmic transformation - Get the log of regressand (abortion) (Note: You can only take logs of
positive numbers)

LOG of Regressand Output

The reason for this is that the log transformation

compresses the scales in which the variables are
measured, thereby reducing a tenfold difference
between two values to a twofold difference.

For example, the number 80 is 10 times the

number 8, but ln 80 (= 4.3280) is about twice as
large as
ln 8 (= 2.0794).

The one caveat about using the log transformation

is that we can take logs of positive numbers
only.

Regressing the log of the abortion rate on the variables included in Eq. (5.1), we obtain the following results.
Qualitatively these results are similar to those given in Table 5.1, in that the price, income, and picket variables
are statistically significant. However, the interpretation of the regression coefficients is different from that in
Table 5.1. The various slope coefficient measure semi-elasticities – that is, the relative changes in the abortion
rate for a unit change in the value of the regressor.16 Thus the price coefficient of –0.003 means if price goes up
by a dollar, the relative change in the abortion rate is –0.003 or about –0.3%. All other coefficients are to be
interpreted similarly.

When this regression was subjected to Breusch–Pagan and White’s test (without squared and cross-product
terms), it was found that this regression did not suffer from heteroscedasticity. Again, this result should be
accepted cautiously, for our “sample” of 51 observations may not be large enough. This conclusion raises an
important point of about heteroscedasticity tests. If one or more of these tests indicate that a problem of
heteroscedasticity,
it may not be heteroscedasticity per se but a model specification error.
ROBUST (Valid in LARGE SAMPLES)
If there is HETERO, remodel and click ROBUST.

Still, we reject H0. (Sample is small, only 51).

REVISIT WAGE FUNCTION and CHECK FOR HETERO

Then correct for one if there is.

Notes on Weighted Least Squares (WLS) – used when assumption of constant variance assumption is
violated

Estimation is different in WLS v.v. OLS

In OLS, we don’t have the W terms.
In OLS,
β matrix = (XTX)-1 (XTY)
JULY 30 ABORTION

BP – squared residual (usq2)

Multivariate Analysis – The Simplest Guide in the Universe: Bite-Size Stats, #6
From Everand
Multivariate Analysis – The Simplest Guide in the Universe: Bite-Size Stats, #6
Lee Baker
No ratings yet
Heteroscedasticity: What Heteroscedasticity Is. Recall That OLS Makes The Assumption That
No ratings yet
Heteroscedasticity: What Heteroscedasticity Is. Recall That OLS Makes The Assumption That
20 pages
202003271457478511akash Heteroscedasticity
No ratings yet
202003271457478511akash Heteroscedasticity
16 pages
Multicollinearity Among The Regressors Included in The Regression Model
No ratings yet
Multicollinearity Among The Regressors Included in The Regression Model
13 pages
ARCH Model
No ratings yet
ARCH Model
26 pages
The Linear Regression Model: An Overview: Damodar Gujarati
No ratings yet
The Linear Regression Model: An Overview: Damodar Gujarati
17 pages
Homoscedastic That Is, They All Have The Same Variance: Heteroscedasticity
100% (1)
Homoscedastic That Is, They All Have The Same Variance: Heteroscedasticity
11 pages
Econometrics I: TA Session 5: Giovanna Ubida
No ratings yet
Econometrics I: TA Session 5: Giovanna Ubida
20 pages
ch02 Ans
No ratings yet
ch02 Ans
11 pages
Chapter 1_Instrumental Variable Method
No ratings yet
Chapter 1_Instrumental Variable Method
32 pages
Econometric S
No ratings yet
Econometric S
26 pages
Econometrics Multiple Regression Analysis: Heteroskedasticity
No ratings yet
Econometrics Multiple Regression Analysis: Heteroskedasticity
19 pages
ch12 Autocorrelation
100% (1)
ch12 Autocorrelation
36 pages
Chapter 5 - The Standard Trade Model
No ratings yet
Chapter 5 - The Standard Trade Model
57 pages
Regression Diagnostic I: Multicollinearity: Damodar Gujarati
No ratings yet
Regression Diagnostic I: Multicollinearity: Damodar Gujarati
7 pages
4 - LM Test and Heteroskedasticity
No ratings yet
4 - LM Test and Heteroskedasticity
13 pages
Chapter8 Econometrics Heteroskedasticity
No ratings yet
Chapter8 Econometrics Heteroskedasticity
15 pages
Heteroskedasticity
No ratings yet
Heteroskedasticity
30 pages
B.ahonseconomics Intermediate Macroeconomics Ii Sem Iv7052 PDF
No ratings yet
B.ahonseconomics Intermediate Macroeconomics Ii Sem Iv7052 PDF
7 pages
Econometrics Question M.Phil II 2020
No ratings yet
Econometrics Question M.Phil II 2020
4 pages
Autocorrelation
0% (1)
Autocorrelation
49 pages
Theories of Foeign Exchange Determination
No ratings yet
Theories of Foeign Exchange Determination
57 pages
ch14 Nonlinear Regression Models
100% (1)
ch14 Nonlinear Regression Models
18 pages
Sampling Distribution and Estimation
No ratings yet
Sampling Distribution and Estimation
46 pages
Logit & Probit Model
No ratings yet
Logit & Probit Model
51 pages
Chow Test
No ratings yet
Chow Test
23 pages
04 Moments, Skewness & Kurtosis
100% (1)
04 Moments, Skewness & Kurtosis
6 pages
Introductory Econometrics IGNOU
No ratings yet
Introductory Econometrics IGNOU
212 pages
Econ 113 Probset3 Sol
No ratings yet
Econ 113 Probset3 Sol
7 pages
Multicollinearity Autocorrelation
No ratings yet
Multicollinearity Autocorrelation
28 pages
Cointegration Copenhagen
100% (1)
Cointegration Copenhagen
51 pages
Chapter 2 (Econometrics)
No ratings yet
Chapter 2 (Econometrics)
36 pages
Immediate download Basic Econometrics 5th Edition Gujarati Solutions Manual all chapters
100% (35)
Immediate download Basic Econometrics 5th Edition Gujarati Solutions Manual all chapters
45 pages
Gujarati Student Solutions
100% (1)
Gujarati Student Solutions
189 pages
Homogeneous and Homothetic Functions PDF
No ratings yet
Homogeneous and Homothetic Functions PDF
8 pages
Chap 5 Two Variable Regression Interval Estimation and Hypothesis Testing
100% (1)
Chap 5 Two Variable Regression Interval Estimation and Hypothesis Testing
46 pages
Economics Practice MCQ Page 1
No ratings yet
Economics Practice MCQ Page 1
2 pages
Sampling Distribution Revised For IBS 2020 Batch
No ratings yet
Sampling Distribution Revised For IBS 2020 Batch
48 pages
Chapter 1 - Introduction To Probability
No ratings yet
Chapter 1 - Introduction To Probability
36 pages
Qualitative Response Regression Models 1
No ratings yet
Qualitative Response Regression Models 1
29 pages
Time Series Components
No ratings yet
Time Series Components
3 pages
Unit-17 IGNOU STATISTICS
No ratings yet
Unit-17 IGNOU STATISTICS
15 pages
Chapter 09 - Dummy Variables
No ratings yet
Chapter 09 - Dummy Variables
21 pages
Development Economics: by Debraj Ray, New York University
0% (2)
Development Economics: by Debraj Ray, New York University
31 pages
Chapter 7 PDF
No ratings yet
Chapter 7 PDF
17 pages
ECO308 Problems in Maximizing Output Given Costs
0% (1)
ECO308 Problems in Maximizing Output Given Costs
4 pages
Arch, Garch
No ratings yet
Arch, Garch
25 pages
Section II: Question 1 (20 Marks)
No ratings yet
Section II: Question 1 (20 Marks)
5 pages
Regression Diagnostic Ii: Heteroscedasticity: Damodar Gujarati
No ratings yet
Regression Diagnostic Ii: Heteroscedasticity: Damodar Gujarati
7 pages
Chapter Three: Estimation of Multiple Linear Regression Model
No ratings yet
Chapter Three: Estimation of Multiple Linear Regression Model
18 pages
Correlation Regression
100% (1)
Correlation Regression
25 pages
Essentials of Econometrics 4th Edition Gujarati Solutions Manual download
100% (1)
Essentials of Econometrics 4th Edition Gujarati Solutions Manual download
47 pages
International Linkages: Chapter #13
No ratings yet
International Linkages: Chapter #13
30 pages
Heteroscedasticity
No ratings yet
Heteroscedasticity
16 pages
Heteros Ce Dasti City
No ratings yet
Heteros Ce Dasti City
17 pages
L1090_lecture7_AU24
No ratings yet
L1090_lecture7_AU24
27 pages
Hetero Test
No ratings yet
Hetero Test
20 pages
Heteros Ce Dasti City
No ratings yet
Heteros Ce Dasti City
15 pages
Heteros Ce Dasti City
No ratings yet
Heteros Ce Dasti City
8 pages
Chi Squared for Beginners
From Everand
Chi Squared for Beginners
Stephanie Glen
No ratings yet
Anuba!!!
No ratings yet
Anuba!!!
5 pages
Jurnal Reviewer 1 (Revisi)
No ratings yet
Jurnal Reviewer 1 (Revisi)
9 pages
Quantum Entanglement
100% (1)
Quantum Entanglement
22 pages
R Matrix Theory
No ratings yet
R Matrix Theory
93 pages
Z Transforms
No ratings yet
Z Transforms
10 pages
Unpaired T-Tests PDF
No ratings yet
Unpaired T-Tests PDF
3 pages
Econometric Scours Outlines by Hamidullah
No ratings yet
Econometric Scours Outlines by Hamidullah
4 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
13 pages
Physics137a sp2012 mt2 Haxton Soln PDF
No ratings yet
Physics137a sp2012 mt2 Haxton Soln PDF
5 pages
MCQ Unit Ii Classical and Quantum Statistics: by Bhanudas Narwade Asst. Professor Degloor College, Degloor
No ratings yet
MCQ Unit Ii Classical and Quantum Statistics: by Bhanudas Narwade Asst. Professor Degloor College, Degloor
16 pages
RM 5 Wpu
No ratings yet
RM 5 Wpu
92 pages
Testing of Hypotheses
No ratings yet
Testing of Hypotheses
19 pages
Bayes and MCMC For Undergraduates: The American Statistician
No ratings yet
Bayes and MCMC For Undergraduates: The American Statistician
7 pages
Lecture Hermite Orthogonality
No ratings yet
Lecture Hermite Orthogonality
21 pages
Hypothesis Testing in Statistics - Short Lecture Notes +
No ratings yet
Hypothesis Testing in Statistics - Short Lecture Notes +
13 pages
Statistics & Probability: Second Semester
No ratings yet
Statistics & Probability: Second Semester
11 pages
Tetrad Basis in General Relativity
No ratings yet
Tetrad Basis in General Relativity
4 pages
Lagrangian Mechanics: Energy
No ratings yet
Lagrangian Mechanics: Energy
3 pages
GriffithsQMCh2p10 PDF
No ratings yet
GriffithsQMCh2p10 PDF
10 pages
Goodness of Fit Test DF
No ratings yet
Goodness of Fit Test DF
2 pages
P-Value Question Example CFA Level 1 - AnalystPrep
No ratings yet
P-Value Question Example CFA Level 1 - AnalystPrep
1 page
Hypothesis Testing and The Comparison of 2 or More Populations
No ratings yet
Hypothesis Testing and The Comparison of 2 or More Populations
83 pages
New MOND
No ratings yet
New MOND
7 pages
Elhbo 5
No ratings yet
Elhbo 5
3 pages
Class 11 Cbse Physics Notes
No ratings yet
Class 11 Cbse Physics Notes
2 pages
Carlo Rovelli Seven Brief Lessons On Physics First Lesson
No ratings yet
Carlo Rovelli Seven Brief Lessons On Physics First Lesson
10 pages
Advanced Molecular Quantum Mechanics
No ratings yet
Advanced Molecular Quantum Mechanics
312 pages
qft1 - t1 200829 114824
No ratings yet
qft1 - t1 200829 114824
2 pages
THE F-Test
100% (1)
THE F-Test
10 pages
Zen The Path of Paradox Vol 1
No ratings yet
Zen The Path of Paradox Vol 1
5 pages

Uploaded by

Uploaded by

Heteroscedasticity

When heteroskedasticity might occur.

1) Errors may increase as the value of an IV increases. For example,

2) Errors may also increase as the values of an IV become more extreme in

5) Heteroskedasticity can also occur if there are subpopulation

Presence pf heteroscedasticity: Ubiased – yes, consistent – yes, efficient - yes

Data: Abortion rates in US.

We assume the error term satisfies the standard

VISUAL TEST FOR HETERO:

It seems that there is a systematic relationship between the squared residuals

Regress e2 on IVs. (Regress the e2 on all the IVs.).

H0: Error terms are homoscedastic

NOTE: same output if we do TESTS – Hetero – BP test

When to use White test?

Now we see p-value is signif, thus reject H0.

3) Divide DV by its FITTED VALUE – abortion / abortion

Note that we do transformations to get rid of heteroscedasticity

LOG of Regressand Output

The reason for this is that the log transformation

For example, the number 80 is 10 times the

The one caveat about using the log transformation

Still, we reject H0. (Sample is small, only 51).

REVISIT WAGE FUNCTION and CHECK FOR HETERO

Estimation is different in WLS v.v. OLS

BP – squared residual (usq2)

You might also like