Please note that the information provided in this section (panel data misspecification tests) are only for those of you who are interested to explore further. You are not responsible from these tests in your 6036ECN exam.
In this section, we will continue with the examples from the previous section and test for Autocorrelation and Heteroscedasticity.
European Countries Gasoline Consumption data is obtained from Abay Mulatu 316ECN Applied Econometrics lecture material, Coventry University.
We will be using the all country sample of the gasoline demand data.
We start by loading the required libraries.
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.4 ✔ readr 2.1.5
✔ forcats 1.0.0 ✔ stringr 1.5.1
✔ ggplot2 3.5.1 ✔ tibble 3.2.1
✔ lubridate 1.9.4 ✔ tidyr 1.3.1
✔ purrr 1.0.2
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(Hmisc) # add labels to variables
Attaching package: 'Hmisc'
The following objects are masked from 'package:dplyr':
src, summarize
The following objects are masked from 'package:base':
format.pval, units
#library(ggplot2)#library(dplyr) # for data manipulationlibrary(plm) # to estimate linear panel data models
Attaching package: 'plm'
The following objects are masked from 'package:dplyr':
between, lag, lead
#library(fastDummies) # create dummies based on categorical (factor) variable
13.1 Testing for Autocorrelation
13.1.1 Task 1
Using the gasoline-demand-all-countries.csv data, estimate a one-way fixed effects gasoline demand model.
13.1.1.1 Guidance
This is a replication of what we have done in the previous section.
df <-read.csv("~/Desktop/R-workshops/assets/data/gasoline-demand-all-countries.csv", stringsAsFactors=TRUE)#View(df)# label variableslabel(df$L_gas_cons_pcar) <-"Logarithm of gasoline consumption per car"label(df$L_income_pc) <-"Logarithm of real income per capita"label(df$L_gas_price) <-"Logarithm of real gasoline price per gallon"label(df$L_cars_pc) <-"Logarithm of number of cars per capita"
We estimate a fixed effects model below:
# Panel Data Fixed Effects Modelfixed_1 <-plm(L_gas_cons_pcar ~ L_income_pc + L_gas_price + L_cars_pc, data = df, index =c("country", "year"), model ="within")summary(fixed_1)
Oneway (individual) effect Within Model
Call:
plm(formula = L_gas_cons_pcar ~ L_income_pc + L_gas_price + L_cars_pc,
data = df, model = "within", index = c("country", "year"))
Balanced Panel: n = 18, T = 19, N = 342
Residuals:
Min. 1st Qu. Median 3rd Qu. Max.
-0.378774 -0.039758 0.004650 0.045412 0.362856
Coefficients:
Estimate Std. Error t-value Pr(>|t|)
L_income_pc 0.662250 0.073386 9.0242 < 2.2e-16 ***
L_gas_price -0.321702 0.044099 -7.2950 2.355e-12 ***
L_cars_pc -0.640483 0.029679 -21.5804 < 2.2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Total Sum of Squares: 17.061
Residual Sum of Squares: 2.7365
R-Squared: 0.8396
Adj. R-Squared: 0.82961
F-statistic: 560.093 on 3 and 321 DF, p-value: < 2.22e-16
13.1.2 Task 2
Test for the existence of Autocorrelation.
13.1.2.1 Guidance
Wooldridge Test for AR(1) Errors in FE Panel Models.
Could be used for short and long panels.
Could be used for fixed effects models only.
Tests for Autocorrelation of order 1.
pwartest(fixed_1)
Wooldridge's test for serial correlation in FE panels
data: fixed_1
F = 212, df1 = 1, df2 = 322, p-value < 2.2e-16
alternative hypothesis: serial correlation
According to the test results, there is autocorrelation of order 1.
Breusch-Godfrey Test for Panel Models
Suitable for long panels.
Allows to choose the order to test for.
It could be used for both the fixed effects and random effects models.
pbgtest(fixed_1)
Breusch-Godfrey/Wooldridge test for serial correlation in panel models
data: L_gas_cons_pcar ~ L_income_pc + L_gas_price + L_cars_pc
chisq = 185.08, df = 19, p-value < 2.2e-16
alternative hypothesis: serial correlation in idiosyncratic errors
There is autocorrelation of order 1.
Let’s say we want to test for Autocorrelation of order 3:
pbgtest(fixed_1, order =3)
Breusch-Godfrey/Wooldridge test for serial correlation in panel models
data: L_gas_cons_pcar ~ L_income_pc + L_gas_price + L_cars_pc
chisq = 172.76, df = 3, p-value < 2.2e-16
alternative hypothesis: serial correlation in idiosyncratic errors
There is autocorrelation up to order 3.
13.1.3 Task 3
Estimate a random effects model
13.1.3.1 Guidance
We have this in the previous section. Here is a random effects version of the above fixed effects model.
# Panel Data Fixed Effects Modelrandom_1 <-plm(L_gas_cons_pcar ~ L_income_pc + L_gas_price + L_cars_pc, data = df, index =c("country", "year"), model ="random")summary(random_1)
Oneway (individual) effect Random Effect Model
(Swamy-Arora's transformation)
Call:
plm(formula = L_gas_cons_pcar ~ L_income_pc + L_gas_price + L_cars_pc,
data = df, model = "random", index = c("country", "year"))
Balanced Panel: n = 18, T = 19, N = 342
Effects:
var std.dev share
idiosyncratic 0.008525 0.092330 0.182
individual 0.038238 0.195545 0.818
theta: 0.8923
Residuals:
Min. 1st Qu. Median 3rd Qu. Max.
-0.3977058 -0.0520350 0.0050877 0.0582288 0.3763726
Coefficients:
Estimate Std. Error z-value Pr(>|z|)
(Intercept) 1.996698 0.184326 10.8324 < 2.2e-16 ***
L_income_pc 0.554986 0.059128 9.3861 < 2.2e-16 ***
L_gas_price -0.420389 0.039978 -10.5155 < 2.2e-16 ***
L_cars_pc -0.606840 0.025515 -23.7836 < 2.2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Total Sum of Squares: 18.054
Residual Sum of Squares: 3.0817
R-Squared: 0.82931
Adj. R-Squared: 0.8278
Chisq: 1642.2 on 3 DF, p-value: < 2.22e-16
13.1.4 Task 4
Test for autocorrelation in the above estimated random effects model.
13.1.4.1 Guidance
We can start by applying the Breusch-Godfrey test for panel models.
pbgtest(random_1)
Breusch-Godfrey/Wooldridge test for serial correlation in panel models
data: L_gas_cons_pcar ~ L_income_pc + L_gas_price + L_cars_pc
chisq = 193.42, df = 19, p-value < 2.2e-16
alternative hypothesis: serial correlation in idiosyncratic errors
There is autocorrelation of order 1.
Baltagi and Li Serial Dependence Test for Random Effect Models
The test can be used with random effects models only.
pbltest(random_1)
Baltagi and Li two-sided LM test
data: formula(x$formula)
chisq = 225.16, df = 1, p-value < 2.2e-16
alternative hypothesis: AR(1)/MA(1) errors in RE panel model
This test confirms the results of the previous one, in that, there is autocorrelation of order 1.
13.2 Heteroscedasticity
13.2.1 Task 5
Test for the existence of heteroscedasticity in each of the models estimated above.
Report the fixed and random effects estimation results with autocorrelation and heteroscedasticity robust standard errors.
13.2.2.1 Guidance
Existence of autocorrelation and heteroscedasticity affects the standard error estimates. Although the coefficients will remain unbiased, the standard errors will be biased and hence all tests we perform on the model will be misleading. We may either try to resolve these issues though changing our model or estimation strategy or computer heteroskedasticity or autocorrelation adjusted standard errors.
Calculation of robust standard errors is common in Applied Econometrics. Since the models presented here are panel data estimation, we will be using plm package’s vcovHC function. Below is the information from R help about the arguments that this function can take: