9  Cointegration: Engle-Granger Test

library(AER)
Loading required package: car
Loading required package: carData
Loading required package: lmtest
Loading required package: zoo

Attaching package: 'zoo'
The following objects are masked from 'package:base':

    as.Date, as.Date.numeric
Loading required package: sandwich
Loading required package: survival
library(tseries)
Registered S3 method overwritten by 'quantmod':
  method            from
  as.zoo.data.frame zoo 
library(vars) # to select optimal lag length
Loading required package: MASS
Loading required package: strucchange
Loading required package: urca
library(urca)

9.1 Example: Pepper Price

We continue from the previous section, where we concluded that the white and black pepper prices are \(I(1)\).

In this section, we test whether they have a cointegrating relationship.

<div style=“color:green>

There are two conditions for cointegration:

  1. The time-series should be integrated of the same order
  2. There is a stationary linear relationship between the variables.

In the previous section, we established that the series are integrated of the same order. The next task finds out whether they have a stationary linear relationship.

data("PepperPrice")
colnames(PepperPrice) <- c("black_pepper", "white_pepper")

9.1.1 Task 1

Regress white_pepper on black_pepper and obtain the residuals.

9.1.1.1 Guidance

# regression of white_pepper on black_pepper
m_1 <- lm(white_pepper ~ black_pepper, as.data.frame(PepperPrice))
# Obtain residuals
resid_m1 <- residuals(m_1)

We need to perform a unit root test on the residuals. But the test regression should not include a constant term. The adf.test() function we used above does not have an option to remove the constant from test regression. Hence, we revert to urca package.

library(urca)
adf_no_constant <- ur.df(resid_m1, type = "none", lags = 0)
summary(adf_no_constant)

############################################### 
# Augmented Dickey-Fuller Test Unit Root Test # 
############################################### 

Test regression none 


Call:
lm(formula = z.diff ~ z.lag.1 - 1)

Residuals:
     Min       1Q   Median       3Q      Max 
-1316.48   -79.12    -4.80    74.85  1087.12 

Coefficients:
        Estimate Std. Error t value Pr(>|t|)    
z.lag.1 -0.11587    0.02916  -3.973 9.11e-05 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 208.1 on 269 degrees of freedom
Multiple R-squared:  0.05543,   Adjusted R-squared:  0.05192 
F-statistic: 15.79 on 1 and 269 DF,  p-value: 9.111e-05


Value of test-statistic is: -3.9733 

Critical values for test statistics: 
      1pct  5pct 10pct
tau1 -2.58 -1.95 -1.62

The null of unit root is rejected. Hence the residuals of model m_1 are stationary. This implies a cointegrating relationship between white_pepper and black_pepper prices. Hence, we can use these variables and estimate a regression without the need for differencing.

9.1.2 Task 2

Estimate the Error Correction Mechanism (ECM) between the two types of pepper prices and interpret your results.

9.1.2.1 Guidance

The ECM will be in the form of a first differenced regression of the two variables while also including the lagged error term from the regression we estimated above.

First, let’s convert the residuals we obtained above to a ts (time-series) object while matching the dates to the PepperPrice object

# Convert residuals we obtained above to a ts object (match ts object PepperPrice)
resid_m1 <- ts(resid_m1, start = start(PepperPrice),
               frequency = frequency(PepperPrice))

We then create the difference and lag variables that we need for an Error Correction Mechanism:

\[ \Delta white\_pepper_t = \alpha_0 + \alpha_1 \Delta black\_pepper_t + \alpha_2 resid\_m1_{t-1}+\epsilon_t \]

d_white <- diff(PepperPrice[, "white_pepper"])  # First difference of white_pepper
d_black <- diff(PepperPrice[, "black_pepper"])  # First difference of black_pepper
lag_resid <- stats::lag(resid_m1, -1)  # Lagged residuals (one period back)

Each of the above are separate ts objects. we will bind them under one. I will save this as PepperPrice_2.

PepperPrice_2 <- cbind(d_white, d_black, lag_resid)
colnames(PepperPrice_2) <- c("d_white", "d_black", "lag_resid") # assign column names

We can now run our ECM:

ecm <- lm(d_white ~ d_black + lag_resid, data = as.data.frame(PepperPrice_2)) 
summary(ecm)

Call:
lm(formula = d_white ~ d_black + lag_resid, data = as.data.frame(PepperPrice_2))

Residuals:
     Min       1Q   Median       3Q      Max 
-1059.66   -62.79   -15.21    61.56  1092.60 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  4.60426   11.65811   0.395    0.693    
d_black      0.72575    0.07993   9.080  < 2e-16 ***
lag_resid   -0.12866    0.02689  -4.784 2.84e-06 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 191.5 on 267 degrees of freedom
  (1 observation deleted due to missingness)
Multiple R-squared:  0.2949,    Adjusted R-squared:  0.2896 
F-statistic: 55.83 on 2 and 267 DF,  p-value: < 2.2e-16

In the above output, we are interested with the lag_resid term. The coefficient of this term needs to be negative for a valid error correction mechanism. We see that the coefficient is statistically significant and is negative. -0.12866 shows that in the case of a deviation from the long-run equilibrium between the white pepper and black pepper prices, 12.9% of this deviation is adjusted in the next period. Here, we have monthly data, so 12.9% of the deviatiton from the long-run equilibrium is adjusted in the next month. The system appears to come back to its equilibrium state in less than a year.