Non-stationarity and unit root processes¶

import warnings
warnings.simplefilter("ignore", FutureWarning)

from statsmodels.tsa.deterministic import DeterministicProcess
import pandas as pd
import numpy as np

from numpy.random import default_rng
from statsmodels.tsa.arima_process import arma_acovf, ArmaProcess
from statsmodels.tsa.arima.model import ARIMA
import matplotlib.pyplot as plt
import statsmodels.api as sm
gen = default_rng(1)
import seaborn as sns
sns.set_theme()
sns.set_context("notebook")

Why need stationarity?¶

stationarity is a form of constancy of the properties of the process (mean, variance, autocovariances)
needed in order to be able to learn something about those properties

Many macroeconomic time series are non-stationary

#df = pd.read_excel('nama_10_gdp__custom_2373104_page_spreadsheet.xlsx',
#                   sheet_name='Sheet 1',
#                  skiprows=range(9), usecols=[0,1], names=['year', 'GDP'], index_col=0, parse_dates=True,
#                  ).loc['1995':'2021']
#df = df.astype('float')
#data = np.log(df['GDP'])
#title='Real GDP, Euro Area'
#ylabel='natural log of millions of euro'
#fig = data.plot(figsize=(10,3), title=title,ylabel=ylabel)
#plt.savefig('images/rGDPea.png')

Erg1

ARMA(p, q)¶

z_{t} = α_{1} z_{t - 1} + α_{2} z_{t - 1} + \dots + α_{p} z_{t - p} + ε_{t} + β_{1} ε_{t - 1} + \dots + β_{q} ε_{t - q}

in lag operator notation

\begin{aligned} α (L) z_{t} & = β (L) ε_{t} \end{aligned}

ARMA(p, q) is stationary if the AR§ part is stationary, i.e. if all roots of

α (x) = 1 - α_{1} x - α_{2} x^{2} - \dots - α_{p} x^{p} = 0

are outside the unit circle ( $| x | > 1$ )

a non-stationary process with one root $x = 1$ and the other roots $| x | > 1$ is called a unit root process
if $z_{t}$ is an unit root process, $Δ z_{t} = z_{t} - z_{t - 1} = (1 - L) z_{t}$ is stationary

α (x) = α^{*} (x) (1 - x)

and if all roots of

α^{*} (x) = 1 - α_{1}^{*} x - α_{2}^{*} x^{2} - \dots - α_{p - 1}^{*} x^{p - 1} = 0

are outside the unit circle ( $| x | > 1$ )

the AR part

\begin{array}{r} α (L) z_{t} = α^{*} (L) (1 - L) z_{t} = α^{*} (L) (z_{t} - z_{t - 1}) = α^{*} (L) Δ z_{t} \end{array}

a non-stationary process whose first difference is stationary is called integrated of order one ( $I (1)$ )

why “integrated”:

Δ z_{t} = z_{t} - z_{t - 1} = u_{t}, u_{t} stationary

\begin{array}{r} \begin{aligned} z_{t} & = z_{t - 1} + u_{t} \\ = z_{t - 2} + u_{t - 1} + u_{t} \\ ⋮ \\ = \sum_{j = 1}^{t} u_{j} + z_{0} \end{aligned} \end{array}

$z_{t}$ is the sum (integral) of a process (is an “integrated” process) starting from some initialization $z_{0}$
$\sum_{j = 1}^{t} u_{j}$ refered to as a stochastic trend
$Δ z_{t}$ is $I (0)$ (stationary, no need to difference)
a non-stationary process which requres double difference to be stationary is called integrated of order two ( $I (2)$ )
- if $z_{t}$ is $I (2)$ then $Δ^{2} z_{t} = (1 - L)^{2} z_{t}$ is $I (0)$
etc for $I (d)$ .
- if $z_{t}$ is $I (d)$ then $Δ^{d}$ is $I (0)$
- $d$ is the order of integration
if we difference a stationary process, we get another stationary process. However, no differencing was required to achieve stationarity
stationary process whose cumulative sum is also stationary, are called overdifferenced, and denoted with $I (- 1)$

example:

z_{t} = ε_{t} - ε_{t - 1}, ε_{t} \sim iid (0, σ^{2})

$z_{t}$ is stationary but so is $ε_{t}$
$z_{t}$ is $I (- 1)$
non-invertible MA part

another example:

z_{t} = δ_{0} + δ_{1} t + ν_{t} with stationary ν_{t}

Δ z_{t} = δ_{1} t - δ_{1} (t - 1) + Δ ν_{t} = δ_{1} + Δ ν_{t}

is $I (- 1)$

linear time trend (deterministic trend)
$z_{t}$ is called trend-stationary process

in the sense that it is non-random

more generally:

z_{t} = {x_{t}}^{'} δ + ν_{t} with stationary ν_{t}, E ν_{t} = 0

where $x_{t}$ are known constants

z_{t} - {x_{t}}^{'} δ = ν_{t}

is stationary

Gaussian MLE¶

and, if $ν_{t}$ is stationary Gaussian time series process

ν \sim N (0, Σ)

we have

z \sim N (μ, Σ)

where

μ = X δ

for example, if $z_{t} = δ_{0} + δ_{1} t + ν_{t}$ , the $t$ -th row of $X$ is $[1, t]$

Models with time trend in Python¶

Generate $X$ for

z_{t} = δ_{0} + δ_{1} t + ν_{t}, t = 1, 2, \dots, T

z = X δ + ν

$T \times 2$ matrix with $t$ -th row given by $[1, t]$

from statsmodels.tsa.deterministic import DeterministicProcess

T = 500
index = pd.RangeIndex(0, T)
det_proc = DeterministicProcess(index, constant=True, order=1) # constant and a linear (1 order) trend
X = det_proc.in_sample()
X.head()

	const	trend
0	1.0	1.0
1	1.0	2.0
2	1.0	3.0
3	1.0	4.0
4	1.0	5.0

z = X δ + ν

# constant and time trend part
delta = np.array([3, .1])
exog = X.values@delta 

$ν_{t}$ as ARMA(1,1) process

alpha = np.array([.8])
beta = np.array([0.1])
ar = np.r_[1, -alpha] # coefficient on z(t) and z(t-1)
ma = np.r_[1, beta]  # coefficients on e(t) and e(t-1)
arma11_process = ArmaProcess(ar, ma)

nu =  arma11_process.generate_sample(T, distrvs=gen.normal)

z = exog + nu

Option 1 Estimate indicating constant and time trend

arma_model = ARIMA(z, order=(1, 0, 1), trend="ct")
arma_results = arma_model.fit()
print(arma_results.summary())

                               SARIMAX Results                                
==============================================================================
Dep. Variable:                      y   No. Observations:                  500
Model:                 ARIMA(1, 0, 1)   Log Likelihood                -733.673
Date:                Mon, 28 Mar 2022   AIC                           1477.345
Time:                        10:48:25   BIC                           1498.418
Sample:                             0   HQIC                          1485.614
                                - 500                                         
Covariance Type:                  opg                                         
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
const          3.0500      0.512      5.954      0.000       2.046       4.054
x1             0.0992      0.002     57.573      0.000       0.096       0.103
ar.L1          0.7971      0.035     22.981      0.000       0.729       0.865
ma.L1          0.0650      0.057      1.142      0.253      -0.047       0.176
sigma2         1.1054      0.073     15.235      0.000       0.963       1.248
===================================================================================
Ljung-Box (L1) (Q):                   0.00   Jarque-Bera (JB):                 0.28
Prob(Q):                              1.00   Prob(JB):                         0.87
Heteroskedasticity (H):               1.20   Skew:                            -0.01
Prob(H) (two-sided):                  0.24   Kurtosis:                         2.89
===================================================================================

Warnings:
[1] Covariance matrix calculated using the outer product of gradients (complex-step).

Option 2 Estimate providing exogenous regressors

z = X δ + ν

arma_model2 = ARIMA(z, exog=X, order=(1, 0, 1), trend="n")
arma_results2 = arma_model2.fit()
print(arma_results2.summary()) 

                               SARIMAX Results                                
==============================================================================
Dep. Variable:                      y   No. Observations:                  500
Model:                 ARIMA(1, 0, 1)   Log Likelihood                -733.673
Date:                Mon, 28 Mar 2022   AIC                           1477.345
Time:                        10:48:29   BIC                           1498.418
Sample:                             0   HQIC                          1485.614
                                - 500                                         
Covariance Type:                  opg                                         
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
const          3.0500      0.512      5.954      0.000       2.046       4.054
trend          0.0992      0.002     57.573      0.000       0.096       0.103
ar.L1          0.7971      0.035     22.981      0.000       0.729       0.865
ma.L1          0.0650      0.057      1.142      0.253      -0.047       0.176
sigma2         1.1054      0.073     15.235      0.000       0.963       1.248
===================================================================================
Ljung-Box (L1) (Q):                   0.00   Jarque-Bera (JB):                 0.28
Prob(Q):                              1.00   Prob(JB):                         0.87
Heteroskedasticity (H):               1.20   Skew:                            -0.01
Prob(H) (two-sided):                  0.24   Kurtosis:                         2.89
===================================================================================

Warnings:
[1] Covariance matrix calculated using the outer product of gradients (complex-step).

Difference-stationary vs trend-stationary¶

z_{t} = δ_{0} + δ_{1} t + ν_{t}

trend-stationary if $ν_{t}$ is stationary
difference-stationary if $ν_{t}$ is a unit root process

$ν_{t}$ is a unit root process if in

\begin{aligned} α (L) ν_{t} & = β (L) ε_{t} \end{aligned}

one of the roots of

α (x) = 1 - α_{1} x - α_{2} x^{2} - \dots - α_{p + 1} x^{p + 1} = 0

is $x = 1$ , and the other roots are $| x | > 1$

α (L) = (1 - L) α^{*} (L)

all $p$ roots of

α^{*} (x) = 1 - α_{1}^{*} x - α_{2}^{*} x^{2} - \dots - α_{p}^{*} x^{p} = 0

are $| x | > 1$

$Δ ν_{t} = (1 - L) ν_{t}$ is a stationary ARMA(p, q) process

\begin{array}{r} \begin{aligned} α (L) ν_{t} & = β (L) ε_{t} \\ α^{*} (L) (1 - L) ν_{t} & = β (L) ε_{t} \\ α^{*} (L) Δ ν_{t} & = β (L) ε_{t} \end{aligned} \end{array}

Difference-stationary process¶

z_{t} = δ_{0} + δ_{1} t + ν_{t}

$ν_{t}$ is an ARIMA(p, 1, q) process
$z_{t}$ is an ARIMAX(p, 1, q) process (ARIMA with exogenous part - constant and time trend)
$Δ z_{t}$ is an ARMAX(p, q) process (ARMA with exogenous part - constant)

Δ z_{t} = δ_{1} + Δ v_{t}

model $z$ as regression model with ARIMA errors

Difference between trend- and difference-stationary models¶

example

\begin{array}{r} \begin{aligned} z_{t} & = δ_{0} + δ_{1} t + ν_{t} \\ ν_{t} & = α ν_{t - 1} + ε_{t} | α | \leq 1 \end{aligned} \end{array}

at $t + h$

\begin{array}{r} \begin{aligned} z_{t + h} & = δ_{0} + δ_{1} (t + h) + ν_{t + h} \\ ν_{t + h} & = α ν_{t + h - 1} + ε_{t + h} \end{aligned} \end{array}

forecast at $t$ :

\begin{array}{r} \begin{aligned} E_{t} z_{t + h} & = δ_{0} + δ_{1} (t + h) + E_{t} ν_{t + h} \\ E_{t} ν_{t + h} & = α^{h} ν_{t} \\ ⇓ \\ E_{t} z_{t + h} & = δ_{0} + δ_{1} (t + h) + α^{h} ν_{t} \end{aligned} \end{array}

if $| α | < 1$ , as $h \to \infty$ , $α^{h} \to 0$

E_{t} z_{t + h} = δ_{0} + δ_{1} (t + h) + α^{h} ν_{t} \to δ_{0} + δ_{1} (t + h) = E z_{t + h}

\begin{array}{r} \begin{aligned} E z_{t + h} & = δ_{0} + δ_{1} (t + h) + E ν_{t + h} \\ E ν_{t + h} & = α E ν_{t + h - 1} + E ε_{t + h} = 0 \end{aligned} \end{array}

For trend-stationary processes:

the long-run forecast is the unconditional mean of $z_{t}$ (mean reversion)
the long-run forecast is independent of $z_{t}$
shocks to $z_{t}$ have temporary impact (transitory)
if $α = 1$ ,

\begin{array}{r} \begin{aligned} E_{t} z_{t + h} & = δ_{0} + δ_{1} (t + h) + α^{h} ν_{t} \\ E_{t} z_{t + h} & = δ_{0} + δ_{1} (t + h) + ν_{t} \\ E_{t} z_{t + h} & = δ_{0} + δ_{1} (t + h) + \underset{ν_{t}}{\underset{⏟}{z_{t} - δ_{0} - δ_{1} t}} \\ E_{t} z_{t + h} & = δ_{1} h + z_{t} \end{aligned} \end{array}

For difference-stationary processes:

the value of $z_{t}$ has a permanent effect on all future forecasts
- shocks to $z_{t}$ have permanent effects
$z_{t}$ is expected to grow by $δ_{1}$ every period

Martingale process¶

if for ${z_{t}}_{- \infty}^{\infty}$

E (z_{t + 1} | z_{t}, z_{t - 1}, \dots) = z_{t}

$z_{t}$ is a martingale process, and $Δ z_{t}$ is a martingale difference process

unlike a random walk, martingale process allows for heteroskedasticity or conditional heteroskedasticity

Beveridge-Nelson decomposition¶

Every difference-stationary process can be written as a sum of a random walk and a stationary component
the effect of innovations (shocks) can be decomposed into permanent and transitory effects
ARIMA(p,1,q)

α (L) Δ z_{t} = δ_{0} + β (L) ε_{t}

$α (L)$ is invertible. Why?
equivalently (by invertability of $α (L)$ )

\begin{array}{r} Δ z_{t} = δ_{1} + ϕ (L) ε_{t} \end{array}

δ_{1} = \frac{δ_{0}}{α (1)}, ϕ (L) = \frac{β (L)}{α (L)} α (1) = 1 - α_{1} - \dots - α_{p}

(a bit of magic here…)

Since 1 is a root of the polinomial

ϕ (L) - ϕ (1)

it can be witten as

ϕ (L) - ϕ (1) = ϕ^{*} (L) (1 - L)

and

ϕ (L) = ϕ^{*} (L) (1 - L) + ϕ (1)

where

ϕ^{*} (L) = \frac{ϕ (L) - ϕ (1)}{(1 - L)} \Rightarrow ϕ_{j}^{*} = - \sum_{i = j + 1} ϕ_{i}

\begin{array}{r} \begin{aligned} Δ z_{t} & = δ_{1} + ϕ (L) ε_{t} \\ (1 - L) z_{t} & = δ_{1} + ϕ (1) ε_{t} + ϕ^{*} (L) (1 - L) ε_{t} \\ z_{t} & = (1 - L)^{- 1} δ_{1} + (1 - L)^{- 1} ϕ (1) ε_{t} + ϕ^{*} (L) ε_{t} \end{aligned} \end{array}

denote $z_{t}^{p} = (1 - L)^{- 1} δ_{1} + (1 - L)^{- 1} ϕ (1) ε_{t}$

then,

\begin{array}{r} \begin{aligned} (1 - L) z_{t}^{p} & = δ_{1} + ϕ (1) ε_{t} \\ z_{t}^{p} - z_{t - 1}^{p} & = δ_{1} + ϕ (1) ε_{t} \\ z_{t}^{p} & = δ_{1} + z_{t - 1}^{p} + ϕ (1) ε_{t} \end{aligned} \end{array}

Therefore

\begin{array}{r} \begin{aligned} z_{t} & = z_{t}^{p} + z_{t}^{s} \\ where \\ z_{t}^{p} & = δ_{1} + z_{t - 1}^{p} + ϕ (1) ε_{t} \\ z_{t}^{s} & = ϕ^{*} (L) ε_{t} \end{aligned} \end{array}

$z_{t}^{p}$ - permanent component (trend)
$z_{t}^{s}$ - transitory component (cycle)
$ε_{t}$ has a permanent impact on $z_{t}$ through $z_{t}^{p}$
$ε_{t}$ has a transitory impact on $z_{t}$ through $z_{t}^{s}$
relative importance of permanent/transitory impact depends on the value of $ϕ (1)$

example ARIMA(0,1,1)

Δ z_{t} = ε_{t} + β ε_{t - 1}

\begin{array}{r} \begin{aligned} z_{t} & = z_{t - 1} + ε_{t} + β ε_{t - 1} \\ = z_{t - 2} + (ε_{t - 1} + β ε_{t - 2}) + ε_{t} + β ε_{t - 1} \\ ⋮ (assuming z_{0} = ε_{0} = 0) \\ = \sum_{j = 1}^{t} ε_{j} + β \sum_{j = 1}^{t - 1} ε_{j} (add and subtract β ε_{t}) \\ = \underset{z_{t}^{p}}{\underset{⏟}{(1 + β) \sum_{j = 1}^{t} ε_{j}}} + \underset{z_{t}^{s}}{\underset{⏟}{(- β ε_{t})}} \end{aligned} \end{array}

\begin{array}{r} \begin{aligned} z_{t}^{p} & = (1 + β) \sum_{j = 1}^{t} ε_{j} \\ = (1 + β) \sum_{j = 1}^{t - 1} ε_{j} + (1 + β) ε_{t} \\ = z_{t - 1}^{p} + (1 + β) ε_{t} \end{aligned} \end{array}

Random walk with a drift model:¶

z_{t} = δ + z_{t - 1} + ε_{t}

\begin{array}{r} \begin{aligned} z_{t} = δ t + ε_{t} & + ε_{t - 1} + ε_{t - 2} + \dots + ε_{1} + z_{0} \\ E z_{t} & = δ t + E z_{0} \\ var (z_{t}) & = σ^{2} t + var (z_{0}) \end{aligned} \end{array}

Beveridge-Nelson decomposition¶

Every difference-stationary processes can be written as a sum of a random walk and a stationary component

many macro variables are well represented by ARIMA(p,1,q) models
every ARIMA(p,1,q) model has a random walk stochastic trend

⇓

the growth in macro variables can be characterized by stochastic trends

Forecasting integrated variables¶

if $z_{t}$ is $I (1)$ , we estimate and forecast using $Δ z_{t}$
from the forecasts of $Δ z_{t + i}$ , $i = 1, 2, \dots, h$

z_{t + h}^{f} = z_{t} + \sum_{i = 1}^{h} Δ z_{t + i}^{f}

Detecting stochastic trends by unit root test¶

Unit root tests are one-sided tests:

$H_{0}$ : unit root
$H_{1}$ : stationary

Augmented Dickey–Fuller test¶

Consider AR(2) process:

z_{t} = α_{1} z_{t - 1} + α_{2} z_{t - 2} + ε_{t}

re-write as

\begin{array}{r} \begin{aligned} z_{t} & = α_{1} z_{t - 1} + (α_{2} z_{t - 1} - α_{2} z_{t - 1}) + α_{2} z_{t - 2} + ε_{t} \\ = (α_{1} + α_{2}) z_{t - 1} - α_{2} (z_{t - 1} - z_{t - 2}) + ε_{t} \\ = (α_{1} + α_{2}) z_{t - 1} - α_{2} Δ z_{t - 1} + ε_{t} \end{aligned} \end{array}

and

\begin{array}{r} \begin{aligned} z_{t} - z_{t - 1} & = (α_{1} + α_{2} - 1) z_{t - 1} - α_{2} Δ z_{t - 1} + ε_{t} \\ Δ z_{t} & = (α_{1} + α_{2} - 1) z_{t - 1} - α_{2} Δ z_{t - 1} + ε_{t} \end{aligned} \end{array}

For AR(2) process, unit root means that $x = 1$ is a solution to

α (x) = 1 - α_{1} x - α_{2} x^{2} = 0

i.e

α (1) = 1 - α_{1} - α_{2} = 0

Therefore, testing for unit root is equivalent to testing that $ρ = 0$ in

\begin{array}{r} \begin{aligned} Δ z_{t} & = (α_{1} + α_{2} - 1) z_{t - 1} - α_{2} Δ z_{t - 1} + ε_{t} \\ = ρ z_{t - 1} + α_{1}^{*} Δ z_{t - 1} + ε_{t} \end{aligned} \end{array}

In general AR§ model for $z_{t}$

\begin{array}{r} \begin{aligned} Δ z_{t} & = (α_{1} + α_{2} + \dots + α_{p} - 1) z_{t - 1} + α_{1}^{*} Δ z_{t - 1} + \dots - α_{p - 1}^{*} Δ z_{t - p + 1} + ε_{t} \\ Δ z_{t} & = ρ z_{t - 1} + \sum_{i = 1}^{p - 1} α_{i}^{*} Δ z_{t - i} + ε_{t} \end{aligned} \end{array}

testing for unit root is equivalent to testing that $ρ = 0$

(5)¶

Δ z_{t} = ρ z_{t - 1} + \sum_{i = 1}^{p - 1} α_{i}^{*} Δ z_{t - i} + ε_{t}

in (5) the $H_{1}$ is that $z_{t}$ is a stationary mean 0 AR§ process
to test for unit root agains $H_{1}$ that $z_{t}$ is a stationary around a constant ( $\neq 0$ ) mean, estimate

(7)¶

Δ z_{t} = δ_{0} + ρ z_{t - 1} + \sum_{i}^{p - 1} α_{i}^{*} Δ z_{t - i} + ε_{t}

to test for unit root agains $H_{1}$ that $z_{t}$ is a stationary around a deterministic linear time trend (trend-stationary)

(7)¶

Δ z_{t} = δ_{0} + δ_{1} t + ρ z_{t - 1} + \sum_{i}^{p - 1} α_{i}^{*} Δ z_{t - i} + ε_{t}

appropriate for series that exhibit growth over the long run

The test for unit root is a test of $ρ = 0$ (unit root) against the alternative of $ρ < 0$ (stationary)

The asymptotic distribution of the test statistic for $ρ$ is non-standard.
The critical values of the test statistics are obtained by Monte Carlo simulations, and depend of whether constant and time trend are included
The null hypothesis of a unit root is rejected when value of the test statistics is below the critical value.
If the null hypothesis cannot be rejected, $z_{t}$ should be differenced prior to estimation

ADF test in Python¶

df = pd.read_stata(r"C:\Users\eeu227\Documents\PROJECTS\econ108-repos\econ108-practice\EconometricsData\FRED-QD\FRED-QD.dta")
rgdp = df[['time', 'gdpc1']].set_index('time')['1961q1':].rename(columns={'gdpc1': 'rGDP'})
rgdp = np.log(rgdp)

df = pd.read_stata(r"C:\Users\eeu227\Documents\PROJECTS\econ108-repos\econ108-practice\EconometricsData\FRED-MD\FRED-MD.dta")
unemp = df[['time', 'unrate']].set_index('time')['1961-01':]

df = pd.read_stata(r"C:\Users\eeu227\Documents\PROJECTS\econ108-repos\econ108-practice\EconometricsData\FRED-MD\FRED-MD.dta")
cpi = np.log(df[['time', 'cpiaucsl']].set_index('time')['1961-01':])

dcpi = (cpi.diff(12)*100).dropna()

Real GDP¶

#fig = rgdp.plot(figsize=(12,4), title='log Real GDP, US', legend=False)
#plt.savefig('images/rGDPus.png')

Erg1

from statsmodels.tsa.stattools import adfuller

def get_adfresults(dftest):
    dfoutput = pd.Series(
            dftest[0:4],
            index=[
                "Test Statistic",
                "p-value",
                "#Lags Used",
                "Number of Observations Used",
            ],
        )
    for key, value in dftest[4].items():
        dfoutput[f"Critical Value ({key})"] = value
    return dfoutput

ADF with constant and trend, 4 lags

dftest = adfuller(rgdp, maxlag=4, autolag=None, regression='ct')
dftest

(-2.0141544954585036,
 0.5937024444875985,
 4,
 223,
 {'1%': -3.9999506167815206,
  '5%': -3.4303636888103926,
  '10%': -3.138725564735756})

get_adfresults(dftest)

Test Statistic                  -2.014154
p-value                          0.593702
#Lags Used                       4.000000
Number of Observations Used    223.000000
Critical Value (1%)             -3.999951
Critical Value (5%)             -3.430364
Critical Value (10%)            -3.138726
dtype: float64

Automatically determine optimal number of lags using AIC

dftest = adfuller(rgdp, autolag="AIC", regression='ct')
get_adfresults(dftest)

Test Statistic                  -2.157741
p-value                          0.513651
#Lags Used                       2.000000
Number of Observations Used    225.000000
Critical Value (1%)             -3.999579
Critical Value (5%)             -3.430185
Critical Value (10%)            -3.138621
dtype: float64

Unemployment Rate¶

#fig = unemp.plot(figsize=(12,4), title='Unemployment rate, US', legend=False)
#plt.savefig('images/UNus.png')

UNus

dftest = adfuller(unemp, autolag='AIC', regression='c')
get_adfresults(dftest)

Test Statistic                  -2.996063
p-value                          0.035264
#Lags Used                      12.000000
Number of Observations Used    671.000000
Critical Value (1%)             -3.440133
Critical Value (5%)             -2.865857
Critical Value (10%)            -2.569069
dtype: float64

Consumer Price Inflation¶

#fig = dcpi.plot(figsize=(12,4), title='Inflation rate, US', legend=False)
#plt.savefig('images/dCPIus.png')

dCPIus

dftest = adfuller(dcpi, autolag='AIC', regression='c')
get_adfresults(dftest)

Test Statistic                  -2.804916
p-value                          0.057575
#Lags Used                      15.000000
Number of Observations Used    656.000000
Critical Value (1%)             -3.440358
Critical Value (5%)             -2.865956
Critical Value (10%)            -2.569122
dtype: float64

ADF test using ARCH¶

from arch.unitroot import ADF

adf = ADF(rgdp, trend="ct")
adf.summary()

Augmented Dickey-Fuller Results
Test Statistic	-2.158
P-value	0.514
Lags	2

Trend: Constant and Linear Time Trend
Critical Values: -4.00 (1%), -3.43 (5%), -3.14 (10%)
Null Hypothesis: The process contains a unit root.
Alternative Hypothesis: The process is weakly stationary.

Examine the regression results¶

reg_res = adf.regression
print(reg_res.summary())

                            OLS Regression Results                            
==============================================================================
Dep. Variable:                      y   R-squared:                       0.176
Model:                            OLS   Adj. R-squared:                  0.161
Method:                 Least Squares   F-statistic:                     11.76
Date:                Mon, 28 Mar 2022   Prob (F-statistic):           1.13e-08
Time:                        16:43:14   Log-Likelihood:                 786.27
No. Observations:                 225   AIC:                            -1563.
Df Residuals:                     220   BIC:                            -1545.
Df Model:                           4                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
Level.L1      -0.0217      0.010     -2.158      0.032      -0.041      -0.002
Diff.L1        0.2365      0.065      3.611      0.000       0.107       0.366
Diff.L2        0.1917      0.066      2.923      0.004       0.062       0.321
const          0.1848      0.083      2.231      0.027       0.022       0.348
trend          0.0001    7.5e-05      1.935      0.054   -2.67e-06       0.000
==============================================================================
Omnibus:                       17.613   Durbin-Watson:                   1.994
Prob(Omnibus):                  0.000   Jarque-Bera (JB):               56.070
Skew:                          -0.116   Prob(JB):                     6.68e-13
Kurtosis:                       5.435   Cond. No.                     2.21e+04
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
[2] The condition number is large, 2.21e+04. This might indicate that there are
strong multicollinearity or other numerical problems.

Examine the regression residuals¶

resids = pd.DataFrame(reg_res.resid)
resids.index = rgdp.index[3:]
resids.columns = ["resids"]
fig = resids.plot(figsize=(12,4))

fig, ax = plt.subplots(figsize=(12,4))
fig = sm.graphics.tsa.plot_acf(reg_res.resid, lags=23, ax=ax)
ax.set_ylim([-0.3, 1.1])

(-0.3, 1.1)

Non-stationarity and unit root processes

Contents

Non-stationarity and unit root processes¶

Why need stationarity?¶

ARMA(p, q)¶

Gaussian MLE¶

Models with time trend in Python¶

Difference-stationary vs trend-stationary¶

Difference-stationary process¶

Difference between trend- and difference-stationary models¶

Martingale process¶

Beveridge-Nelson decomposition¶

Random walk with a drift model:¶

Beveridge-Nelson decomposition¶

Forecasting integrated variables¶

Detecting stochastic trends by unit root test¶

Augmented Dickey–Fuller test¶

ADF test in Python¶

Real GDP¶

Unemployment Rate¶

Consumer Price Inflation¶

ADF test using ARCH¶

Examine the regression results¶

Examine the regression residuals¶