Multivariate time series
Definition: A time series process is a sequence of random vectors indexed by time:
\[
\{\mathbf{z}_t: t =~ ..., -2, -1, ~0,~1, ~2,~ ... \} = \{\mathbf{z}_t \}_{t=-\infty}^{\infty} \tag{1}
\]
where \(\mathbf{z}_t\) is \(n \geq 1\)-dimensional vector
Stationarity
Definition: The process \(\{\mathbf{z}_t\}_{t=-\infty}^{\infty}\) is covariance stationary if the first two moments of the joint distribution exist and are time-invariant
\[\begin{split}
\operatorname{E}\mathbf{z}_t = \mathbf{\mu}_t = \begin{bmatrix} \mu_{1t} \\ \mu_{2t} \\\vdots \\\mu_{nt} \end{bmatrix}
\end{split}\]
\[\begin{split}
\begin{align}
\operatorname{cov}(\mathbf{z}_t, \mathbf{z}_{t-k}) & = \operatorname{E}(\mathbf{z}_t - \mathbf{\mu}_t)(\mathbf{z}_{t-k} - \mathbf{\mu}_{t-k} )'\\\\ & = \Gamma(t, t-k) =
\begin{pmatrix}
\gamma_{11}(t, t-k) & \cdots & \gamma_{1n}(t, t-k)\\
\vdots & \ddots & \vdots\\
\gamma_{n1}(t, t-k) & \cdots & \gamma_{nn}(t, t-k)\\
\end{pmatrix}
\end{align}
\end{split}\]
\[\begin{split}
\begin{align}
\mathbf{\mu}_t & = \mathbf{\mu}
\\
\Gamma(t, t-k) & = \Gamma(k)
\end{align}
\end{split}\]
are not functions of \(t\)
Note:
\[
\Gamma(k) = \operatorname{cov}(\mathbf{z}_t, \mathbf{z}_{t-k}) = \operatorname{cov}(\mathbf{z}_{t+k}, \mathbf{z}_{t}) \neq
\operatorname{cov}(\mathbf{z}_{t}, \mathbf{z}_{t+k}) = \Gamma(-k)
\]
\(\Gamma(k)\) is not symmetric (unless \(k=0\))
but since \(\operatorname{cov}(\mathbf{z}_{t+k}, \mathbf{z}_{t})' = \operatorname{cov}(\mathbf{z}_{t}, \mathbf{z}_{t+k})\)
\[
\Gamma(k) = \Gamma(-k)'
\]
follows from (set \(\mathbf{\mu}=0\) w.l.g.)
\[\Gamma(k) =\operatorname{cov}(\mathbf{z}_{t+k}, \mathbf{z}_{t}) = \operatorname{E}(\mathbf{z}_{t+k} \mathbf{z}_{t}')= \operatorname{E}(\mathbf{z}_{t}\mathbf{z}_{t+k}')' = \left(\operatorname{cov}(\mathbf{z}_{t}, \mathbf{z}_{t+k}) \right)'=\Gamma(-k)'
\]
If
\[\begin{split}\mathbf{Z}_T = \begin{bmatrix}\mathbf{z}_{1}\\ \mathbf{z}_{2}\\ \vdots\\ \mathbf{z}_{T}\end{bmatrix}\end{split}\]
then
\[\begin{split}\operatorname{E}(\mathbf{Z}_T) = \begin{bmatrix}\mathbf{\mu}\\ \mathbf{\mu}\\ \vdots\\ \mathbf{\mu}\end{bmatrix},\;\;\;\;\;
\operatorname{cov}(\mathbf{Z}_T) =
\begin{pmatrix}
\Gamma(0) & \Gamma(1)' & \cdots & \Gamma(T-1)'\\
\Gamma(1) & \Gamma(0) & \cdots & \Gamma(T-2)'\\
\vdots & \vdots & \vdots& \vdots\\
\Gamma(T-1) & \Gamma(T-2) & \cdots & \Gamma(0)\\
\end{pmatrix} \;\;\; (\text{symmetric block Toeplitz matrix})
\end{split}\]
\[ \mathbf{Z}_T \sim \mathcal{N} \left(\boldsymbol \mu, \boldsymbol \Sigma \right) \]
\[\]
\[\begin{split} \begin{bmatrix}\mathbf{z}_{1}\\ \mathbf{z}_{2}\\ \vdots\\ \mathbf{z}_{T}\end{bmatrix}
\sim \mathcal{N} \left(
\begin{bmatrix}\mathbf{\mu}\\ \mathbf{\mu}\\ \vdots\\ \mathbf{\mu}\end{bmatrix},
\begin{pmatrix}
\Gamma(0) & \Gamma(1)' & \cdots & \Gamma(T-1)'\\
\Gamma(1) & \Gamma(0) & \cdots & \Gamma(T-2)'\\
\vdots & \vdots & \vdots& \vdots\\
\Gamma(T-1) & \Gamma(T-2) & \cdots & \Gamma(0)\\
\end{pmatrix}
\right)
\end{split}\]
Number of unique parameters
\[\begin{split}
\begin{align}
\Gamma(0) & : n(n+1)/2 \\
&+\\
\Gamma(1) & : n^2 \\
&+\\
& \vdots \\
&+\\
\Gamma(T-1) & : n^2
\end{align}
\end{split}\]
with \(n = 5, T = 200 => 4990\)
time series models allow to represent temporal (inter)dependence parsimoniously
by imposing restrictions - reducing the number of unique parameters
making estimation feasible
VARMA(p, q) model
\[\begin{split}
\begin{align}
A(L) \mathbf{z}_t &= B(L) \boldsymbol \varepsilon_t, \;\;\;\;\; \boldsymbol \varepsilon_t \sim \operatorname{WN} \left( 0, \;\mathbf{\Sigma}\right) \;\;\; (\text{vector white noise, i.e. } \operatorname{E}(\boldsymbol \varepsilon_t, \boldsymbol \varepsilon_{t-k}') = 0)\\\\
A(L) &= I - A_1 L - \cdots - A_p L^p \\
B(L) &= I + B_1 L + \cdots + B_q L^q \\
\end{align}
\end{split}\]
zero auto and cross-auto correlations of the innovations
\[\begin{split}
A_i =
\begin{pmatrix}
a_{11, i} & \cdots & a_{1n, i}\\
\vdots & \ddots & \vdots\\
a_{n1, i} & \cdots & a_{nn, i}\\
\end{pmatrix}, \;\;\;
B_j =
\begin{pmatrix}
b_{11, j} & \cdots & b_{1n, j}\\
\vdots & \ddots & \vdots\\
b_{n1, j} & \cdots & b_{nn, j}\\
\end{pmatrix}
\end{split}\]
VAR§ model
\[\begin{split}
\begin{align}
\mathbf{z}_t = A_1 \mathbf{z}_{t-1} + \cdots + A_p \mathbf{z}_{t-p} + \boldsymbol \varepsilon_t, \;\;\;\;\; \boldsymbol \varepsilon_t \sim \operatorname{WN} \left( 0, \;\mathbf{\Sigma}\right)\\\\
\end{align}
\end{split}\]
\[\begin{split}
\begin{align}
A(L) \mathbf{z}_t &= \boldsymbol \varepsilon_t, \;\;\;\;\; \boldsymbol \varepsilon_t \sim \operatorname{WN} \left( 0, \;\mathbf{\Sigma}\right)\\\\
A(L) = I & - A_1 L - \cdots - A_p L^p \\
\end{align}
\end{split}\]
\(\mathbf{\Sigma}_{ij}\) captures all contemporaneous (time \(t\)) relationships between \(z_i\) and \(z_j\)
\(\boldsymbol \{A_k\}_{ij}\) captures all dynamic interactions between \(z_{it}\) and \(z_{j, t-k}\)
Stationarity
VAR§ is stationary if all roots of the equation
\[| A(x)| = | I - A_1 x - \cdots - A_p x^p|=0\]
are outside the unit circle (\(|x|>1\))
VAR(1) representation of VAR§ process
\[
\mathbf{Z}_{t} = \boldsymbol \Phi \mathbf{Z}_{t-1} + \boldsymbol E_t
\]
\[\begin{split}
\underset{\mathbf{Z}_t}{\underbrace{\left[\begin{array}{c}
\mathbf{z}_{t}\\
\mathbf{z}_{t-1}\\
\vdots\\
\mathbf{z}_{t-p+1}
\end{array}\right]}}
=
\underset{ \boldsymbol \Phi}{\underbrace{\left[\begin{array}{cccccccc}
A_1 & A_2 & \cdots & A_{p-1} & A_p\\
I & 0 & \cdots & 0 & 0\\
\vdots & \vdots & \vdots & \vdots \\
0 & 0 & \cdots & I & 0
\end{array}\right]}}
\underset{\mathbf{Z}_{t-1}}{\underbrace{\left[\begin{array}{c}
\mathbf{z}_{t-1}\\
\mathbf{z}_{t-2}\\
\vdots\\
\mathbf{z}_{t-p}
\end{array}\right]}}
+
\underset{ \boldsymbol E_{t}}{\underbrace{\left[\begin{array}{c}
\varepsilon_{t}\\
0\\
\vdots\\
0
\end{array}\right]}}
\end{split}\]
\(\mathbf{Z}_{t}\) is stationary if all roots \(x\) of
\[|\boldsymbol I - \boldsymbol \Phi x|=0\]
are outside the unit circle (\(|x|>1\)), which is equivalent to all solutions of
\[|\boldsymbol I \lambda - \boldsymbol \Phi|=0\]
being \(|\lambda|<1\), i.e. all eigenvalues of \(\boldsymbol \Phi\) being less than 1 in absolute value.
Eigenvalue decomposition
from
\[ A V = V \Lambda \]
we have
\[\begin{split}
\begin{align}
A & = V \Lambda V^{-1}\\
A^2 & = A A = V \Lambda V^{-1} V \Lambda V^{-1} = V \Lambda^2 V^{-1}\\
A^h & = V \Lambda^h V^{-1}\\
\end{align}
\end{split}\]
if for all eigenvalues \(|\lambda_i| < 1\),
\[ \Lambda^h \longrightarrow 0 \;\; \text{ and } \;\; A^h \longrightarrow 0 \;\;\; \text{as } h \longrightarrow \infty\]
VAR(1) process
\[
\mathbf{z}_{t} = A \mathbf{z}_{t-1} + \boldsymbol \varepsilon_{t}
\]
From VAR parameters to moments of \(\mathbf{z}_t\)
\[\begin{split}
\begin{align}
\operatorname{E} \mathbf{z}_{t} &= A \operatorname{E} \mathbf{z}_{t-1} + \operatorname{E} \boldsymbol \varepsilon_{t}\\
& = 0
\end{align}
\end{split}\]
\[\mathbf{z}_{t} = A \mathbf{z}_{t-1} + \boldsymbol \varepsilon_{t} \;\; \Rightarrow \;\; \mathbf{z}_{t}\mathbf{z}_{t}^{\prime} = \left(A \mathbf{z}_{t-1} + \boldsymbol \varepsilon_{t}\right)\left(A \mathbf{z}_{t-1} + \boldsymbol \varepsilon_{t}\right)^{\prime}\]
Since \(\operatorname{E}(\mathbf{z}_{t-1} \varepsilon_{t}') = 0 \)
\[
\operatorname{E}(\mathbf{z}_{t} \mathbf{z}'_{t}) = A \operatorname{E}(\mathbf{z}_{t-1} \mathbf{z}^{\prime}_{t-1}) A' + \operatorname{E}( \varepsilon_{t} \varepsilon_{t}')
\]
or
\[\begin{split}
\begin{align}
\Gamma(0) =\; & A \Gamma(0) A ' + \Sigma\\\\
\;\;\; &\text{and}\\\\
\operatorname{vec}\left(\Gamma(0) \right) =\; & \left( I - A \otimes A \right)^{-1} \operatorname{vec}( \Sigma)\\
\end{align}
\end{split}\]
\[
\operatorname{E}(\mathbf{z}_{t} \mathbf{z}'_{t-1}) = A \operatorname{E}(\mathbf{z}_{t-1} \mathbf{z}^{\prime}_{t-1}) + \operatorname{E}( \varepsilon_{t} \mathbf{z}_{t-1}')
\]
\[ \Gamma(1) = A \Gamma(0) \]
\[ \Gamma(2) = A \Gamma(1) = A^2 \Gamma(0) \]
\[ \Gamma(h) = A^h \Gamma(0) \]
From moments of \(\mathbf{z}_t\) to VAR parameters
\[\begin{split}
\begin{align}
\Gamma(1) = A \Gamma(0) \;\;\; & \Rightarrow \;\;\; A = \Gamma(1)\Gamma(0)^{-1}
\\\\
\Gamma(0) = A \Gamma(0) A ' + \Sigma \;\;\; & \Rightarrow \;\;\; \Sigma = \Gamma(0)-\Gamma(1)\Gamma(0)^{-1} \Gamma(1)'
\end{align}
\end{split}\]
Non-zero mean \(\operatorname{E} \mathbf{z}_t = \mathbf{\mu} \neq 0\)
\[\begin{split}
\begin{align}
\mathbf{z}_{t} & = \boldsymbol a_0 + A \mathbf{z}_{t-1} + \varepsilon_{t}\\\\
\mathbf{\mu} & = \boldsymbol a_0 + A \mathbf{\mu}\\\\
\mathbf{\mu} & = \left( I - A \right)^{-1} \boldsymbol a_0
\end{align}
\end{split}\]
and
\[\operatorname{E} \bar{\mathbf{z}}_t = \operatorname{E} (\mathbf{z}_t- \mathbf{\mu}) = 0\]
Note For the moments of a VAR§ model, use the VAR(1) representation, and apply the selection matrix
\[\boldsymbol s = \left[ I, 0, \cdots, 0 \right] \]
to obtain the autocovariances of \(\mathbf{z}_{t}\) from the autocovariances of \(\mathbf{Z}_{t}\)
since
\[ \mathbf{z}_{t} = \boldsymbol s \mathbf{Z}_{t}\]
\( \operatorname{E} \mathbf{z}_{t} = \boldsymbol s \operatorname{E} \mathbf{Z}_{t}\)
\( \operatorname{var} (\mathbf{z}_{t}) = \boldsymbol s \operatorname{var} (\mathbf{Z}_{t}) \boldsymbol s'\)
\( \operatorname{cov} (\mathbf{z}_{t}, \mathbf{z}_{t-k}) = \boldsymbol s \operatorname{cov} (\mathbf{Z}_{t}, \mathbf{Z}_{t-k}) \boldsymbol s'\)
Estimation
We can write VAR§
\[
\begin{align}
\mathbf{z}_t = \boldsymbol a_0 + A_1 \mathbf{z}_{t-1} + \cdots + A_p \mathbf{z}_{t-p} + \boldsymbol \varepsilon_t, \;\;\;\;\; \boldsymbol \varepsilon_t \sim \operatorname{WN} \left( 0, \;\mathbf{\Sigma}\right)
\end{align}
\]
as
\[\mathbf{z}_t = \boldsymbol A \boldsymbol x_{t-1} + \boldsymbol \varepsilon_t \]
where \(\boldsymbol A = [\boldsymbol a_0, A_1, \cdots, A_p ]\), and \(\boldsymbol x_{t-1} = [1, \mathbf{z}_{t-1}', \cdots, \mathbf{z}_{t-p}' ]'\)
Assume a pre-sample \(\mathbf{z}_{0}, \mathbf{z}_{1}, \cdots, \mathbf{z}_{-p+1}\) is given. (alternatively, re-define \(T\))
Then, we have
\[\boldsymbol Z = \boldsymbol A \boldsymbol X + \boldsymbol U \]
where
\(\boldsymbol Z = [\mathbf{z}_{1}, \cdots, \mathbf{z}_T] \) is \(n \times T\)
\(\boldsymbol X = [\mathbf{x}_{0}, \cdots, \mathbf{x}_{T-1}] \) is \(n(p+1) \times T\)
\(\boldsymbol U = [\boldsymbol \varepsilon_{1}, \cdots, \boldsymbol \varepsilon_T] \) is \(n \times T\)
OLS estimation
\[\begin{split}
\begin{align}
\hat{\boldsymbol A} &= \boldsymbol Z \boldsymbol X' (\boldsymbol X \boldsymbol X')^{-1} \\\\
\hat{\boldsymbol \Sigma} & = \frac{1}{T - np - 1} \hat{\boldsymbol U} \hat{\boldsymbol U}' \\\\
\hat{\boldsymbol U} & = \boldsymbol Z - \hat{\boldsymbol A} \boldsymbol X
\end{align}
\end{split}\]
Asymptotic distribution
\[
\operatorname{vec}\left(\hat{\boldsymbol A} \right) \overset{a}{\sim} \mathcal{N} \left( \operatorname{vec}\left(\boldsymbol A \right),\; (\boldsymbol X \boldsymbol X')^{-1} \otimes \hat{\boldsymbol \Sigma} \right)
\]
Notes:
the rows of \(\boldsymbol A\) can be estimated with OLS equation by equation
also equivalent to conditional MLE, assuming that \(\boldsymbol \varepsilon_t \sim \mathcal{N} \left( 0, \;\mathbf{\Sigma}\right)\)
\(\hat{\boldsymbol A}\) has a small-sample bias, which can be corrected for analytically (when the VAR has only intercept) or using bootstrap (when a deterministic trand is included)
Choice of \(p\)
define a set of models - select \(p_{min}\) and \(p_{max}\)
estimate each one and compute IC§
pick the one with lowest IC§ value
Most commonly used ICs:
\[\begin{split}
\begin{align}
AIC &= \operatorname{ln}|\hat{\boldsymbol \Sigma}^{ml}(p)| + \frac{2}{T}(pn^2 + n) \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\; \text{Akaike’s Information Criterion} \\
BIC &= \operatorname{ln}|\hat{\boldsymbol \Sigma}^{ml}(p)| + + \frac{\operatorname{ln}(T)}{T}(pn^2 + n) \;\;\;\;\;\;\;\;\; \text{Bayesian Information Criterion}
\end{align}
\end{split}\]
Notes
For ICs to be comparable for different \(p\), the sample has to be the same (set \(t = p_{max}+1,\cdots, T\))
\(\hat{\boldsymbol \Sigma}^{ml}(p) = \frac{1}{T} \hat{\boldsymbol U} \hat{\boldsymbol U}' = \frac{T-np-1}{T} \hat{\boldsymbol \Sigma}^{ols}(p)\)
typically, \(p_{min}=12\) for monthly and \(p_{min}=4\) for quarterly data
Forecasting
\[\begin{split}
\begin{align}
\mathbf{z}_t &= A \mathbf{z}_{t-1} + \boldsymbol \varepsilon_t, \\\\
\mathbf{z}_{t+1} &= A \mathbf{z}_{t} + \boldsymbol \varepsilon_{t+1}\\\\
\mathbf{z}_{t+2} &= A^2 \mathbf{z}_{t} + A \boldsymbol \varepsilon_{t+1}+ \boldsymbol \varepsilon_{t+2}\\
&\;\vdots \\
\mathbf{z}_{t+h} &= A^h \mathbf{z}_{t} + A^{h-1} \boldsymbol \varepsilon_{t+1} + \cdots + \boldsymbol \varepsilon_{t+h}
\end{align}
\end{split}\]
Optimal forecast given information at \(T\):
\[\begin{split}
\begin{align}
\operatorname{E}(\mathbf{z}_{T+1} | \mathbf{z}_{T} ) & = A \mathbf{z}_{T}\\
\operatorname{E}(\mathbf{z}_{T+h} | \mathbf{z}_{T} ) & = \boldsymbol A^h \mathbf{z}_{T}
\end{align}
\end{split}\]
Optimal forecast given information at \(T+1\):
\[\begin{split}
\begin{align}
\operatorname{E}(\mathbf{z}_{T+h} | \mathbf{z}_{T+1} ) & = \boldsymbol A^{h-1} \mathbf{z}_{T+1}\\
& = \boldsymbol A^{h-1} \left( A\mathbf{z}_{T} + \boldsymbol \varepsilon_{T+1}\right)\\
& = \boldsymbol A^{h}\mathbf{z}_{T} + A^{h-1} \boldsymbol \varepsilon_{T+1} \\
& = \operatorname{E}(\mathbf{z}_{T+h} | \mathbf{z}_{T} ) + A^{h-1} (\mathbf{z}_{T+1} - A \mathbf{z}_{T} )\\
& = \operatorname{E}(\mathbf{z}_{T+h} | \mathbf{z}_{T} ) + A^{h-1} (\mathbf{z}_{T+1} - \operatorname{E}(\mathbf{z}_{T+1} | \mathbf{z}_{T} ) )\\
\end{align}
\end{split}\]
Optimal forecast update:
\[
\operatorname{E}(\mathbf{z}_{T+h} | \mathbf{z}_{T+1} ) - \operatorname{E}(\mathbf{z}_{T+h} | \mathbf{z}_{T} ) = A^{h-1} \underbrace{(\mathbf{z}_{T+1} - \operatorname{E}(\mathbf{z}_{T+1} | \mathbf{z}_{T} ) )}_{\text{1-step ahead forecast error}}
\]
\(h\)-step-ahead forecast error:
\[
\mathbf{z}_{T+h} - \operatorname{E}(\mathbf{z}_{T+h} | \mathbf{z}_{T} ) = A^{h-1} \boldsymbol \varepsilon_{t+1} + \cdots + A \boldsymbol \varepsilon_{t+h-1} + \boldsymbol \varepsilon_{t+h}
\]
with
\[\begin{split}
\begin{align}
\operatorname{E} \left(\mathbf{z}_{T+h} - \operatorname{E}(\mathbf{z}_{T+h} | \mathbf{z}_{T} )\right) &= 0 \\
\operatorname{cov} \left(\mathbf{z}_{T+h} - \operatorname{E}(\mathbf{z}_{T+h} | \mathbf{z}_{T} )\right) &= A^{h-1} \Sigma (A^{h-1})' + \cdots + A \Sigma A' + \Sigma
\end{align}
\end{split}\]
Note: as in the univariate case, we can write VAR\((p)\) as VMA\((\infty)\). For VAR(1)
\[\begin{split}\begin{align}
\mathbf{z}_t &= A(L)^{-1} \boldsymbol \varepsilon_t\\
&= \varepsilon_t + A\varepsilon_{t-1} + A^2\varepsilon_{t-2} + \cdots
\end{align} \end{split}\]
and
\[
\operatorname{cov} (\mathbf{z}_{t}) = \Gamma(0) = \Sigma + A \Sigma A' + A^2 \Sigma (A^2)' + \cdots
\]
As \(h \longrightarrow \infty\)
\[\begin{split}
\begin{align}
\operatorname{E}(\mathbf{z}_{T+h} | \mathbf{z}_{T} ) &\longrightarrow 0 \;\;\;(\text{unconditional mean of } \mathbf{z}_{t} ) \\
\operatorname{cov} \left(\mathbf{z}_{T+h} - \operatorname{E}(\mathbf{z}_{T+h} | \mathbf{z}_{T} )\right) & \longrightarrow \Gamma(0) \;\;\;(\text{unconditional covariance of } \mathbf{z}_{t} )
\end{align}
\end{split}\]
For VAR§ - use the VAR(1) representation
\[
\mathbf{Z}_{t} = \boldsymbol \Phi \mathbf{Z}_{t-1} + \boldsymbol E_t
\]
\[\operatorname{E}(\mathbf{z}_{T+h} | \mathbf{Z}_{T} ) = \boldsymbol s \operatorname{E}(\mathbf{Z}_{T+h} | \mathbf{Z}_{T} ) \]
\[
\operatorname{cov} \left(\mathbf{z}_{T+h} - \operatorname{E}(\mathbf{z}_{T+h} | \mathbf{Z}_{T} )\right) = \boldsymbol s\operatorname{cov} \left(\mathbf{Z}_{T+h} - \operatorname{E}(\mathbf{Z}_{T+h} | \mathbf{Z}_{T} )\right)\boldsymbol s'
\]
Impulse response functions
From the VMA representation of of VAR(1) model
\[\begin{align}
\mathbf{z}_{t+h} = \mathbf{\varepsilon}_{t+h} + A \mathbf{\varepsilon}_{t+h-1} + A^2 \mathbf{\varepsilon}_{t+h-2} + \cdots + A^h \mathbf{\varepsilon}_{t} + \cdots
\end{align}\]
we have
\[
\frac{\partial \mathbf{z}_{t+h} }{\partial \mathbf{\varepsilon}_{t}} = A^h
\]
Note that \(\mathbf{\varepsilon}_{t}\) is a vector
typically, we want to know the effect of a shock on a variable (for example monetary policy on inflation)
here \(\operatorname{cov}(\mathbf{\varepsilon}_t) = \Sigma\), i.e. \(\varepsilon_{it}\) and \(\varepsilon_{jt}\) are correlated
\(\mathbf{\varepsilon}_t\) are not shocks (statistical innovations, residuals, forecast errors)
(orthogonalized ) impulse response functions
Since \(\Sigma\) is positive definite matrix, there exists a matrix \(B_0\) such that
\[B_0^{-1} (B_0^{-1})^{\prime} = \Sigma \;\;\; \Rightarrow \;\;\; B_0\Sigma B_0' = I\]
Then
\[
\mathbf{u}_t = B_0 \mathbf{\varepsilon}_t \;\;\; \Rightarrow \;\;\; \mathbf{u}_t \sim \operatorname{WN} \left( 0, \;I\right)
\]
Using \(\mathbf{\varepsilon}_t = B_0^{-1} \mathbf{u}_t\) in the MA representation
\[\begin{split}\begin{align}
\mathbf{z}_{t+h} & = \mathbf{\varepsilon}_{t+h} + A \mathbf{\varepsilon}_{t+h-1} + A^2 \mathbf{\varepsilon}_{t+h-2} + \cdots + A^h \mathbf{\varepsilon}_{t} + \cdots \\\\
& = B_0^{-1} \mathbf{u}_{t+h} + A B_0^{-1} \mathbf{u}_{t+h-1} + A^2 B_0^{-1} \mathbf{u}_{t+h-2} + \cdots + A^h B_0^{-1} \mathbf{u}_{t} + \cdots
\end{align} \end{split}\]
and therefore
\[
\frac{\partial \mathbf{z}_{t+h} }{\partial \mathbf{u}_{t}} = A^h B_0^{-1} \equiv \Psi_h
\]
Since \(u_{it}\) and \(u_{jt}\) are uncorrelated, the \(k,l\) element of \(\Psi_h\) gives the response of \(z_{k,t+h}\) to a (one standard deviation shock to \(u_{l,t})\)
\[
\frac{\partial z_{k,t+h} }{\partial u_{lt}} = \psi_{kl,h}
\]
and the \(l\) column of \(\Psi_h\) gives the response of \(\mathbf{z}_{t+h}\) to a one standard deviation shock to \(u_{l,t}\)
\[
\frac{\partial \mathbf{z}_{t+h} }{\partial u_{lt}} = \boldsymbol \psi_{l,h}
\]
\(\boldsymbol \psi_{l,h}\) is the \(l\)-th column of \(A^h B_0^{-1}\)
From VAR§ to Structural VAR§ (and vice versa)
\[\begin{split}
\begin{align}
\mathbf{z}_t & = \boldsymbol a_0 + A_1 \mathbf{z}_{t-1} + \cdots + A_p \mathbf{z}_{t-p} + \boldsymbol \varepsilon_t, \;\;\;\;\; \boldsymbol \varepsilon_t \sim \operatorname{WN} \left( 0, \;\mathbf{\Sigma}\right)\\\\
\mathbf{z}_t & = \boldsymbol a_0 + A_1 \mathbf{z}_{t-1} + \cdots + A_p \mathbf{z}_{t-p} + B_0^{-1}\boldsymbol u_t\;\;\;\;\; (\text{using } \mathbf{u}_t = B_0 \mathbf{\varepsilon}_t) \\
& \downarrow\\
B_0\mathbf{z}_t & = B_0\boldsymbol a_0 + B_0 A_1 \mathbf{z}_{t-1} + \cdots + B_0 A_p \mathbf{z}_{t-p} + \mathbf{u}_t, \;\;\;\;\; (\text{pre-multiply by } B_0)\\\\
B_0\mathbf{z}_t & = \boldsymbol b_0 + B_1 \mathbf{z}_{t-1} + \cdots + B_p \mathbf{z}_{t-p} + \mathbf{u}_t, \;\;\;\;\; \boldsymbol u_t \sim \operatorname{WN} \left( 0, \;I\right)
\end{align}
\end{split}\]
\(B_0\) captures contemporaneous (time \(t\)) interactions among variables
\(\mathbf{u}_t\) are orthogonal shocks: \(u_{i, t}\) only affects \(z_{i, t}\) contemporaneously
impulse responses
\[
\frac{\partial \mathbf{z}_{t+h} }{\partial u_{it}}
\]
Identification
We can estimate the reduced-form coefficients \(a_0\), \(A_1\), … \(A_p\) and \(\Sigma\)
and compute reduced-form MA representation of \(\mathbf{z}\)
to compute impulse responses to structura shocks we need \(B_0^{-1}\)
\[B_0^{-1} (B_0^{-1})^{\prime} = \Sigma \]
by symmetry \(\Sigma\) has \(n(n+1)/2\) unique elements \(\Rightarrow n(n+1)/2\) equations, but \(B_0\) has \(n^2\) unknown elements
Types of identifying restrictions
Short-run restrictions:
\[\begin{split}
\left[\begin{array}{c}
\varepsilon_{1, t}\\
\varepsilon_{2, t}\\
\vdots\\
\varepsilon_{n, t}
\end{array}\right]
=
\underset{ B_0^{-1}}{\underbrace{\left[\begin{array}{cccccccc}
b^{1,1}_0 & b^{1,2}_0 & \cdots & b^{1,n}_0\\
b^{2,1}_0 & b^{2,2}_0 & \cdots & b^{2,n}_0\\
\vdots & \vdots & \vdots \\
b^{n,1}_0 & b^{n,2}_0 & \cdots & b^{n,n}_0\\
\end{array}\right]}}
\left[\begin{array}{c}
u_{1, t}\\
u_{2, t}\\
\vdots\\
u_{n, t}
\end{array}\right]
\end{split}\]
\[\begin{split}
\underset{B_0}{\underbrace{\left[\begin{array}{cccccccc}
b_{0,11} & b_{0,12} & \cdots & b_{0,1n}\\
b_{0,21} & b_{0,22} & \cdots & b_{0,2n}\\
\vdots & \vdots & \vdots \\
b_{0,n1} & b_{0,n2} & \cdots & b_{0,nn}\\
\end{array}\right]}}
\left[\begin{array}{c}
z_{1, t}\\
z_{2, t}\\
\vdots\\
z_{n, t}
\end{array}\right]
= \cdots +
\left[\begin{array}{c}
u_{1, t}\\
u_{2, t}\\
\vdots\\
u_{n, t}
\end{array}\right]
\end{split}\]
Long-run restrictions:
MA representation of \(\mathbf{z}_t\)
\[\begin{split}\begin{align}
\mathbf{z}_{t+h} & = \mathbf{\varepsilon}_{t+h} + A \mathbf{\varepsilon}_{t+h-1} + A^2 \mathbf{\varepsilon}_{t+h-2} + \cdots + A^h \mathbf{\varepsilon}_{t} + \cdots \\\\
& = B_0^{-1} \mathbf{u}_{t+h} + A B_0^{-1} \mathbf{u}_{t+h-1} + A^2 B_0^{-1} \mathbf{u}_{t+h-2} + \cdots + A^h B_0^{-1} \mathbf{u}_{t} + \cdots
\end{align} \end{split}\]
The cumulative impuse responses of shocks in \(t\) on \(\mathbf{z}_{t}\), \(\mathbf{z}_{t+1}\), … are given by
\[
(I + A + A^2 + A^3 + \cdots ) B^{-1} = A(1)^{-1} B_0^{-1}
\]
if \(\mathbf{z}_{t}\) contains growth rates (e.g. of GDP), the cumulative response givs the permanent effect on the level
common long-run restrictions: some shocks don’t have permanent effect on some variables (nominal shocks on real variabls)
Sign restrictions:
If \(B_0\) is such that
\[B_0^{-1} (B_0^{-1})^{\prime} = \Sigma \]
then for any orthogonal matrix \(Q\) (\( QQ^{\prime} =Q^{\prime}Q = I \))
we have
\[(B_0^{-1} Q) (B_0^{-1} Q)^{\prime} = \Sigma \]
There are inifitely many such matrices.
find the set of solutions that satisfy sign restrictions implied by theory (monetary policy shock raises \(i_t\) and lowers \(\pi_t\) and \(y_t\))
find all matrices \(Q\) such that \(B_0^{-1} = P Q \) meets those restrictions, where \(P\) is the Cholesky factor of \(\Sigma\), and \(Q\) is orthogonal matrix.
\[\Sigma = P P^{\prime}\]
There are different ways to generate \(Q\)
For example, for \(n=2\) candidates \(Q\) can be generated using
\[\begin{split} Q = \left[\begin{array}{cc}
\cos(\theta) & -\sin(\theta)\\
\sin(\theta) & \cos(\theta) \\
\end{array}\right]\end{split}\]
and \(\theta \in (0, 2 \pi)\)
Can also use the QR decomposition of a random matrix \(H\) such that \(H_{ij} \sim N(0, 1)\)