Maximum likelihood estimation

Time series model

  • a specification of the joint distribution of {z𝑡}

p(z;θ)

definition: Likelihood function

L(θ|z)=p(z;θ)

Note

The likelihood function is identical in functional form to the PDF of z, p(z;θ), but is interpreted as a function of θ, for a given value of z, rather than as a function of z for a given value of θ.

definition: Log-likelihood function

(θ|z)=log(L(θ|z)

The maximum likelihood estimator (MLE)

θ^=argmaxθ  L(θ|z)=argmaxθ  (θ|z)

Rationale for MLE

For a given θ, the value of p(z;θ)dz evaluated at the observed sample z tells us what is the probability of observing a sample in a small neighborhood around the actual z for that value of θ. Compared to the MLE θ^, any other value of θ is associated with a pdf that assigns a lower probability of observing such a sample. Therefore, θ^ is the value most supported by the observed sample.

Note

Difference between ML estimator and ML estimate:

  • estimator: θ^ as a function of a generic sample z

  • estimate: the value θ^ at a particular sample z

Score

ST(θ)=θ(θ|z)
  • describes the steepness of log-likelihood function

  • MLE θ^ solves

ST(θ^)=0

Observed Fisher information

IT(θ^)=θST(θ)|θ^=2θθ(θ|z)|θ^
  • describes the curvature of the log-likelihood function at the maximum θ^

  • measures how much information about θ we have at the MLE.

Expected Fisher information

IT(θ)=E[ST(θ)ST(θ)]
IT(θ)=E[θST(θ)]=E[IT(θ)]
  • expected curvature of the log-likelihood function

  • measures how much information about θ we can expect to have

Consistency and asymptotic normality of MLE

Assumption: z is a draw from p(z;θ0), θ0 - true value of θ

  • θ^ is consitent estimator of θ0

θ^θ0
  • θ^ is asymptotically normally distributed

T(θ^θ0)N(0,I1(θ0))

where

I(θ)=limT1TIT(θ)
θ^aN(θ0,1TI1(θ0))