Bayesian approach

Bayesian approach

  • Suppose \(\mathbf{z} = [\mathbf{x}^{\prime}, \; \mathbf{y}^{\prime}]^{\prime}\) is a random vector with joint density \(p(\mathbf{z})\)

  • Suppose we observe a sample realization of \(\mathbf{x}\). How does that change our knowledge of \(\mathbf{y}\)?

  • before observing the sample \(\mathbf{x}\), our knowledge of \(\mathbf{y}\) is given by

\[ p(\mathbf{y}) = \int_{\mathbf{x}} p(\mathbf{z}) \text{d} \mathbf{x} \]
  • after observing \(\mathbf{x}\), our knowledge of \(\mathbf{y}\) is given by

\[ p(\mathbf{y} | \mathbf{x}) = \frac{p(\mathbf{z})}{p(\mathbf{x})} = \frac{p(\mathbf{x} | \mathbf{y}) p(\mathbf{y}) }{p(\mathbf{x})} \]
  • for exmple, if \(\mathbf{z}\) is Gaussian:

\[\begin{split} p(\mathbf{y} | \mathbf{x}) = \mathcal{N}(\boldsymbol \mu_y + \mathbf{\Sigma}_{y x} \mathbf{\Sigma}^{-1}_{x} (\mathbf{x} - \boldsymbol \mu_x), \mathbf{\Sigma}_{y} - \mathbf{\Sigma}_{y x}\mathbf{\Sigma}_{x}^{-1}\mathbf{\Sigma}_{x y}) %\\ \end{split}\]
\[ p(\mathbf{y} | \mathbf{x}) = \frac{p(\mathbf{z})}{p(\mathbf{x})} = \frac{p(\mathbf{x} | \mathbf{y}) p(\mathbf{y}) }{p(\mathbf{x})} \]
  • in Bayesian parlance

    • \(p(\mathbf{y})\) is prior distribution

    • \(p(\mathbf{y} | \mathbf{x})\) is posterior distribution

    • \(p(\mathbf{x} | \mathbf{y})\) is likelihood

  • typically, we would think of \(\mathbf{y}\) as parameters, and use instead \(\mathbf{\theta}\)

  • also, the prior and the likelihood are specified separately

  • as a result, the posterior is not availalbe in closed form

  • simulation method are used instead, to sample from it

since \(p(\mathbf{x})\) is independent from \(\theta\)

\[ p(\mathbf{\theta} | \mathbf{x}) = \frac{p(\mathbf{x} | \mathbf{\theta}) p(\mathbf{\theta}) }{p(\mathbf{x})} \propto p(\mathbf{x} | \mathbf{\theta}) p(\mathbf{\theta}) \]

Goal of Bayesian inference: characterize the distribution of \(\theta\), given

  • the data \(\mathbf{x}\)

  • the prior knowledge about \(\theta\)

  • the model, embodied by the likelihood function

characterize here means estimate the moments of the posterior, based on sample draws from it