Parameter Learning with Complete Data

Motivation: How to get the unknown parameter $\theta$ of a DGM/UGM $p(x_1,x_2…,x_M|\theta)$ from fully observed data?
Given: a set of 𝑁 independent and identically distributed (i.i.d) complete observation of each random variable 𝑋:$x_{1,1}, … x_{1,N},…, x_{M,1},…x_{M,N}$.

  • Maximum Likelihood Estimate (MLE)
  • Maximum a Posteriori (MAP)

DGM: MLE & MAP

Special Cases: Single Random Variable DGM

graph TD;
    parameters-->X
    theta-->X_observation_1;
    theta-->X_observation_2;
    theta-->...;
    theta-->X_observation_N;

theta is the parameter we want to learn from the N disconnected observations of variable.

Continuous: Univariate Normal Distribution

Fit an univariate normal distribution model to a set of scalar data $X:x_1,x_2,…,x_N$.
Goal is to find the parameter $theta = (\mu,\sigma^2)$

  • Maximum Likelihood Estimation (MLE)
  • Maximum a Posteriori (MAP)

Discrete: Univatiate Categorical Distribution

  • Maximum Likelihood Estimation (MLE)
  • Maximum a Posteriori (MAP)

General Cases:

Discrete

Maximum Log-Likelihood

Maximum A Posteriori(MAP)

Continuous

Maximum Log-Likelihood

Maximum A Posteriori(MAP)

MRF: stochasitc maximum likelihood & iterative proportional fitting

Stochasitc Maximum Likelihood

Iterative Proportional Fitting(IFP)

CRF: stochastic gradient descent

Stochasitc Gradient Descent

Maximum A Posteriori(MAP)