Statistics 5 minutes

The maximum likelihood estimator (MLE)

The maximum likelihood estimator is one of the most used estimators in statistics. In this article, we introduce this estimator and study its properties.

In a typical inference task, we have some data that we wish to understand better. The statistical approach is to model the source of these data as a random variable whose outcomes are produced with joint-probability where is an unknown parameter.

Definitions

A maximum likelihood estimator for is an estimator that maximizes the probability of producing the sample we observed.

Definition: likelihood
The likelihood is the probability seen as a function of :
Definition: MLE
When the likelihood admits a unique global maximum, the MLE is:

In practice, we often maximize the log-likelihood instead of the likelihood. Since is an increasing function, this yields an equivalent solution.

The log-likelihood is noted :

Remarks:

  • the likelihood is not the probability of ;
  • maximizing the probability of is called “maximum a posteriori estimation”.

Estimator performance

As explained in our primer on estimators, we first want to know if the MLE is consistent.

Consistency

Under some regularity conditions on the density , the MLE is a consistent estimator, for instance:

  • when and is concave;
  • when and is continuously differentiable;
  • when is from a -parameter exponential family.

Asymptotic performance

Assuming an i.i.d. sample and under sufficient regularity of the distribution , the MLE has excellent asymptotical properties:

Theorem
For i.i.d. samples with sufficient regularity and assuming consistency, the asymptotic distribution of the MLE is:

Where:

is the Fisher information.

So, for large sample sizes :

  • it is approximately normally distributed;
  • approximately unbiased;
  • approximately achieves the Cramer-Rao lower bound.

…What else?

What are those regularity conditions?

  • is an open subset of (so that it always make sense for an estimator to have symmetric distribution around ).
  • The support of is independent of (so that we can interchange integration and differentiation).
  • .
  • and .
  • .
  • and such that and:

Other properties

The MLE is equivariant, which is very convenient in practice.

Proposition: Equivariance of the MLE
MLEs are equivariant: let a bijection. If is the MLE of , then is the MLE of :