Machine-learning 5 minutes

# OLS regressions in simple terms

A least-squares regression, often called ordinary least squares (OLS), is a linear regression model that uses the mean squared-error loss function (MSE loss).

This article is a sequel to our previous article: Linear regression in simple terms. We will use the same notations and start where we left.

The loss function is the MSE loss, defined by:

The best vector of parameters can be found by minimizing this loss function using an optimization algorithm, or using the analytic solution that we will introduce now.

## Finding the parameters

To express the analytic solution, we need the vector notations. For a dataset $\sets$, let $\mx_\sets$ be the corresponding design matrix and $\vy_\sets$ the output vector.

With these notations, the predictions for the set $\sets$ are in the predicted vector $\hat{\vy}_\sets$:

and the loss function is writen:

To shorten notations, let $\mx = \mx_{\trainset}$ and $\vy = \vy_{\trainset}$ be the design matrix and output vector for our training-set. Provided $\mx^{\top}\mx$ is invertible, the vector $\hat{\vw}$ that minimizes this loss is: