Machine-learning 5 minutes

# What is regularization?

Regularization is a semi-automated method to manage overfitting. The core idea is to avoid overfitting by penalizing model complexity.

Regularization improves a model performance by decreasing variance at the cost of increasing a bit of bias but to a lesser extent.

## General shape

The general shape of a regularized loss function $\l_\text{reg}$ is the following:

Where $\sets$ is the dataset on which we compute the loss and $\vw$ is the vector of parameters for our model.

## The hyper-parameter

The hyper-parameter $\lambda$ allows us to control the importance of the regularization. When $\lambda = 0$ the regularizer is canceled and we get the unregularized solution. When $\lambda \to \infty$, we get an intercept-only model.

TODO graph the curve as lambda variates.

## A word of caution

It is important to normalize the features before using regularization. Failure to do so will yield incoherent regularization behavior.

## Common regularizers

### The L2 norm

An ordinary least squares regression with $L_2$ regularization is named a ridge regression.

### The L1 norm

An ordinary least squares regression with $L_1$ regularization is named a lasso regression.