## Introduction to PAC Learning

What is “learning” and do we have a formal model for it? I’ve decided to dive into the theoretical underpinnings of machine-learning, so here’s a quick introduction to...

## What is ridge regression?

A ridge regression is an OLS regression that uses L2-regularization.

## The effect of L2-regularization

In this article, we discuss the impact of L2-regularization on the estimated parameters of a linear model.

## What is regularization?

Regularization is a semi-automated method to manage overfitting. The core idea is to avoid overfitting by penalizing model complexity.

## Quote of the day

The problem of fitting a model to data differs from the problem of finding patterns that generalize to new data.

## Underfitting and overfitting

In this article, we define underfitting and overfitting

## What is a polynomial regression?

A polynomial regression is a linear regression where the input vectors have been preprocessed using polynomial basis expansion.

## Polynomial basis expansion

Polynomial basis expansion, also called polynomial features augmentation, is part of the machine-learning preprocessing. It consists in adding powers of the input’s components to the input vector.

## How to assess an OLS regression?

We’ve just fitted OLS to our trainset. How to assess whether it was a good model to use? We will answer this question from the point of view...

## The bias-variance-noise decomposition

The MSE loss is attractive because the expected error in prediction can be explained by the bias-variance of the model and the variance of the noise. This is...

## The bias-variance decomposition

The MSE loss is attractive because the expected error in estimation can be explained by the bias and the variance of the model. This is called the bias-variance...

## OLS regressions in simple terms

A least-squares regression, often called ordinary least squares (OLS), is a linear regression model that uses the mean squared-error loss function (MSE loss).

## OLS regressions from the probabilistic viewpoint

We will show that the loss function used by ordinary least-squares (OLS) stems from the statistical theory of maximum likelihood estimation applied to the normal distribution.

## Vector notation for linear regressions

A linear regression attempts to estimate an output value using a linear function. Those functions can be expressed concisely using the vector notations. In this article, we define...

## Linear regressions, the probabilistic viewpoint

General formulation

## Linear regressions in simple terms

A linear regression is a model used to predict the value of a (continuous) variable.

## Why there is more to classification than dicrete regression

In a classification problem, the dataset consists of pairs of input vectors and discrete labels :

## What is logistic regression?

A Logistic regression is a generalized linear model which is tailored to classification. In this article, we introduce this regression and explain its origin.

## What is a generalized linear model?

To understand what a generalized linear model does, let’s look back at linear models.

## Regression with squared error loss

In this article we study the solution to a regression with squared error loss. We start with the theoretical formulation before tackling the problem in practice.

## Understanding and solving the normal equations

The normal equations arise in several branches of mathematics, from statistics to geometry. In this article, we discuss how they emerge and how to solve them.

## What is stochastic gradient descent (SGD)?

Stochastic gradient descent is an algorithm that tries to find the minimum of a function expressed as a sum of component functions. It does so by choosing a...

## What is gradient descent (GD)?

Gradient descent is an optimization algorithm that tries to find the minimum of a function by following its gradient.

## Why do we care about convexity?

In machine learning, the best parameters for a model are chosen so as to minimize the training objective. Strictly convex functions are paticularly interesting because they have a...