Machine-learning 5 minutes

# Vector notation for linear regressions

A linear regression attempts to estimate an output value using a linear function. Those functions can be expressed concisely using the vector notations. In this article, we define the design matrix and the output vector for a linear regression.

Let $\sets$ be our dataset, made of $\sn$ records $(\vx_\si, \sy_\si)$ where $\vx_\si = (\sx_{\si1}, \dotsc, \sx_{\si\sd})$ is an input vector and $\sy_\si$ is an output value:

Our goal is to approximate $\sy_\si$ with a linear function of $\vx_\si$. Let’s note $\vw = (\sw_0, \dotsc, \sw_\sd)$ the parameters of the linear function. We want:

The first step towards vector notations is to note $\sx_{\si0} = 1$. The approximation becomes:

And we can start the sum symbol $\sum$ at $0$:

If we note $\vy = (\sy_1, \dotsc, \sy_\sn)$, we can stack all those approximations for $\si = 1, \dotsc, \sn$ into a vector equation:

The expression above is exactly the definition of a matrix product. Let $\mx$ be the design matrix, which is defined as follow:

It is the matrix whose $\si$-th row is the vector $(1, \vx_\si)$ :

The approximation can be written as a matrix product:

Which is the vector notation for a linear regression. The vector $\vy$ is named the output vector and the matrix $\mx$ is the design matrix. (And $\vw$ is the vector of parameters).