December 2013 – The Research Kitchen

Here’s a simple and intuitive way of looking at the geometry of a least squares regression:

Take the bottom left point in the triangle below as the origin O. For the linear model:

$$ Y=X\beta + \epsilon $$

Both $Y$ and $X\beta$ are vectors, and the residual vector $\epsilon$ is the difference. The standard least squares error technique uses $\epsilon^2$ or $(Y-X\beta)^T(Y-X\beta)$ as the error measure to be minimised, and this leads to the calculation of the $\beta$ coefficient vector.

Geometrically, the beta coefficients calculated by the least squares regression minimise the squared length of the error vector. This turns out to be the projection of $Y$ on to $X\beta$ – i.e. the perpendicular vector that turns (O, $Y$, $X\beta$) into a right-angled triangle.

The projection of $Y$ onto $X\beta$ is done using the projection matrix P, which is defined as

\[ P = X\left(X^{T}X\right)^{-1}X^{T} \]

So $ X\beta = \hat{Y} = PY $.

Using the Pythagorean theorem:

$ Y^TY = \hat{Y}^T\hat{Y} + (Y-X\beta)^T(Y-X\beta) $

In other words, the total sum of squares = sum of squares due to regression + residual sum of squares. This is a fundamental part of analysis of variance techniques.