Linear Algebra: X Least Squares Approximations

Least Squares Approximations


Consider a matrix system \(A \vec{x} = \vec{b}\) with an \(m \times n\) matrix \(A\) and an \(m\)-dimensional vector \(\vec{b}\).



Consistent Systems

A system is called consistent if there is at least one \(n\)-dimensional vector solution \(\vec{x}\). This means that \(\vec{b}\) is in the column space of \(A\):


\[ \vec{b} \in \mathrm{Col}(A) \]


A system is called inconsistent if there is no \(n\)-dimensional vector solution \(\vec{x}\). This means that \(\vec{b}\) is not in the column space of \(A\):


\[ \vec{b} \notin \mathrm{Col}(A) \]


If a system is consistent, let \(\vec{x}_p\) be a particular solution such that:


\[ A\vec{x}_p = \vec{b} \]


Then every solution has the form:


\[ \vec{x} = \vec{x}_p + \vec{x}_h, \quad \text{where} \quad \vec{x}_h \in \mathrm{Null}(A), \;\; A\vec{x}_h = \vec{0} \]



Overdetermined Systems

A system is called overdetermined if \(m > n\). This means that:


\[ \mathrm{rank}(A) \leq \min\{m,n\} = n \]


And so the left null space of \(A\) is nontrivial:


\[ \mathrm{dim}(\mathrm{Null}(A^\top)) = m - \mathrm{rank}(A) > 0 \]


Therefore, if an overdetermined system is consistent, then the rows of \(A\) are linearly dependent, with at least \(m - \mathrm{rank}(A)\) dependencies.



Underdetermined Systems

A system is called underdetermined if \(n > m\). This means that:


\[ \mathrm{rank}(A) \leq \min\{m,n\} = m \]


And so the null space of \(A\) is nontrivial:


\[ \dim(\mathrm{Null}(A)) = n - \mathrm{rank}(A) > 0 \]


Therefore, if an underdetermined system is consistent, then there are $n - \mathrm{rank}(A)$ free variables, so there exist infinitely many solutions.



Least Squares Projections

Suppose that the system $A \vec{x} = \vec{b}$ is inconsistent. Then $\vec{b} \notin \mathrm{Col}(A)$. However, since $\mathrm{Col}(A) = \mathrm{span}\{c_1, \ldots, c_n \}$ where $\vec{c}_i$ is a column of $A$ then we consider projecting $\vec{b}$ onto each $\vec{c}_i$ using a projection matrix:

$$\mathrm{Proj}_{\vec{c}_i}(\vec{b}) = \frac{\vec{c}_i\vec{c}_i^\top}{\vec{c}_i^\top\vec{c}_i} \vec{b}$$
However, each vector projection overlaps unless the columns of $A$ are orthogonal. Instead, we define the least squares approximation $\vec{x}^*$ such that:

$$ A \vec{x}^* = \mathrm{Proj}_{\mathrm{Col}(A)}(\vec{b}) $$
With error / residual vector:

$$ \vec{r} = \vec{b} - A\vec{x}^* $$
which is orthogonal to all of the column vectors of $A$:

$$ c_1^\top \vec{r} = 0, \quad \ldots, \quad c_n^\top \vec{r} = 0 $$
Therefore:

$$ A^\top \vec{r} = \vec{0} \quad \Rightarrow \quad A^\top (\vec{b} - A\vec{x}^*) = \vec{0} $$

$$ \Rightarrow \quad A^\top \vec{b} = A^\top A\vec{x}^* $$

$$ \Rightarrow \quad \vec{x}^* = (A^\top A)^{-1} A^\top \vec{b} $$
The least squares projection of $\vec{b}$ onto $\mathrm{Col}(A)$ is thus

$$ \mathrm{Proj}_{\mathrm{Col}(A)}(\vec{b}) = A \vec{x}^* = A (A^\top A)^{-1} A^\top \vec{b} $$
Where $P = A (A^\top A)^{-1} A^\top$ is called the least squares projection matrix where:

$$ \mathrm{rank}(P) = \mathrm{rank}(A) $$

Idempotent Property

The least squares projection matrix $P$ is idempotent, meaning that $P^2=P$:

$$ P^2 = [A (A^\top A)^{-1} A^\top][A (A^\top A)^{-1} A^\top] $$

$$ = A (A^\top A)^{-1} (A^\top A) (A^\top A)^{-1} A^\top = A (A^\top A)^{-1} A^\top = P $$


Symmetric Property

The least squares projection matrix $P$ is symmetric, meaning that $P^\top=P$:

$$ P^\top = [A (A^\top A)^{-1} A^\top]^\top $$
$$ = A [(A^\top A)^{-1}]^\top A^\top $$
$$ = A [(A^{-1}(A^\top)^{-1}]^\top A^\top $$
$$ = A [[(A^\top)^{-1}]^\top[A^{-1}]^\top A^\top $$
$$ = A [[A^{-1}][A^\top]^{-1} A^\top $$
$$ = A (A^\top A)^{-1} A^\top = P $$

QR Decomposition

If \( m \geq n \), then \( A \) can be decomposed as

$$ A = QR $$
where $Q$ is an $m$ by $n$ matrix with orthonormal columns ( \( Q^T Q = I \)) and $R$ is an $n$ by $n$ upper triangular matrix. Then,

$$ Q^\top r = 0 \quad \Rightarrow \quad Q^\top (QR\vec{x}^* - \vec{b}) = 0 $$

$$ = Q^\top Q R \vec{x} = Q^\top \vec{b} \quad \Rightarrow \quad R \vec{x}^* = Q^\top \vec{b} $$
Since $R$ is an upper triangular matrix, then we can solve for the least squares approximation using backward substitution.

Comments

Popular posts from this blog

Linear Algebra: IX The Fundamental Theorem of Linear Algebra

Linear Algebra: III Square and Inverse Matrices

Linear Algebra: VII Orthonormal Matrices