A Course on Numerical Linear Algebra¶
Instructor: Deniz Bilman
Textbook: Numerical Linear Algebra, by Trefethen and Bau.
Lecture 6: Projectors¶
A projector is a square matrix $P$ that satisfies
$$ P^2=P $$Such a matrix is also said to be idempotent. This definition includes both orthogonal projectors and nonorthogonal projectors.
Note that if $v\in\operatorname{range}(P)$, then $P v = v$. This is true because $v\in\operatorname{range}(P)$ means that $v=P x$ for some $x$ and $P v= P^2 x = P x = v$.
What about the remainder after projecting $v$ by a projector $P$? Let $v$ be arbitrary, but with $Pv \neq v$. We can draw the line from $v$ to $Pv$, which is in the direction of $Pv -v$. This should be the direction from which we are shining the light on $v$ (and $Pv$ is the "shadow"). Note that $Pv-v$ satisfies:
$$ P(Pv -v) = P^2 v - P v = 0. $$This shows that $Pv -v \in\operatorname{null}(P)$.
Complementary Projectors¶
The consideration above motivates us for to investigate $\mathbb{I}-P$. If $P$ is a projector, $\mathbb{I}-P$ is also a projector. See why:
$$ (\mathbb{I}-P)^2=\mathbb{I}-2 P+P^2=\mathbb{I}-P $$The matrix $\mathbb{I}-P$ is called the complementary projector to $P$.
It is easy to see that $\operatorname{range}(\mathbb{I}-P)\subseteq\operatorname{null}(P)$. More is true since $P x = 0 \implies x=(I-\mathbb{P})0$, so:
$$ \operatorname{range}(\mathbb{I}-P)=\operatorname{null}(P). $$.
We can write $P = \mathbb{I} -(\mathbb{I} - P)$ to obtain the complementary fact:
$$ \operatorname{null}(\mathbb{I}-P)=\operatorname{range}(P). $$.
Finally, suppose that $v\in \operatorname{range}(P)\cap \operatorname{null}(P)$. Then $v\in \operatorname{null}(\mathbb{I}-P)$ by the identity above. This yields $v = v - 0 = v- Pv = (\mathbb{I}-P)v = 0$. Therefore,
$$ \operatorname{range}(P) \cap \operatorname{null}(P)=\{0\} . $$So a projector separates $C^m$ into two (disjoint aside from 0) subspaces.
How about the converse question?
Let $S_1$ and $S_2$ be two subspaces of $\mathbb{C}^m$ such that $S_1 \cap S_2=\{0\}$ and $S_1+S_2 = \mathbb{C}^m$. Such a pair are said to be complementary subspaces. Then there exists a projection $P$ such that $\operatorname{range}(P) = S_1$ and $\operatorname{null}(P) = S_2$. This projector and its complement can be seen as the unique solution to the following problem:
Given $v$, find vectors $v_1 \in S_1$ and $v_2 \in S_2$ such that $v_1+v_2=v$.
The projection $P v$ gives $v_1$, and the complementary projection $(I-P) v$ gives $v_2$.
Question: Why does such a projection exist given the subspaces and why is it unique?
Since $S_1+S_2 = \mathbb{C}^m$, for any given $v\in\mathbb{C}^m$ there exists vectors $v_1 \in S_1$ and $v_2 \in S_2$ such that $v_1+v_2 = v$. The pair $v_1$ and $v_2$ are unique. To see this, suppose that $u_1\in S_1$ and $u_2 \in S_2$ are another such pair: $v=u_1+u_2$. Then $u_1-v_1 \in S_1$ and $u_2-v_2 \in S_2$, and we have
$$ 0 = v-v =(u_1-v_1) + (u_2-v_2). $$This implies that $u_1-v_1 = -(u_2-v_2)$. Therefore, $(u_2-v_2) \in S_1$ as well. Since $S_1\cap S_2 = \{0\}$, it follows that $u_2=v_2$. Similarly, $u_1=v_1$.
Since uniqueness of the pairs in the given complementary subspaces is in the pocket, we can define the map $v\mapsto P v$ by assigning $Pv := v_1$. Considering $v_1 \in \mathbb{C}^m$, we have the unique decomposition $v_1 = v_1 + 0$, where $v_1\in S_1$ and $0\in S_2$. Therefore $P^2 v = P v_1 = v_1 = P v$ and the idempotency of the map is satisfied. It is a little exercise to show that the $v\mapsto Pv$ is also a linear map. With these two properties, the map is a projection and we can consider $P$ to be its projection matrix representation.
Everything we have covered so far is for general projectors. We now consider more special projectors.
Orthogonal Projectors¶
An orthogonal projector is one that projects onto a subspace $S_1$ along a space $S_2$, where $S_1$ and $S_2$ are orthogonal complements of each other.
Warning: orthogonal projectors are not orthogonal matrices!
An eqiuvalent algebraic definition is the following:
An orthogonal projection is one that satisfies $P^*=P$.
These two definitions are equivalent:
Proposition:¶
A projection $P$ satisfies $P^* = P$ if and only if $(\operatorname{null}(P))^{\perp} = \operatorname{range}(P)$.
Proof:¶
($\Longrightarrow$) Assume $P^* = P$ for the projection matrix $P$. Let $u\in \operatorname{null}(P)$ and $v\in \operatorname{range}(P)$. Observe: $$ u^* v = \left((\mathbb{I}-P) w \right)^* P x $$ for some $w, x\in\mathbb{C}^m$. Then $$ u^* v = w^* (\mathbb{I}-P)^* P x = w^* (\mathbb{I}-P) P x = w^*(P x - P x) = 0. $$ This only shows that $\operatorname{null}(P)$ and $\operatorname{range}(P)$ are orthogonal spaces. But we also know that $\operatorname{null}(P)+\operatorname{range}(P)=\mathbb{C}^m$. Therefore, $(\operatorname{null}(P))^{\perp} = \operatorname{range}(P)$.
($\Longleftarrow$) Assume that $(\operatorname{null}(P))^{\perp} = \operatorname{range}(P)$ for the projection matrix $P$. Let $x,y\in\mathbb{C}^m$ be arbitrary. We write $x = P x + (\mathbb{I}-P)x$ and note that $(\mathbb{I}-P)x$ is orthogonal to $P y \in \operatorname{range}(P)$. Then \begin{align*} x^* P y &= (P x + (\mathbb{I}-P)x)^* P y\\ &= (P x)^* P y\\ &= (P x)^* (P y + (\mathbb{I}-P)y )\\ &= (P x)^* y &= x^* P^* y. \end{align*} We know that $P^*=P$ if and only if $x^* P y = x^* P^* y$ for all $x,y\in\mathbb{C}^m$, so we are done. $\blacksquare$
Proposition:¶
A nontrivial orthogonal projection $P$ has $\| P \|_2 = 1$.
Proof:¶
Let $u$ be in the range of $P$ so that $Pu = u$. Then by definition of the norm:
$$ \| P \|_2 = \sup_{\|x\|_2=1} \| P x \|_2 \geq \frac{\| P u \|_2}{\| u \|_2} = 1 $$For arbitrary $v\in\mathbb{C}^m$, $v\neq 0$, again write $v = P v + (\mathbb{I}-P)v$. Then using $P^*=P$, it follows that
$$ \|v\|_2^2 = \|P v\|_2^2 + \|(\mathbb{I}-P)v\|_2^2. $$Then we have
$$ \frac{\| P v \|_2}{\| v \|_2} = \frac{\| P v \|_2}{\sqrt{\|P v\|_2^2 + \|(\mathbb{I}-P)v\|_2^2}} \leq 1. $$We have arrived at
$$ 1\leq \| P \|_2 \leq 1 \implies \| P \|_2 =1. $$Exercise: Show that $\|A P+A(I-P)\|_F^2=\|A P\|_F^2+\|A(I-P)\|_F^2$.
Note that given an orthogonal projection $P\in\mathbb{C}^{m\times m}$, we can construct an SVD for it as follows. Let $\left\{q_1, \ldots, q_n\right\}$ is a basis for $S_1=\operatorname{range}(P)$ and $\left\{q_{n+1}, \ldots, q_m\right\}$ is a basis for $S_2=\operatorname{null}(P)$. Let $Q$ be the unitary matrix whose $j$-th column is $q_j$. Then
$$ [PQ]_{\colon,j} = \begin{cases} q_j,&\quad 1\leq j \leq n,\\ 0,&\quad n+1\leq j \leq m. \end{cases} $$This implies that
$$ Q^*PQ = \begin{bmatrix} 1 & & & & \\ & \ddots & & & \\ & & 1 & & \\ & & & 0 & \\ & & & & \ddots \end{bmatrix}=:\Sigma $$Thus,
$$ P = Q \Sigma Q^* $$is an SVD for $P$.
Constructing Orthogonal Projectors¶
First, an observation: We can switch to using the thin (reduced) version of the SVD above since the diagonal $m\times m$ matrix $\Sigma$ has a trailing zeros on the diagonal. If we truncate those $m-n$ zeros to obtain $\hat{\Sigma}=\mathbb{I}_{n\times n}$ and remove the associated columns from $Q$, we end up with the $m\times n$ matrix: $$ \hat{Q} := \begin{bmatrix} q_1 & q_2 & \cdots & q_n \end{bmatrix}. $$ This results in the quite simple expression: $$ P = \hat{Q} \hat{Q}^*. $$
We have seen that any $v\in\mathbb{C}^m$ can be written as
$$ v = r + \sum_{i=1}^n\left(q_i q_i^*\right) v, $$which is the decomposition of $v$ into a component in the column space of $\hat{Q}$ plus a component in the orthogonal space. Thus the map $$ v \mapsto \sum_{i=1}^n\left(q_i q_i^*\right) v $$ is an orthogonal projector onto range $(\hat{Q})$. Thus any product $\hat{Q} \hat{Q}^*$ is always a projector onto the column space of $\hat{Q}$, regardless of how $\hat{Q}$ was obtained, as long as its columns are orthonormal.
So, the matrix $\hat{Q}$ above need not come from an SVD.
In general, let $X\in\mathbb{C}^m$ be an arbitrary (nontrivial) subspace of dimension $r$ and construct and orthonormal basis $\{q_1,q_2,\ldots,q_r\}$ for $X$. Then construct the matrix $m\times m$ matrix $$ Q := \begin{bmatrix}q_1 & q_2 & \cdots & q_r & \underbrace{0_{m\times 1} \cdots 0_{m\times 1}}_{\text{may be omitted}}\end{bmatrix} $$
Then the orthogonal projection onto the space $X$ is given by
$$ P = Q Q^*. $$Note that we have $P^* = P$ as we should. Also:
$$ P^2 = Q Q^*Q Q^* = Q \begin{bmatrix} 1 & & & & \\ & \ddots & & & \\ & & 1 & & \\ & & & 0 & \\ & & & & \ddots \end{bmatrix} Q^* = Q Q^* = P. $$Rank-One Projections¶
A special case are rank-one orthogonal projections. Such a projection projects onto the direction of a single unit vector $q\in\mathbb{C}^m$ and it is given by
$$ P = q q^*. $$Its complement is given by $\mathbb{I}-qq^*$.
Neither the rank-one projection nor its complement are full-rank matrices. But we observe:;
$$ (\mathbb{I} - 2 q q^*)^* (\mathbb{I} - 2 q q^*) = (\mathbb{I} - 2 P)^*(\mathbb{I} - 2 P) = (\mathbb{I} - 2 P)(\mathbb{I} - 2 P) = \mathbb{I} -4P + 4P^2 = \mathbb{I} -4P + 4P = \mathbb{I}. $$One can similarly show that $(\mathbb{I} - 2 q q^*) (\mathbb{I} - 2 q q^*)^* = \mathbb{I}$ as well. Therefore $(\mathbb{I} - 2 q q^*)$ is a unitary matrix.
So far we have assumed that $q$ is a unit vector. For arbitrary nonzero vectors $a$, the analogous formulas are
$$ P_a=\frac{a a^*}{a^* a}\qquad\text{and}\qquad P_{\perp a}=I-\frac{a a^*}{a^* a}. $$Projection with an Arbitrary Basis¶
An orthogonal projector onto a subspace of $\mathbb{C}^m$ can also be constructed beginning with an arbitrary basis, not necessarily orthogonal.
Suppose that the subspace is spanned by the linearly independent vectors $\left\{a_1, \ldots, a_n\right\}$, and let $A$ be the $m \times n$ matrix whose $j$ th column is $a_j$.
The orthogonal projector $y$ of $v\in\mathbb{C}^m$ onto the column space of $A$ lies in $\operatorname{range}(A)$ and $y-v$ must be orthogonal to $\operatorname{range}(A)$. In other words,
$$ a_j^* (y-v) = 0\qquad\text{for every}\quad j=1,2,\ldots,n. $$We know that $y=Ax$ for some $x\in \mathbb{C}^{n}$, this reads
$$ a_j^* (Ax-v) = 0\qquad\text{for every}\quad j=1,2,\ldots,n. $$This is equivalent to
$$ A^* (Ax -v) = 0 \iff A^* A x = A^* v. $$Note that $A$ has full rank since its columns are linearly independent. It can be shown that (exercise!) $A^* A$ is nonsingular. Therefore,
$$ x = (A^* A)^{-1} A^* v. $$But recall that the projection $y$ of $v$ onto the columnspace of $A$ is $y=Ax$. So,
$$ y = A (A^* A)^{-1} A^* v $$So the desired projection matrix is $P=A (A^* A)^{-1} A^*$.
Remark:¶
The formula $P=A (A^* A)^{-1} A^*$ is a multidimensional generalization of the formula $P_a=\frac{a a^*}{a^* a}$.
Remark:¶
In the orthonotmal case, the central factor $(A^* A)^{-1}$ in the formula $P=A (A^* A)^{-1} A^*$ collapses since $A=\hat{Q}$ and we are left with $P=\hat{Q} \hat{Q}^*$.