A Course on Numerical Linear Algebra¶
Instructor: Deniz Bilman
Textbook: Numerical Linear Algebra, by Trefethen and Bau.
Lecture 8: Gram-Schmidt Orthogonalization¶
We have already mentioned that the classical Gram-Schmidt iteration is numerically unstable.
At each step, the classicial Gram-Schmidt iteration computes only one orthogonal projection:
Let $P_j$ denote the projection $$ P_j = \mathbb{I} - \hat{Q}_{j-1} \hat{Q}_{j-1}^*,\qquad \hat{Q}_j := \begin{bmatrix} q_1 & q_2 & \cdots & q_{j-1} \end{bmatrix}, $$ for orthonormal columns $q_1, q_2,\ldots$. At each step, we compute $v_j := P_j a_j$ for $a_j \in \mathbb{C}^m$, which is a rank $m-(j-1)$ projector: It projects $a_j$ onto the orthogonal complement of the span of all the previous $q_j$'s. Here $P_1=\mathbb{I}$.
using LinearAlgebra
m = 8
j = 4
# Quick way to get a random Qj with j orthonormal columns.
Qj = Matrix(qr(rand(m, j)).Q)
Qj |> display
# Form the projector onto the orthogonal complement of the columns of Qj
Pj = Matrix(I, m, m) - Qj * Qj'
Pj |> display
# Note that the singular values of Pj are all 1s (or 0s).
σ = svd(Pj).S
8×4 Matrix{Float64}: -0.213047 -0.202534 -0.245694 -0.447233 -0.151908 -0.519449 -0.211781 0.146222 -0.0896999 -0.579809 -0.285335 0.043158 -0.429645 -0.0615816 0.0251773 -0.436842 -0.676201 0.285642 -0.340633 0.173449 -0.34281 -0.157327 0.771063 -0.260595 -0.398918 0.230798 0.11058 0.484822 -0.0706536 -0.435399 0.299895 0.502854
8×8 Matrix{Float64}: 0.653208 -0.124208 -0.187345 … -0.032 0.205753 0.19534 -0.124208 0.640865 -0.381546 0.0676021 0.0118156 -0.246917 -0.187345 -0.381546 0.572497 0.109289 0.108664 -0.194917 -0.293191 -0.0280471 -0.0482074 -0.290227 0.0518259 0.154948 -0.0923295 -0.051846 0.000282129 0.12098 -0.382099 0.0915265 -0.032 0.0676021 0.109289 … 0.195282 -0.0593648 -0.192917 0.205753 0.0118156 0.108664 -0.0593648 0.540316 -0.204653 0.19534 -0.246917 -0.194917 -0.192917 -0.204653 0.462637
8-element Vector{Float64}: 1.0000000000000002 1.0000000000000002 1.0 1.0 4.505279385120622e-16 3.9003634431939383e-16 3.487356037184413e-16 1.1660350749431453e-16
Modified Gram-Schmidt Iteration¶
We can express the projection $P_j$ as the product
$$ P_j = P_{\perp q_{j-1}} \cdots P_{\perp q_2} P_{\perp q_1}. $$# Initialize the product to be the identity matrix
Xj = Matrix(I,m,m)
# Take the products to form the Pj as given above.
for k = 1:j
Xj = Xj*(Matrix(I,m,m))-Qj[:,k]*Qj[:,k]'
end
norm(Xj-Pj)
2.1698319048434277e-16
This works simply because $q_k$'s are orthonormal --- just check the cross-terms you get and see for yourself.
Modified Gram-Schmidt Iteration (MGS) in principle consists of the same computations as the classical Gram-Schmidt (CGS). But, it performs them in a slightly different order.
In CGS we orthogonalize a column $a_j$ against column $q_k$ ($k<j$) by projecting the original column $a_j$ on the $q_k$ and subtracting:
\begin{align*} v_j &:= a_j - \sum_{k=1}^{j-1} (q_k^* a_j)q_k\\ q_j &:= \frac{v_j}{\|v_j\|_2}. \end{align*}In MGS we subtract the components along each $q_k$ immediately from the rest of the columns $a_i$ to the right of column $k$ (i>k) --- the remaining columns given as soon as $q_k$ is computed.
This means that the orthogonalization of column $a_j$ against $q_k$ is carried out by subtracting the component of $q_k$ from the vector $v_j^{k}$, where
\begin{align*} v_{j}^{(1)}&:=a_j\\ v_{j}^{(2)}&:=P_{\perp q_1}v_{j}^{(1)} = v_{j}^{(1)} - \left(q_1^* v_{j}^{(1)}\right)q_1 \\ v_{j}^{(3)}&:=P_{\perp q_2}v_{j}^{(2)} = v_{j}^{(2)} - \left(q_2^* v_{j}^{(2)}\right)q_2 \\ &\cdots \\ v_j := v_{j}^{(j)}&:=P_{\perp q_{j-1}}v_{j}^{(j-1)} = v_{j}^{(j-1)} - \left(q_{j-1}^* v_{j}^{(j-1)}\right)q_{j-1}. \end{align*}Why is the roundoff error smaller in MGS?
Note that $q_k$ has an error that lies in the subspace $\operatorname{span}(\{q_1,q_2,\ldots, q_{k-1}\})$. $a_j$ may have large components in $\operatorname{span}(\{q_1,q_2,\ldots, q_{k-1}\})$. However, $v_j^{k}$ has only error-size components in $\operatorname{span}(\{q_1,q_2,\ldots, q_{k-1}\})$ by design in MGS. Therefore, the inner product $q_k^* a_j$ result in larger errors compared to the inner product $q_k^* v_j^{(k)}$.
Let's see this in action.
# Implementing Classical Gram-Schmidt
function classical_gram_schmidt(A)
m,n = size(A)
Q = zeros(m,n)
R = zeros(n,n)
R[1,1] = norm(A[:,1]);
Q[:,1] = A[:,1]/R[1,1];
for j = 2:n
R[1:j-1,j] = Q[:,1:j-1]'*A[:,j];
v = A[:,j] - Q[:,1:j-1]*R[1:j-1,j];
R[j,j] = norm(v);
Q[:,j] = v/R[j,j];
end
return Q
end
A = round.(10*rand(8,8))
Qcgs = classical_gram_schmidt(A)
# Just check orthogonality
m,m = size(Qcgs)
Matrix(I, m, m)-Qcgs*Qcgs'
8×8 Matrix{Float64}: 1.19904e-14 -4.2805e-15 -3.28134e-15 … -3.89261e-15 -3.64241e-15 -4.2805e-15 -6.88338e-15 -4.74382e-15 6.76964e-16 -3.67212e-15 -3.28134e-15 -4.74382e-15 -3.10862e-15 1.0424e-15 -8.6227e-16 -2.97893e-15 -1.4617e-16 -8.12949e-16 1.09602e-15 1.02259e-15 7.31927e-15 6.41607e-15 5.29492e-15 -2.09818e-15 2.12093e-15 7.62787e-15 -1.53469e-15 -1.87155e-15 … -2.74933e-15 -1.49055e-15 -3.89261e-15 6.76964e-16 1.0424e-15 1.33227e-15 1.07084e-15 -3.64241e-15 -3.67212e-15 -8.6227e-16 1.07084e-15 -4.44089e-16
# Modified Gram-Schmidt (MGS)
function modified_gram_schmidt(A)
m, n = size(A)
Q = copy(A)
for k in 1:n
Q[:, k] /= norm(Q[:, k]) # Normalize
for j in k+1:n
Q[:, j] -= (Q[:, k]' * Q[:, j]) * Q[:, k] # Orthogonalization step
end
end
return Q
end
A = round.(10*rand(8,8))
Qmgs = modified_gram_schmidt(A)
# Just check orthogonality
m,m = size(Qmgs)
Matrix(I, m, m)-Qmgs*Qmgs'
8×8 Matrix{Float64}: 0.0 1.70104e-16 5.26442e-16 … -3.24987e-16 5.1595e-16 1.70104e-16 7.77156e-16 -3.24783e-16 4.42414e-18 1.00732e-15 5.26442e-16 -3.24783e-16 1.11022e-16 2.68324e-16 -5.42544e-16 5.24078e-16 2.70703e-16 -4.0924e-18 2.76845e-16 4.70332e-16 4.3916e-16 1.46378e-16 -3.08433e-16 2.29891e-16 -3.24684e-16 5.83465e-16 -1.15728e-15 -2.75138e-16 … 4.24259e-16 -1.80613e-15 -3.24987e-16 4.42414e-18 2.68324e-16 -2.22045e-16 5.43335e-16 5.1595e-16 1.00732e-15 -5.42544e-16 5.43335e-16 8.88178e-16
We want $\mathbb{I} - Q Q^* =0$. We will check this using Hilbert matrices (homework: read about them.)
# Let's compare for matrices of growing sizes.
function hilbert_matrix_reg(n)
H = [1.0/(i+j-1) for i=1:n, j=1:n] + 0.001 * I
return H
end
function eye(n)
Matrix{Float64}(I, n, n)
end
hilbert_matrix_reg(10)
err_cgs = []
err_mgs = []
p_end = 10
for p=1:p_end
H = hilbert_matrix_reg(2^p)
id = eye(2^p)
Qcgs = classical_gram_schmidt(H)
Qmgs = modified_gram_schmidt(H)
append!(err_cgs, norm(id - Qcgs*Qcgs', Inf))
append!(err_mgs, norm(id - Qmgs*Qmgs', Inf))
end
using Plots
mech = eps(Float64)
plist = [2^p for p=1:p_end]
plot(plist, err_cgs, label="CGS Error", lw=2, marker=:o, ylabel="Error", xlabel="n", yscale=:log10)
plot!(plist, err_mgs, label="MGS Error", lw=2, marker=:s, linestyle=:dash, yscale=:log10)
┌ Warning: attempting to remove probably stale pidfile │ path = /Users/denizbilman/.julia/compiled/v1.10/Plots/ld3vC_iT3rh.ji.pidfile └ @ FileWatching.Pidfile /Users/denizbilman/.julia/juliaup/julia-1.10.2+0.aarch64.apple.darwin14/share/julia/stdlib/v1.10/FileWatching/src/pidfile.jl:244