A Course on Numerical Linear Algebra¶
Instructor: Deniz Bilman
Textbook: Numerical Linear Algebra, by Trefethen and Bau.
Lecture 2: Orthogonal Vectors and Matrices¶
Orthogonality is a key notion underlying many algorithms developed since 1960s.
Adjoint¶
For a complex scalar $z$, its complex conjugate is denoted by $z^*$ (or sometimes by $\bar{z}$). For real scalar $z$, we have $z=z^*$.
For a matrix $A$, we denote by $\bar{A}$ the entry-wise complex conjugate. The Hermitian conjugate or adjoint of a matrix $A\in\mathbb{C}^{m\times n}$ is denoted by $A^*$. $A^*$ is simply the composition of the transpose with entry-wise complex conjugation:
$$ A^* = (\bar{A})^{\top} = \overline{(A^{\top})}. $$
Therefore, for real matrices adjoint is just the transpose.
For matrices with compatible dimensions
$$ (A B)^* = B^* A^*. $$
A = rand(2,3) + 1im*rand(2,3)
2×3 Matrix{ComplexF64}: 0.272235+0.704558im 0.407994+0.164823im 0.586908+0.422778im 0.975727+0.124496im 0.461125+0.0969386im 0.410563+0.871246im
# Get the Hermitian by '
A'
3×2 adjoint(::Matrix{ComplexF64}) with eltype ComplexF64: 0.272235-0.704558im 0.975727-0.124496im 0.407994-0.164823im 0.461125-0.0969386im 0.586908-0.422778im 0.410563-0.871246im
# To get just the transpose (without complex conjugation), use transpose"
transpose(A)
3×2 transpose(::Matrix{ComplexF64}) with eltype ComplexF64: 0.272235+0.704558im 0.975727+0.124496im 0.407994+0.164823im 0.461125+0.0969386im 0.586908+0.422778im 0.410563+0.871246im
If $A = A^*$, then the matrix $A$ is called a Hermitian matrix. By definition, a hermitian matrix must be square.
Inner Product¶
The inner product of two column vectors $x,y\in\mathbb{C}^m$ is $$ x^* y = \sum_{i=1}^m x_i^* y_i. $$ This is a bilinear operation: $$ \begin{aligned} \left(x_1+x_2\right)^* y & =x_1^* y+x_2^* y \\ x^*\left(y_1+y_2\right) & =x^* y_1+x^* y_2 \\ (\alpha x)^*(\beta y) & ={\alpha}^* \beta x^* y \end{aligned} $$
The availability of an inner product lets one compute the Euclidian length of a vector: $$ \|x\|_2 := \sqrt{x^* x} = \left(\sum_{i=1}^m x_i^* x_i\right)^{1/2} = \left(\sum_{i=1}^m |x_i|^2\right)^{1/2}. $$
Inner product also lets us define the (cosine of the) angle $\alpha$ between two vectors
$$ \cos(\alpha) = \frac{x^* y}{\|x\|_2 \|y\|_2}. $$
using LinearAlgebra
u = [1.0-2im ; 4. ; 3im]
v = [2.0 ; 1.0+3im ; 2-1im]
u'*v|>display
dot(u,v)|>display
3.0 + 10.0im
3.0 + 10.0im
Orthogonal Vectors¶
A pair of vectors $x$ and $y$ are orthogonal if $x^* y=0$. If $x$ and $y$ are real, this means they lie at right angles to each other in $\mathbb{R}^m$.
Two sets of vectors $X$ and $Y$ are orthogonal (also stated " $X$ is orthogonal to $Y$ ") if every $x \in X$ is orthogonal to every $y \in Y$.
A set of nonzero vectors $S$ is orthogonal if its elements are pairwise orthogonal, i.e., if for $x, y \in S$:
$$ x \neq y \quad \implies \quad x^* y=0.$$
A set of vectors is orthonormal if it is orthogonal and, in addition, every $x \in S$ has $\|x\|_2=1$.
Theorem¶
The vectors in an orthogonal set $S$ are linearly independent.
Proof¶
Done in class.
Components of a Vector¶
Important idea: Inner products can be used to decompose arbitrary vectors into orthogonal components.
Consider an orthonormal set of $n$ vectors in $\mathbb{C}^m$ and place them as columns of a matrix $Q\in\mathbb{C}^{m\times n}$.
Let $u\in\mathbb{C}^m$ be an arbitrary vector (same dimension as the columns of $Q$). Note that $Q^* \in \mathbb{C}^{n\times m}$ and the product
$$ Q^* u = \text{inner product of each column of $Q$ with $u$} = \begin{bmatrix} Q_{:,1}^* u & Q_{:,2}^* u & \cdots & Q_{:,n}^* u\end{bmatrix} =: c^\top $$
Use these as a coefficient vector $c\in \mathbb{C}^n$ to compute a vector $v$ in the column space (range) of $Q$:
$$ v := Q c $$
Exercise: Show that $ r:= u - v $ is orthogonal to the columns of $Q$.
# Don't worry why we create the matrix Q in the following way for now. We will learn this later.
using LinearAlgebra
Q = qr(rand(5,3)).Q
Q|>display
# This is a memory-efficient representation that avoids explicitly storing Q as a dense matrix, particularly for large matrices. To show Q as a matrix
Q = LinearAlgebra.QRCompactWYQ{Float64, Matrix{Float64}, Matrix{Float64}}([-1.5042237024117984 -1.5233319076509289 -1.0141581357481837; 0.38947696375229135 -0.42537298775290633 -0.26122184831609024; 0.5713993434650161 -0.16916686893302924 0.32382032757517315; 0.5170394749538952 -0.24292887973169003 0.8631803582871568; 0.36260581080287513 -0.3075144449099935 -0.33531887643998015], [1.0655287912013849 -0.10041481682582896 -1.057882763469496; 2.1290610246e-314 1.6917654107274496 0.5022745870539044; 2.129381772e-314 2.129722456e-314 1.0767049564752504])
# To display Q as a matrix, do:
Q = Matrix(Q)
5×3 Matrix{Float64}: -0.0655288 -0.314584 0.432053 -0.414999 -0.814289 -0.0478091 -0.608842 0.106438 0.206724 -0.55092 0.248326 -0.653509 -0.386367 0.406172 0.584154
println("Columns of Q form an orthonormal set:")
Q'*Q
Columns of Q form an orthonormal set
3×3 Matrix{Float64}: 1.0 -3.30715e-18 -3.98049e-18 -3.30715e-18 1.0 5.44146e-17 -3.98049e-18 5.44146e-17 1.0
u = rand(5,1);
c = Q'*u
v = Q*c
r = u-v
5×1 Matrix{Float64}: 0.3679180077413129 -0.09948492266489017 -0.1571513261632806 0.20307405090837333 0.002535581538666294
Q'*r
3×1 Matrix{Float64}: -4.6365479381439955e-17 1.7057111506049572e-16 2.0089179091650116e-16
A useful way to see this procedure: We have decomposed the arbitrary vector $u$ into the sum of two orthogonal vectors $u = r + v$, where $v$ lies in $\operatorname{range}(Q)$ and $r$ lies in the orthogonal complement of $\operatorname{range}(Q)$.
v'*r
1×1 Matrix{Float64}: -9.655327741791721e-17
Unitary Matrices¶
A square matrix $Q \in \mathbb{C}^{m \times m}$ is unitary (in the real case, we also say orthogonal) if $Q^*=Q^{-1}$, i.e, if $Q^* Q=I$.
In terms of the columns of $Q$, we have that $Q_{:,j}^* Q_{:,k} = \delta_{jk}$, where $\delta_{jk}$ is the Kronecker delta.
Fact: The columns of a unitary matrix $Q \in \mathbb{C}^{m \times m}$ form an orthonormal basis of $\mathbb{C}^m$.
# We construct a 5x5 unitary random matrix again in the for-now mysterious way
Q = qr(rand(5,5)+1im*rand(5,5)).Q
Q = Matrix(Q)
5×5 Matrix{ComplexF64}: -0.413598-0.405449im -0.0751956+0.129571im … 0.109026-0.405464im -0.427271-0.0658705im -0.287504+0.58702im 0.294001+0.304138im -0.281873-0.433991im 0.447914-0.31221im -0.122117+0.398274im -0.223619-0.0343334im 0.479359-0.036455im 0.157314+0.278769im -0.375099-0.134032im -0.0407248-0.139396im -0.39323-0.462765im
abs.(inv(Q) - Q')
5×5 Matrix{Float64}: 2.22045e-16 1.47523e-16 1.11022e-16 5.72196e-17 7.85046e-17 4.2367e-16 4.57757e-16 5.55112e-17 5.72196e-17 1.24127e-16 2.16778e-16 1.88758e-16 3.34221e-16 1.38778e-16 1.61841e-16 1.11022e-16 3.388e-16 1.44389e-16 1.00074e-16 5.55112e-17 3.97399e-16 2.98937e-16 3.33356e-16 1.14439e-16 2.98937e-16
Recall the decomposition above. We know $\operatorname(Q)=m$ since the columns are linearly independent (by the above theorem). So an arbitrary vector $u\in\mathbb{C}^m$ lies in the column space $\operatorname(range)(Q)$. Therefore, in the above decomposition $u=r+v$, we have that $v=u$ and $r=0$. Let's verify:
c = Q'*u
v = Q*c
r = u - v
5×1 Matrix{ComplexF64}: -2.220446049250313e-16 - 3.877973513332126e-17im 1.1102230246251565e-16 + 4.147879758000911e-17im -5.551115123125783e-17 - 1.6207264158802483e-17im 0.0 + 3.8369045680730934e-17im -1.0408340855860843e-16 - 3.492531187816855e-17im
Multiplication by a unitary matrix preserves the geometry (angles) and also the length. We have $$ (Q x)^* (Q y) = x^* y $$ for any $x,y\in\mathbb{C}^m$. Therefore, $$ \| Q x \|_2 = \|x\|_2. $$