A Course on Numerical Linear Algebra¶

Instructor: Deniz Bilman

Textbook: Numerical Linear Algebra, by Trefethen and Bau.

Lecture 28: QR Algorithm without Shifts¶

The QR algorithm as a method to compute the eigenvalues of a square matrix dates back to 1960s (John Francis and Vera Kublanovskaya). In its simplest form, it goes like this:

Algorithm 28.1: "Pure" QR Algorithm:

$A^{(0)}:=A$
for $k=1,2,\ldots$
- $Q^{(k)} R^{(k)} = A^{(k-1)}$ (Obtain the QR factorization)
- $A^{(k)} = R^{(k)} Q^{(k)}$ (Recombine in reverse order)

A termination condition can be chosen through a threshold on the Frobenius norm of the lower-triangular part of $A^{(k)}$.

Note that this procedure preserves eigenvalues of $A$ because

$$ \begin{aligned} A^{(k-1)} = Q^{(k)} R^{(k)} &\iff R^{(k)} = (Q^{(k)})^* A^{(k-1)} \\ &\iff R^{(k)} Q^{(k)} = (Q^{(k)})^* A^{(k-1)} Q^{(k)} \iff A^{(k)} = (Q^{(k)})^* A^{(k-1)} Q^{(k)} \end{aligned} $$

which shows that $A^{(k)}=R^{(k)} Q^{(k)}$ is unitarily equivalent to $A^{(k-1)}$. Therefore, they have the same eigenvalues.

Under suitable assumptions, this simple algorithm converges to a Schur form for the matrix $A$: It converges to an upper-triangular matrix if $A$ is arbitrary and it converges to a diagonal matrix if $A$ is Hermitian. In any case, the eigenvalues of $A$ of course end up in the diagonal of the resulting matrix.

In [4]:

using Random, LinearAlgebra, Plots

# Set random seed for reproducibility
Random.seed!(1234)

# Generate a 500x500 random matrix
A = randn(500, 500)

# Number of iterations for the QR algorithm
num_iters = 100

# Create an animation object
anim = Animation()

# Reset A and run the animation with fixed color limits
A = randn(300, 300)
# A = A'*A

for i in 1:num_iters
    Q, R = qr(A)
    A = R * Q

    # Generate heatmap with fixed color limits
    heatmap(A, color=:acton, clim=(0, 1), title="Iteration $i", 
            axis=true, colorbar=true, yflip=true)

    frame(anim)  # Save the current frame
end

# Save as an animation
gif(anim, "qr_algorithm_convergence.gif", fps=10)

┌ Info: Saved animation to /Users/denizbilman/Library/CloudStorage/Dropbox/numerical-linear-algebra/notebooks/qr_algorithm_convergence.gif
└ @ Plots /Users/denizbilman/.julia/packages/Plots/dIjan/src/animation.jl:156

No description has been provided for this image

In [5]:

using Random, LinearAlgebra, Plots

# Set random seed for reproducibility
Random.seed!(1234)

# Generate a 500x500 random matrix
A = randn(500, 500)

# Number of iterations for the QR algorithm
num_iters = 100

# Create an animation object
anim = Animation()

# Reset A and run the animation with fixed color limits
A = randn(300, 300)

# Hermitian matrix now:
A = A'*A

for i in 1:num_iters
    Q, R = qr(A)
    A = R * Q

    # Generate heatmap with fixed color limits
    heatmap(A, color=:acton, clim=(0, 1), title="Iteration $i", 
            axis=true, colorbar=true, yflip=true)

    frame(anim)  # Save the current frame
end

# Save as an animation
gif(anim, "qr_algorithm_convergence-hermitian.gif", fps=10)

┌ Info: Saved animation to /Users/denizbilman/Library/CloudStorage/Dropbox/numerical-linear-algebra/notebooks/qr_algorithm_convergence-hermitian.gif
└ @ Plots /Users/denizbilman/.julia/packages/Plots/dIjan/src/animation.jl:156

We have seen in the last lecture that the Householder triangularization idea to transform a given matrix to Hessenberg from (the initial step of computing Schur factorization) was a bad idea --- it simply didn't work. But it turns out to be a quite powerful transformation when employed as the basis of iteration.

Like the Rayleigh quotient iteration, the QR algorithm for real symmetric matrices converges cubically.

To achieve this performance, however, the algorithm as presented above must be modified.

There are several modifications need to be made. The first one is introducing "shifts".

Algorithm 28.2: "Practical" QR Algorithm

$\left(Q^{(0)}\right)^\top A^{(0)} Q^{(0)}=A \qquad (A^{(0)}$ is a tridiagonalization of $A$).
for $k=1,2,\ldots$
- Pick a shift $\mu^{(k)} $ (e.g., choose $\mu^{(k)}=A_{m m}^{(k-1)}$).
- $Q^{(k)} R^{(k)}=A^{(k-1)}-\mu^{(k)} I$ (QR factorization of $A^{(k-1)}-\mu^{(k)} I$).
- $A^{(k)}=R^{(k)} Q^{(k)}+\mu^{(k)} I $ (Recombine factors in reverse order).
- If any off-diagonal element $A_{j, j+1}^{(k)}$ is sufficiently close to zero,
  - set $A_{j, j+1}=A_{j+1, j}=0$ to obtain
  - $\left[\begin{array}{cc}A_1 & 0 \\ 0 & A_2\end{array}\right]=A^{(k)}$
  - and now apply the QR algorithm to $A_1$ and $A_2$.

This QR algorithm with well-chosen shifts has been the standard method for computing all the eigenvalues of a matrix since the early 1960s. Only in the 1990s has a competitor emerged, the divide-and-conquer algorithm described in Lecture 30 in [Trefethen & Bau].

Simultaneous Iteration¶

Here the idea is to perform power iteration simultaneously on a set of linearly indepdendent vectors.

$A$ is $m\times m$.

Pick $n$ linearly independent vectors $v^{(0)}_1, v^{(0)}_2, \ldots, v^{(0)}_n$, $n<m$.

We expect that $A^k v_1^{(0)}$ converges as $k \rightarrow \infty$ (under suitable assumptions) to the eigenvector corresponding to the largest eigenvalue of $A$ in absolute value.

We also expect that the space $\left\langle A^k v_1^{(0)}, \ldots, A^k v_n^{(0)}\right\rangle$ should converge (again under suitable assumptions) to the space $\left\langle q_1, \ldots, q_n\right\rangle$ spanned by the eigenvectors $q_1, \ldots, q_n$ of $A$ corresponding to the $n$ largest eigenvalues in absolute value.

We assume that the first $n+1$ eigenvalues of $A$ are distinct in absolute value:

$$\left|\lambda_1\right|>\left|\lambda_2\right|>\cdots>\left|\lambda_n\right|>\left|\lambda_{n+1}\right| \geq\left|\lambda_{n+2}\right| \geq \cdots \geq\left|\lambda_m\right|.$$

We denote the iterates by

$$ v^{(k)}_\ell = A^k v^{(0)}_\ell,\qquad \ell = 1 , 2,\ldots,n, $$

and write

$$ V^{(k)}=A^k V^{(0)}=\left[\begin{array}{l|l|l} v_1^{(k)} & \cdots & v_n^{(k)} \end{array}\right] . $$

Since our interest is in the column space of $V^{(k)}$, let us extract a well-behaved basis for this space by computing a reduced QR factorization of $V^{(k)}$ :

$$ \hat{Q}^{(k)} \hat{R}^{(k)}=V^{(k)} . $$

Here $\hat{Q}^{(k)}$ and $\hat{R}^{(k)}$ have dimensions $m \times n$ and $n \times n$, respectively. It seems plausible that as $k \rightarrow \infty$, under suitable assumptions, the successive columns of $\hat{Q}^{(k)}$ should converge to the eigenvectors $\pm q_1, \pm q_2, \ldots, \pm q_n$.

This actually goes through in principle (exact arithmetic).

As $k \rightarrow \infty$, the vectors $v_1^{(k)}, \ldots, v_n^{(k)}$ in the algorithm all converge to multiples of the same dominant eigenvector $q_1$ of $A$. Thus, although the space they span, $\left\langle v_1^{(k)}, \ldots, v_j^{(k)}\right\rangle$, converges to something useful, these vectors constitute a highly ill-conditioned basis of that space.

If we actually carried out simultaneous iteration in floating point arithmetic as just described, the desired information would quickly be lost to rounding errors.

How to get ourselves out of this trouble?

We orthonormalize at each step rather than once and for all. We do not construct $V^{(k)}$ as defined above, but a different sequence of matrices $Z^{(k)}$ with the same column spaces.

Algorithm 28.3 Simultaneous Iteration

Pick $\hat{Q}^{(0)} \in \mathbb{R}^{m \times n}$ with orthonormal columns.
for $k=1,2,\ldots$
- $Z=A \hat{Q}^{(k-1)}$
- $\hat{Q}^{(k)} \hat{R}^{(k)}=Z$ (reduced QR factorization of $Z$)

From the form of this algorithm, it is clear that the column spaces of $\hat{Q}^{(k)}$ and $Z^{(k)}$ are the same, both being equal to the column space of $A^k \hat{Q}^{(0)}$. Thus, mathematically speaking, this new formulation of simultaneous iteration converges under the same conditions as the old one.

The basic assumptions are:

$$ \left|\lambda_1\right|>\left|\lambda_2\right|>\cdots>\left|\lambda_n\right|>\left|\lambda_{n+1}\right| \geq\left|\lambda_{n+2}\right| \geq \cdots \geq\left|\lambda_m\right| . $$

and

$$ \text { All the leading principal submatrices of } \hat{Q}^T V^{(0)} \text { are nonsingular. } $$

When $n=m$, the simultaneous iteration turns out to be algebraically equivavlent to the QR algorithm without shifts.