A Course on Numerical Linear Algebra¶

Instructor: Deniz Bilman

Textbook: Numerical Linear Algebra, by Trefethen and Bau.

Lecture 26: Reduction to Hessenberg or Tridiagonal Form¶

In [37]:
using LinearAlgebra
A = rand(0:20, 6, 6)
6×6 Matrix{Int64}:
 14   2  18  16   8   0
  1  11   8   9  16   1
  4   0  13   8   0   4
 16  11   5  19  20  13
  5  16  19  14  18  17
 15  18  16   6  20  15
In [38]:
# This is what you'd get when the iterative algorithm 
# computing the Schur factorization converges
(T,Q) = schur(A);
T
6×6 Matrix{Float64}:
 65.9173   6.91374  12.9358    11.9099    -2.30041    16.867
  0.0     16.0483   -0.639618   4.87063    8.26544     0.519912
  0.0      0.0      -0.216094  12.2684     5.40949     0.550169
  0.0      0.0      -7.56057   -0.216094   7.5363     -0.393075
  0.0      0.0       0.0        0.0        0.174992  -13.2969
  0.0      0.0       0.0        0.0        0.0         8.29157

The Schur form of a matrix always exists and the eigenvalues sit at the diagonal of $T$ and stare at us.

In [39]:
Λ = eigvals(A)
6-element Vector{ComplexF64}:
 -0.216094247712258 - 9.630996572139273im
 -0.216094247712258 + 9.630996572139273im
 0.1749915319448782 + 0.0im
  8.291565276575604 + 0.0im
 16.048344415238855 + 0.0im
  65.91728727166523 + 0.0im

In case the eigenvalues are complex-valued, there is a way to handle the process with real arithmetic, and the resulting matrix $T$ becomes quasi-triangular, with complex eigenvalues being represented as the eigenvalues of $2\times 2$ blocks on its diagonal. This choice can be made for speed purposes.

In [40]:
T[3:4,3:4]|>display
a=(T[3:4,3:4])[1,1];
b=(T[3:4,3:4])[1,2];
c=(T[3:4,3:4])[2,1];
d=(T[3:4,3:4])[2,2];
2×2 Matrix{Float64}:
 -0.216094  12.2684
 -7.56057   -0.216094
In [41]:
b*c < 0.
true
In [43]:
a + 1im*sqrt(-b*c)
-0.216094247712258 + 9.630996572139274im
In [44]:
eigvals(T[3:4,3:4])
2-element Vector{ComplexF64}:
 -0.216094247712258 - 9.630996572139273im
 -0.216094247712258 + 9.630996572139273im

OK. But we first transform to an upper-Heisenberg matrix (all entries below the first subdiagonal are zero.). That is the first phase:

$$ A\mapsto H $$

In [45]:
F = hessenberg(A);

H = F.H
6×6 UpperHessenberg{Float64, Matrix{Float64}}:
  14.0     -16.179    -8.56849  -13.9849     9.78923  -4.62762
 -22.8692   42.3939   32.2153    -2.97531  -12.8806   -3.95445
    ⋅       23.3027   17.3303     2.51689  -11.8056   -0.682365
    ⋅         ⋅      -14.1685     4.40003    1.00031   0.82663
    ⋅         ⋅         ⋅        -6.97764    3.60084  -2.71927
    ⋅         ⋅         ⋅          ⋅        11.8614    8.27492
In [46]:
# The unitary transformation that gets us to H
Matrix(F.Q)
6×6 Matrix{Float64}:
 1.0   0.0         0.0         0.0        0.0          0.0
 0.0  -0.0437269  -0.449615    0.65065    0.508113    -0.338246
 0.0  -0.174908   -0.13215    -0.55967    0.756414     0.257975
 0.0  -0.699631    0.0906378  -0.245188  -0.103325    -0.656893
 0.0  -0.218635   -0.842596   -0.161143  -0.398657     0.239452
 0.0  -0.655904    0.249399    0.421117   0.00751394   0.574625

How does this work?

One may think of unitary triangularization via Householder reflectors. But that is not a good idea. See why for yourself: remember, we will produce $Q_1^* A Q_1$. Does that result in the desired sparsity?

What is a better idea?

Do not use a unitary matrix that acts on all the rows of $A$. Instead select a Householder reflector $Q_1^*$ that leaves the first row unchanged.

Then $Q_1^* A$ will form linear combinations of rows $2,\ldots,m$ and introduce zeros in the first column in rows $3,\ldots,m$.

Now take $Q_1$ and multiply $(Q_1^* A)$ on the right to obtain $Q_1^* A Q_1$. This will leave the first column of $Q_1^* A$ unchanged! So we will not ruin our work (on introducing zeros).

This idea is repeated to introduce zeros into subsequent columns. For example, the second Householder reflector, $Q_2$, leaves the first and second rows and columns unchanged.

After repeating this process $m-2$ times, we have a product in Hessenberg form as desided.

$$ \underbrace{Q_{m-2}^* \cdots Q_2^* Q_1^*}_{Q^*} A \underbrace{Q_1 Q_2 \cdots Q_{m-2}}_Q=H . $$

lec-26-Hessenberg.png

Algorithm 26.1 Householder Reduction to Hessenberg Form¶

  • for $k=1$ to $m-2$
    • $x=A_{k+1: m, k}$
    • $v_k=\operatorname{sign}\left(x_1\right)\|x\|_2 e_1+x$
    • $v_k=v_k /\left\|v_k\right\|_2$
    • $A_{k+1: m, k: m}=A_{k+1: m, k: m}-2 v_k\left(v_k^* A_{k+1: m, k: m}\right)$
    • $A_{1: m, k+1: m}=A_{1: m, k+1: m}-2\left(A_{1: m, k+1: m} v_k\right) v_k^*$

Operation Count¶

$$ \text { Work for Hessenberg reduction: } \quad \sim \frac{10}{3} m^3 \text { flops. } $$

Note that $A= Q H Q$ is almost a Schur factorization! One just needs to transform the subdiagonal to $0$. Recall the example output:

In [47]:
H
6×6 UpperHessenberg{Float64, Matrix{Float64}}:
  14.0     -16.179    -8.56849  -13.9849     9.78923  -4.62762
 -22.8692   42.3939   32.2153    -2.97531  -12.8806   -3.95445
    ⋅       23.3027   17.3303     2.51689  -11.8056   -0.682365
    ⋅         ⋅      -14.1685     4.40003    1.00031   0.82663
    ⋅         ⋅         ⋅        -6.97764    3.60084  -2.71927
    ⋅         ⋅         ⋅          ⋅        11.8614    8.27492

The Hermitian Case: Reduction to Tridiagonal From¶

If $A$ is hermitian, the algorithm just described will reduce $A$ to tridiagonal form (at least, in the absence of rounding errors). This is easy to see: since $A$ is hermitian, $Q^* A Q$ is also hermitian, and any hermitian Hessenberg matrix is tridiagonal.

$$ \text { Work for tridiagonal reduction: } \quad \sim \frac{4}{3} m^3 \text { flops. } $$

In [49]:
# Symmetrize A and obtain B
B = A+A'
6×6 Matrix{Int64}:
 28   3  22  32  13  15
  3  22   8  20  32  19
 22   8  26  13  19  20
 32  20  13  38  34  19
 13  32  19  34  36  37
 15  19  20  19  37  30
In [53]:
# H for B is also Hermitian, therefore tridiagonal (banded matrix).
hessenberg(B)
Hessenberg{Float64, UpperHessenberg{Float64, Matrix{Float64}}, Matrix{Float64}, Vector{Float64}, Bool}
Q factor: 6×6 LinearAlgebra.HessenbergQ{Float64, Matrix{Float64}, Vector{Float64}, false}
H factor:
6×6 UpperHessenberg{Float64, Matrix{Float64}}:
  28.0    -43.715     4.18911e-15  -1.92664e-15   5.40675e-16  -4.4296e-15
 -43.715   92.6007  -50.2489       -1.7034e-14   -6.1369e-15   -2.55385e-15
    ⋅     -50.2489   43.1229       -6.92658       5.22226e-16   3.66116e-15
    ⋅        ⋅       -6.92658      -5.80194       6.96338       4.35548e-15
    ⋅        ⋅         ⋅            6.96338      13.778         8.95241
    ⋅        ⋅         ⋅             ⋅            8.95241       8.30033

Stability¶

Theorem:¶

Let the Hessenberg reduction $A=Q H Q^*$ of a matrix $A \in$ $\mathbb{C}^{m \times m}$ be computed by Algorithm 26.1 on a computer satisfying the axioms A and B (from Lecture 13), and let $\tilde{Q}$ and $\tilde{H}$ the computed factors (again taking $\tilde{Q}$ to be exactly unitary - see Lecture 16). Then we have

$$ \tilde{Q} \tilde{H} \tilde{Q}^*=A+\delta A, \quad \frac{\|\delta A\|}{\|A\|}=O\left(\epsilon_{\text {M}}\right) $$

for some $\delta A \in \mathbb{C}^{m \times m}$.