problems into still smaller problems and eventually get to triangular form by a finite algorithm.This would imply the existence of a formula for the zeros of a general nth degree polynomial,in violation of Galois theory. The ability to reduce a matrix to upper Hessenberg form is quite useful.If the ORal s that A are upper H s results ir nuch more e oncanbeli0ne9nnaipotgteORdcecomposii om n an the subsequen en erg cas ell worth the extra expense.In fact ith is' m,the Hessenberg form is absolutely esse ntial 4.FRANCIS'S ALGORITHM.Suppose we want to find the eigenvalues of some matrix A ERx".We continue to focus on the real case;the extension to complex matrices is straightforward.We know from Theorem 2 that we can reduce A to upper Hessenberg form,so let us assume from the outset that A is upper Hessenberg. can assume,moreover,that A is properly upper Hessenberg.This means that all of the subdiagonal entries o H are nonzero:d.0 1.Ind ed,if A A=[] and A.If eithe no propepr Heer matrices.We wil assume,therefore.that A is properly upper An iteration of Francis's algorithm of degree m begins by picking m shifts ou. mIn principal m can be any positive integer,but in practice it should be fairly small Francis took m =2.The rationale for shift selection will be explained later.There are many reasonabl e ways to choose shifts,but the simplest is to take p,....Pm to be the eigenvalues of the m x m submatrix in the lower right-hand corner of A.This is an easy computation if m =2. if m 1,this shi that it have this property:when the matrix is real,complex shifts must be produced in pA)=(A-pmI)…(A-pI). We do not actually compute p(A),as this would be too expensive.Francis's algorithm just needs the first column x p(A)e1, which is easily co -vecto multiplicatioas.T .com 1s9 se upper Hessen easy check th pil)ei ha only its fir 2( A-PiDe as nonzero en ee posi and so on.As a co nonzero entres in only its rst m+I positions e entire computation c depen 392 THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 118
problems into still smaller problems and eventually get to triangular form by a finite algorithm. This would imply the existence of a formula for the zeros of a general nth degree polynomial, in violation of Galois theory. The ability to reduce a matrix to upper Hessenberg form is quite useful. If the QR algorithm (2) is initiated with a matrix A0 that is upper Hessenberg, then (except in some special cases that can be avoided) all iterates Aj are upper Hessenberg. This results in much more economical iterations, as both the QR decomposition and the subsequent RQ multiplication can be done in many fewer operations in the Hessenberg case. Thus an initial reduction to upper Hessenberg form is well worth the extra expense. In fact, the algorithm would not be competitive without it. For Francis’s implicitly-shifted QR algorithm, the Hessenberg form is absolutely essential. 4. FRANCIS’S ALGORITHM. Suppose we want to find the eigenvalues of some matrix A ∈ R n×n . We continue to focus on the real case; the extension to complex matrices is straightforward. We know from Theorem 2 that we can reduce A to upper Hessenberg form, so let us assume from the outset that A is upper Hessenberg. We can assume, moreover, that A is properly upper Hessenberg. This means that all of the subdiagonal entries of A are nonzero: aj+1,j 6= 0 for j = 1, 2, . . . , n − 1. Indeed, if A is not properly upper Hessenberg, say ak+1,k = 0, then A has the block-triangular form A = A11 A12 0 A22 , where A11 is k × k. Thus the eigenvalue problem for A reduces to eigenvalue problems for the smaller upper Hessenberg matrices A11 and A22. If either of these is not properly upper Hessenberg, we can break it apart further until we are finally left with proper upper Hessenberg matrices. We will assume, therefore, that A is properly upper Hessenberg from the outset. An iteration of Francis’s algorithm of degree m begins by picking m shifts ρ1, . . . , ρm. In principal m can be any positive integer, but in practice it should be fairly small. Francis took m = 2. The rationale for shift selection will be explained later. There are many reasonable ways to choose shifts, but the simplest is to take ρ1, . . . , ρm to be the eigenvalues of the m × m submatrix in the lower right-hand corner of A. This is an easy computation if m = 2. If m > 1, this shifting strategy can produce complex shifts, but they always occur in conjugate pairs. Whether we use this or some other strategy, we will always insist that it have this property: when the matrix is real, complex shifts must be produced in conjugate pairs. Now let p(A) = (A − ρm I)· · ·(A − ρ1 I). We do not actually compute p(A), as this would be too expensive. Francis’s algorithm just needs the first column x = p(A)e1, which is easily computed by m successive matrix-vector multiplications. The computation is especially cheap because A is upper Hessenberg. It is easy to check that (A − ρ1 I)e1 has nonzero entries in only its first two positions, (A − ρ2 I)(A − ρ1 I)e1 has nonzero entries in only its first three positions, and so on. As a consequence, x has nonzero entries in only its first m + 1 positions. The entire computation of x depends 392 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 118
edenote the vector consisting of the first mentries of the nonzero part).Theorem I guarantees that there is an (m+1)x(m+1)reflector Qo such that ce.where a =ll2.Let Qo be an n x n reflector given by 7 (4 Clearly Oox=ae and,since o is an involution,Ooe=x.Thus the first column oT o is propo tional tox orm A 1 on oA the firs m+1 columns Since】 0.row m+2 gets filled in by this t and these remain zero.The total effec is that there is a bulge in the Hessenberg form.In the case m=3and n=7,the matrix oA Qo looks like 米米米水米 米米 In this7×7m )you will that the matrix is 100 case essenberg with a tiny he francis ite ration ber f of T upper Hes by the gh n and the desi n al ists of afte the seon e of can be restricted a lot further in this case.It needs to act on rows 2 through m+2 and create zeros in positions (3.1). melen we seta mati that has the Hesentor)lyingo column.Completing the similarity transformation,we multiply by on the right. This recombines columns 2 through m+2.Since a 2≠0,additional nonzero entries are created in positions(m+3.2).....(m+3,m1).In the case m=3 and n =7 the matrix 210oA0o0 looks like May2011] FRANCIS'S ALGORITHM 393
on only the first m columns of A (and the shifts) and requires negligible computational effort if m is small. Since the complex shifts occur in conjugate pairs, p(A) is real. Therefore x is real. Let x˜ ∈ R m+1 denote the vector consisting of the first m + 1 entries of x (the nonzero part). Theorem 1 guarantees that there is an (m + 1) × (m + 1) reflector Q˜ 0 such that Q˜ 0x˜ = αe1, where α = ±k ˜xk2 . Let Q0 be an n × n reflector given by Q0 = Q˜ 0 I . (4) Clearly Q0x = αe1 and, since Q0 is an involution, Q0e1 = α −1 x. Thus the first column of Q0 is proportional to x. Now use Q0 to perform a similarity transformation: A 7→ Q −1 0 AQ0 = Q0AQ0. This disturbs the Hessenberg form but only slightly. Because Q0 has the form (4), the transformation A 7→ Q0A affects only the first m + 1 rows. Typically the first m + 1 rows are completely filled in. The transformation Q0A 7→ Q0AQ0 affects only the first m + 1 columns. Since am+2,m+1 6= 0, row m + 2 gets filled in by this transformation. Below row m + 2, we have a big block of zeros, and these remain zero. The total effect is that there is a bulge in the Hessenberg form. In the case m = 3 and n = 7, the matrix Q0AQ0 looks like ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ . In this 7 × 7 matrix the bulge looks huge. If you envision, say, the 100 × 100 case (keeping m = 3), you will see that the matrix is almost upper Hessenberg with a tiny bulge at the top. The rest of the Francis iteration consists of returning this matrix to upper Hessenberg form by the algorithm sketched in the proof of Theorem 2. This begins with a transformation Q1 that acts only on rows 2 through n and creates the desired zeros in the first column. Since the first column already consists of zeros after row m + 2, the scope of Q1 can be restricted a lot further in this case. It needs to act on rows 2 through m + 2 and create zeros in positions (3, 1), . . . , (m + 2, 1). Applying Q1 on the left, we get a matrix Q1Q0AQ0 that has the Hessenberg form restored in the first column. Completing the similarity transformation, we multiply by Q1 on the right. This recombines columns 2 through m + 2. Since am+3,m+2 6= 0, additional nonzero entries are created in positions (m + 3, 2), . . . , (m + 3, m + 1). In the case m = 3 and n = 7 the matrix Q1Q0AQ0Q1 looks like ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ . May 2011] FRANCIS’S ALGORITHM 393