604 IV.Branches of Mathematics really goes on is a far more interesting process of be done even in principle by formulas.for most mathe. matical problems cannot be solved by a finite sequence Press/Elsevier. of elementary operations.What happens instead is Ke.Shen,and M.Vyaly 2002.Classical and Quan rision,or a hundred.For a scientific or engineering um Computation.Providence,RI:American Mathematical application,such an answer may be as good as exact. Kushilevitz,E,and N.Nisan.1996.Communication Com- We can illustra a tutorial).In Handbook on p(2)-C0+C1z+C222+C323+C42 s.Bulletin of the E an Association and another of degree 5, ions of extractors q(z)-do+diz+d2z2+d3z+dz+dsz It is well-known that there is an explicit formula that Algebralc complexity theory.In Hand- expresses the roots of p in terms of radicals (discov book of Theoretical C red by Ferrari around bridge.MA-MIT Press/Elsevier. more than 250 vears later see THe O THE QUINTIC [V.21]for more details).Thus,in a cer IV.21 Numerical Analysis tain philosophic se the root-findir Lloyd N.Trefethen know the roots of one of these polynomials,he or she 1 The Need for Numerical Computation vill turn to a computer and get an answer to sixt Everyone knows that when scientists and engineers of precs ha a need numerical answers to mathematical problems. In thend.Did they turn to computers there is a wide the answer is certainly no,but what about p?Maybe. naybe not.Most of the time,the user neither knows The power of numbers has been extraordinary.It is proba y not one ofe noted that the scientific revolution was setn made it a principle from memory en G others Here are three more examples of problems that can be solved in principle by a finite sequence of elementary and,in the remarkable cycle whose fruits are all around operations,like fnding the roots of p. us,finer measurements led to refine d laws,which ir yand st ()Linear equations:solve a system of n linear equ in the physical sciences could be achieved,or a signifi- (Linear programming:minimize a linear function cant engineering product developed,without numerical of n variables subject to m linear constraints. )Traveling sale man problem:find the shortest tou ertainly play etween i citle nc d he Many people imagine that scientists and mathemati gen Canhtkctmitheosof4.cano formulas,and th h,b essary results.The reality is nothing like this.What
✐ 604 IV. Branches of Mathematics Karp, R. M., and V. Ramachandran. 1990. Parallel algorithms for shared-memory machines. In Handbook of Theoretical Computer Science, volume A, Algorithms and Complexity, edited by J. van Leeuwen. Cambridge, MA: MIT Press/Elsevier. Kearns, M. J., and U. V. Vazirani. 1994. An Introduction to Computational Learning Theory. Cambridge, MA: MIT Press. Kitaev, A., A. Shen, and M. Vyalyi. 2002. Classical and Quantum Computation. Providence, RI: American Mathematical Society. Kushilevitz, E., and N. Nisan. 1996. Communication Complexity. Cambridge: Cambridge University Press. Ron, D. 2001. Property testing (a tutorial). In Handbook on Randomized Computing, volume II. Dordrecht: Kluwer. Shaltiel, R. 2002. Recent developments in explicit constructions of extractors. Bulletin of the European Association for Theoretical Computer Science 77:67–95. Sipser, M. 1997. Introduction to the Theory of Computation. Boston, MA: PWS. Strassen, V. 1990: Algebraic complexity theory. In Handbook of Theoretical Computer Science, volume A, Algorithms and Complexity, edited by J. van Leeuwen. Cambridge, MA: MIT Press/Elsevier. IV.21 Numerical Analysis Lloyd N. Trefethen 1 The Need for Numerical Computation Everyone knows that when scientists and engineers need numerical answers to mathematical problems, they turn to computers. Nevertheless, there is a widespread misconception about this process. The power of numbers has been extraordinary. It is often noted that the scientific revolution was set in motion when Galileo and others made it a principle that everything must be measured. Numerical measurements led to physical laws expressed mathematically, and, in the remarkable cycle whose fruits are all around us, finer measurements led to refined laws, which in turn led to better technology and still finer measurements. The day has long since passed when an advance in the physical sciences could be achieved, or a signifi- cant engineering product developed, without numerical mathematics. Computers certainly play a part in this story, yet there is a misunderstanding about what their role is. Many people imagine that scientists and mathematicians generate formulas, and then, by inserting numbers into these formulas, computers grind out the necessary results. The reality is nothing like this. What really goes on is a far more interesting process of execution of algorithms. In most cases the job could not be done even in principle by formulas, for most mathematical problems cannot be solved by a finite sequence of elementary operations. What happens instead is that fast algorithms quickly converge to “approximate” answers that are accurate to three or ten digits of precision, or a hundred. For a scientific or engineering application, such an answer may be as good as exact. We can illustrate the complexities of exact versus approximate solutions by an elementary example. Suppose we have one polynomial of degree 4, p(z) = c0 + c1z + c2z2 + c3z3 + c4z4, and another of degree 5, q(z) = d0 + d1z + d2z2 + d3z3 + d4z4 + d5z5. It is well-known that there is an explicit formula that expresses the roots of p in terms of radicals (discovered by Ferrari around 1540), but no such formula for the roots of q (as shown by Ruffini and abel [VI.33] more than 250 years later; see the insolubility of the quintic [V.21] for more details). Thus, in a certain philosophical sense the root-finding problems for p and q are utterly different. Yet in practice they hardly differ at all. If a scientist or a mathematician wants to know the roots of one of these polynomials, he or she will turn to a computer and get an answer to sixteen digits of precision in less than a millisecond. Did the computer use an explicit formula? In the case of q, the answer is certainly no, but what about p? Maybe, maybe not. Most of the time, the user neither knows nor cares, and probably not one mathematician in a hundred could write down formulas for the roots of p from memory. Here are three more examples of problems that can be solved in principle by a finite sequence of elementary operations, like finding the roots of p. (i) Linear equations: solve a system of n linear equations in n unknowns. (ii) Linear programming: minimize a linear function of n variables subject to m linear constraints. (iii) Traveling salesman problem: find the shortest tour between n cities. And here are five that, like finding the roots of q, cannot generally be solved in this manner. (iv) Find an eigenvalue [I.3 §4.3] of an n × n matrix. (v) Minimize a function of several variables
IV21.Numerical Analysis 605 5 r 1 ation (ODE) ics began but now mainly in the hands of (v)Solve a partial differential equation (PDE). Can we conclude that (i)-(iii)will be easier than (iv)-(viii ly not.Pro em (is usually ver thad ma Problems (vi)and (vii)are usually rather easy.at leas century from the 1950s,machines sped up by a factor of around 10 if the dimension.Problems (and (v) but so did the best algorithms known is small like 100 and often very hard when n is larg like 1000000.In fact,in the matters philosophy is Half a century on.umericl analysis has grownn the largest branches gnores the exact solution and uses approximate(but dozens of mathematical journals as well as applica- ons journals across the sciences and ast!)meth is the study of algorithms for solving the problems of continuous mathematics.by ers,we have reached a point wh which we e most of the clas cal p ear progra ences ming and the traveling salesman problem the algorithms that make this possible were invented ed o wer the real numbers,but not their dis rete ana since 1950 0 possible future trends This field encompass classical questions of inter 2 A Brief History po ANALY Throughout history,leading mathematicians have been [VL251.Gauss.and othe semiclassical problems of to the di c applic ions,and i amln山 thm still in use todav.GAuss IVL261. as usual,is an out stein and major newer topics. snlines radial standing example Among n mnany othe basis functions.and WAVELETS IVII.31 We shall t have sp 1795)systems of lin ar equations (809)and nume (0 dre o than almos ical quadrature (1814), well as inventing THE FAS or later.the discussion comes down to approximation theory. Cooley and Tukey in 1965. 3 Machine Arithmetic and Rounding Errors the numerical sid of mather d t It is well-known that computers cannot represent real or complex numbers exactly.A quoti ent like 1/7 eval. the grow s genera and of grea ated on a cor er,fo orm matical rigor had to be the heart of the matter.For chines to work in base 7!) Computers appr imate xample,many advances of the entieth cer bers by a system of float s about infinity,a subject relativel y far from alent of scientific notation.so that the scale docs no merical calculation. nd in the 1940s nted by ad the 1930s
✐ IV.21. Numerical Analysis 605 (vi) Evaluate an integral. (vii) Solve an ordinary differential equation (ODE). (viii) Solve a partial differential equation (PDE). Can we conclude that (i)–(iii) will be easier than (iv)–(viii) in practice? Absolutely not. Problem (iii) is usually very hard indeed if n is, say, in the hundreds or thousands. Problems (vi) and (vii) are usually rather easy, at least if the integral is in one dimension. Problems (i) and (iv) are of almost exactly the same difficulty: easy when n is small, like 100, and often very hard when n is large, like 1 000 000. In fact, in these matters philosophy is such a poor guide to practice that, for each of the three problems (i)–(iii), when n and m are large one often ignores the exact solution and uses approximate (but fast!) methods instead. Numerical analysis is the study of algorithms for solving the problems of continuous mathematics, by which we mean problems involving real or complex variables. (This definition includes problems like linear programming and the traveling salesman problem posed over the real numbers, but not their discrete analogues.) In the remainder of this article we shall review some of its main branches, past accomplishments, and possible future trends. 2 A Brief History Throughout history, leading mathematicians have been involved with scientific applications, and in many cases this has led to the discovery of numerical algorithms still in use today. gauss [VI.26], as usual, is an outstanding example. Among many other contributions, he made crucial advances in least-squares data fitting (1795), systems of linear equations (1809), and numerical quadrature (1814), as well as inventing the fast fourier transform [III.26] (1805), though the last did not become widely known until its rediscovery by Cooley and Tukey in 1965. Around 1900, the numerical side of mathematics started to become less conspicuous in the activities of research mathematicians. This was a consequence of the growth of mathematics generally and of great advances in fields in which, for technical reasons, mathematical rigor had to be the heart of the matter. For example, many advances of the early twentieth century sprang from mathematicians’ new ability to reason rigorously about infinity, a subject relatively far from numerical calculation. A generation passed, and in the 1940s the computer was invented. From this moment numerical mathematics began to explode, but now mainly in the hands of specialists. New journals were founded such as Mathematics of Computation (1943) and Numerische Mathematik (1959). The revolution was sparked by hardware, but it included mathematical and algorithmic developments that had nothing to do with hardware. In the halfcentury from the 1950s, machines sped up by a factor of around 109, but so did the best algorithms known for some problems, generating a combined increase in speed of almost incomprehensible scale. Half a century on, numerical analysis has grown into one of the largest branches of mathematics, the specialty of thousands of researchers who publish in dozens of mathematical journals as well as applications journals across the sciences and engineering. Thanks to the efforts of these people going back many decades, and thanks to ever more powerful computers, we have reached a point where most of the classical mathematical problems of the physical sciences can be solved numerically to high accuracy. Most of the algorithms that make this possible were invented since 1950. Numerical analysis is built on a strong foundation: the mathematical subject of approximation theory. This field encompasses classical questions of interpolation, series expansions, and harmonic analysis [IV.11] associated with newton [VI.14], fourier [VI.25], Gauss, and others; semiclassical problems of polynomial and rational minimax approximation associated with names such as chebyshev [VI.45] and Bernstein; and major newer topics, including splines, radial basis functions, and wavelets [VII.3]. We shall not have space to address these subjects, but in almost every area of numerical analysis it is a fact that, sooner or later, the discussion comes down to approximation theory. 3 Machine Arithmetic and Rounding Errors It is well-known that computers cannot represent real or complex numbers exactly. A quotient like 1/7 evaluated on a computer, for example, will normally yield an inexact result. (It would be different if we designed machines to work in base 7!) Computers approximate real numbers by a system of floating-point arithmetic, in which each number is represented in a digital equivalent of scientific notation, so that the scale does not matter unless the number is so huge or tiny as to cause overflow or underflow. Floating-point arithmetic was invented by Konrad Zuse in Berlin in the 1930s, and
606 IV.Branches of Mathematics by the end of the 1950s it was standard across the the magnetic moment of the electron to the Bohr mag in rs had widely dif neton.At pres st nothing in physics ise thar of discussion,the IEEE(nstitute of Electrical and Ele any number in science.(Of course,purely mathematical ues lke n are an rithmetic is fa This standard has subsequently become nearly univer closer to its ideal than is physics.It is a curious al on pro ors of many kir ·An IEEE(do phenomenon that. floating-poin into bits for a signed fraction in base2 and 11 bits widely regarded as an ugly and dangerous compromise for a signed exponent.Since ≈1.1×10-16,E Numerical analysts themselves are partly to blame for Geld dis this system work an be a source of danger causing errors in results down to abou that ought to e source of course thev of rounding errors from microscopic to ma addition,sub action,multipli ation,and division,and These me s are o NN IV int arith metic,the computedr ault of e careless rellance on machine arithmetic.These risks are ℃ ery real.but the message was communicated all to is the same operation as realized on the computer ion that the main business of numerical analysis is then for any floating-point numbe oping with rounding errors In fact,the main b hat there is no underflo ess of n sisis desig X©v=(X*Y)(1+E) Here s is a very small quantity,no greater in abso arely the central issue.If 90%of numerical analysis lute value than a number known as machine epsilon denoted by 10gutc .In the 4 Numerical Linear Algebra Thus,on a computer,thei erval [1,2],for example the of th fphysics.In a handful of sons for this,but I think one is at the bottom of i d or of gas,th number ance or lnear algebra has exploded since the The starting point of this subject is Gaussian elimi ber).Such a system ehaves enough like a continuum nation,a procedure that can solve n linear equations ord 0 onsof the for puter arithmetic,however,is more than a million times xb,where Ais matri and x and b are co than thi Another nparison with physics co imn vectors of siz nd the known.such as (roughly)4 digits for the gravitationa ystem of linear equations is solved.Even if n is as constant G.7digits for Planck's constant h and the el mentary charge e,and 12 digits for the ratio ue/uB o on a typical 2008
✐ 606 IV. Branches of Mathematics by the end of the 1950s it was standard across the computer industry. Until the 1980s, different computers had widely different arithmetic properties. Then, in 1985, after years of discussion, the IEEE (Institute of Electrical and Electronics Engineers) standard for binary floating-point arithmetic was adopted, or IEEE arithmetic for short. This standard has subsequently become nearly universal on processors of many kinds. An IEEE (double precision) real number consists of a 64-bit word divided into 53 bits for a signed fraction in base 2 and 11 bits for a signed exponent. Since 2−53 ≈ 1.1 × 10−16, IEEE numbers represent the numbers of the real line to a relative accuracy of about 16 digits. Since 2±210 ≈ 10±308, this system works for numbers up to about 10308 and down to about 10−308. Computers do not merely represent numbers, of course; they perform operations on them such as addition, subtraction, multiplication, and division, and more complicated results are obtained from sequences of these elementary operations. In floating-point arithmetic, the computed result of each elementary operation is almost exactly correct in the following sense: if “∗” is one of these four operations in its ideal form and “-* ” is the same operation as realized on the computer, then for any floating-point numbers x and y, assuming that there is no underflow or overflow, x -* y = (x ∗ y)(1 + ε). Here ε is a very small quantity, no greater in absolute value than a number known as machine epsilon, denoted by εmach, that measures the accuracy of the computer. In the IEEE system, εmach = 2−53 ≈ 1.1 × 10−16. Thus, on a computer, the interval [1, 2], for example, is approximated by about 1016 numbers. It is interesting to compare the fineness of this discretization with that of the discretizations of physics. In a handful of solid or liquid or a balloonful of gas, the number of atoms or molecules in a line from one point to another is on the order of 108 (the cube root of Avogadro’s number). Such a system behaves enough like a continuum to justify our definitions of physical quantities such as density, pressure, stress, strain, and temperature. Computer arithmetic, however, is more than a million times finer than this. Another comparison with physics concerns the precision to which fundamental constants are known, such as (roughly) 4 digits for the gravitational constant G, 7 digits for Planck’s constant h and the elementary charge e, and 12 digits for the ratio µe/µB of the magnetic moment of the electron to the Bohr magneton. At present, almost nothing in physics is known to more than 12 or 13 digits of accuracy. Thus IEEE numbers are orders of magnitude more precise than any number in science. (Of course, purely mathematical quantities like π are another matter.) In two senses, then, floating-point arithmetic is far closer to its ideal than is physics. It is a curious phenomenon that, nevertheless, it is floating-point arithmetic rather than the laws of physics that is widely regarded as an ugly and dangerous compromise. Numerical analysts themselves are partly to blame for this perception. In the 1950s and 1960s, the founding fathers of the field discovered that inexact arithmetic can be a source of danger, causing errors in results that “ought” to be right. The source of such problems is numerical instability: that is, the amplification of rounding errors from microscopic to macroscopic scale by certain modes of computation. These men, including von neumann [VI.91], Wilkinson, Forsythe, and Henrici, took great pains to publicize the risks of careless reliance on machine arithmetic. These risks are very real, but the message was communicated all too successfully, leading to the current widespread impression that the main business of numerical analysis is coping with rounding errors. In fact, the main business of numerical analysis is designing algorithms that converge quickly; rounding-error analysis, while often a part of the discussion, is rarely the central issue. If rounding errors vanished, 90% of numerical analysis would remain. 4 Numerical Linear Algebra Linear algebra became a standard topic in undergraduate mathematics curriculums in the 1950s and 1960s, and has remained there ever since. There are several reasons for this, but I think one is at the bottom of it: the importance of linear algebra has exploded since the arrival of computers. The starting point of this subject is Gaussian elimination, a procedure that can solve n linear equations in n unknowns using on the order of n3 arithmetic operations. Equivalently, it solves equations of the form Ax = b, where A is an n×n matrix and x and b are column vectors of size n. Gaussian elimination is invoked on computers around the world almost every time a system of linear equations is solved. Even if n is as large as 1000, the time required is well under a second on a typical 2008 desktop machine. The idea of
IV21.Numerical Analysis 607 If the are chosen to include [V1221 Gauss and Iacori [VL351 to the (kk) osition before the kth elimination step The modern way of describing such algorith ms,how then L has the additional property Ifiil s 1 for all i and ts cted from the second row.This operation can be ly hard.In s the multip left by pe with the additional nonzero entry m Furthe ed to solve correspond to further mult If k ster rt A to per-trianaular matrix son and ot then we have.or.upon pivoting.The lack of an explanation of this discrep setting L M ancy represents an embarrassing gap at the heart of A-LU ents sugges Here ngular,tha 1.er-tri the ces with independent norr ally distributed entries)for ents the target st ture and i encodes the opera ch Gaussian e amplines rounding error inCamiedouttogetth 1 ecan s say tha uppe n is the dimension,but a theorem to this effect has never Many other algorithms of numerical linear algebra n prove in the late 1950s that hat edon writing a matrix as a product of matr numerical linear algebrae xpanded in another direction from biology,we may say that this field has a central he use of algorithms ba d on ORTHOGONAL.50s3 dogma: 0s3)m algorithms-matrix factorizations notes the conjugate transpose.The starting point n this framework we ribe th of A is a product A-OR A- al colu int matrice idea of Gram-Schmidt orthog nat even I actor a th 41.q or Q ar rounding erre b potentially large correspond to multiplication of A on the right eve by ele mentary upper-t riangula r matrices.One could say mal entries to the dia nal.a process known as piv oting.Since pivoting acts on rows,it again corresponds lizatio A big event was when Householder facton 958 that a d elimination with pivoting is this approach.by applying a succession of elementary PA=LU matrix upper-a operations each of which reflects R gular,L ng one redu uppe is a perm
✐ IV.21. Numerical Analysis 607 elimination was first discovered by Chinese scholars about 2000 years ago, and more recent contributors include lagrange [VI.22], Gauss, and jacobi [VI.35]. The modern way of describing such algorithms, however, was apparently introduced as late as the 1930s. Suppose that, say, α times the first row of A is subtracted from the second row. This operation can be interpreted as the multiplication of A on the left by the lower-triangular matrix M1 consisting of the identity with the additional nonzero entry m21 = −α. Further analogous row operations correspond to further multiplications on the left by lower-triangular matrices Mj . If k steps convert A to an upper-triangular matrix U, then we have MA = U with M = Mk ··· M2M1, or, upon setting L = M−1, A = LU. Here L is unit lower-triangular, that is, lower-triangular with all its diagonal entries equal to 1. Since U represents the target structure and L encodes the operations carried out to get there, we can say that Gaussian elimination is a process of lower-triangular uppertriangularization. Many other algorithms of numerical linear algebra are also based on writing a matrix as a product of matrices that have special properties. To borrow a phrase from biology, we may say that this field has a central dogma: algorithms ←→ matrix factorizations. In this framework we can quickly describe the next algorithm that needs to be considered. Not every matrix has an LU factorization; a 2 × 2 counterexample is the matrix A = 0 1 1 0 . Soon after computers came into use it was observed that even for matrices that do have LU factorizations, the pure form of Gaussian elimination is unstable, amplifying rounding errors by potentially large amounts. Stability can be achieved by interchanging rows during the elimination in order to bring maximal entries to the diagonal, a process known as pivoting. Since pivoting acts on rows, it again corresponds to a multiplication of A by other matrices on the left. The matrix factorization corresponding to Gaussian elimination with pivoting is PA = LU, where U is upper-triangular, L is unit lower-triangular, and P is a permutation matrix, i.e., the identity matrix with permuted rows. If the permutations are chosen to bring the largest entry below the diagonal in column k to the (k, k) position before the kth elimination step, then L has the additional property |ij | 1 for all i and j. The discovery of pivoting came quickly, but its theoretical analysis has proved astonishingly hard. In practice, pivoting makes Gaussian elimination almost perfectly stable, and it is routinely done by almost all computer programs that need to solve linear systems of equations. Yet it was realized in around 1960 by Wilkinson and others that for certain exceptional matrices, Gaussian elimination is still unstable, even with pivoting. The lack of an explanation of this discrepancy represents an embarrassing gap at the heart of numerical analysis. Experiments suggest that the fraction of matrices (for example, among random matrices with independent normally distributed entries) for which Gaussian elimination amplifies rounding errors by a factor greater than ρn1/2 is in a certain sense exponentially small as a function of ρ as ρ → ∞, where n is the dimension, but a theorem to this effect has never been proved. Meanwhile, beginning in the late 1950s, the field of numerical linear algebra expanded in another direction: the use of algorithms based on orthogonal [III.50 §3] or unitary [III.50 §3] matrices, that is, real matrices with Q−1 = QT or complex ones with Q−1 = Q∗, where Q∗ denotes the conjugate transpose. The starting point of such developments is the idea of QR factorization. If A is an m × n matrix with m n, a QR factorization of A is a product A = QR, where Q has orthonormal columns and R is upper-triangular. One can interpret this formula as a matrix expression of the familiar idea of Gram–Schmidt orthogonalization, in which the columns q1, q2,... of Q are determined one after another. These column operations correspond to multiplication of A on the right by elementary upper-triangular matrices. One could say that the Gram–Schmidt algorithm aims for Q and gets R as a by-product, and is thus a process of triangular orthogonalization. A big event was when Householder showed in 1958 that a dual strategy of orthogonal triangularization is more effective for many purposes. In this approach, by applying a succession of elementary matrix operations each of which reflects Rm across a hyperplane, one reduces A to upper-triangular form via orthogonal operations: one aims at R and gets Q
608 IV.Branches of Mathematics havebeen so successful that the computation of mati num orthogon s long ago be peratio rounding errors introduced at each step. cialists kn wing the details of how it is done.A curi From the O factoriz ation sprang a rich ous related story is that EISPACK's relative LINPACK for OR factorization can be ems and construct orthonormal bases benchmarks that all computer manufacturers run to se as a st the spee the computers.a super linear algebra is the determination of the a vear since 1993.it is because of its prowess in solving and eigen ctors of squa ertain matrix probler ms Ax "b of dimensions ranging ors and a diap ematicians,but the development of numerical linear algebra has rought it ont the AX XD. SVD was dise and hence,since X is nonsingular A-XDX-1, SYLVESTER [VL s by m n th specia m≥,an SVD of A is a factorization orthonormal eigenvectors alway s exists,giving A=UEV* A=QDQ*, thonormal columns,Visnxn algo for com and AA,b values cannot generally be computed in closed form that does not square A.Computing the SVD is the stan. The QR algorithm is therefore necessarily an iterative ermining the [11.62]IAll the HI A-I orm)the is square and nonsingular,or their product,known as .the the condition number. rect digits in one of the eigenvalue K(A)=AllA-=/om approx triples t isalso a step numerical analys and its impact h east-squares,computation of ranges and nullspaces software products has bee letermination of s,to squa low-ran on it le p EISPACK ystem Package")and its descendant All the discu ssion above co erms“lass Canm LAPA The sar method en inc cal linear algebra,borr 5.Th of tools:methods for large- cale problems based on subspace iterations.The idea of these iterations Maple,and s as foll ows.Suppose a linear algebra problem is giver
✐ 608 IV. Branches of Mathematics as a by-product. The Householder method turns out to be more stable numerically, because orthogonal operations preserve norms and thus do not amplify the rounding errors introduced at each step. From the QR factorization sprang a rich collection of linear algebra algorithms in the 1960s. The QR factorization can be used by itself to solve leastsquares problems and construct orthonormal bases. More remarkable is its use as a step in other algorithms. In particular, one of the central problems of numerical linear algebra is the determination of the eigenvalues and eigenvectors of a square matrix A. If A has a complete set of eigenvectors, then by forming a matrix X whose columns are these eigenvectors and a diagonal matrix D whose diagonal entries are the corresponding eigenvalues, we obtain AX = XD, and hence, since X is nonsingular, A = XDX−1, the eigenvalue decomposition. In the special case in which A is hermitian [III.50 §3], a complete set of orthonormal eigenvectors always exists, giving A = QDQ∗, where Q is unitary. The standard algorithm for computing these factorizations was developed in the early 1960s by Francis, Kublanovskaya, and Wilkinson: the QR algorithm. Because polynomials of degree 5 or more cannot be solved by a formula, we know that eigenvalues cannot generally be computed in closed form. The QR algorithm is therefore necessarily an iterative one, involving a sequence of QR factorizations that is in principle infinite. Nevertheless, its convergence is extraordinarily rapid. In the symmetric case, for a typical matrix A, the QR algorithm converges cubically, in the sense that at each step the number of correct digits in one of the eigenvalue–eigenvector pairs approximately triples. The QR algorithm is one of the great triumphs of numerical analysis, and its impact through widely used software products has been enormous. Algorithms and analysis based on it led in the 1960s to computer codes in Algol and Fortran and later to the software library EISPACK (“Eigensystem Package”) and its descendant LAPACK. The same methods have also been incorporated in general-purpose numerical libraries such as the NAG, IMSL, and Numerical Recipes collections, and in problem-solving environments such as MATLAB, Maple, and Mathematica. These developments have been so successful that the computation of matrix eigenvalues long ago became a “black box” operation for virtually every scientist, with nobody but a few specialists knowing the details of how it is done. A curious related story is that EISPACK’s relative LINPACK for solving linear systems of equations took on an unexpected function: it became the original basis for the benchmarks that all computer manufacturers run to test the speed of their computers. If a supercomputer is lucky enough to make the TOP500 list, updated twice a year since 1993, it is because of its prowess in solving certain matrix problems Ax = b of dimensions ranging from 100 into the millions. The eigenvalue decomposition is familiar to all mathematicians, but the development of numerical linear algebra has also brought its younger cousin onto the scene: the singular value decomposition (SVD). The SVD was discovered by Beltrami, jordan [VI.52], and sylvester [VI.42] in the late nineteenth century, and made famous by Golub and other numerical analysts beginning in around 1965. If A is an m× n matrix with m n, an SVD of A is a factorization A = UΣV ∗, where U is m×n with orthonormal columns, V is n×n and unitary, and Σ is diagonal with diagonal entries σ1 σ2 ··· σn 0. One could compute the SVD by relating it to the eigenvalue problems for AA∗ and A∗A, but this proves numerically unstable; a better approach is to use a variant of the QR algorithm that does not square A. Computing the SVD is the standard route to determining the norm [III.62] A = σ1 (here · is the hilbert space [III.37] or “2” norm), the norm of the inverse A−1 = 1/σn in the case where A is square and nonsingular, or their product, known as the condition number, κ(A) = A A−1 = σ1/σn. It is also a step in an extraordinary variety of further computational problems including rank-deficient least-squares, computation of ranges and nullspaces, determination of ranks, “total least-squares,” low-rank approximation, and determination of angles between subspaces. All the discussion above concerns “classical” numerical linear algebra, born in the period 1950–75. The ensuing quarter-century brought in a whole new set of tools: methods for large-scale problems based on Krylov subspace iterations. The idea of these iterations is as follows. Suppose a linear algebra problem is given