808 Chapter 18.Integral Equations and Inverse Theory The single central idea in inverse theory is the prescription minimize:A+入B (18.4.12) for various values of 0<A<oo along the so-called trade-off curve (see Figure 18.4.1),and then to settle on a"best"value of A by one or another criterion,ranging from fairly objective (e.g.,making x2=N)to entirely subjective.Successful methods.several of which we will now describe.differ as to their choices ofA and B,as to whether the prescription(18.4.12)yields linear or nonlinear equations,as to their recommended method for selecting a final A,and as to their practicality for 8 computer-intensive two-dimensional problems like image processing. They also differ as to the philosophical baggage that they (or rather,their proponents)carry.We have thus far avoided the word "Bayesian."(Courts have consistently held that academic license does not extend to shouting"Bayesian"in a crowded lecture hall.)But it is hard,nor have we any wish,to disguise the fact that B has something to do with a priori expectation,or knowledge,of a solution,while A has something to do with a posteriori knowledge.The constant A adjudicates a delicate compromise between the two.Some inverse methods have acquired a more Bayesian stamp than others,but we think that this is purely an accident of history. An outsider looking only at the equations that are actually solved,and not at the 需 accompanying philosophical justifications,would have a difficult time separating the so-called Bayesian methods from the so-called empirical ones,we think. The next three sections discuss three different approaches to the problem of inversion,which have had considerable success in different fields.All three fit within the general framework that we have outlined,but they are quite different in OF SCIENTIFIC detail and in implementation. CITED REFERENCES AND FURTHER READING: 6 Craig,I.J.D.,and Brown,J.C.1986,Inverse Problems in Astronomy(Bristol,U.K.:Adam Hilger). Twomey,S.1977,Introduction to the Mathematics of Inversion in Remote Sensing and Indirect Measurements (Amsterdam:Elsevier). Tikhonov.A.N..and Arsenin,V.Y.1977,Solutions of Il/-Posed Problems(New York:Wiley). Tikhonov,A.N.,and Goncharsky,A.V.(eds.)1987,Il/-Posed Problems in the Natural Sciences (Moscow:MIR). Parker,R.L.1977,Annual Review of Earth and Planetary Science,vol.5,pp.35-64 Frieden,B.R.1975,in Picture Processing and Digital Filtering,T.S.Huang,ed.(New York: Numerical Recipes 10621 43106 Springer-Verlag). Tarantola,A.1987,Inverse Problem Theory (Amsterdam:Elsevier). (outside Baumeister,J.1987,Stable Solution of Inverse Problems(Braunschweig.Germany:Friedr.Vieweg Sohn)[mathematically oriented]. North Software. Titterington,D.M.1985.Astronomy and Astrophysics,vol.144,pp.381-387. Jeffrey,W.,and Rosner,R.1986,Astrophysical Journal,vol.310,pp.463-472. 18.5 Linear Regularization Methods What we will call linear regularization is also called the Phillips-Twomey method (1.21,the constrained linear inversion method 31,the method of regulariza- tion [41,and Tikhonov-Miller regularization [5-71.(It probably has other names also
808 Chapter 18. Integral Equations and Inverse Theory Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). The single central idea in inverse theory is the prescription minimize: A + λB (18.4.12) for various values of 0 <λ< ∞ along the so-called trade-off curve (see Figure 18.4.1), and then to settle on a “best” value of λ by one or another criterion, ranging from fairly objective (e.g., making χ2 = N) to entirely subjective. Successful methods, several of which we will now describe, differ as to their choices of A and B, as to whether the prescription (18.4.12) yields linear or nonlinear equations, as to their recommended method for selecting a final λ, and as to their practicality for computer-intensive two-dimensional problems like image processing. They also differ as to the philosophical baggage that they (or rather, their proponents) carry. We have thus far avoided the word “Bayesian.” (Courts have consistently held that academic license does not extend to shouting “Bayesian” in a crowded lecture hall.) But it is hard, nor have we any wish, to disguise the fact that B has something to do with a priori expectation, or knowledge, of a solution, while A has something to do with a posteriori knowledge. The constant λ adjudicates a delicate compromise between the two. Some inverse methods have acquired a more Bayesian stamp than others, but we think that this is purely an accident of history. An outsider looking only at the equations that are actually solved, and not at the accompanying philosophical justifications, would have a difficult time separating the so-called Bayesian methods from the so-called empirical ones, we think. The next three sections discuss three different approaches to the problem of inversion, which have had considerable success in different fields. All three fit within the general framework that we have outlined, but they are quite different in detail and in implementation. CITED REFERENCES AND FURTHER READING: Craig, I.J.D., and Brown, J.C. 1986, Inverse Problems in Astronomy (Bristol, U.K.: Adam Hilger). Twomey, S. 1977, Introduction to the Mathematics of Inversion in Remote Sensing and Indirect Measurements (Amsterdam: Elsevier). Tikhonov, A.N., and Arsenin, V.Y. 1977, Solutions of Ill-Posed Problems (New York: Wiley). Tikhonov, A.N., and Goncharsky, A.V. (eds.) 1987, Ill-Posed Problems in the Natural Sciences (Moscow: MIR). Parker, R.L. 1977, Annual Review of Earth and Planetary Science, vol. 5, pp. 35–64. Frieden, B.R. 1975, in Picture Processing and Digital Filtering, T.S. Huang, ed. (New York: Springer-Verlag). Tarantola, A. 1987, Inverse Problem Theory (Amsterdam: Elsevier). Baumeister, J. 1987, Stable Solution of Inverse Problems (Braunschweig, Germany: Friedr. Vieweg & Sohn) [mathematically oriented]. Titterington, D.M. 1985, Astronomy and Astrophysics, vol. 144, pp. 381–387. Jeffrey, W., and Rosner, R. 1986, Astrophysical Journal, vol. 310, pp. 463–472. 18.5 Linear Regularization Methods What we will call linear regularization is also called the Phillips-Twomey method [1,2], the constrained linear inversion method [3], the method of regularization [4], and Tikhonov-Miller regularization [5-7]. (It probably has other names also
18.5 Linear Regularization Methods 809 since it is so obviously a good idea.)In its simplest form,the method is an immediate generalization of zeroth-order regularization (equation 18.4.11,above).As before, the functional A is taken to be the x2 deviation,equation(18.4.9),but the functional B is replaced by more sophisticated measures of smoothness that derive from first or higher derivatives. For example,suppose that your a priori belief is that a credible u()is not too different from a constant.Then a reasonable functional to minimize is M-1 Bx/位(2dx∑位u-立u+1]2 (18.5.1) 4=1 since it is nonnegative and equal to zero only when (z)is constant.Here =()and the second equality (proportionality)assumes that the a's are uniformly spaced.We can write the second form of B as B=B.2=.(BT.B).=H. (18.5.2) where u is the vector of components,u=1,...,M,B is the (M-1)x M first difference matrix (Nort serve -1 100 00 0 0 -1 1 0 0 0 0 0 computer, Americ make one paper University Press. THE (18.5.3) ART 0 00 0 0 -1 1 0 0 Programs 0 0 0 0 0 -1 and H is the M×M matrix 量 1 -1 0 0 0 0 0 0 -1 2 -1 0 0 0 0 0 -1 2 -1 0 0 0 0 H=BT.B (18.5.4) OF SCIENTIFIC COMPUTING (ISBN 1988-19920 0 0 0 0 -1 2 -1 0 0 0 0 0 -1 @cambri 0 0 0 0 0 0 -1 Numerical Recipes 10-6211 Note that B has one fewer row than column.It follows that the symmetric H is degenerate:it has exactly one zero eigenvalue corresponding to the value of a constant function,any one of which makes B exactly zero. (outside If,just as in $15.4,we write North Software. Ai三Ru/o:b:≡c/o (18.5.5) then,using equation(18.4.9),the minimization principle(18.4.12)is minimize:A+B A.-b+.H. (18.5.6) This can readily be reduced to a linear set of normal equations,just as in $15.4:The components of the solution satisfy the set of M equations in M unknowns, ∑(E44)+aa=∑4 4=1,2,,M(18.5.7)
18.5 Linear Regularization Methods 809 Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). since it is so obviously a good idea.) In its simplest form, the method is an immediate generalization of zeroth-order regularization (equation 18.4.11, above). As before, the functional A is taken to be the χ2 deviation, equation (18.4.9), but the functional B is replaced by more sophisticated measures of smoothness that derive from first or higher derivatives. For example, suppose that your a priori belief is that a credible u(x) is not too different from a constant. Then a reasonable functional to minimize is B ∝ [u (x)]2dx ∝ M −1 µ=1 [uµ − uµ+1] 2 (18.5.1) since it is nonnegative and equal to zero only when u(x) is constant. Here uµ ≡ u(xµ), and the second equality (proportionality) assumes that the x µ’s are uniformly spaced. We can write the second form of B as B = |B · u| 2 = u · (BT · B) · u ≡ u · H · u (18.5.2) where u is the vector of components uµ, µ = 1,...,M, B is the (M − 1) × M first difference matrix B = −1100000 ··· 0 0 −110000 ··· 0 . . . ... . . . 0 ··· 0000 −110 0 ··· 00000 −1 1 (18.5.3) and H is the M × M matrix H = BT · B = 1 −100000 ··· 0 −1 2 −10000 ··· 0 0 −1 2 −1000 ··· 0 . . . ... . . . 0 ··· 000 −1 2 −1 0 0 ··· 0000 −1 2 −1 0 ··· 00000 −1 1 (18.5.4) Note that B has one fewer row than column. It follows that the symmetric H is degenerate; it has exactly one zero eigenvalue corresponding to the value of a constant function, any one of which makes B exactly zero. If, just as in §15.4, we write Aiµ ≡ Riµ/σi bi ≡ ci/σi (18.5.5) then, using equation (18.4.9), the minimization principle (18.4.12) is minimize: A + λB = |A · u − b| 2 + λu · H · u (18.5.6) This can readily be reduced to a linear set of normal equations, just as in §15.4: The components uµ of the solution satisfy the set of M equations in M unknowns, ρ i AiµAiρ + λHµρ uρ = i Aiµbi µ = 1, 2,...,M (18.5.7)
810 Chapter 18.Integral Equations and Inverse Theory or,in vector notation. (AT.A+AH).=AT.b (18.5.8) Equations(18.5.7)or (18.5.8)can be solved by the standard techniques of Chapter 2,e.g.,LU decomposition.The usual warnings about normal equations being ill-conditioned do not apply,since the whole purpose of the A term is to cure that same ill-conditioning.Note,however,that the A term by itself is ill-conditioned, since it does not select a preferred constant value.You hope your data can at least do that! Although inversion of the matrix(AT.A+AH)is not generally the best way to solve for u,let us digress to write the solution to equation(18.5.8)schematically as 1 AAA.b nted for i=(A.A+H (schematic only!) (18.5.9) where the identity matrix in the form A.A-has been inserted.This is schematic not only because the matrix inverse is fancifully written as a denominator,but also because,in general,the inverse matrix A-does not exist.However,it is illuminating to compare equation (18.5.9)with equation (13.3.6)for optimal or (North Wiener filtering,or with equation(13.6.6)for general linear prediction.One sees that AT.A plays the role of S2,the signal power or autocorrelation,while AH America computer, make one paper e University Press. THE plays the role of N2,the noise power or autocorrelation.The term in parentheses in equation(18.5.9)is something like an optimal filter,whose effect is to pass the 是 ill-posed inverse A-.b through unmodified when AT.A is sufficiently large,but Programs to suppress it when AT.A is small. The above choices of B and H are only the simplest in an obvious sequence of derivatives.If your a priori belief is that a linear function is a good approximation to u(x),then minimize to dir -2 Bx"(dx∑ -iu+2tu+1-iu+2]2 (18.5.10) OF SCIENTIFIC COMPUTING(ISBN 0-521- implying -1 2 -1 0 0 0 0 0 -1 2-1 0 0 0 0 1988-1992 by Numerical Recipes B= (18.5.11) -43108. 0 0 0 0 -1 2 -1 0 (outside 0 0 0 0 0 -1 2 Software. and North 1 0 0 0 0 -2 5 -4 1 0 0 0 America). 1 4 6 -4 1 0 0 0 0 1 -4 6 -4 0 0 H=BT.B- (18.5.12) 0 0 一4 6 0 0 0 0 1 6 -4 0 0 0 -4 -2 0 0 0 0 0 1 -2 1
810 Chapter 18. Integral Equations and Inverse Theory Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). or, in vector notation, (AT · A + λH) · u = AT · b (18.5.8) Equations (18.5.7) or (18.5.8) can be solved by the standard techniques of Chapter 2, e.g., LU decomposition. The usual warnings about normal equations being ill-conditioned do not apply, since the whole purpose of the λ term is to cure that same ill-conditioning. Note, however, that the λ term by itself is ill-conditioned, since it does not select a preferred constant value. You hope your data can at least do that! Although inversion of the matrix (AT · A + λH) is not generally the best way to solve for u, let us digress to write the solution to equation (18.5.8) schematically as u = 1 AT · A + λH · AT · A A−1 · b (schematic only!) (18.5.9) where the identity matrix in the form A · A−1 has been inserted. This is schematic not only because the matrix inverse is fancifully written as a denominator, but also because, in general, the inverse matrix A−1 does not exist. However, it is illuminating to compare equation (18.5.9) with equation (13.3.6) for optimal or Wiener filtering, or with equation (13.6.6) for general linear prediction. One sees that AT · A plays the role of S2, the signal power or autocorrelation, while λH plays the role of N 2, the noise power or autocorrelation. The term in parentheses in equation (18.5.9) is something like an optimal filter, whose effect is to pass the ill-posed inverse A−1 · b through unmodified when AT · A is sufficiently large, but to suppress it when AT · A is small. The above choices of B and H are only the simplest in an obvious sequence of derivatives. If your a priori belief is that a linear function is a good approximation to u(x), then minimize B ∝ [u(x)]2dx ∝ M −2 µ=1 [−uµ + 2uµ+1 − uµ+2] 2 (18.5.10) implying B = −1 2 −10000 ··· 0 0 −1 2 −1000 ··· 0 . . . ... . . . 0 ··· 000 −1 2 −1 0 0 ··· 0000 −1 2 −1 (18.5.11) and H = BT · B = 1 −210000 ··· 0 −2 5 −41000 ··· 0 1 −4 6 −4100 ··· 0 0 1 −4 6 −410 ··· 0 . . . ... . . . 0 ··· 0 1 −4 6 −410 0 ··· 001 −4 6 −4 1 0 ··· 0001 −4 5 −2 0 ··· 00001 −2 1 (18.5.12)
18.5 Linear Regularization Methods 811 This H has two zero eigenvalues,corresponding to the two undetermined parameters of a linear function. If your a priori belief is that a quadratic function is preferable,then minimize M-3 B x "(r)P2dr 立+3u+1-3u+2+i4+3]2 (18.5.13) with -1 3-31 0 0 0… 0 -1 3-3 1 0 0 0 83 B= : (18.5.14) 鱼 0 0 0 -1 3 -3 1 0 0 0 0 -1 3 -3 1 务9 1872 and now 1 -3 3 -1 0 0 0 0 0 0 10-12 6 -1 0 0 0 0 3 -12 19 -15 6 -1 0 0 0 (North America to any server computer, users to make one paper UnN电.t 9 THE -1 6 -15 20 -15 6 -1 0 0 0 ART 0 -1 6 -15 20 -15 6 -1 0 0 H= strictly prohibited Programs 0 E ..E 0 -1 6 -15 20 -15 6 -1 0 send 0 0 -1 6 -15 20 -15 6 -1 0 0 0 0 -1 6 -15 19 -12 0 0 0 0 -1 6 -12 10 3 to dir 0 0 0 0 0 -1 3 -3 1/ (18.5.15) (We'll leave the calculation of cubics and above to the compulsive reader.) OF SCIENTIFIC COMPUTING(ISBN 0-521- Notice that you can regularize with "closeness to a differential equation,"if you want.Just pick B to be the appropriate sum of finite-difference operators(the coefficients can depend on x),and calculate H=BT.B.You don't need to know the values of your boundary conditions,since B can have fewer rows than columns, 1988-1992 by Numerical Recipes as above;hopefully,your data will determine them.Of course,if you do know some -43198.5 boundary conditions,you can build these into B too. (outside With all the proportionality signs above,you may have lost track of what actual value of A to try first.A simple trick for at least getting"on the map"is to first try North Software. 入=Tr(AT·A)/Tr(H) (18.5.16 visit website where Tr is the trace of the matrix(sum of diagonal components).This choice will tend to make the two parts of the minimization have comparable weights,and you can adjust from there. As for what is the "correct value of A,an objective criterion,if you know your errors oi with reasonable accuracy,is to make x2(that is,A.u-b2)equal to N,the number of measurements.We remarked above on the twin acceptable choices N(2N)1/2.A subjective criterion is to pick any value that you like in the
18.5 Linear Regularization Methods 811 Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). This H has two zero eigenvalues, corresponding to the two undetermined parameters of a linear function. If your a priori belief is that a quadratic function is preferable, then minimize B ∝ [u(x)]2dx ∝ M −3 µ=1 [−uµ + 3uµ+1 − 3uµ+2 + uµ+3] 2 (18.5.13) with B = −1 3 −31000 ··· 0 0 −1 3 −3100 ··· 0 . . . ... . . . 0 ··· 0 0 −1 3 −310 0 ··· 000 −1 3 −3 1 (18.5.14) and now H = 1 −3 3 −100000 ··· 0 −3 10 −12 6 −10000 ··· 0 3 −12 19 −15 6 −1000 ··· 0 −1 6 −15 20 −15 6 −100 ··· 0 0 −1 6 −15 20 −15 6 −1 0 ··· 0 . . . ... . . . 0 ··· 0 −1 6 −15 20 −15 6 −1 0 0 ··· 0 0 −1 6 −15 20 −15 6 −1 0 ··· 000 −1 6 −15 19 −12 3 0 ··· 0000 −1 6 −12 10 −3 0 ··· 00000 −1 3 −3 1 (18.5.15) (We’ll leave the calculation of cubics and above to the compulsive reader.) Notice that you can regularize with “closeness to a differential equation,” if you want. Just pick B to be the appropriate sum of finite-difference operators (the coefficients can depend on x), and calculate H = BT · B. You don’t need to know the values of your boundary conditions, since B can have fewer rows than columns, as above; hopefully, your data will determine them. Of course, if you do know some boundary conditions, you can build these into B too. With all the proportionality signs above, you may have lost track of what actual value of λ to try first. A simple trick for at least getting “on the map” is to first try λ = Tr(AT · A)/Tr(H) (18.5.16) where Tr is the trace of the matrix (sum of diagonal components). This choice will tend to make the two parts of the minimization have comparable weights, and you can adjust from there. As for what is the “correct” value of λ, an objective criterion, if you know your errors σi with reasonable accuracy, is to make χ2 (that is, |A · u − b| 2) equal to N, the number of measurements. We remarked above on the twin acceptable choices N ± (2N)1/2. A subjective criterion is to pick any value that you like in the
812 Chapter 18.Integral Equations and Inverse Theory range 0<A<oo,depending on your relative degree of belief in the a priori and a posteriori evidence.(Yes,people actually do that.Don't blame us.) Two-Dimensional Problems and Iterative Methods Up to now our notation has been indicative of a one-dimensional problem, finding ()or=().However,all of the discussion easily generalizes to the problem of estimating a two-dimensional set of unknowns=1,...,M,K= 1,...,K,corresponding,say,to the pixel intensities of a measured image.In this case,equation(18.5.8)is still the one we want to solve. 三 81 In image processing,it is usual to have the same number of input pixels in a measured“raw'or“dirty'image as desired“clean”pixels in the processed output image,so the matrices R and A(equation 18.5.5)are square and of size MK x MK. A is typically much too large to represent as a full matrix,but often it is either(i) sparse,with coefficients blurring an underlying pixel(i,j)only into measurements (ifew,j+few),or(ii)translationally invariant,so that A(i)()=A(i-u,j-v). Both of these situations lead to tractable problems. RECIPES In the case of translational invariance,fast Fourier transforms(FFTs)are the obvious method of choice.The general linear relation between underlying function 9 and measured values(18.4.7)now becomes a discrete convolution like equation (13.1.1).Ifk denotes a two-dimensional wave-vector.then the two-dimensional FFT takes us back and forth between the transform pairs A(i-4,j-w))←→A(k)ba)←→b(k) ,)←→i(k)(18.5.17) S3 We also need a regularization or smoothing operator B and the derived H=BT.B. One popular choice for B is the five-point finite-difference approximation of the 6 Laplacian operator,that is,the difference between the value of each point and the average of its four Cartesian neighbors.In Fourier space,this choice implies, B(k)xsin2(πk1/M)sin2(πk2/K) (18.5.18) H(k)o sin(1/M)sin(k2/K) Numerica 10621 43126 In Fourier space,equation(18.5.7)is merely algebraic,with solution A*(k)b(k) u(k)= (18.5.19) |A(k)P+入H(k) North where asterisk denotes complex conjugation.You can make use of the FFT routines for real data in 812.5 Turn now to the case where A is not translationally invariant.Direct solution of(18.5.8)is now hopeless,since the matrix A is just too large.We need some kind of iterative scheme. One way to proceed is to use the full machinery of the conjugate gradient method in 810.6 to find the minimum ofA+AB,equation (18.5.6).Of the various methods in Chapter 10,conjugate gradient is the unique best choice because (1) it does not require storage of a Hessian matrix,which would be infeasible here
812 Chapter 18. Integral Equations and Inverse Theory Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). range 0 <λ< ∞, depending on your relative degree of belief in the a priori and a posteriori evidence. (Yes, people actually do that. Don’t blame us.) Two-Dimensional Problems and Iterative Methods Up to now our notation has been indicative of a one-dimensional problem, finding u(x) or uµ = u(xµ). However, all of the discussion easily generalizes to the problem of estimating a two-dimensional set of unknowns uµκ, µ = 1,...,M, κ = 1,...,K, corresponding, say, to the pixel intensities of a measured image. In this case, equation (18.5.8) is still the one we want to solve. In image processing, it is usual to have the same number of input pixels in a measured “raw” or “dirty” image as desired “clean” pixels in the processed output image, so the matrices R and A (equation 18.5.5) are square and of size MK ×MK. A is typically much too large to represent as a full matrix, but often it is either (i) sparse, with coefficients blurring an underlying pixel (i, j) only into measurements (i±few, j±few), or (ii) translationally invariant, so that A(i,j)(µ,ν) = A(i−µ, j−ν). Both of these situations lead to tractable problems. In the case of translational invariance, fast Fourier transforms (FFTs) are the obvious method of choice. The general linear relation between underlying function and measured values (18.4.7) now becomes a discrete convolution like equation (13.1.1). If k denotes a two-dimensional wave-vector, then the two-dimensional FFT takes us back and forth between the transform pairs A(i − µ, j − ν) ⇐⇒ A(k) b(i,j) ⇐⇒ b(k) u(i,j) ⇐⇒ u(k) (18.5.17) We also need a regularization or smoothing operator B and the derived H = B T · B. One popular choice for B is the five-point finite-difference approximation of the Laplacian operator, that is, the difference between the value of each point and the average of its four Cartesian neighbors. In Fourier space, this choice implies, B(k) ∝ sin2(πk1/M) sin2(πk2/K) H(k) ∝ sin4(πk1/M) sin4(πk2/K) (18.5.18) In Fourier space, equation (18.5.7) is merely algebraic, with solution u(k) = A*(k)b(k) |A(k)| 2 + λH(k) (18.5.19) where asterisk denotes complex conjugation. You can make use of the FFT routines for real data in §12.5. Turn now to the case where A is not translationally invariant. Direct solution of (18.5.8) is now hopeless, since the matrix A is just too large. We need some kind of iterative scheme. One way to proceed is to use the full machinery of the conjugate gradient method in §10.6 to find the minimum of A + λB, equation (18.5.6). Of the various methods in Chapter 10, conjugate gradient is the unique best choice because (i) it does not require storage of a Hessian matrix, which would be infeasible here,