Lecture Note 2:Solution of nonlinear equations Xiaoqun Zhang Shanghai Jiao Tong University Last updated:September 21,2012
Lecture Note 2: Solution of nonlinear equations Xiaoqun Zhang Shanghai Jiao Tong University Last updated: September 21, 2012
Lecture note 2 Numerical Analysis 1.1 Solution of nonlinear equations in one variable Chapter 2 in the textbook. ·Give a function f(z),iffp)=O,then we say that p is a"zero”ora"root” of f(x). ·Solve f(x)=0in[a,. -f(r)=22-4x+4=0,we can find the exact solution by hand. -f(r)=3x-et=0,need an algorithm to approximate the solution. An example of Bisection Method. Intermediate Value Theorem (special form):If f(x)is continuous on [a,b]and f(a)f(b)<0,then there exist a p[a,b]such that f(p)=0. f(b) y=f(x) f(a) Figure 1.1:Intermediate Value theorem Example of the Bisection method:Find a zero of f(z)=3x-e on the interval [1,2]. Step1f(1)=3-e≈0.282 f(2)=6-e2≈-1.389 There is a root in [a,b]. Step 2 Divide [1,2]into two equal sections(by the middle point of it):[1,1.5] and[1.5,2]. Where is the zero? Step3f(1.5)=3.1.5-e1.5≈0.018. f(1.5)f(2)<0,so the root is in [1.5,2]. Step 4 Divide [1.5,2]into equal sections(by the middle point of it):[1.5,1.75] and[1.75,2]. f(1.75)≈-0.505 f(1.5)f(1.75)<0 So the root is in [1.5,1.75] 2
Lecture note 2 Numerical Analysis 1.1 Solution of nonlinear equations in one variable • Chapter 2 in the textbook. • Give a function f(x), if f(p) = 0, then we say that p is a ”zero” or a ”root” of f(x). • Solve f(x) = 0 in [a, b]. – f(x) = x 2 − 4x + 4 = 0, we can find the exact solution by hand. – f(x) = 3x − e x = 0, need an algorithm to approximate the solution. An example of Bisection Method. • Intermediate Value Theorem (special form): If f(x) is continuous on [a, b] and f(a)f(b) < 0, then there exist a p ∈ [a, b] such that f(p) = 0. a p b y = f(x) x y f(b) f(a) Figure 1.1: Intermediate Value theorem • Example of the Bisection method: Find a zero of f(x) = 3x − e x on the interval [1, 2]. Step 1 f(1) = 3 − e ≈ 0.282 f(2) = 6 − e 2 ≈ −1.389 There is a root in [a, b]. Step 2 Divide [1, 2] into two equal sections (by the middle point of it): [1, 1.5] and [1.5, 2]. Where is the zero? Step 3 f(1.5) = 3 · 1.5 − e 1.5 ≈ 0.018. f(1.5)f(2) < 0 , so the root is in [1.5, 2]. Step 4 Divide [1.5, 2] into equal sections (by the middle point of it): [1.5, 1.75] and [1.75, 2]. f(1.75) ≈ −0.505 f(1.5)f(1.75) < 0 So the root is in [1.5, 1.75] 2
Lecture note 2 Numerical Analysis Step 5 Divide [1.5,1.75]into two equal section (by the middle point of it): [1.5,1.625]and[1.625,1.75l. f(1.625)≈-0.203 f(1.5)f(1.625)<0 So the root is in [1.5,1.625]. Step 6 Take the middle of the interval as the guess of p. p=15+62=1.5625 2 The real solution is p =1.5121.We get an approximate root with 2 correct digits! The Algorithm Problem:find a number pE [a,b]such that f(p)=0. ·Assumptions. 1.f(x)is continuous on [a,b. 2.f(x)have opposite signs at a and b,i.e.,f(a)f(b)<0. Some remarks on the assumptions. -Here is“<”not“≤”we exclude f(a)f(b)=0 because in this case either a or b is a root. -Emphasize the assumptions.These are the same assumptions as in the intermediate value theorem.Recall the intermediate value theorem. ·Algorithm. -eta,]=a,,p1=岁. Set n 1. ()If pn is close enough to p,then we stop;otherwise,we cut [an,bn]into two parts [an,Pn]and [pn,bn]. If f(an)f(pn)<0,then p is in [an,pn].Set [an+1,on+1][an:Pnl If f(an)f(pn)>0,then f(pn)f(n)<0.Therefore p is in [pn:bn]. Set [an+1,6n+1]=[Pn,on]. -Set pn+1=ata±,and go to step(). 2 The algorithm will produce a sequence of intervals [an,bn]satisfying 1.p∈[an,bnl. 2.The interval length (on+1-an+1)=(on an) 3.pPn=an地m 2 One issue remains:How to judge if pn is close enough to p?Stopping criteria! 3
Lecture note 2 Numerical Analysis Step 5 Divide [1.5, 1.75] into two equal section (by the middle point of it): [1.5, 1.625] and [1.625, 1.75]. f(1.625) ≈ −0.203 f(1.5)f(1.625) < 0 So the root is in [1.5, 1.625]. Step 6 Take the middle of the interval as the guess of p. p = 1.5+1.625 2 = 1.5625 The real solution is p = 1.5121. We get an approximate root with 2 correct digits! The Algorithm • Problem: find a number p ∈ [a, b] such that f(p) = 0. • Assumptions. 1. f(x) is continuous on [a, b]. 2. f(x) have opposite signs at a and b, i.e., f(a)f(b) < 0. Some remarks on the assumptions. – Here is “<” not “≤” we exclude f(a)f(b) = 0 because in this case either a or b is a root. – Emphasize the assumptions. These are the same assumptions as in the intermediate value theorem. Recall the intermediate value theorem. • Algorithm. – Set [a1, b1] = [a, b], p1 = a+b 2 . – Set n = 1. (⋆) If pn is close enough to p, then we stop; otherwise, we cut [an, bn] into two parts [an, pn] and [pn, bn]. ∗ If f(an)f(pn) < 0, then p is in [an, pn]. Set [an+1, bn+1] = [an, pn]. ∗ If f(an)f(pn) > 0, then f(pn)f(bn) < 0. Therefore p is in [pn, bn]. Set [an+1, bn+1] = [pn, bn]. – Set pn+1 = an+1+bn+1 2 , and go to step (⋆). • The algorithm will produce a sequence of intervals [an, bn] satisfying 1. p ∈ [an, bn]. 2. The interval length (bn+1 − an+1) = 1 2 (bn − an). 3. pn = an+bn 2 . • One issue remains: How to judge if pn is close enough to p? Stopping criteria! 3
Lecture note 2 Numerical Analysis Stopping criteria (I)Select a maximum iteration number N.We stop the iteration after N itera- tions. (II)Select a small tolerance e>0.We stop the iteration whenever one of the following is met. 1.The error is small enough (a)Absolute error,lp-pnl<c. (b)Relative error etc. IPl But we don't know p before we find it! 2.The residual is small enough (a)absolute lf(pn)<e. (b)relative<etc. But If(pn)is small the relative error Ipn-pl is small. Example::f(x)=(x-1)10,p=1,andp:=1+1/n.fp2)≤9.8×10-4 but lp2 -pl =0.5. 3.The difference between successive iterations is small enough (a)absolute Ipn -Pn-1l<c. (b)relativeap<,and pn0. But Ipn-Pn-1 converges to zero Pn converges.Counter example: pn=∑i=i(/).lpm-pn-i→0 but p diverges.. Which one is best?It depends! For safety (to avoid infinite loops),we use:when (I)OR (II)is met,we stop. Of course,the algorithm might fail when (I)is met. Convergence and Error estimate Theorem 1 (Theorem 2.1 in the textbook)Suppose -fECla,]. -f(a)f(b)<0. Then the sequence pn generated by the bisection method satisfies pn-p< where p is a zero of f in (a,b). Proof. bn-an=5bn-1-an-1) 11 =互2bm-2-am-2)=2z6m-2-am-2) (1.1) 1 1 -2nb1-a1)=2mb-a 4
Lecture note 2 Numerical Analysis Stopping criteria (I) Select a maximum iteration number N. We stop the iteration after N iterations. (II) Select a small tolerance ǫ > 0. We stop the iteration whenever one of the following is met. 1. The error is small enough (a) Absolute error, |p − pn| < ǫ. (b) Relative error |p−pn| |p| < ǫ, |p−pn| |p−p1| < ǫ, etc. But we don’t know p before we find it! 2. The residual is small enough (a) absolute |f(pn)| < ǫ. (b) relative |f(pn)| |f(p0)| < ǫ etc. But |f(pn)| is small 6=⇒ the relative error |pn − p| is small. Example: f(x) = (x − 1)10 , p = 1, and pi = 1 + 1/n. f(p2) ≤ 9.8 × 10−4 but |p2 − p| = 0.5. 3. The difference between successive iterations is small enough (a) absolute |pn − pn−1| < ǫ. (b) relative |pn−pn−1| |pn| < ǫ, and pn 6= 0. But |pn − pn−1| converges to zero 6=⇒ pn converges. Counter example: pn = Pn i=1(1/i). |pn − pn−1| → 0 but pn diverges. • Which one is best? It depends! • For safety (to avoid infinite loops), we use: when (I) OR (II) is met, we stop. Of course, the algorithm might fail when (I) is met. Convergence and Error estimate • Theorem 1 (Theorem 2.1 in the textbook) Suppose – f ∈ C[a, b]. – f(a)f(b) < 0. Then the sequence pn generated by the bisection method satisfies |pn−p| ≤ b−a 2n , where p is a zero of f in (a, b). Proof. bn − an = 1 2 (bn−1 − an−1) = 1 2 · 1 2 (bn−2 − an−2) = 1 2 2 (bn−2 − an−2) . . . = 1 2 n−1 (b1 − a1) = 1 2 n−1 (b − a) (1.1) 4
Lecture note 2 Numerical Analysis p∈[an,onl and pn=n,so Ipn-pl≤z(bn-an)=是 口 Corollary 1(Corollary 1.2) n limoe pn=p. For bisection method,we can stop it until the absolute lp-pil<e. Example:Determine the number of iterations necessary to solve 3x-e"=0 on [1,2]with accuracy 10-3.We have 6-a 1 pm-川≤2n=2a≤10-3 Solve 1 ≤10-3 Take logarithm on both hand sides, -n≤-31og210,s0n≥31og210≈9.96. We need at least 10 iterations! Summary of the bisection method ·PrOs 1.Simple and easy to implement 2.One function evaluation per iteration 3.No knowledge of the derivative is needed 4.Easy to converge (only need continuity) ·Cons 1.Have to find an interval [a,b]such that f(a)f(b)<0 2.The error depends on b-a and converges slowly 3.It is hard to generalize it to solutions of multi-variable nonlinear equa- tions 1.2 Fixed point iteration 1.2.1 Fixed point Definition:p is a fixed point of g of g(p)=p. Finding a root of f(z)is equivalent to finding a fixed point,e.g., g(x)=x-f(x),g(x)=x+3f(x),g(x)=x-h(x)f(r),h(x)≠0,… 5
Lecture note 2 Numerical Analysis p ∈ [an, bn] and pn = an+bn 2 , so |pn − p| ≤ 1 2 (bn − an) = b−a 2n . • Corollary 1 (Corollary 1.2) lim n→+∞ pn = p. • For bisection method, we can stop it until the absolute |p − pi | < ǫ. • Example: Determine the number of iterations necessary to solve 3x − e x = 0 on [1, 2] with accuracy 10−3 . We have |pn − p| ≤ b − a 2 n = 1 2 n ≤ 10−3 . Solve 1 2 n ≤ 10−3 Take logarithm on both hand sides, −n ≤ −3 log2 10, so n ≥ 3 log2 10 ≈ 9.96. We need at least 10 iterations! Summary of the bisection method • Pros 1. Simple and easy to implement 2. One function evaluation per iteration 3. No knowledge of the derivative is needed 4. Easy to converge (only need continuity) • Cons 1. Have to find an interval [a, b] such that f(a)f(b) < 0 2. The error depends on b − a and converges slowly 3. It is hard to generalize it to solutions of multi-variable nonlinear equations 1.2 Fixed point iteration 1.2.1 Fixed point • Definition: p is a fixed point of g of g(p) = p. • Finding a root of f(x) is equivalent to finding a fixed point, e.g., g(x) := x − f(x), g(x) = x + 3f(x), g(x) = x − h(x)f(x), h(x) 6= 0, , ... 5