当前位置：和泉文库 > 数学 > 浏览文档

《数值分析》课程教学参考书籍：《Numerical Analysis》PDF电子书（Youngstown State University，Richard L. Burden，NINTH EDITION）

1 Mathematical Preliminaries and Error Analysis 2 Solutions of Equations in One Variable 3 Interpolation and Polynomial Approximation 4 Numerical Differentiation and Integration 5 Initial-Value Problems for Ordinary Differential Equations 6 Direct Methods for Solving Linear Systems 7 IterativeTechniques in Matrix Algebra 8 ApproximationTheory 9 Approximating Eigenvalues 10 Numerical Solutions of Nonlinear Systems of Equations 11 Boundary-Value Problems for Ordinary Differential Equations 12 Numerical Solutions to Partial Differential Equations

文件格式：PDF，文件大小：9.79MB，售价：72.52元

共887页，可试读40页，点击往前阅读 ↑↑

文档详细内容（约887页）

1.2 Round-off Errors and Computer Arithmetic 19 The exponential part of the number is, therefore, 21027−1023 = 24. The final 52 bits specify that the mantissa is f = 1 · 1 2 1 + 1 · 1 2 3 + 1 · 1 2 4 + 1 · 1 2 5 + 1 · 1 2 8 + 1 · 1 2 12 . As a consequence, this machine number precisely represents the decimal number (−1) s 2c−1023(1 + f ) = (−1) 0 · 21027−1023 1 + 1 2 + 1 8 + 1 16 + 1 32 + 1 256 + 1 4096 = 27.56640625. However, the next smallest machine number is 0 10000000011 1011100100001111111111111111111111111111111111111111, and the next largest machine number is 0 10000000011 1011100100010000000000000000000000000000000000000001. This means that our original machine number represents not only 27.56640625, but also half of the real numbers that are between 27.56640625 and the next smallest machine number, as well as half the numbers between 27.56640625 and the next largest machine number. To be precise, it represents any real number in the interval [27.5664062499999982236431605997495353221893310546875, 27.5664062500000017763568394002504646778106689453125). The smallest normalized positive number that can be represented has s = 0, c = 1, and f = 0 and is equivalent to 2−1022 · (1 + 0) ≈ 0.22251 × 10−307, and the largest has s = 0, c = 2046, and f = 1 − 2−52 and is equivalent to 21023 · (2 − 2−52) ≈ 0.17977 × 10309. Numbers occurring in calculations that have a magnitude less than 2−1022 · (1 + 0) result in underflow and are generally set to zero. Numbers greater than 21023 · (2 − 2−52) result in overflow and typically cause the computations to stop (unless the program has been designed to detect this occurrence). Note that there are two representations for the number zero; a positive 0 when s = 0, c = 0 and f = 0, and a negative 0 when s = 1, c = 0 and f = 0. Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

20 CHAPTER 1- Mathematical Preliminaries and Error Analysis Decimal Machine Numbers The use of binary digits tends to conceal the computational difficulties that occur when a finite collection of machine numbers is used to represent all the real numbers. To examin these problems, we will use more familiar decimal numbers instead of binary representation Specifically, we assume that machine numbers are represented in the normalized decimal floating-point form ±0.d1d2…dkx10,1≤d1≤9,and0≤d≤9, for each i= 2,..., k Numbers of this form are called k-digit decimal machine numbers Any positive real number within the numerical range of the machine can be normalized to the form y=0.d1d2…dkdk+1dk 10n The error that results from The floating-point form of y, denoted fl(y), is obtained by terminating the mantissa of replacing a number with its y at k decimal digits. There are two common ways of performing this termination. One floating-point form is called method, called chopping, is to simply chop off the digits dk+ldk+2.. This produces the round-off error regardless of loating-point form whether the rounding o chopping method is used. f1(y)=0.d1d2..dk×10 The other method, called rounding, adds 5 x 10/-+d to y and then chops the result to obtain a number of the form fl(y)=0.8162.k×10 For rounding, when dk+1>5, we add 1 to dk to obtain f10); that is, we round up. When dk+1 <5, we simply chop off all but the first k digits; so we round down. If we round down, then 8i= di, for each i= 1, 2,..., k. However, if we round up, the digits(and even the exponent)might change Example 1 Determine the five-digit(a)chopping and(b) rounding values of the irrational number Solution The number t has an infinite decimal expansion of the form T=3. 14159265 Written in normalized decimal form we have 丌=0.314159265 The relative error is generally a a)The floating-point form of T using five-digit chopping is better measure of accuracy than the absolute error because it takes f(r)=0.31415×101=3.1415 into consideration the size of the number being approximated. b) The sixth digit of the decimal expansion of T is a 9, so the floating-point form of T using five-digit rounding fl(r)=(0.31415+0.0000×1032=3.1416 The following definition describes two methods for measuring approximation errors. Definition 1.15 Suppose that p*is an approximation to p. The absolute error is lp-p l, and the relative error Is vice P≠0. Consider the absolute and relative errors in representing p by p in the following Copyright 2010 Cengage Learning. All Rights t materially affect the overall leaming eaperience Cengage Learning reserves the right to remo rty commen may be suppressed from the eBook andor eChaptert'sh. May no be copied, scanned, or duplicated, in whole or in part Due to

20 CHAPTER 1 Mathematical Preliminaries and Error Analysis Decimal Machine Numbers The use of binary digits tends to conceal the computational difficulties that occur when a finite collection of machine numbers is used to represent all the real numbers. To examine these problems, we will use more familiar decimal numbers instead of binary representation. Specifically, we assume that machine numbers are represented in the normalized decimal floating-point form ±0.d1d2 ... dk × 10n , 1 ≤ d1 ≤ 9, and 0 ≤ di ≤ 9, for each i = 2, ... , k. Numbers of this form are called k-digit decimal machine numbers. Any positive real number within the numerical range of the machine can be normalized to the form y = 0.d1d2 ... dkdk+1dk+2 ... × 10n . The error that results from The floating-point form of y, denoted f l(y), is obtained by terminating the mantissa of replacing a number with its floating-point form is called round-off error regardless of whether the rounding or chopping method is used. y at k decimal digits. There are two common ways of performing this termination. One method, called chopping, is to simply chop off the digits dk+1dk+2 .... This produces the floating-point form f l(y) = 0.d1d2 ... dk × 10n . The other method, called rounding, adds 5 × 10n−(k+1) to y and then chops the result to obtain a number of the form f l(y) = 0.δ1δ2 ...δk × 10n . For rounding, when dk+1 ≥ 5, we add 1 to dk to obtain f l(y); that is, we round up. When dk+1 < 5, we simply chop off all but the first k digits; so we round down. If we round down, then δi = di, for each i = 1, 2, ... , k. However, if we round up, the digits (and even the exponent) might change. Example 1 Determine the five-digit (a) chopping and (b) rounding values of the irrational number π. Solution The number π has an infinite decimal expansion of the form π = 3.14159265.... Written in normalized decimal form, we have π = 0.314159265 ... × 101 . (a) The floating-point form of π using five-digit chopping is f l(π ) = 0.31415 × 101 = 3.1415. (b) The sixth digit of the decimal expansion of π is a 9, so the floating-point form of π using five-digit rounding is f l(π ) = (0.31415 + 0.00001) × 101 = 3.1416. The following definition describes two methods for measuring approximation errors. The relative error is generally a better measure of accuracy than the absolute error because it takes into consideration the size of the number being approximated. Definition 1.15 Suppose that p∗ is an approximation to p. The absolute error is |p − p∗|, and the relative error is |p − p∗| |p| , provided that p = 0. Consider the absolute and relative errors in representing p by p∗ in the following example. Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it

1.2 Round-off Errors and Computer Arithmetic 21 Example 2 Determine the absolute and relative errors when approximating p by p∗ when (a) p = 0.3000 × 101 and p∗ = 0.3100 × 101; (b) p = 0.3000 × 10−3 and p∗ = 0.3100 × 10−3; (c) p = 0.3000 × 104 and p∗ = 0.3100 × 104. Solution (a) For p = 0.3000 × 101 and p∗ = 0.3100 × 101 the absolute error is 0.1, and the relative error is 0.3333 × 10−1. (b) For p = 0.3000 × 10−3 and p∗ = 0.3100 × 10−3 the absolute error is 0.1 × 10−4, and the relative error is 0.3333 × 10−1. (c) For p = 0.3000 × 104 and p∗ = 0.3100 × 104, the absolute error is 0.1 × 103, and the relative error is again 0.3333 × 10−1. This example shows that the same relative error, 0.3333 × 10−1, occurs for widely varying absolute errors. As a measure of accuracy, the absolute error can be misleading and the relative error more meaningful, because the relative error takes into consideration the size of the value. We often cannot find an accurate value for the true error in an approximation. Instead we find a bound for the error, which gives us a “worst-case” error. The following definition uses relative error to give a measure of significant digits of accuracy for an approximation. Definition 1.16 The number p∗ is said to approximate p to t significant digits (or figures) if t is the largest nonnegative integer for which |p − p∗| |p| ≤ 5 × 10−t . Table 1.1 illustrates the continuous nature of significant digits by listing, for the various values of p, the least upper bound of |p − p∗|, denoted max |p − p∗|, when p∗ agrees with p to four significant digits. The term significant digits is often used to loosely describe the number of decimal digits that appear to be accurate. The definition is more precise, and provides a continuous concept. Table 1.1 p 0.1 0.5 100 1000 5000 9990 10000 max |p − p∗| 0.00005 0.00025 0.05 0.5 2.5 4.995 5. Returning to the machine representation of numbers, we see that the floating-point representation f l(y) for the number y has the relative error y − f l(y) y . If k decimal digits and chopping are used for the machine representation of y = 0.d1d2 ... dkdk+1 ... × 10n , Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it

CHAPTER 1- Mathematical Preliminaries and Error Analysis f1(y)||0ad1d2.dkdk+1…×102-0d1d…dk×102 y 0.dk+ldk 0d1d2 0d1d2 Since d1 =0, the minimal value of the denominator is 0. 1. The numerator is bounded above by 1. As a consequence, ≤ 10 In a similar manner, a bound for the relative error when using k-digit rounding arithmeti is 0.5x 10-*+I (See Exercise 24.) Note that the bounds for the relative error using k-digit arithmetic are independent of the number being represented. This result is due to the manner in which the machine numbers are distributed along the real line. Because of the exponential form of the characteristic, the same number of decimal machine numbers is used to represent each of the intervals [O1,1[1, 10], and [10, 100]. In fact, within the limits of the machine, the number of decimal machine numbers in [10", 10+] is constant for all integers n Finite-Digit Arithmetic In addition to inaccurate representation of numbers, the arithmetic performed in a computer is not exact. The arithmetic involves manipulating binary digits by various shifting, or logical, operations. Since the actual mechanics of these operations are not pertinent to this presentation, we shall devise our own approximation to computer arithmetic. Although our arithmetic will not give the exact picture, it suffices to explain the problems that occur(For an explanation of the manipulations actually involved, the reader is urged to consult more technically oriented computer science texts, such as[Ma], Computer System Architecture.) Assume that the floating-point representations fi()and flo)are given for the real numbers x and y and that the symbols e, e, e represent machine addition, subtraction multiplication, and division operations, respectively. We will assume a finite-digit arithmetic given by x⊕y=fl(fl(x)+fl(y),x⑧y=fl(fl(x)×fl(y) xey= fl(fl(r)-fl)), xey= fi(fl(x):flo)) This arithmetic corresponds to performing exact arithmetic on the floating-point repre sentations of x and y and then converting the exact result to its finite-digit floating-point epresentation Rounding arithmetic is easily implemented in Maple. For example, the command causes all arithmetic to be rounded to 5 digits. To ensure that Maple uses approximate rather than exact arithmetic we use the evalf. For example, if x= T and y= v2 then evalf (x); evalf(y) produces 3. 1416 and 1.4142, respectively. Then fl(fl(x)+ flo))is performed using 5-digit rounding arithmetic with f(evalf(x)+evalf (y)) Copyright 2010 Cengage Learning. All Rights t materially affect the overall leaming eaperience Cengage Learning reserves the right to remo rty commen may be suppressed from the eBook andor eChaptert'sh. May no be copied, scanned, or duplicated, in whole or in part Due to

22 CHAPTER 1 Mathematical Preliminaries and Error Analysis then y − f l(y) y = 0.d1d2 ... dkdk+1 ... × 10n − 0.d1d2 ... dk × 10n 0.d1d2 ... × 10n = 0.dk+1dk+2 ... × 10n−k 0.d1d2 ... × 10n = 0.dk+1dk+2 ... 0.d1d2 ... × 10−k . Since d1 = 0, the minimal value of the denominator is 0.1. The numerator is bounded above by 1. As a consequence, y − f l(y) y ≤ 1 0.1 × 10−k = 10−k+1 . In a similar manner, a bound for the relative error when using k-digit rounding arithmetic is 0.5 × 10−k+1. (See Exercise 24.) Note that the bounds for the relative error using k-digit arithmetic are independent of the number being represented. This result is due to the manner in which the machine numbers are distributed along the real line. Because of the exponential form of the characteristic, the same number of decimal machine numbers is used to represent each of the intervals [0.1, 1], [1, 10], and [10, 100]. In fact, within the limits of the machine, the number of decimal machine numbers in [10n, 10n+1] is constant for all integers n. Finite-Digit Arithmetic In addition to inaccurate representation of numbers, the arithmetic performed in a computer is not exact. The arithmetic involves manipulating binary digits by various shifting, or logical, operations. Since the actual mechanics of these operations are not pertinent to this presentation, we shall devise our own approximation to computer arithmetic. Although our arithmetic will not give the exact picture, it suffices to explain the problems that occur. (For an explanation of the manipulations actually involved, the reader is urged to consult more technically oriented computer science texts, such as [Ma], Computer System Architecture.) Assume that the floating-point representations f l(x) and f l(y) are given for the real numbers x and y and that the symbols ⊕, , ⊗, .. represent machine addition, subtraction, multiplication, and division operations, respectively. We will assume a finite-digit arithmetic given by x ⊕ y = f l(f l(x) + f l(y)), x ⊗ y = f l(f l(x) × f l(y)), x y = f l(f l(x) − f l(y)), x .. y = f l(f l(x) ÷ f l(y)). This arithmetic corresponds to performing exact arithmetic on the floating-point representations of x and y and then converting the exact result to its finite-digit floating-point representation. Rounding arithmetic is easily implemented in Maple. For example, the command Digits := 5 causes all arithmetic to be rounded to 5 digits. To ensure that Maple uses approximate rather than exact arithmetic we use the evalf. For example, if x = π and y = √2 then evalf(x); evalf(y) produces 3.1416 and 1.4142, respectively. Then f l(f l(x) + f l(y)) is performed using 5-digit rounding arithmetic with evalf(evalf(x) + evalf(y)) Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it

1.2 Round-off Errors and Computer Arithmetic 23 which gives 4.5558. Implementing finite-digit chopping arithmetic is more difficult and requires a sequence of steps or a procedure. Exercise 27 explores this problem. Example 3 Suppose that x = 5 7 and y = 1 3 . Use five-digit chopping for calculating x + y, x − y, x × y, and x ÷ y. Solution Note that x = 5 7 = 0.714285 and y = 1 3 = 0.3 implies that the five-digit chopping values of x and y are f l(x) = 0.71428 × 100 and f l(y) = 0.33333 × 100 . Thus x ⊕ y = f l(f l(x) + f l(y)) = f l 0.71428 × 100 + 0.33333 × 100 = f l 1.04761 × 100 = 0.10476 × 101 . The true value is x + y = 5 7 + 1 3 = 22 21 , so we have Absolute Error = 22 21 − 0.10476 × 101 = 0.190 × 10−4 and Relative Error = 0.190 × 10−4 22/21 = 0.182 × 10−4 . Table 1.2 lists the values of this and the other calculations. Table 1.2 Operation Result Actual value Absolute error Relative error x ⊕ y 0.10476 × 101 22/21 0.190 × 10−4 0.182 × 10−4 x y 0.38095 × 100 8/21 0.238 × 10−5 0.625 × 10−5 x ⊗ y 0.23809 × 100 5/21 0.524 × 10−5 0.220 × 10−4 x .. y 0.21428 × 101 15/7 0.571 × 10−4 0.267 × 10−4 The maximum relative error for the operations in Example 3 is 0.267 × 10−4, so the arithmetic produces satisfactory five-digit results. This is not the case in the following example. Example 4 Suppose that in addition to x = 5 7 and y = 1 3 we have u = 0.714251, v = 98765.9, and w = 0.111111 × 10−4 , so that f l(u) = 0.71425 × 100 , f l(v) = 0.98765 × 105 , and f l(w) = 0.11111 × 10−4 . Determine the five-digit chopping values of x u, (x u) .. w, (x u) ⊗ v, and u ⊕ v. Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it

点击进入文档下载页（PDF格式）

共887页，可试读40页，点击继续阅读 ↓↓

您可能感兴趣的文档

点击购买下载（PDF）

下载及服务说明

购买前请先查看本文档预览页，确认内容后再进行支付；
如遇文件无法下载、无法访问或其它任何问题，可发送电子邮件反馈，核实后将进行文件补发或退款等其它相关操作；
邮箱：

文档浏览记录