1 Errors in numericalanalysisRound-off ErrorsErrorsourcelimited number of digitsof machine numbersRepresentationofmachinenumbers(16-bitcomputer)base-2system(binary),16-bitcomputer(wordlength=16bits)·integerrepresentation28272625213212211210292423222120214sign》range of numbers covered:-((215-1)+1),..., +((215-1)-32,768,..., +32,767.overflowoverflowfloating-point representation》according to standard IEEE754(IEEE=Institute of Electrical and Electronics Engineers)》(number)=(sign)(mantissa)·(base)^(exponent)restriction:mantissa e[1,2]base = 2, mantissa =1+ 2a -2*,a (0,1)k=14Michael Beer, Engineering Mathematics
14 Error source ● limited number of digits of machine numbers Round-off Errors Representation of machine numbers (16-bit computer) sign 214 213 212 211 210 29 28 27 26 25 24 23 22 21 20 ● base-2 system (binary), 16-bit computer (word length = 16 bits) » range of numbers covered: −((215−1)+1), ., +((215−1) −32,768, ., +32,767 ● integer representation overflow overflow » (number) = (sign) · (mantissa) · (base) ^ (exponent) ● floating-point representation { } − = = =+ ⋅ ∈ ∑ p k k k k 1 base 2, mantissa 1 a 2 ,a 0,1 restriction: mantissa ∈ [1, 2) » according to standard IEEE 754 (IEEE = Institute of Electrical and Electronics Engineers) 1 Errors in numerical analysis Michael Beer, Engineering Mathematics
1ErrorsinnumericalanalysisRound-off ErrorsRepresentation of machine numbers (cont'd)floating-point representation (cont'd)signsigned exponentmantissa1 bitrbitsp bits》single precision formatsize: 32 bits; r = 8, p = 23gap between two consecutivemantissa numbers:=2-23~1.192.10-7precision:6-7 decimal digitsrange of numbers covered (approx.): ± 2-126±2+128±1.175·10-38..,±3.403·10+38(in addition to zero)underflowoverflow》doubleprecisionformatsize: 64 bits; r = 11, p = 52; gap: = 2-52~2.220.10-16precision:15-16decimaldigitsrange of numbers covered (approx.): ± 2-1022,±2+1024±2.225·10-308,.,±1.798·10+308(in addition to zero)underflowoverflow》quadprecisionformat(underconstruction inIEEE754r);size:128 bit15Michael Beer,Engineering Mathematics
15 Round-off Errors range of numbers covered (approx.): ± 2−126 , ., ± 2+128 (in addition to zero) ± 1.175·10−38 , ., ± 3.403·10+38 underflow overflow sign signed exponent mantissa 1 bit r bits p bits Representation of machine numbers (cont'd) ● floating-point representation (cont'd) » single precision format size: 32 bits; r = 8, p = 23 gap between two consecutive mantissa numbers: ε = 2−23 ≈ 1.192·10−7 precision: 6−7 decimal digits range of numbers covered (approx.): ± 2−1022 , ., ± 2+1024 (in addition to zero) ± 2.225·10−308 , ., ± 1.798·10+308 underflow overflow » double precision format size: 64 bits; r = 11, p = 52; gap: ε = 2−52 ≈ 2.220·10−16 precision: 15−16 decimal digits » quad precision format (under construction in IEEE 754r); size: 128 bit 1 Errors in numerical analysis Michael Beer, Engineering Mathematics
1 Errors in numericalanalysisRound-offErrorsCharacteristicsofmachinenumbers.discrete finite countable set D of available numerical representations,combinatorialproblem:/D/≤264=1.845.1019》compare:the set R of real numbers is connected,infiniteand uncountable;ID<R》exact representation of a truereal numberis an exceptional case!!!》eachindividualcomputeroperationfl.lisperformedwithapproximated numbersx,andyieldsanapproximated resultYa:y =ya(f[xa1(Xt1), Xa2(Xt2)]) with Xt1,Xt2 eRandXa1,Xa2,YaEID)》transformationfromIR to IDleadsto errorsDefinition:round-off error.difference between thetrue real numberXtand its computerapproximationXarEro = Xt - Xa(xt) with Xt E R and xa(xt) e D; (form of quantisation error)》Erincreaseswithmagnitudeofxt(constantbutincreasingexponent)16Michael Beer, Engineering Mathematics
16 Round-off Errors Characteristics of machine numbers ● discrete finite countable set of available numerical representations, combinatorial problem: || ≤ 264 ≈ 1.845·1019 » compare: the set of real numbers is connected, infinite and uncountable; ||«|| 1 Errors in numerical analysis Definition: round-off error ● difference between the true real number xt and its computer approximation xa, Ero = xt − xa(xt) with xt ∈ and xa(xt) ∈ ; (form of quantisation error) » Ero increases with magnitude of xt (constant ε but increasing exponent) » exact representation of a true real number is an exceptional case !!! » each individual computer operation f[.] is performed with approximated numbers xa and yields an approximated result ya: y = ya(f[xa 1(xt 1), xa 2(xt 2)]) with xt 1, xt 2 ∈ and xa 1, xa 2,ya ∈ » transformation from to leads to errors Michael Beer, Engineering Mathematics