s89.1 Quantization Process and Error Two basic types of binary representations of data:(1) Fixed-point, and(2) Floating-point formats ·Ⅴ arious problems can arise in the digital implementation of the arithmetic operations involving the binary data Caused by the finite wordlength limitations of the registers storing the data and the results of arithmetic operations
§9.1 Quantization Process and Error • Two basic types of binary representations of data: (1) Fixed-point, and (2) Floating-point formats • Various problems can arise in the digital implementation of the arithmetic operations involving the binary data • Caused by the finite wordlength limitations of the registers storing the data and the results of arithmetic operations
s89.1 Quantization Process and Error For example in fixed-point arithmetic, product a of two b-bit numbers is 2b bits long, which has to be quantized to b bits to fit the prescribed wordlength of the registers In fixed-point arithmetic, addition operation can result in a sum exceeding the register wordlength, causing an overflow 盛· In floating-point arithmetic, there is no overflow, but results of both addition and multiplication may have to be quantized
§9.1 Quantization Process and Error • For example in fixed-point arithmetic, product of two b-bit numbers is 2b bits long, which has to be quantized to b bits to fit the prescribed wordlength of the registers • In fixed-point arithmetic, addition operation can result in a sum exceeding the register wordlength, causing an overflow • In floating-point arithmetic, there is no overflow, but results of both addition and multiplication may have to be quantized
s89.1 Quantization Process and Error In both fixed-point and floating-point forma a negative number can be represented in one of three different forms Analysis of various quantization effects on the performance of a digital filter depends on (1)Data format (fixed-point or floating-point), (2) Type of representation of negative numbers 3) Type of quantization, and (4)Digital filter structure implementing the transfer function
§9.1 Quantization Process and Error • In both fixed-point and floating-point formats, a negative number can be represented in one of three different forms • Analysis of various quantization effects on the performance of a digital filter depends on (1) Data format (fixed-point or floating-point), (2) Type of representation of negative numbers, (3) Type of quantization, and (4) Digital filter structure implementing the transfer function
s89.1 Quantization Process and Error Since the number of all possible combinations of the type of arithmetic, type of quantization method, and digital filter structure is very large quantization effects in some selected practical cases are discussed Analysis presented can be extended easily to other cases
§9.1 Quantization Process and Error • Since the number of all possible combinations of the type of arithmetic, type of quantization method, and digital filter structure is very large, quantization effects in some selected practical cases are discussed • Analysis presented can be extended easily to other cases
s89.1 Quantization Process and Error In DSP applications, it is a common practice to represent the data either as a fixed-point fraction or as a floating-point binary number with the mantissa as a binary fraction Assume the available wordlength is(b+l)bits with the most significant bit ( MSB) representing the sign Consider the data to be a(b+1)-bit fixed-point fraction
§9.1 Quantization Process and Error • In DSP applications, it is a common practice to represent the data either as a fixed-point fraction or as a floating-point binary number with the mantissa as a binary fraction • Assume the available wordlength is (b+1) bits with the most significant bit (MSB) representing the sign • Consider the data to be a (b+1)-bit fixed-point fraction