当前位置：和泉文库 > 电子与通信 > 《电子工程师手册》学习资料（英文版）Chapter 17 Multidimensional Signal Processing

《电子工程师手册》学习资料（英文版）Chapter 17 Multidimensional Signal Processing

17.1 Digital Image Processing Image Capture. Point Operations Image Enhancement. Digital Purdue University Image Compression. Reconstruction. Edge Detection.Analysis Charles A.Bouman and Computer Vision Purdue University 17.2 Video Signal Processing Sampling. Quantization. Vector Quantization.Video Sarah A. Rajala Compression Informatin Preserving Coders.Predictive North Carolina State University Coding. Motion-Compensated Predictive Coding- Transform Coding. Subband Coding HDTV. Motion Estimation N.K.Bose

文件格式：PDF，文件大小：2.59MB，售价：13.48元

共48页，可试读17页，点击往前阅读 ↑↑

文档详细内容（约48页）

RE 17.3 This image demonstrates the effects of quantization error. The upper left image is a coronary artery image 8 bits(256 levels or shades of gray) per pixel. The upper right image has 4 bits/pixel (16 levels). The lower left image bits/pixel(8 levels). The lower right image has 2 bits/pixel (4 levels). Note the false contouring in the images as the number of possible levels in the pixel representation is reduced. This false contouring is the quantization error, and as the number of levels increases the quantization error decreases because fewer mistakes are being made in the representation of false contouring in the picture. One usually needs at least 6 bits or 64 gray levels to represent an image adequately. Higher-quality imaging systems use 8 bits(256 levels)or even as many as 10 bits(1024 levels)per ixel. In most applications, the human observer cannot distinguish quantization error when there are more than 256 levels. (Many times the number of gray levels is cited in bytes. One byte is 8 bits, i.e., high-quality monochrome digital imaging systems use one byte per pixel) One of the problems briefly mentioned previously is the large number of pixels needed to represent an i. y which translates into a large amount of digital data needed for the representation. A 512 X 512 image wit 8 bits/pixel(1 byte/pixel)of gray level representation requires 2,097, 152 bits of computer data to describe it.A typical computer file that contains 1000 words usually requires only about 56,000 bits to describe it. The 512 X 512 image is 37 times larger! (A picture is truly worth more than 1000 words. )This data requirement is one of the major problems with digital imaging, given that the storage of digital images in a computer file system is expensive. Perhaps another example will demonstrate this problem. Many computers and word processing systems have the capability of transmitting information over telephone lines to other systems at data rates of 2400 bits per second. At this speed it would require nearly 15 minutes to transmit a 512 X 512 image! Moving objects are imaged digitally by taking digital snapshots of them, i. e, digital video. True digital imaging would acquire about 30 images/s to capture all the important motion in a scene. At 30 images/s, with each image sampled at 512 X 512 and with 8 bits/pixel, the system must handle 62, 914, 560 bits/s. Only very expensive acquisition systems are capable of handling these large data rates The grea do on a computer can be done to a digital image. Recall that a digital image is Just a(huge)matrix test advantage of digital images is that they can be processed on a computer. Any type of operation that one car of numbers. Digital image processing is the process of using a computer to extract useful information from this matrix. Processing that cannot be done optically or with analog systems(such as early video systems)can be easily done on computers. The disadvantage is that a large amount of data needs to be processed and on ome small computer systems this can take a long time(hours). we shall examine image processing in more detail in the next subsection and discuss some of the computer hardware issues in a later chapt c2000 by CRC Press LLC

© 2000 by CRC Press LLC of false contouring in the picture. One usually needs at least 6 bits or 64 gray levels to represent an image adequately. Higher-quality imaging systems use 8 bits (256 levels) or even as many as 10 bits (1024 levels) per pixel. In most applications, the human observer cannot distinguish quantization error when there are more than 256 levels. (Many times the number of gray levels is cited in bytes. One byte is 8 bits, i.e., high-quality monochrome digital imaging systems use one byte per pixel.) One of the problems briefly mentioned previously is the large number of pixels needed to represent an image, which translates into a large amount of digital data needed for the representation. A 512 3 512 image with 8 bits/pixel (1 byte/pixel) of gray level representation requires 2,097,152 bits of computer data to describe it. A typical computer file that contains 1000 words usually requires only about 56,000 bits to describe it. The 512 3 512 image is 37 times larger! (A picture is truly worth more than 1000 words.) This data requirement is one of the major problems with digital imaging, given that the storage of digital images in a computer file system is expensive. Perhaps another example will demonstrate this problem. Many computers and word processing systems have the capability of transmitting information over telephone lines to other systems at data rates of 2400 bits per second. At this speed it would require nearly 15 minutes to transmit a 512 3 512 image! Moving objects are imaged digitally by taking digital snapshots of them, i.e., digital video. True digital imaging would acquire about 30 images/s to capture all the important motion in a scene. At 30 images/s, with each image sampled at 512 3 512 and with 8 bits/pixel, the system must handle 62,914,560 bits/s. Only very expensive acquisition systems are capable of handling these large data rates. The greatest advantage of digital images is that they can be processed on a computer. Any type of operation that one can do on a computer can be done to a digital image. Recall that a digital image is just a (huge) matrix of numbers. Digital image processing is the process of using a computer to extract useful information from this matrix. Processing that cannot be done optically or with analog systems (such as early video systems) can be easily done on computers. The disadvantage is that a large amount of data needs to be processed and on some small computer systems this can take a long time (hours). We shall examine image processing in more detail in the next subsection and discuss some of the computer hardware issues in a later chapter. FIGURE 17.3 This image demonstrates the effects of quantization error. The upper left image is a coronary artery image with 8 bits (256 levels or shades of gray) per pixel. The upper right image has 4 bits/pixel (16 levels). The lower left image has 3 bits/pixel (8 levels). The lower right image has 2 bits/pixel (4 levels). Note the false contouring in the images as the number of possible levels in the pixel representation is reduced. This false contouring is the quantization error, and as the number of levels increases the quantization error decreases because fewer mistakes are being made in the representation

URE 17. 4 Contrast stretching. The image on the right has gray values between 0 and 63, causing the contrast to look washed out. The image on the right has been contrast enhanced by multiplying the gray levels by four. Point operations Perhaps the simplest image processing operation is that of modifying the values of individual pixels in an image These operations are commonly known as point operations. A point operation might be used to highlight certain regions in an image. Suppose one wished to know where all the pixels in a certain gray level region were spatially located in the image. One would modify all those pixel values to 0 (black) or 255(white)such that the observer could see where they were located Another example of a point operation is contrast enhancement or contrast stretching. The pixel values in a articular image may occupy only a small region of gray level distribution. For instance, the pixels in an image may only take on values between 0 and 63, when they could nominally take on values between 0 and 255. Thi is sometimes caused by the way the image was digitized and/or by the type of transducer used. When this image is examined on a CrT display the contrast looks washed out. A simple point operation that multiplies each pixel value in the image by four will increase the apparent contrast in the image; the new image now has gray values between 0 and 252. This operation is shown in Fig. 17. 4. Possibly the most widely used point operation in medical imaging is pseudo-coloring. In this point operation all the pixels in the image with a particular gray value are assigned a color. Various schemes have been proposed for appropriate pseudo-color tables that assign he gray values to colors. It should be mentioned that point operations are often cascaded, i. e, an image undergoes contrast enhancement and then pseudo-coloring The operations described above can be thought of as operations(or algorithms)that modify the range of the gray levels of the pixels. An important feature that describes a great deal about an image is the histogram of the pixel values. A histogram is a table that lists how many pixels in an image take on a particular gray value. These data are often plotted as a function of the gray value. Point operations are also known as histogram modification or histogram stretching. The contrast enhancement operation shown in Fig. 17.4 modifies the histogram of the resultant image by stretching the gray values from a range of 0-63 to a range of 0-252 Some point operations are such that the resulting histogram of the processed image has a particular shape. a popular rm of histogram modification is known as histogram equalization, whereby the pixels are modified such that ge is almost flat, i. e,, all the pixel values occur equally It is impossible to list all possible types of point operations; however, the important thing to remember is that these operations process one pixel at a time by modifying the pixel based only on its gray level value and nowhere it is distributed spatially(i. e, location in the pixel matrix). These operations are performed to enhance the image, make it easier to see certain structures or regions in the image, or to force a particular shape to the histogram of the image. They are also used as initial operations in a more complicated image processing e 2000 by CRC Press LLC

© 2000 by CRC Press LLC Point Operations Perhaps the simplest image processing operation is that of modifying the values of individual pixels in an image. These operations are commonly known as point operations. A point operation might be used to highlight certain regions in an image. Suppose one wished to know where all the pixels in a certain gray level region were spatially located in the image. One would modify all those pixel values to 0 (black) or 255 (white) such that the observer could see where they were located. Another example of a point operation is contrast enhancement or contrast stretching. The pixel values in a particular image may occupy only a small region of gray level distribution. For instance, the pixels in an image may only take on values between 0 and 63, when they could nominally take on values between 0 and 255. This is sometimes caused by the way the image was digitized and/or by the type of transducer used. When this image is examined on a CRT display the contrast looks washed out. A simple point operation that multiplies each pixel value in the image by four will increase the apparent contrast in the image; the new image now has gray values between 0 and 252. This operation is shown in Fig. 17.4. Possibly the most widely used point operation in medical imaging is pseudo-coloring. In this point operation all the pixels in the image with a particular gray value are assigned a color. Various schemes have been proposed for appropriate pseudo-color tables that assign the gray values to colors. It should be mentioned that point operations are often cascaded, i.e., an image undergoes contrast enhancement and then pseudo-coloring. The operations described above can be thought of as operations (or algorithms) that modify the range of the gray levels of the pixels. An important feature that describes a great deal about an image is the histogram of the pixel values. A histogram is a table that lists how many pixels in an image take on a particular gray value. These data are often plotted as a function of the gray value. Point operations are also known as histogram modification or histogram stretching. The contrast enhancement operation shown in Fig. 17.4 modifies the histogram of the resultant image by stretching the gray values from a range of 0–63 to a range of 0–252. Some point operations are such that the resulting histogram of the processed image has a particular shape. A popular form of histogram modification is known as histogram equalization, whereby the pixels are modified such that the histogram of the processed image is almost flat, i.e., all the pixel values occur equally. It is impossible to list all possible types of point operations; however, the important thing to remember is that these operations process one pixel at a time by modifying the pixel based only on its gray level value and not where it is distributed spatially (i.e., location in the pixel matrix). These operations are performed to enhance the image, make it easier to see certain structures or regions in the image, or to force a particular shape to the histogram of the image. They are also used as initial operations in a more complicated image processing algorithm. FIGURE 17.4 Contrast stretching. The image on the right has gray values between 0 and 63, causing the contrast to look washed out. The image on the right has been contrast enhanced by multiplying the gray levels by four

Image enhancement Image enhancement is the use of image processing algorithms to ertain types of distortion in an mage. The image is enhanced by removing noise, making the edge s in the image stand out, or an other operation that makes the image look better. Point operation bove are generally considered to be enhancement operations Enhancement also includes operations that use groups of pixels and the spatial location of the pixels in the image. The most widely used algorithms for enhancement are based on pixel functions that are known as window operations. a window operation performed on an image is nothing more than the process of examining the pixels in a certain region of the image, called the window region, and computing some type of mathematical function derived from the pixels in the window. In most cases the windows are square or rectangle, although other shapes have been used. After the operation is performed, the result of the computation is placed in the center pixel of the window where a 3 x 3 pixel window has been extracted from the image. The values of the pixels in the window, labeled a1, a2,..., ag, are used to compute a new pixel value which replaces the value of as, and the window is moved to a new center location until all the pixels in the original image have been processed. As an example of a window operation, suppose we computed the average value of the pixels in the ndow. This operation is known as smoothing and will tend to reduce noise in the image, but unfortunately it will also tend to blur edge structures in the image. Another window operation often used is the computation of a linear weighted sum of the pixel values. Let d s be the new pixel value that will replace a, in the original image. We then form C: a (17.1) where the a s are any real numbers. For the simple smoothing operation described above we set a =1/9 for all i. By changing the values of the a; weights, one can perform different types of enhancement operations to an image. Any window operation that can be described by Eq 17. 1 is known as a linear window operation or convolution operator. If some of the a; coefficients take on negative values, one can enhance the appearance of edge structures in the image. It is possible to compute a nonlinear function of the pixels in the window. One of the more powerful nonline window operations is that of median filtering. In this operation all the pixels in the window are listed middle, or median, pixel is obtained. The median pixel then is used to replace as. The median filter is used to remove noise from an image and at the same time preserve the edge structure the image. More recently there has been a great deal of interest in morphological operators. These are also nonlinear window operations that can be used to extract or enhance shape information in an image. In the preceding discussion, all of the window operations were described on 3 x 3 windows. The current research in window operations is directed at using large window sizes, i.e., 9 x 9, 13 X 13, or 21 x 21. The philosophy in this work is that small window sizes only use local information and what one really needs to use is information that is more global in nature. Digital Image Compression Image compression refers to the task of reducing the amount of data required to store or transmit a digital Issed earlier, in its natural form, a digital image comprises an array of numbers. Each such cement is often confused with image restoration Image enhancement is the ac processing algorithms to enhance the appearance of the image. Image restoration is the application of algorithn knowledge of the degradation process to enhance or restore the image, i.e., deconvolution algorithms used to effect of the aperture point spread function in blurred images. a discussion of image restoration is beyond the scope of this c2000 by CRC Press LLC

© 2000 by CRC Press LLC Image Enhancement Image enhancement is the use of image processing algorithms to remove certain types of distortion in an image. The image is enhanced by removing noise, making the edge structures in the image stand out, or any other operation that makes the image look better. 1 Point operations discussed above are generally considered to be enhancement operations. Enhancement also includes operations that use groups of pixels and the spatial location of the pixels in the image. The most widely used algorithms for enhancement are based on pixel functions that are known as window operations. A window operation performed on an image is nothing more than the process of examining the pixels in a certain region of the image, called the window region, and computing some type of mathematical function derived from the pixels in the window. In most cases the windows are square or rectangle, although other shapes have been used. After the operation is performed, the result of the computation is placed in the center pixel of the window where a 3 3 3 pixel window has been extracted from the image. The values of the pixels in the window, labeled a1, a2, . . ., a9, are used to compute a new pixel value which replaces the value of a5, and the window is moved to a new center location until all the pixels in the original image have been processed. As an example of a window operation, suppose we computed the average value of the pixels in the window. This operation is known as smoothing and will tend to reduce noise in the image, but unfortunately it will also tend to blur edge structures in the image. Another window operation often used is the computation of a linear weighted sum of the pixel values. Let a9 5 be the new pixel value that will replace a5 in the original image. We then form (17.1) where the ai ’s are any real numbers. For the simple smoothing operation described above we set ai= 1/9 for all i. By changing the values of the ai weights, one can perform different types of enhancement operations to an image. Any window operation that can be described by Eq. 17.1 is known as a linear window operation or convolution operator. If some of the ai coefficients take on negative values, one can enhance the appearance of edge structures in the image. It is possible to compute a nonlinear function of the pixels in the window. One of the more powerful nonlinear window operations is that of median filtering. In this operation all the pixels in the window are listed in descending magnitude and the middle, or median, pixel is obtained. The median pixel then is used to replace a5. The median filter is used to remove noise from an image and at the same time preserve the edge structure in the image. More recently there has been a great deal of interest in morphological operators. These are also nonlinear window operations that can be used to extract or enhance shape information in an image. In the preceding discussion, all of the window operations were described on 3 3 3 windows. The current research in window operations is directed at using large window sizes, i.e., 9 3 9, 13 3 13, or 21 3 21. The philosophy in this work is that small window sizes only use local information and what one really needs to use is information that is more global in nature. Digital Image Compression Image compression refers to the task of reducing the amount of data required to store or transmit a digital image. As discussed earlier, in its natural form, a digital image comprises an array of numbers. Each such 1 Image enhancement is often confused with image restoration. Image enhancement is the ad hoc application of various processing algorithms to enhance the appearance of the image. Image restoration is the application of algorithms that use knowledge of the degradation process to enhance or restore the image, i.e., deconvolution algorithms used to remove the effect of the aperture point spread function in blurred images. A discussion of image restoration is beyond the scope of this section. ¢ = = a a Â i i i 5 1 9 a

Encoder Image Modulator Error Control Coding Mutiplexing etc Demodulator Processing FIGURE 17.5 Overview of an image compression system number is the sampled value of the image at a pixel (picture element)location. These numbers are represented with finite precision using a fixed number of bits. Until recently, the dominant image size was 512 X 512 pixels with 8 bits or 1 byte per pixel. The total storage size for such an image is 5122=0.25 X 10 bytes or 0.25 Mbytes. When digital image processing first emerged in the 1960s, this was considered to be a formidable amount of data, and so interest in developing ways to reduce this storage requirement arose immediately. Since that time, image compression has continued to be an active area of research. The recent emergence of standards for image coding algorithms and the commercial availability of very large scale integration(VLSI)chips that implement image coding algorithms is indicative of the present maturity of the field, although research activity continues apace. with declining memory costs and increasing transmission bandwidths, 0. 25 Mbytes is no longer considered to be the large amount of data that it once was. This might suggest that the need for image compression is not as great as previously. Unfortunately(or fortunately, depending on one's point of view), this is not the case because our appetite for image data has also grown enormously over the years. The old 512 X 512 pixels x I byte per pixel"standard"was a consequence of the spatial and gray scale resolution of sensors and displays that were commonly available until recently. At this time, displays with more than 10x 10 pixels and 24 bits/pixel to allow full color representation(8 bits each for red, green, and blue)are becoming commonplace. Thus, our 0.25-Mbyte standard image size has grown to 3 Mbytes. This is just the tip of the iceberg, however. For example, in desktop printing applications, a 4-color(cyan, magenta, yellow, and black)image of an 8.5 X hyperspectral image contains terrain irradiance measurements in each of 200 10-nm-wide spectral band,, R 11 in. page sampled at 600 dots per in. requires 134 Mbytes. In remote sensing applications, a typic 5-m intervals on the ground. Each measurement is recorded with 12-bit precision. Such data are acquire from aircraft or satellite and are used in agriculture, forestry, and other fields concerned with management of natural resources. Storage of these data from just a 10 X 10 km2 area requires 4800 Mbytes. Figure 17.5 shows the essential components of an image compression system. At the system input, the image encoded into its compressed form by the image coder. The compressed image may then be subjected to further digital processing, such as error control coding, encryption, or multiplexing with other data sources, before being used to modulate the analog signal that is actually transmitted through the channel or stored in a storage medium. At the system output, the image is processed step by step to undo each of the operations that was performed on it at the system input At the final step, the image is decoded into its original uncom- ressed form by the image decoder. Because of the role of the image encoder and decoder in an image compression system, image coding is often used as a synonym for image compression. If the reconstructed image is identical to the original image, the compression is said to be lossless. Otherwise, it is lossy Image compression algorithms depend for their success on two separate factors: redundancy and irrelevancy. Redundancy refers to the fact that each pixel in an image does not take on all possible values with equal probability, and the value that it does take on is not independent of that of the other pixels in the image. If this were not true, the image would appear as a white noise pattern such as that seen when a television receiver is tuned to an unused channel. From an information-theoretic point of view, such an image contains the e 2000 by CRC Press LLC

© 2000 by CRC Press LLC number is the sampled value of the image at a pixel (picture element) location. These numbers are represented with finite precision using a fixed number of bits. Until recently, the dominant image size was 512 3 512 pixels with 8 bits or 1 byte per pixel. The total storage size for such an image is 5122 ª 0.25 3 106 bytes or 0.25 Mbytes. When digital image processing first emerged in the 1960s, this was considered to be a formidable amount of data, and so interest in developing ways to reduce this storage requirement arose immediately. Since that time, image compression has continued to be an active area of research. The recent emergence of standards for image coding algorithms and the commercial availability of very large scale integration (VLSI) chips that implement image coding algorithms is indicative of the present maturity of the field, although research activity continues apace. With declining memory costs and increasing transmission bandwidths, 0.25 Mbytes is no longer considered to be the large amount of data that it once was. This might suggest that the need for image compression is not as great as previously. Unfortunately (or fortunately, depending on one’s point of view), this is not the case because our appetite for image data has also grown enormously over the years. The old 512 3 512 pixels 3 1 byte per pixel “standard’’ was a consequence of the spatial and gray scale resolution of sensors and displays that were commonly available until recently. At this time, displays with more than 103 3 103 pixels and 24 bits/pixel to allow full color representation (8 bits each for red, green, and blue) are becoming commonplace. Thus, our 0.25-Mbyte standard image size has grown to 3 Mbytes. This is just the tip of the iceberg, however. For example, in desktop printing applications, a 4-color (cyan, magenta, yellow, and black) image of an 8.5 3 11 in.2 page sampled at 600 dots per in. requires 134 Mbytes. In remote sensing applications, a typical hyperspectral image contains terrain irradiance measurements in each of 200 10-nm-wide spectral bands at 25-m intervals on the ground. Each measurement is recorded with 12-bit precision. Such data are acquired from aircraft or satellite and are used in agriculture, forestry, and other fields concerned with management of natural resources. Storage of these data from just a 10 3 10 km2 area requires 4800 Mbytes. Figure 17.5 shows the essential components of an image compression system. At the system input, the image is encoded into its compressed form by the image coder. The compressed image may then be subjected to further digital processing, such as error control coding, encryption, or multiplexing with other data sources, before being used to modulate the analog signal that is actually transmitted through the channel or stored in a storage medium. At the system output, the image is processed step by step to undo each of the operations that was performed on it at the system input. At the final step, the image is decoded into its original uncompressed form by the image decoder. Because of the role of the image encoder and decoder in an image compression system, image coding is often used as a synonym for image compression. If the reconstructed image is identical to the original image, the compression is said to be lossless. Otherwise, it is lossy. Image compression algorithms depend for their success on two separate factors: redundancy and irrelevancy. Redundancy refers to the fact that each pixel in an image does not take on all possible values with equal probability, and the value that it does take on is not independent of that of the other pixels in the image. If this were not true, the image would appear as a white noise pattern such as that seen when a television receiver is tuned to an unused channel. From an information-theoretic point of view, such an image contains the FIGURE 17.5 Overview of an image compression system

Feature Compressed FIGURE 17.6 Key elements of an image encoder. maximum amount of information. From the point of view of a human or machine interpreter, however, it ontains no information at all. Irrelevancy refers to the fact that not all the information in the image is required for its intended application. First, under typical viewing conditions, it is possible to remove some of the information in an image without producing a change that is perceptible to a human observer. This is because of the limited ability of the human viewer to detect small changes in luminance over a large area or larger changes in luminance over a very small area, especially in the presence of detail that may mask these changes. Second, even though some degradation in image quality may be observed as a result of image compression, the degradation may not be objectionable for a particular application, such as teleconferencing. Third, the degradation introduced by image compression may not interfere with the ability of a human or machine to extract the information from the image that is important for a particular application. Lossless compression algorithms can only exploit redundancy, whereas lossy methods may exploit both redundancy and irrelevancy. A myriad of approaches have been proposed for image compression. To bring some semblance of order to the field, it is helpful to identify those key elements that provide a reasonably accurate description of most encoding algorithms. These are shown in Fig. 17. 6. The first step is feature extraction. Here the partitioned into x Blocks of pixels. Within each block, a feature vector is computed which is used to represent all the pixels within that block. If the feature vector provides a complete description of the block, i. the block of pixel values can be determined exactly from the feature vector, then the feature is suitable for use in a lossless compression algorithm. Otherwise, the algorithm will be lossy. For the simplest feature vector, we let the block size N= l and take the pixel values to be the features. Another important example for N= l is to let the feature be the error in the prediction of the pixel value based on the values of neighboring pixels which have already been encoded and, hence, whose values would be known as the decoder. This feature forms the basis for predictive encoding, of which differential pulse-code modulation(DPCM)is a special case. For larger size blocks, the most important example is to compute a two-dimensional(2-D) Fourier-like transform of the block of pixels and to use the n transform coefficients as the feature vector. The widely used Joint graphic Experts Group(JPEG)standard image coder is based on the discrete cosine transform(DCT) a block size of N=8. In all of the foregoing examples, the block of pixel values can be reconstructed exactly from the feature vector. In the last example, the inverse DCT is used. Hence, all these features may form the basis for a lossless compression algorithm. A feature vector that does not provide a complete description of the pixel block is a vector consisting of the mean and variance of the pixels within the block and an nX N binary mask indicating whether or not each pixel exceeds the mean. From this vector, we can only reconstruct an approximation to the original pixel block which has the same mean and variance as the original. This feature provide as nonredundant as possible a representation of the image and to separate those aspects of the image that are relevant to the viewer from those that are irrelevant The second step in image encoding is vector quantization. This is essentially a clustering step in which we partition the feature space into cells, each of which will be represented by a single prototype feature vector ince all feature vectors belonging to a given cell are mapped to the same prototype, the quantization process is irreversible and, hence, cannot be used as part of a lossless compression algorithm. Figure 17.7 shows an xample for a two-dimensional feature space. Each dot corresponds to one feature vector from the image. The Xs signify the prototypes used to represent all the feature vectors contained within its quantization cell, the boundary of which is indicated by the dashed lines. Despite the simplicity with which vector quantization may be described, the implementation of a vector quantizer is a computationally complex task unless some structure is imposed on it. The clustering is based on minimizing the distortion between the original and quantized feature vectors, averaged over the entire image. The distortion measure can be chosen to account for the relative ensitivity of the human viewer to different kinds of degradation. In one dimension, the vector quantizer reduces to the Lloyd-Max scalar quantizer c2000 by CRC Press LLC

© 2000 by CRC Press LLC maximum amount of information. From the point of view of a human or machine interpreter, however, it contains no information at all. Irrelevancy refers to the fact that not all the information in the image is required for its intended application. First, under typical viewing conditions, it is possible to remove some of the information in an image without producing a change that is perceptible to a human observer. This is because of the limited ability of the human viewer to detect small changes in luminance over a large area or larger changes in luminance over a very small area, especially in the presence of detail that may mask these changes. Second, even though some degradation in image quality may be observed as a result of image compression, the degradation may not be objectionable for a particular application, such as teleconferencing. Third, the degradation introduced by image compression may not interfere with the ability of a human or machine to extract the information from the image that is important for a particular application. Lossless compression algorithms can only exploit redundancy, whereas lossy methods may exploit both redundancy and irrelevancy. A myriad of approaches have been proposed for image compression. To bring some semblance of order to the field, it is helpful to identify those key elements that provide a reasonably accurate description of most encoding algorithms. These are shown in Fig. 17.6. The first step is feature extraction. Here the image is partitioned into N 3 N blocks of pixels. Within each block, a feature vector is computed which is used to represent all the pixels within that block. If the feature vector provides a complete description of the block, i.e., the block of pixel values can be determined exactly from the feature vector, then the feature is suitable for use in a lossless compression algorithm. Otherwise, the algorithm will be lossy. For the simplest feature vector, we let the block size N = 1 and take the pixel values to be the features. Another important example for N = 1 is to let the feature be the error in the prediction of the pixel value based on the values of neighboring pixels which have already been encoded and, hence, whose values would be known as the decoder. This feature forms the basis for predictive encoding, of which differential pulse-code modulation (DPCM) is a special case. For larger size blocks, the most important example is to compute a two-dimensional (2-D) Fourier-like transform of the block of pixels and to use the N2 transform coefficients as the feature vector. The widely used Joint Photographic Experts Group (JPEG) standard image coder is based on the discrete cosine transform (DCT) with a block size of N = 8. In all of the foregoing examples, the block of pixel values can be reconstructed exactly from the feature vector. In the last example, the inverse DCT is used. Hence, all these features may form the basis for a lossless compression algorithm. A feature vector that does not provide a complete description of the pixel block is a vector consisting of the mean and variance of the pixels within the block and an N 3 N binary mask indicating whether or not each pixel exceeds the mean. From this vector, we can only reconstruct an approximation to the original pixel block which has the same mean and variance as the original. This feature is the basis for the lossy block truncation coding algorithm. Ideally, the feature vector should be chosen to provide as nonredundant as possible a representation of the image and to separate those aspects of the image that are relevant to the viewer from those that are irrelevant. The second step in image encoding is vector quantization. This is essentially a clustering step in which we partition the feature space into cells, each of which will be represented by a single prototype feature vector. Since all feature vectors belonging to a given cell are mapped to the same prototype, the quantization process is irreversible and, hence, cannot be used as part of a lossless compression algorithm. Figure 17.7 shows an example for a two-dimensional feature space. Each dot corresponds to one feature vector from the image. The X’s signify the prototypes used to represent all the feature vectors contained within its quantization cell, the boundary of which is indicated by the dashed lines. Despite the simplicity with which vector quantization may be described, the implementation of a vector quantizer is a computationally complex task unless some structure is imposed on it. The clustering is based on minimizing the distortion between the original and quantized feature vectors, averaged over the entire image. The distortion measure can be chosen to account for the relative sensitivity of the human viewer to different kinds of degradation. In one dimension, the vector quantizer reduces to the Lloyd-Max scalar quantizer. FIGURE 17.6 Key elements of an image encoder

点击进入文档下载页（PDF格式）

共48页，可试读17页，点击继续阅读 ↓↓

您可能感兴趣的文档

点击购买下载（PDF）

下载及服务说明

购买前请先查看本文档预览页，确认内容后再进行支付；
如遇文件无法下载、无法访问或其它任何问题，可发送电子邮件反馈，核实后将进行文件补发或退款等其它相关操作；
邮箱：

文档浏览记录