ISO/EC JTC1/SC29/G1 N1816 July 2000 ISO/EC JTC1 SC29/WG1 (TU-TSG8 Coding of Still Pictures JBIG JPEG Joint Bi-level Image Joint Photographic Experts grou Experts Group TITLE: An analytical study of JPEG 2000 functionalities Paper to be published in the Proceedings of SPIE, voL 4115, of the 45 annual SPIE meeting, Applications of Digital Image Processing XXlll. SOURCE: Diego Santa Cruz, Touradj Ebrahimi, Joel Askelof, Mathias Larsson and Charilaos Christopoulos. Diego. Santa Cruz(@epfl.ch Touradi. Ebrahimi@epfl. ch Askelofi@era. ericssonse PROJECT: JPEG 2000 STATUS: Information REQUESTED ACTION DISTRIBUTION: WG1 delegates, WG1 website and reflectors Contact: ISO/EC JTC1/SC29/WG1 Congener-Dr Daniel Lee Hewlett-Packard Company, 11000 Wolfe Road, MS 42U0, Cupertino, CA 95014 Te:+14084474160,Fax:+14084472842, E-mail: daniel lee@
ISO/IEC JTC1/SC29/WG1 N1816 July 2000 ISO/IEC JTC1/SC29/WG1 (ITU-T SG8) Coding of Still Pictures JBIG JPEG Joint Bi-level Image Joint Photographic Experts Group Experts Group TITLE: An analytical study of JPEG 2000 functionalities Paper to be published in the Proceedings of SPIE, vol. 4115, of the 45th annual SPIE meeting, Applications of Digital Image Processing XXIII. SOURCE: Diego Santa Cruz, Touradj Ebrahimi, Joel Askelof, Mathias Larsson and Charilaos Christopoulos. Diego.SantaCruz@epfl.ch Touradj.Ebrahimi@epfl.ch Joel.Askelof@era.ericsson.se Mathias.Larsson@era.ericsson.se Charilaos.Christopoulos@era.ericsson.se PROJECT: JPEG 2000 STATUS: Information REQUESTED ACTION: DISTRIBUTION: WG1 delegates, WG1 website and reflectors Contact: ISO/IEC JTC1/SC29/WG1 Congener - Dr. Daniel Lee Hewlett-Packard Company, 11000 Wolfe Road, MS 42U0, Cupertino, CA 95014 Tel: +1 408 447 4160, Fax: +1 408 447 2842, E-mail: daniel_lee@
TobepublishedinProceedingsofspieVol.4115.Seehttp://itSwww.epfl.ch/-dsanta/forthefinalreference JPEG 2000 still image coding versus other standards D Santa-Cruz, TEbrahimi,.Askelof, M Larsson and C. A Christopoulos Signal Processing Laboratory - Swiss Federal Institute of Technology CH-1015 Lausanne Switzerland E-mail:( Diego. Santa Cruz, Touradj Ebrahimi)@epfl. ch Ericsson Research, Corporate Unit S-164 Stockholm. Sweden E-mail: JJoel Askelof, Mathias. Larsson, Charilaos Christopoulos ) @era. ericssonse ABSTRACT JPEG 2000, the new IsonTU-T standard for still image coding, is about to be finished. Other new standards have been recently introduced, namely JPEG-LS and MPEG-4 VTC. This paper compares the set of features offered by JPEG 2000 and how well they are fulfilled, versus JPEG-LS and MPEG-4 VTC, as well as the older but widely used JPEG and more recent PNG. The study concentrates 2 s.ession efficiency and functionality set, while addressing other aspects such as omplexity. Lossless compression efficiency as well as the fixed and progressive lossy rate-distortion behaviors are valuated. Robustness to transmission errors, Region of Interest coding and complexity are also discussed. The principles ehind each algorithm are briefly described. The results show that the choice of the"best standard depends strongly on the application at hand, but that JPEG 2000 supports the widest set of features among the evaluated standards, while providing superior rate-distortion performance in most cases Keywords: image coding, standards, wavelets, DWT, DCT, JPEG, JPEG-LS, JPEG 2000, MPEG-4, PNG 1. INTRODUCTION It has been three years since the call for proposals for the next ISO/ITU-T standard for compression of still images, JPEG 2000, has been issued. Now JPEG 2000 Part I(the core system)is in its final stage to become an International Standard (Is). It has been promoted to Final Committee Draft(FCD) in March 2000 and will reach IS status by the end of the same year. A great effort has been made to deliver a new standard for today's and tomorrows applications, by providing features inexistent in previous standards, but also by providing higher efficiency for features that exist in others. Now that the new standard is nearing finalization, a trivial question would be: what are the features offered by JPEG 2000 but also how well are they fulfilled when compared to other standards offering the same features. This paper aims at providing an nswer to this simple but somewhat complex question. Section 2 provides a brief overview of the techniques compared with special attention on new features of JPEG 2000 such as Region of Interest(ROD) coding. Section 3 explains the comparison methodology employed in the results shown in section 4 and conclusions are drawn in section 5 2. OVERVIEW OF STILL IMAGE CODING STANDARDS For the purpose of this study we compare the coding algorithm in JPEG 2000 standard to the following three standards JPEG, MPEG-4 Visual Texture Coding(VTC)* and JPEG-LS. In addition, we also include PNG. The reasons behind this hoice are as follows. JPEG is one of the most popular coding techniques in imaging applications ranging from Internet to digital photography. Both MPEG-4 VTC and JPEG-LS are very recent standards that start appearing in various applications It is only logical to compare the set of features offered by JPEG 2000 standard not only to those offered in a popular but lder standard ( PEg), but also to those offered in most recent ones using newer state-of-the-art technologies. Although PNG is not formally a standard and is not based on state-of-the-art techniques, it is becoming increasingly popular for Internet based applications. PNG is also undergoing standardization by ISO/EC JTC1/SC24 and will eventually become ISO/EC international standard 15948
To be published in Proceedings of SPIE Vol. 4115. See http://ltswww.epfl.ch/~dsanta/ for the final reference. 1 JPEG 2000 still image coding versus other standards D. Santa-Cruza , T. Ebrahimia , J. Askelöfb , M. Larssonb and C. A. Christopoulosb a Signal Processing Laboratory – Swiss Federal Institute of Technology CH-1015 Lausanne, Switzerland E-mail: {Diego.SantaCruz, Touradj.Ebrahimi}@epfl.ch b Ericsson Research, Corporate Unit S-164 Stockholm, Sweden E-mail: {Joel.Askelof, Mathias.Larsson, Charilaos.Christopoulos }@era.ericsson.se ABSTRACT JPEG 2000, the new ISO/ITU-T standard for still image coding, is about to be finished. Other new standards have been recently introduced, namely JPEG-LS and MPEG-4 VTC. This paper compares the set of features offered by JPEG 2000, and how well they are fulfilled, versus JPEG-LS and MPEG-4 VTC, as well as the older but widely used JPEG and more recent PNG. The study concentrates on compression efficiency and functionality set, while addressing other aspects such as complexity. Lossless compression efficiency as well as the fixed and progressive lossy rate-distortion behaviors are evaluated. Robustness to transmission errors, Region of Interest coding and complexity are also discussed. The principles behind each algorithm are briefly described. The results show that the choice of the “best” standard depends strongly on the application at hand, but that JPEG 2000 supports the widest set of features among the evaluated standards, while providing superior rate-distortion performance in most cases. Keywords: image coding, standards, wavelets, DWT, DCT, JPEG, JPEG-LS, JPEG 2000, MPEG-4, PNG 1. INTRODUCTION It has been three years since the call for proposals1 for the next ISO/ITU-T standard for compression of still images, JPEG 2000, has been issued. Now JPEG 2000 Part I (the core system) is in its final stage to become an International Standard (IS). It has been promoted to Final Committee Draft (FCD)2 in March 2000 and will reach IS status by the end of the same year. A great effort has been made to deliver a new standard for today's and tomorrow's applications, by providing features inexistent in previous standards, but also by providing higher efficiency for features that exist in others. Now that the new standard is nearing finalization, a trivial question would be: what are the features offered by JPEG 2000 but also how well are they fulfilled when compared to other standards offering the same features. This paper aims at providing an answer to this simple but somewhat complex question. Section 2 provides a brief overview of the techniques compared, with special attention on new features of JPEG 2000 such as Region of Interest (ROI) coding. Section 3 explains the comparison methodology employed in the results shown in section 4 and conclusions are drawn in section 5. 2. OVERVIEW OF STILL IMAGE CODING STANDARDS For the purpose of this study we compare the coding algorithm in JPEG 2000 standard to the following three standards: JPEG3 , MPEG-4 Visual Texture Coding (VTC)4 and JPEG-LS5 . In addition, we also include PNG6 . The reasons behind this choice are as follows. JPEG is one of the most popular coding techniques in imaging applications ranging from Internet to digital photography. Both MPEG-4 VTC and JPEG-LS are very recent standards that start appearing in various applications. It is only logical to compare the set of features offered by JPEG 2000 standard not only to those offered in a popular but older standard (JPEG), but also to those offered in most recent ones using newer state-of-the-art technologies. Although PNG is not formally a standard and is not based on state-of-the-art techniques, it is becoming increasingly popular for Internet based applications. PNG is also undergoing standardization by ISO/IEC JTC1/SC24 and will eventually become ISO/IEC international standard 15948
TobepublishedinProceedingsofspieVol.4115.Seehttp://itSwww.epfl.ch/-dsanta/forthefinalreference 2.1. JPEG This is the very well known ISO/ITU-T standard created in the late 1980s. There are several modes defined for JPEG' including baseline, lossless, progressive and hierarchical. The baseline mode is the most popular one and supports lossy coding only. The lossless mode is not popular but provides for lossless coding, although it does not support lossy In the baseline mode the image is divided in 8x8 blocks and each of these is transformed with the dct. the transformed blocks are quantized with a uniform scalar quantizer, zig-zag scanned and entropy coded with Huffman coding. The quantization step size for each of the 64 DCT coefficients is specified in a quantization table, which remains the same for all blocks. The DC coefficients of all blocks are coded separately, using a predictive scheme. Hereafter we refer to this mode simply as JPEG The lossless mode is based on a completely different algorithm, which uses a predictive scheme. The prediction is based on the nearest three causal neighbors and seven different predictors are defined(the same one is used for all samples). The prediction error is entropy coded with Huffman coding. Hereafter we refer to this mode as L-JPEG The progressive and hierarchical modes of JPEG are both lossy and differ only in the way the dct coefficients are coded or computed, respectively, when compared to the baseline mode. They allow a reconstruction of a lower quality or lower resolution version of the image, respectively, by partial decoding of the compressed bitstream. Progressive mode encodes the quantized coefficients by a mixture of spectral selection and successive approximation, while hierarchical mode uses a pyramidal approach to computing the dCt coefficients in a multi-resolution way 2.2. MPEG-4 VTC MPEG-4 Visual Texture Coding (VTC) is the algorithm used in MPEG-4 to compress visual textures and still images which are then used in photo realistic 3D models, animated meshes, etc. or as simple still images. It is based on the discrete avelet transform (DWT), scalar quantization, zero-tree coding and arithmetic coding. The DWT is dyadic and uses a Daubechies(9, 3)tap biorthogonal filter. The quantization is scalar and can be of three types: single(SQ), multiple (MQ) and bi-level(BQ). With SQ each wavelet coefficient is quantized once, the produced bitstream not being SNR scalable With MQ a coarse quantizer is used and this information coded. A finer quantizer is then applied to the resulting quantization error and the new information coded. This process can be repeated several times, resulting in limited Snr scalability. BQ is essentially like SQ, but the information is sent by bitplanes, providing general SNR scalability. Two canning modes are available: tree-depth(TD), the standard zero-tree scanning, and band-by-band(BB). Only the latter provides for resolution scalability. The produced bitstream is resolution scalable at first, if BB scanning is used, and ther SNR scalable within each resolution level, if MQ or BQ is used. A unique feature of MPEG-4 VTC is the a shape adaptive DWT and MPEG-4s s ity to code arbitrarily shaped objects. This is accomplished by the means of coding. Several objects can be encoded separately, possibly at different ualities, and then composited at the decoder to obtain the final decoded image. On the other hand, MPEG-4 VtC does not support lossless coding 2.3. JPEG-LS JPEG-LS is the latest ISo/ITU-T standard for lossless coding of still images. It also provides for"near-lossless compression. Part-I, the baseline system, is based on adaptive prediction, context modeling and Golomb coding. In addition, it features a flat region detector to encode these in run-lengths. Near-lossless compression is achieved by allowing a fixed maximum sample error. Part-ll will introduce extensions such as an arithmetic coder, but is still under preparation. This algorithm was designed for low-complexity while providing high lossless compression ratios. However, it does not provide support for scalability, error resilience or any such functionality 2,4. PNG Portable Network Graphics(PNG) is a w3C recommendation for coding of still images which has been elaborated as a patent free replacement for GIF, while incorporating more features than this last one. It is based on a predictive scheme and ntropy coding. The prediction is done on the three nearest causal neighbors and there are five predictors that can be selected on a line-by-line basis. The entropy coding uses the Deflate algorithm of the popular Zip file compression utility
To be published in Proceedings of SPIE Vol. 4115. See http://ltswww.epfl.ch/~dsanta/ for the final reference. 2 2.1. JPEG This is the very well known ISO/ITU-T standard created in the late 1980s. There are several modes defined for JPEG3 , including baseline, lossless, progressive and hierarchical. The baseline mode is the most popular one and supports lossy coding only. The lossless mode is not popular but provides for lossless coding, although it does not support lossy. In the baseline mode, the image is divided in 8x8 blocks and each of these is transformed with the DCT. The transformed blocks are quantized with a uniform scalar quantizer, zig-zag scanned and entropy coded with Huffman coding. The quantization step size for each of the 64 DCT coefficients is specified in a quantization table, which remains the same for all blocks. The DC coefficients of all blocks are coded separately, using a predictive scheme. Hereafter we refer to this mode simply as JPEG. The lossless mode is based on a completely different algorithm, which uses a predictive scheme. The prediction is based on the nearest three causal neighbors and seven different predictors are defined (the same one is used for all samples). The prediction error is entropy coded with Huffman coding. Hereafter we refer to this mode as L-JPEG. The progressive and hierarchical modes of JPEG are both lossy and differ only in the way the DCT coefficients are coded or computed, respectively, when compared to the baseline mode. They allow a reconstruction of a lower quality or lower resolution version of the image, respectively, by partial decoding of the compressed bitstream. Progressive mode encodes the quantized coefficients by a mixture of spectral selection and successive approximation, while hierarchical mode uses a pyramidal approach to computing the DCT coefficients in a multi-resolution way. 2.2. MPEG-4 VTC MPEG-4 Visual Texture Coding (VTC) is the algorithm used in MPEG-44 to compress visual textures and still images, which are then used in photo realistic 3D models, animated meshes, etc., or as simple still images. It is based on the discrete wavelet transform (DWT), scalar quantization, zero-tree coding and arithmetic coding. The DWT is dyadic and uses a Daubechies (9,3) tap biorthogonal filter. The quantization is scalar and can be of three types: single (SQ), multiple (MQ) and bi-level (BQ). With SQ each wavelet coefficient is quantized once, the produced bitstream not being SNR scalable. With MQ a coarse quantizer is used and this information coded. A finer quantizer is then applied to the resulting quantization error and the new information coded. This process can be repeated several times, resulting in limited SNR scalability. BQ is essentially like SQ, but the information is sent by bitplanes, providing general SNR scalability. Two scanning modes are available: tree-depth (TD), the standard zero-tree scanning, and band-by-band (BB). Only the latter provides for resolution scalability. The produced bitstream is resolution scalable at first, if BB scanning is used, and then SNR scalable within each resolution level, if MQ or BQ is used. A unique feature of MPEG-4 VTC is the capability to code arbitrarily shaped objects. This is accomplished by the means of a shape adaptive DWT and MPEG-4’s shape coding. Several objects can be encoded separately, possibly at different qualities, and then composited at the decoder to obtain the final decoded image. On the other hand, MPEG-4 VTC does not support lossless coding. 2.3. JPEG-LS JPEG-LS5 is the latest ISO/ITU-T standard for lossless coding of still images. It also provides for “near-lossless” compression. Part-I, the baseline system, is based on adaptive prediction, context modeling and Golomb coding. In addition, it features a flat region detector to encode these in run-lengths. Near-lossless compression is achieved by allowing a fixed maximum sample error. Part-II will introduce extensions such as an arithmetic coder, but is still under preparation. This algorithm was designed for low-complexity while providing high lossless compression ratios. However, it does not provide support for scalability, error resilience or any such functionality. 2.4. PNG Portable Network Graphics (PNG)6 is a W3C recommendation for coding of still images which has been elaborated as a patent free replacement for GIF, while incorporating more features than this last one. It is based on a predictive scheme and entropy coding. The prediction is done on the three nearest causal neighbors and there are five predictors that can be selected on a line-by-line basis. The entropy coding uses the Deflate algorithm of the popular Zip file compression utility
TobepublishedinProceedingsofspieVol.4115.Seehttp://itSwww.epfl.ch/-dsanta/forthefinalreference which is based on LZ77 coupled with Huffman coding. PNG is capable of lossless compression only and supports gray scale, paletted color and true color, an optional alpha plane, interlacing and other features 2.5.JPEG2000 JPEG 20004, as noted previously, is the next Iso/ITU-T standard for still image coding. In the following, we restrict the description to Part I of the standard, which defines the core system. Part Il will provide various extensions for specific applications, but is still in preparation. JPEG 2000 is based on the discrete wavelet transform (DWT), scalar quantization, context modeling, arithmetic coding and post-compression rate allocation. The dwT is dyadic and can be performed with either the reversible Le gall (5, 3)taps filter, which provides for lossless coding, or the non-reversible Daubechies(9, 7)taps biorthogonal one, which provides for higher compression but does not do lossless. The quantizer follows an embedded dead-zone scalar approach and is independent for each sub-band. Each sub-band is divided into rectangular blocks(called code-blocks in JPEG 2000), typically 64x64, and entropy coded using context modeling and bit-plane arithmetic coding. The coded data is organized in so called layers, which are quality levels, using the post-compression rate allocation an output to the code-stream in packets. The generated code-stream is parseable and can be resolution, layer (i.e. SNR), position or component progressive, or any combination thereof. JPEG 2000 also supports a number of functionalities, many f which are inherent from the algorithm itself. Examples of this is random access, which is possible because of the independent coding of the code-blocks and the packetized structure of the codestream. Another such functionality is the possibility to encode images with arbitrarily shaped Regions of Interest(ROD). The fact that the subbands are encoded bitplane by bitplane makes it possible to select regions of the image that will precede the rest of the image in the codestream. By scaling the sub-band samples so that the bitplanes encoded first only contain ROl information and following bitplanes only contain background information. The only thing the decoder needs to receive is the factor by which the samples were scaled. The decoder can then invert the scaling based only on the amplitude of the samples. Other supported functionalities are error-resilience, random access, multicomponent images, palletized color, compressed domain lossless flipping and simple rotation, to mention a few 3. COMPARISON METHODOLOGY Although one of the major, and often only, concerns in coding techniques has been that of compression efficiency, it is not the only factor that determines the choice of a particular algorithm for an application. Most applications also require other features in a coding algorithm than simple compression efficiency. This is often referred to as functionalities. Examples of such functionalities are ability to distribute quality in a non-uniform fashion across the image(e.g, ROD), or resiliency to esidual transmission errors that occur in mobile channels. In this paper we report on compression efficiency, since it is still one of the top priorities in many imaging products, but we also devote attention to complexity and functionalities. In the next section we summarize the results of the study as long as the considered functionalities are concerned 3. 1. Compression efficiency Compression efficiency is measured for lossless and lossy compression. For lossless coding it is simply measured by the achieved compression ratio for each one of the test images. For lossy coding the root mean square error(RMSE)is used, as well as the corresponding peak signal to noise ratio(PSNR), defined as -20log10-2-1 RMSE where b is the bit depth of the original image Although RMSE and PSNR are known to not al ways faithfully represent visual quality, it is the only established, well- known, objective measure that works reasonably well across a wide range of compression ratios For images encoded with a Region of Interest(ROD)the RMSE, as well as the corresponding PSNR, are calculated both for the roi and for the entire image 3. 2. Complexity Evaluating complexity is a difficult issue, with no well-defined measure. It means different things for different applications It can be memory bandwidth, total working memory, number of CPU cycles, number of hardware gates, etc. Furthermore
To be published in Proceedings of SPIE Vol. 4115. See http://ltswww.epfl.ch/~dsanta/ for the final reference. 3 which is based on LZ77 coupled with Huffman coding. PNG is capable of lossless compression only and supports gray scale, paletted color and true color, an optional alpha plane, interlacing and other features. 2.5. JPEG 2000 JPEG 20002 , as noted previously, is the next ISO/ITU-T standard for still image coding. In the following, we restrict the description to Part I of the standard, which defines the core system. Part II will provide various extensions for specific applications, but is still in preparation. JPEG 2000 is based on the discrete wavelet transform (DWT), scalar quantization, context modeling, arithmetic coding and post-compression rate allocation. The DWT is dyadic and can be performed with either the reversible Le Gall (5,3) taps filter9 , which provides for lossless coding, or the non-reversible Daubechies (9,7) taps biorthogonal one10, which provides for higher compression but does not do lossless. The quantizer follows an embedded dead-zone scalar approach and is independent for each sub-band. Each sub-band is divided into rectangular blocks (called code-blocks in JPEG 2000), typically 64x64, and entropy coded using context modeling and bit-plane arithmetic coding. The coded data is organized in so called layers, which are quality levels, using the post-compression rate allocation and output to the code-stream in packets. The generated code-stream is parseable and can be resolution, layer (i.e. SNR), position or component progressive, or any combination thereof. JPEG 2000 also supports a number of functionalities, many of which are inherent from the algorithm itself. Examples of this is random access, which is possible because of the independent coding of the code-blocks and the packetized structure of the codestream. Another such functionality is the possibility to encode images with arbitrarily shaped Regions of Interest (ROI)11. The fact that the subbands are encoded bitplane by bitplane makes it possible to select regions of the image that will precede the rest of the image in the codestream. By scaling the sub-band samples so that the bitplanes encoded first only contain ROI information and following bitplanes only contain background information. The only thing the decoder needs to receive is the factor by which the samples were scaled. The decoder can then invert the scaling based only on the amplitude of the samples. Other supported functionalities are error-resilience, random access, multicomponent images, palletized color, compressed domain lossless flipping and simple rotation, to mention a few. 3. COMPARISON METHODOLOGY Although one of the major, and often only, concerns in coding techniques has been that of compression efficiency, it is not the only factor that determines the choice of a particular algorithm for an application. Most applications also require other features in a coding algorithm than simple compression efficiency. This is often referred to as functionalities. Examples of such functionalities are ability to distribute quality in a non-uniform fashion across the image (e.g., ROI), or resiliency to residual transmission errors that occur in mobile channels. In this paper we report on compression efficiency, since it is still one of the top priorities in many imaging products, but we also devote attention to complexity and functionalities. In the next section we summarize the results of the study as long as the considered functionalities are concerned. 3.1. Compression efficiency Compression efficiency is measured for lossless and lossy compression. For lossless coding it is simply measured by the achieved compression ratio for each one of the test images. For lossy coding the root mean square error (RMSE) is used, as well as the corresponding peak signal to noise ratio (PSNR), defined as 2 1 10 20log − − b RMSE where b is the bit depth of the original image. Although RMSE and PSNR are known to not always faithfully represent visual quality, it is the only established, wellknown, objective measure that works reasonably well across a wide range of compression ratios. For images encoded with a Region of Interest (ROI) the RMSE, as well as the corresponding PSNR, are calculated both for the ROI and for the entire image. 3.2. Complexity Evaluating complexity is a difficult issue, with no well-defined measure. It means different things for different applications. It can be memory bandwidth, total working memory, number of CPU cycles, number of hardware gates, etc. Furthermore
TobepublishedinProceedingsofspieVol.4115.Seehttp://itSwww.epfl.ch/-dsanta/forthefinalreference these numbers are very dependent on the optimization, targeted ap s and other factors of the different implementatio As a rough indication of complexity we provide the run times of the different algorithms on a Linux based PC. This only gives an appreciation of the involved complexit 3.3. Functionalities Comparing how well different functionalities are fulfilled in the different standards is also a difficult issue. In the next section we provide a functionality matrix that indicates the set of supported features in each standard and an appreciation of how well they are fulfilled. Although in most cases this appreciation is based on the other results presented here, in some others it is based on the capabilities provided by the different algorithms 4. RESULTS The algorithms have been evaluated with seven images from the JPEG 2000 test set, covering various types of imagery. The mages"bike"(2048x2560) and"cafe"(2048x2560)are natural,"cmpnd1"(512x768)and"chart"(1688x2347)are compound documents consisting of text, photographs and computer graphics, aerial2"(2048x2048)is an aerial photography, "target"(512x512)is a computer generated image and"us"(512x448)an ultra scan. All these images have a depth of 8 bits per pixel The results have been generated on a PC with a 550 MHz Pentium Ill processor, 512 kB of cache and 512 MB of raM under Linux 2. 12. The software implementations used for coding the images are: the JPEG 2000 Verification Model (VM) 6.1(ISO/EC JTC1/SC29/WGIN1580), the MPEG-4 MoMuSys VM of Aug 1999(ISO/EC JTC1/SC29/WGllN2805) heiNdependentJpeGGroupJpeGimplementation(http://www.ijg.org/),version6b,theSpmgJpeG-lsimplementation oftheUniversityofBritishColumbia(http://spmg.ece.ubcca/),version2.2,theLosslessjPegcodecofCornellUniversity ( ftp: //ftp. cs. cornell. edu/pub/multimed), version 1.0, and the libpng implementation of PNG (ftp: //ftp uu. net/graphics/png) version 1. 0.3 4.1. Lossless compression Table I summarizes the lossless compression efficiency of lossless JPEG (L-JPEG), JPEG-LS, PNG and JPEG 2000 for all e test images For JPeg 2000 the reversible dwt filter, referred to as JPEG 2000R, has been used. In the case of L-JPEG optimized Huffman tables and the predictor yielding the best compression performance have been used for each image. For PNG the maximum compre setting has been used, while for JPEG-LS the default options were chosen. MPEG-4 VTC is not considered, as it does not provide a lossless functionality Table 1. Lossless compression ratios I JPEG 2000R JPEG-LS L-JPEG PNG l.84 cafe 149 1.57 1.36144 cmpd chart 282200241 aerial 147 143148 2.598.701 3.04 241294 average 2.09352 It can be seen that in almost all cases the best performance is obtained by JPEG-LS. JPEG 2000 provides, in most cases, competitive compression ratios with the added benefit of scalability. PNG performance is similar to the one of JPEG 2000 As for lossless JPEG, it does not perform as well as the other, more recent, standards. One notable exception to the general trend is the"target"image, which contains mostly patches of constant gray level as well as gradients. For this type of images, PNG provides the best results, probably because of the use of LZ77. Another exception is the"cmpndI"image, in which JPEG-LS and PNG achieve much larger compression ratios. This image contains for the most part black text on a
To be published in Proceedings of SPIE Vol. 4115. See http://ltswww.epfl.ch/~dsanta/ for the final reference. 4 these numbers are very dependent on the optimization, targeted applications and other factors of the different implementations. As a rough indication of complexity we provide the run times of the different algorithms on a Linux based PC. This only gives an appreciation of the involved complexity. 3.3. Functionalities Comparing how well different functionalities are fulfilled in the different standards is also a difficult issue. In the next section we provide a functionality matrix that indicates the set of supported features in each standard and an appreciation of how well they are fulfilled. Although in most cases this appreciation is based on the other results presented here, in some others it is based on the capabilities provided by the different algorithms. 4. RESULTS The algorithms have been evaluated with seven images from the JPEG 2000 test set, covering various types of imagery. The images “bike” (2048x2560) and “cafe” (2048x2560) are natural, “cmpnd1” (512x768) and “chart” (1688x2347) are compound documents consisting of text, photographs and computer graphics, “aerial2” (2048x2048) is an aerial photography, “target” (512x512) is a computer generated image and “us” (512x448) an ultra scan. All these images have a depth of 8 bits per pixel. The results have been generated on a PC with a 550 MHz PentiumTM III processor, 512 kB of cache and 512 MB of RAM under Linux 2.2.12. The software implementations used for coding the images are: the JPEG 2000 Verification Model (VM) 6.1 (ISO/IEC JTC1/SC29/WG1 N 1580), the MPEG-4 MoMuSys VM of Aug. 1999 (ISO/IEC JTC1/SC29/WG11 N 2805), the Independent JPEG Group JPEG implementation (http://www.ijg.org/), version 6b, the SPMG JPEG-LS implementation of the University of British Columbia (http://spmg.ece.ubc.ca/), version 2.2, the Lossless JPEG codec of Cornell University (ftp://ftp.cs.cornell.edu/pub/multimed), version 1.0, and the libpng implementation of PNG (ftp://ftp.uu.net/graphics/png), version 1.0.3. 4.1. Lossless compression Table 1 summarizes the lossless compression efficiency of lossless JPEG (L-JPEG), JPEG-LS, PNG and JPEG 2000 for all the test images. For JPEG 2000 the reversible DWT filter, referred to as JPEG 2000R, has been used. In the case of L-JPEG optimized Huffman tables and the predictor yielding the best compression performance have been used for each image. For PNG the maximum compression setting has been used, while for JPEG-LS the default options were chosen. MPEG-4 VTC is not considered, as it does not provide a lossless functionality. Table 1. Lossless compression ratios JPEG 2000R JPEG-LS L-JPEG PNG bike 1.77 1.84 1.61 1.66 cafe 1.49 1.57 1.36 1.44 cmpnd1 3.77 6.44 3.23 6.02 chart 2.60 2.82 2.00 2.41 aerial2 1.47 1.51 1.43 1.48 target 3.76 3.66 2.59 8.70 us 2.63 3.04 2.41 2.94 average 2.50 2.98 2.09 3.52 It can be seen that in almost all cases the best performance is obtained by JPEG-LS. JPEG 2000 provides, in most cases, competitive compression ratios with the added benefit of scalability. PNG performance is similar to the one of JPEG 2000. As for lossless JPEG, it does not perform as well as the other, more recent, standards. One notable exception to the general trend is the “target” image, which contains mostly patches of constant gray level as well as gradients. For this type of images, PNG provides the best results, probably because of the use of LZ77. Another exception is the “cmpnd1” image, in which JPEG-LS and PNG achieve much larger compression ratios. This image contains for the most part black text on a