Preface With the widespread adoption of technologies such as digital television, Internet streaming video and DVD-Video, video compression has become an essential component of broad- ast and entertainment media. The success of digital TV and dvD-Video is based upon the 10-year-old MPEG-2 standard, a technology that has proved its effectiveness but is now oking distinctly old-fashioned. It is clear that the time is right to replace MPEG-2 video compression with a more effective and efficient technology that can take advantage of recent progress in processing power. For some time there has been a running debate about which technology should take up MPEG-2's mantle. The leading contenders are the International Standards known as MPeg-4 Visual and H. 264 This book aims to provide a clear, practical and unbiased guide to these two standards to enable developers, engineers, researchers and students to understand and apply them effec- tively. Video and image compression is a complex and extensive subject and this book keeps an unapologetically limited focus, concentrating on the standards themselves(and in the case of MPEG-4 Visual, on the elements of the standard that support coding of 'real world video material)and on video coding concepts that directly underpin the standards. The book takes an application-based approach and places particular emphasis on tools and features that are help- ful in practical applications, in order to provide practical and useful assistance to developers and adopters of these standards I am grateful to a number of people who helped to shape the content of this book. I received many helpful comments and requests from readers of my book Video Codec Design. Particular thanks are due to Gary Sullivan for taking the time to provide helpful and detailed comments, corrections and advice and for kindly agreeing to write a Foreword; to Harvey Hanna(Impact Labs Inc), Yafan Zhao(The Robert Gordon University)and Aitor Garay for reading and commenting on sections of this book during its development; to members of the Joint Video Team for clarifying many of the details of H. 264: to the editorial team at John Wiley Sons(and especially to the ever-helpful, patient and supportive Kathryn Sharples to Phyllis for her constant support; and finally to Freya and Hugh for patiently waiting for the ng-promised trip to Storybook Glen very much hope that you will find this book enjoyable, readable and above all useful Furtherresourcesandlinksareavailableatmywebsitehttp://www.vcodex.com/.Ialway appreciate feedback, comments and suggestions from readers and you will find contact details at this website lain richardson
Preface With the widespread adoption of technologies such as digital television, Internet streaming video and DVD-Video, video compression has become an essential component of broadcast and entertainment media. The success of digital TV and DVD-Video is based upon the 10-year-old MPEG-2 standard, a technology that has proved its effectiveness but is now looking distinctly old-fashioned. It is clear that the time is right to replace MPEG-2 video compression with a more effective and efficient technology that can take advantage of recent progress in processing power. For some time there has been a running debate about which technology should take up MPEG-2’s mantle. The leading contenders are the International Standards known as MPEG-4 Visual and H.264. This book aims to provide a clear, practical and unbiased guide to these two standards to enable developers, engineers, researchers and students to understand and apply them effectively. Video and image compression is a complex and extensive subject and this book keeps an unapologetically limited focus, concentrating on the standards themselves (and in the case of MPEG-4 Visual, on the elements of the standard that support coding of ‘real world’ video material) and on video coding concepts that directly underpin the standards. The book takes an application-based approach and places particular emphasis on tools and features that are helpful in practical applications, in order to provide practical and useful assistance to developers and adopters of these standards. I am grateful to a number of people who helped to shape the content of this book. I received many helpful comments and requests from readers of my book Video Codec Design. Particular thanks are due to Gary Sullivan for taking the time to provide helpful and detailed comments, corrections and advice and for kindly agreeing to write a Foreword; to Harvey Hanna (Impact Labs Inc), Yafan Zhao (The Robert Gordon University) and Aitor Garay for reading and commenting on sections of this book during its development; to members of the Joint Video Team for clarifying many of the details of H.264; to the editorial team at John Wiley & Sons (and especially to the ever-helpful, patient and supportive Kathryn Sharples); to Phyllis for her constant support; and finally to Freya and Hugh for patiently waiting for the long-promised trip to Storybook Glen! I very much hope that you will find this book enjoyable, readable and above all useful. Further resources and links are available at my website, http://www.vcodex.com/. I always appreciate feedback, comments and suggestions from readers and you will find contact details at this website. Iain Richardson
glossary 4: 2: 0(sampling) Sampling method: chrominance components have half the horizontal and vertical resolution of luminance component 4: 2: 2(sampling) Sampling method: chrominance components have half the horizontal 4: 4: 4(sampling) Sampling method: chrominance components have same resolution as arithmetic coding Coding method to reduce redundancy artefact Visual distortion in an image ASO Arbitrary Slice Order, in which slices may be coded out of raster nce BAB Binary Alpha Block, indicates the boundaries of a region(MPEG-4 BAP Body Animation Parameters BI Region of macroblock(8 x8 or 4 x 4)for transform purposes block matching Motion estimation carried out on rectangular picture areas blocking Square or rectangular distortion areas in picture(slice) Coded picture(slice) predicted using bidirectional motion compensation CABAC Context-based Adaptive Binary Arithmetic Codin AE Context-based Arithmetic Encoding CAVLC Context Adaptive Variable Length Codin chrominance Colour difference component Common Intermediate Format, a colour image format CODEC COder/DECoder pair colour space Method of representing colour images Discrete Cosine transform Direct prediction A coding mode in which no motion vector is transmitted DPCM Differential Pulse Code modulation DSCQS Double Stimulus Continuous Quality Scale, a scale and method for subjective quality measurement DWT Discrete Wavelet Transform
Glossary 4:2:0 (sampling) Sampling method: chrominance components have half the horizontal and vertical resolution of luminance component 4:2:2 (sampling) Sampling method: chrominance components have half the horizontal resolution of luminance component 4:4:4 (sampling) Sampling method: chrominance components have same resolution as luminance component arithmetic coding Coding method to reduce redundancy artefact Visual distortion in an image ASO Arbitrary Slice Order, in which slices may be coded out of raster sequence BAB Binary Alpha Block, indicates the boundaries of a region (MPEG-4 Visual) BAP Body Animation Parameters Block Region of macroblock (8 × 8 or 4 × 4) for transform purposes block matching Motion estimation carried out on rectangular picture areas blocking Square or rectangular distortion areas in an image B-picture (slice) Coded picture (slice) predicted using bidirectional motion compensation CABAC Context-based Adaptive Binary Arithmetic Coding CAE Context-based Arithmetic Encoding CAVLC Context Adaptive Variable Length Coding chrominance Colour difference component CIF Common Intermediate Format, a colour image format CODEC COder / DECoder pair colour space Method of representing colour images DCT Discrete Cosine Transform Direct prediction A coding mode in which no motion vector is transmitted DPCM Differential Pulse Code Modulation DSCQS Double Stimulus Continuous Quality Scale, a scale and method for subjective quality measurement DWT Discrete Wavelet Transform
GLOSSARY entropy coding Coding method to reduce redundancy error concealment Post-processing of a decoded image to remove or reduce visible error effects ExD-Golomb Exponential Golomb variable length codes FAP Facial Animation Parameter FBA Face and Body Animation FGS Fine Granular Scalability Odd-or even-numbered lines from an interlaced video sequence flowgraph Pictorial representation of a transform algorithm(or the algorithm itself) Flexible Macroblock Order, in which macroblocks may be coded out of Full search A motion estimation algorithm GMC Global Motion Compensation, motion compensation applied to a complete coded object(MPEG-4 Visual) Group Of Pictures, a set of coded video images H.261 video coding standard H263 A video coding standard H.264 A video coding standard HDTV High Definition Television Huffman coding Coding method to reduce redundanc hybrid( CoDEC interpret visn. System, the system by which humans perceive and HVS Images CODEC model featuring motion compensation and transform International Electrotechnical Commission, a standards body Inter(coding) Coding of video frames using temporal prediction or compensation interlaced(video) Video data represented as a series of fields intra(coding) Coding of video frames without temporal prediction I-picture(slice) Picture(or slice) coded without reference to any other frame International Standards Organisation, a standards body ITU International Telecommunication Union. a standards bod JPEG Joint Photographic Experts Group, a committee of iso (also an image coding standard) JPEG2000 An image coding standard Delay through a communication system Level A set of conformance parameters(applied to a Profile) loop filter Spatial filter placed within encoding or decoding feedback loop Region of frame coded as a unit(usually 16 x 16 pixels in the orig Macroblock Region of macroblock with its own motion vector(H. 264) Macroblock Region of macroblock with its own motion vector(H. 264) media processor Processor with features specific to multimedia coding and processing Prediction of a video frame with modelling of motion motion estimation Estimation of relative motion between two or more video frames
•xxii GLOSSARY entropy coding Coding method to reduce redundancy error concealment Post-processing of a decoded image to remove or reduce visible error effects Exp-Golomb Exponential Golomb variable length codes FAP Facial Animation Parameters FBA Face and Body Animation FGS Fine Granular Scalability field Odd- or even-numbered lines from an interlaced video sequence flowgraph Pictorial representation of a transform algorithm (or the algorithm itself) FMO Flexible Macroblock Order, in which macroblocks may be coded out of raster sequence Full Search A motion estimation algorithm GMC Global Motion Compensation, motion compensation applied to a complete coded object (MPEG-4 Visual) GOP Group Of Pictures, a set of coded video images H.261 A video coding standard H.263 A video coding standard H.264 A video coding standard HDTV High Definition Television Huffman coding Coding method to reduce redundancy HVS Human Visual System, the system by which humans perceive and interpret visual images hybrid (CODEC) CODEC model featuring motion compensation and transform IEC International Electrotechnical Commission, a standards body Inter (coding) Coding of video frames using temporal prediction or compensation interlaced (video) Video data represented as a series of fields intra (coding) Coding of video frames without temporal prediction I-picture (slice) Picture (or slice) coded without reference to any other frame ISO International Standards Organisation, a standards body ITU International Telecommunication Union, a standards body JPEG Joint Photographic Experts Group, a committee of ISO (also an image coding standard) JPEG2000 An image coding standard latency Delay through a communication system Level A set of conformance parameters (applied to a Profile) loop filter Spatial filter placed within encoding or decoding feedback loop Macroblock Region of frame coded as a unit (usually 16 × 16 pixels in the original frame) Macroblock Region of macroblock with its own motion vector (H.264) partition Macroblock Region of macroblock with its own motion vector (H.264) sub-partition media processor Processor with features specific to multimedia coding and processing motion Prediction of a video frame with modelling of motion compensation motion estimation Estimation of relative motion between two or more video frames
GLOSSARY motion vector Vector indicating a displaced block or region to be used for motion MPEG Motion Picture Experts Group, a committee of Iso/EC MPEG-1 A multimedia coding standard MPEG-2 A multimedia coding standard MPEG-4 dia coding standard Network Abstraction Layer objective quality Visual quality measured by algorithm(s) OBMC Overlapped Block Motion Compensation Picture(coded) Coded(compressed) video frame P-picture(slice) Coded picture(or slice)using motion-compensated prediction from one profile A set of functional capabilities(of a video CODEC) progressive(video) Video data represented as a series of complete frames PSNR eak Signal to Noise Ratio, an objective quality measure QCIF Quarter Common Intermediate Format Reduce the precision of a scalar or vector quantity rate control Control of bit rate of encoded video signal rate-distortion Measure of CODEC performance(distortion at a range of coded bit RBSP Raw Byte Sequence Payload RGB Red/Green/Blue colour space ringing(artefacts) 'Ripple-like artefacts around sharp edges in a decoded image RTP Real Time Protocol, a transport protocol for real-time data RVLC Reversible variable length Code salable coding Coding a signal into a number of layers SI slice Intra-coded slice used for switching between coded bitstreams(H.264) slice of a coded picture SNHC Synthetic Natural Hybrid Codin SP slice Inter-coded slice used for switching between coded bitstreams(H. 264) Texture region that may be incorporated in a series of decoded frames (MPEG-4 Visual) statistical Redundancy due to the statistical distribution of data studio quality Lossless or near-lossless video quality subjective quality Visual quality as perceived by human observer(s) subjective Redundancy due to components of the data that are subjectively sub-pixel(motion Motion-compensated prediction from a reference area that may be compensation) formed by interpolating between integer-valued pixel positions test model A software model and document that describe a reference implementation of a video coding standard Image or residual data Tree-structured Motion compensation featuring a flexible hierarchy of partition sizes (H.264)
GLOSSARY •xxiii motion vector Vector indicating a displaced block or region to be used for motion compensation MPEG Motion Picture Experts Group, a committee of ISO/IEC MPEG-1 A multimedia coding standard MPEG-2 A multimedia coding standard MPEG-4 A multimedia coding standard NAL Network Abstraction Layer objective quality Visual quality measured by algorithm(s) OBMC Overlapped Block Motion Compensation Picture (coded) Coded (compressed) video frame P-picture (slice) Coded picture (or slice) using motion-compensated prediction from one reference frame profile A set of functional capabilities (of a video CODEC) progressive (video) Video data represented as a series of complete frames PSNR Peak Signal to Noise Ratio, an objective quality measure QCIF Quarter Common Intermediate Format quantise Reduce the precision of a scalar or vector quantity rate control Control of bit rate of encoded video signal rate–distortion Measure of CODEC performance (distortion at a range of coded bit rates) RBSP Raw Byte Sequence Payload RGB Red/Green/Blue colour space ringing (artefacts) ‘Ripple’-like artefacts around sharp edges in a decoded image RTP Real Time Protocol, a transport protocol for real-time data RVLC Reversible Variable Length Code scalable coding Coding a signal into a number of layers SI slice Intra-coded slice used for switching between coded bitstreams (H.264) slice A region of a coded picture SNHC Synthetic Natural Hybrid Coding SP slice Inter-coded slice used for switching between coded bitstreams (H.264) sprite Texture region that may be incorporated in a series of decoded frames (MPEG-4 Visual) statistical Redundancy due to the statistical distribution of data redundancy studio quality Lossless or near-lossless video quality subjective quality Visual quality as perceived by human observer(s) subjective Redundancy due to components of the data that are subjectively redundancy insignificant sub-pixel (motion Motion-compensated prediction from a reference area that may be compensation) formed by interpolating between integer-valued pixel positions test model A software model and document that describe a reference implementation of a video coding standard Texture Image or residual data Tree-structured Motion compensation featuring a flexible hierarchy of partition sizes motion (H.264) compensation