CONTENTS 5MPEG4Ⅴ isual 5.1 Introduction 5.2 Overview of MPEG-4 Visual(Natural Video Coding) 100 5.2.1 Features 5.2.2 Tools, Objects, Profiles and levels 5.2.3 Video Objects 5.3 Coding rectangular frames 5.3.1 Input and Output Video Format 5.3.2 The Simple Profile 5.3.3 The Advanced Simple Profile 15 5.3.4 The Advanced Real Time Simple Profile 121 5.4 Coding Arbitrary-sl 5.4.1 The Core Profi 5.4.2 The main profile 133 5.4.3 The Advanced Coding Efficiency Profile 5.4.4 The N-bit profile 141 5.5 Scalable video Coding 142 5.5.1 Spatial Scalability 142 5.5.2 Temporal Scalability 144 5.5.3 Fine Granular Scalability 5.5.4 The Simple Scalable Profile 5.5.5 The Core Scalable profile 148 5.5.6 The Fine granular Scalability profile 5.6 Texture Coding 149 5.6.1 The Scalable Texture Profile 152 5.6.2 The Advanced Scalable Texture Profile 152 5.7 Coding Studio-quality Video 153 5.7.1 The Simple Studio profile 5.7.2 The Core Studio Profile 155 5.8 Coding Synthetic Visual Scenes 5.8.1 Animated 2D and 3D Mesh Coding 155 5.8.2 Face and Body Animation 5.9 Conclusions 5.10 References 6 H.264/MPEG-4 Part 10 6.1 Introduction 6.1.1 Terminology 6.2 The H. 264 CODEC 160 6.3 H. 264 structure 6.3.1 Profiles and Levels 162 6.3.2 Video Format 162 6.3. 3 Coded Data Format 163 6.3.4 Reference Pictures 163 6.3.5 Slices 6.3.6 Macroblocks
CONTENTS •ix 5 MPEG-4 Visual 99 5.1 Introduction 99 5.2 Overview of MPEG-4 Visual (Natural Video Coding) 100 5.2.1 Features 100 5.2.2 Tools, Objects, Profiles and Levels 100 5.2.3 Video Objects 103 5.3 Coding Rectangular Frames 104 5.3.1 Input and Output Video Format 106 5.3.2 The Simple Profile 106 5.3.3 The Advanced Simple Profile 115 5.3.4 The Advanced Real Time Simple Profile 121 5.4 Coding Arbitrary-shaped Regions 122 5.4.1 The Core Profile 124 5.4.2 The Main Profile 133 5.4.3 The Advanced Coding Efficiency Profile 138 5.4.4 The N-bit Profile 141 5.5 Scalable Video Coding 142 5.5.1 Spatial Scalability 142 5.5.2 Temporal Scalability 144 5.5.3 Fine Granular Scalability 145 5.5.4 The Simple Scalable Profile 148 5.5.5 The Core Scalable Profile 148 5.5.6 The Fine Granular Scalability Profile 149 5.6 Texture Coding 149 5.6.1 The Scalable Texture Profile 152 5.6.2 The Advanced Scalable Texture Profile 152 5.7 Coding Studio-quality Video 153 5.7.1 The Simple Studio Profile 153 5.7.2 The Core Studio Profile 155 5.8 Coding Synthetic Visual Scenes 155 5.8.1 Animated 2D and 3D Mesh Coding 155 5.8.2 Face and Body Animation 156 5.9 Conclusions 156 5.10 References 156 6 H.264/MPEG-4 Part 10 159 6.1 Introduction 159 6.1.1 Terminology 159 6.2 The H.264 CODEC 160 6.3 H.264 structure 162 6.3.1 Profiles and Levels 162 6.3.2 Video Format 162 6.3.3 Coded Data Format 163 6.3.4 Reference Pictures 163 6.3.5 Slices 164 6.3.6 Macroblocks 164
CONTENTS 6.4 The baseline profile 6.4.1 Overview 165 6. 4.2 Reference Picture Management 6.4.3 Slices 6.4.4 Macroblock prediction 6.4.5 Inter Prediction 170 6.4.6 Intra pre 177 6.4.7 Deblocking Filter 6.4.8 Transform and Quantisation 187 6. 4.9 4 x 4 Luma DC Coefficient Transform and Quantisation (16 x 16 Intra-mode Only) 6. 4.10 2 x 2 Chroma DC Coefficient Transform and Quantisation 195 6. 4.11 The Complete Transform, Quantisation, Rescaling and Inverse Transform process 6.4.12 Reorderin 98 6. 4. 13 Entropy Coding 6.5 The main Profile 6.5.1 B Slices 6.5.2 Weighted Prediction 211 6.5.3 Interlaced Video 6.5.4 Context-based Adaptive Binary Arithmetic Coding(CABAC) 6. 6 The Extended Profile 216 6.6.1 SP and si slices 216 6.6.2 Data Partitioned slices 6.7 Transport of H. 264 6. 8 Conclusions 6.9 References 7 Design and performance 7.1 Introduction 7.2 Functional Design 7.2.1 Segmentation 7. 2.2 Motion Estimation 7. 2.3 DCT/IDCT 7 2 4 Wavelet Transform 7.2.5 Quantise/Rescale 7.2.6 Entropy Odin 238 7.3 Input and Output 7.3.1 Interfacing 7.3.2 Pre-processing 242 7.3.3 Post-processing 7.4 Performance 7.4.1 Criteria 7.4.2 Subjective Performance 74.3 Rate-distortion Performance
•x CONTENTS 6.4 The Baseline Profile 165 6.4.1 Overview 165 6.4.2 Reference Picture Management 166 6.4.3 Slices 167 6.4.4 Macroblock Prediction 169 6.4.5 Inter Prediction 170 6.4.6 Intra Prediction 177 6.4.7 Deblocking Filter 184 6.4.8 Transform and Quantisation 187 6.4.9 4 × 4 Luma DC Coefficient Transform and Quantisation (16 × 16 Intra-mode Only) 194 6.4.10 2 × 2 Chroma DC Coefficient Transform and Quantisation 195 6.4.11 The Complete Transform, Quantisation, Rescaling and Inverse Transform Process 196 6.4.12 Reordering 198 6.4.13 Entropy Coding 198 6.5 The Main Profile 207 6.5.1 B Slices 207 6.5.2 Weighted Prediction 211 6.5.3 Interlaced Video 212 6.5.4 Context-based Adaptive Binary Arithmetic Coding (CABAC) 212 6.6 The Extended Profile 216 6.6.1 SP and SI slices 216 6.6.2 Data Partitioned Slices 220 6.7 Transport of H.264 220 6.8 Conclusions 222 6.9 References 222 7 Design and Performance 225 7.1 Introduction 225 7.2 Functional Design 225 7.2.1 Segmentation 226 7.2.2 Motion Estimation 226 7.2.3 DCT/IDCT 234 7.2.4 Wavelet Transform 238 7.2.5 Quantise/Rescale 238 7.2.6 Entropy Coding 238 7.3 Input and Output 241 7.3.1 Interfacing 241 7.3.2 Pre-processing 242 7.3.3 Post-processing 243 7.4 Performance 246 7.4.1 Criteria 246 7.4.2 Subjective Performance 247 7.4.3 Rate–distortion Performance 251
CONTENTS 7.4.4 Computational Performance 25 7.4.5 Performance Optimisation 7.5 Rate control 256 7.6 Transport and Storage 262 7.6.1 Transport Mechanisms 7. 6.2 File Formats 26 7.6.3 Coding and Transport Issues 264 7. 7 Conclu 7. 8 Reference 8 Applications and Directions 269 8.3 Platforms 270 8. 4 Choosing a codec 8.5 Commercial issues 272 8.5.1 Open Standards? 273 8.5.2 Licensing MPEg-4 Visual and H. 264 274 8.5.3 Capturing the Market 274 8.6 Future directions 75 8.7 Conclusions 276 8. 8 References 276 Bibliography 277 279
CONTENTS •xi 7.4.4 Computational Performance 254 7.4.5 Performance Optimisation 255 7.5 Rate control 256 7.6 Transport and Storage 262 7.6.1 Transport Mechanisms 262 7.6.2 File Formats 263 7.6.3 Coding and Transport Issues 264 7.7 Conclusions 265 7.8 References 265 8 Applications and Directions 269 8.1 Introduction 269 8.2 Applications 269 8.3 Platforms 270 8.4 Choosing a CODEC 270 8.5 Commercial issues 272 8.5.1 Open Standards? 273 8.5.2 Licensing MPEG-4 Visual and H.264 274 8.5.3 Capturing the Market 274 8.6 Future Directions 275 8.7 Conclusions 276 8.8 References 276 Bibliography 277 Index 279
about the author lain Richardson is a lecturer and researcher at The Robert Gordon University, Aberdeen Scotland. He was awarded the degrees of MEng(Heriot-Watt University) and PhD(The Robert gordon University) in 1990 and 1999 respectively. He has been actively involved in research and development of video compression systems since 1993 and is the author of over 40 journal and conference papers and two previous books. He leads the Image Communica tion Technology Research Group at The robert Gordon University and advises a number of companies on video compression technology issues
About the Author Iain Richardson is a lecturer and researcher at The Robert Gordon University, Aberdeen, Scotland. He was awarded the degrees of MEng (Heriot-Watt University) and PhD (The Robert Gordon University) in 1990 and 1999 respectively. He has been actively involved in research and development of video compression systems since 1993 and is the author of over 40 journal and conference papers and two previous books. He leads the Image Communication Technology Research Group at The Robert Gordon University and advises a number of companies on video compression technology issues