VIDEO FORMATS AND QUALITY Figure 2.1 Still image from natural video scene o。◎ Temporal samples Figure 2.2 Spatial and temporal sampling of a video sequence 2.3 CAPTURE A natural visual scene is spatially and temporally continuous. Representing a visual scene in digital form involves sampling the real scene spatially(usually on a rectangular grid in the video image plane)and temporally(as a series of still frames or components of frames sampled at regular intervals in time)(Figure 2.2). Digital video is the representation of a sampled video scene in digital form. Each spatio-temporal sample(picture element or pixel)is represented as a number or set of numbers that describes the brightness(luminance)and colour of the
•10 VIDEO FORMATS AND QUALITY Figure 2.1 Still image from natural video scene . . . . . . . Spatial samples Temporal samples Figure 2.2 Spatial and temporal sampling of a video sequence 2.3 CAPTURE A natural visual scene is spatially and temporally continuous. Representing a visual scene in digital form involves sampling the real scene spatially (usually on a rectangular grid in the video image plane) and temporally (as a series of still frames or components of frames sampled at regular intervals in time) (Figure 2.2). Digital video is the representation of a sampled video scene in digital form. Each spatio-temporal sample (picture element or pixel) is represented as a number or set of numbers that describes the brightness (luminance) and colour of the sample
CAPTURE To obtain a 2D sampled image, a camera focuses a 2D projection of the video scene onto a sensor, such as an array of Charge Coupled Devices( CCD array). In the case of colour mage capture, each colour component is separately filtered and projected onto a CCD array (see Section 2.4) 2.3.1 Spatial Sampling The output of a CCD array is an analogue video signal, a varying electrical signal that represents produces a sampled i has defined values at a set of sampling points. The most common format for a sampled image is a rectangle with the sampling points positioned on a square or rectangular grid. Figure 2.3 shows a continuous-tone frame with two different sampling grids superimposed upon it Sampling occurs at each of the intersection points on the grid and the sampled image may be reconstructed by representing each sample as a square picture element(pixel). The visual quality of the image is influenced by the number of sampling points. Choosing a'coars sampling grid(the black grid in Figure 2.3)produces a low-resolution sampled image( Figure 2. 4)whilst increasing the number of sampling points slightly(the grey grid in Figure 2.3) increases the resolution of the sampled image(Figure 2.5). 2.3.2 Temporal Sampling A moving video image is captured by taking a rectangularsnapshot'of the signal at periodic time intervals. Playing back the series of frames produces the appearance of motion. A higher temporal sampling rate(frame rate) gives apparently smoother motion in the video scene but requires more samples to be captured and stored. Frame rates below 10 frames per second tre sometimes used for very low bit-rate video communications(because the amount of data
CAPTURE •11 Figure 2.3 Image with 2 sampling grids To obtain a 2D sampled image, a camera focuses a 2D projection of the video scene onto a sensor, such as an array of Charge Coupled Devices (CCD array). In the case of colour image capture, each colour component is separately filtered and projected onto a CCD array (see Section 2.4). 2.3.1 Spatial Sampling The output of a CCD array is an analogue video signal, a varying electrical signal that represents a video image. Sampling the signal at a point in time produces a sampled image or frame that has defined values at a set of sampling points. The most common format for a sampled image is a rectangle with the sampling points positioned on a square or rectangular grid. Figure 2.3 shows a continuous-tone frame with two different sampling grids superimposed upon it. Sampling occurs at each of the intersection points on the grid and the sampled image may be reconstructed by representing each sample as a square picture element (pixel). The visual quality of the image is influenced by the number of sampling points. Choosing a ‘coarse’ sampling grid (the black grid in Figure 2.3) produces a low-resolution sampled image (Figure 2.4) whilst increasing the number of sampling points slightly (the grey grid in Figure 2.3) increases the resolution of the sampled image (Figure 2.5). 2.3.2 Temporal Sampling A moving video image is captured by taking a rectangular ‘snapshot’ of the signal at periodic time intervals. Playing back the series of frames produces the appearance of motion. A higher temporal sampling rate (frame rate) gives apparently smoother motion in the video scene but requires more samples to be captured and stored. Frame rates below 10 frames per second are sometimes used for very low bit-rate video communications (because the amount of data
12 VIDEO FORMATS AND QUALITY Figure 2. 4 Image sampled at coarse resolution(black sampling grid) Figure 2.5 Image sampled at slightly finer resolution(grey sampling grid)
•12 VIDEO FORMATS AND QUALITY Figure 2.4 Image sampled at coarse resolution (black sampling grid) Figure 2.5 Image sampled at slightly finer resolution (grey sampling grid)
COLOUR SPACES Figure 2.6 Interlaced video seque is relatively small) but motion is jerky and unnatural at this rate. Between 10 and 20 frames per second is more typical for low bit-rate video communications; the image is smoother but jerky motion may be fast-moving parts of the sequence Sampling at 25 or 30 complete frames per second is standard for television pictures(with interlacing to improve the appearance of motion, see below ) 50 or 60 frames per second produces smooth apparent motion(at the expense of a very high data rate 2.3.3 Frames and Fields A video signal may be sampled as a series of complete frames(progressive sampling)or as a sequence of interlaced fields(interlaced sampling). In an interlaced video sequence, half of the data in a frame(one field) is sampled at each temporal sampling interval. A field consists of either the odd-numbered or even-numbered lines within a complete video frame and an interlaced video sequence(Figure 2.6)contains a series of fields, each representing half of the information in a complete video frame(e.g. Figure 2.7 and Figure 2.8). The advantage of this sampling method is that it is possible to send twice as many fields per second as the number of frames in an equivalent progressive sequence with the same data rate, giving the appearance of smoother motion. For example, a Pal video sequence consists of 50 fields per econd and, when played back, motion can appears smoother than in an equivalent progressive video sequence containing 25 frames per second. 2. 4 COLOUR SPACES Most digital video applications rely on the display of colou capture and represent colour information. A monochrome image(e.g. Figure 2. 1)requires just one number to indicate the brightness or luminance of each spatial sample. Colour images, on the other hand, require at least three numbers per pixel position to represent colour accurately The method chosen to represent brightness(luminance or luma) and colour is described as a colour space
COLOUR SPACES •13 top field bottom field top field bottom field Figure 2.6 Interlaced video sequence is relatively small) but motion is clearly jerky and unnatural at this rate. Between 10 and 20 frames per second is more typical for low bit-rate video communications; the image is smoother but jerky motion may be visible in fast-moving parts of the sequence. Sampling at 25 or 30 complete frames per second is standard for television pictures (with interlacing to improve the appearance of motion, see below); 50 or 60 frames per second produces smooth apparent motion (at the expense of a very high data rate). 2.3.3 Frames and Fields A video signal may be sampled as a series of complete frames (progressive sampling) or as a sequence of interlaced fields (interlaced sampling). In an interlaced video sequence, half of the data in a frame (one field) is sampled at each temporal sampling interval. A field consists of either the odd-numbered or even-numbered lines within a complete video frame and an interlaced video sequence (Figure 2.6) contains a series of fields, each representing half of the information in a complete video frame (e.g. Figure 2.7 and Figure 2.8). The advantage of this sampling method is that it is possible to send twice as many fields per second as the number of frames in an equivalent progressive sequence with the same data rate, giving the appearance of smoother motion. For example, a PAL video sequence consists of 50 fields per second and, when played back, motion can appears smoother than in an equivalent progressive video sequence containing 25 frames per second. 2.4 COLOUR SPACES Most digital video applications rely on the display of colour video and so need a mechanism to capture and represent colour information. A monochrome image (e.g. Figure 2.1) requires just one number to indicate the brightness or luminance of each spatial sample. Colour images, on the other hand, require at least three numbers per pixel position to represent colour accurately. The method chosen to represent brightness (luminance or luma) and colour is described as a colour space
14 VIDEO FORMATS AND QUALITY Figure 2.7 Top field Figure 2. 8 Bottom field 2.4.1RGB In the RGB colour space, a colour image sample is represented with three numbers that indicate the relative proportions of Red, Green and Blue( the three additive primary colours of light) Any colour can be created by combining red, green and blue in varying proportions. Figure 2.9 red, green and blue components of a colour image: the red component consists of all the red samples, the green component contains all the green samples and the blue component contains the blue samples. The person on the right is wearing a blue sweater and so this appears"brighter'in the blue component, whereas the red waistcoat of the figure on the left
•14 VIDEO FORMATS AND QUALITY Figure 2.7 Top field Figure 2.8 Bottom field 2.4.1 RGB In the RGB colour space, a colour image sample is represented with three numbers that indicate the relative proportions of Red, Green and Blue (the three additive primary colours of light). Any colour can be created by combining red, green and blue in varying proportions. Figure 2.9 shows the red, green and blue components of a colour image: the red component consists of all the red samples, the green component contains all the green samples and the blue component contains the blue samples. The person on the right is wearing a blue sweater and so this appears ‘brighter’ in the blue component, whereas the red waistcoat of the figure on the left