Multimedia gds MSc exam 2000 SOLUTIONS Setter: ADM Checker: AcJ Answer 3 Questions out of 4 Time allowed 2 Hours 1.(a)Why is data compression, including file compression, highly desirable for Multimedia activities? Multimedia files are very large therefore for storage, file transfer etc file sizes need to be reduced. Text and other files may also be encoded/compressed for email and other applications 2 MARKS--- BOOKWORK (b) Briefly explain, clearly identifying the diferences between them, how entropy coding and transform coding techniques work for data compression. Illustrate your answer with a simple example ofeach type Compression can be categorised in two broad ways ossless Compression where data is compressed and can be reconstituted (uncompressed) without loss of detail or information. These are referred to as bit-preserving or reversible compression systems also Lossy Compression where the aim is to obtain the best possible fidelity for a given bit-rate or minimizing the bit-rate to achieve a given fidelity measure. Video and audio compression techniques are most suited to this form of compression Lossless compression frequently involves some form of entropy encoding and are based in information theoretic techniques Lossy compression use source encoding techniques that may involve transform encoding, differential encoding or vector quantisation ENTROPY METHODS
Multimedia IGDS MSc Exam 2000 SOLUTIONS Setter: ADM Checker: ACJ Answer 3 Questions out of 4 Time Allowed 2 Hours 1. (a) Why is data compression, including file compression, highly desirable for Multimedia activities? Multimedia files are very large therefore for storage, file transfer etc. file sizes need to be reduced. Text and other files may also be encoded/compressed for email and other applications. 2 MARKS --- BOOKWORK (b) Briefly explain, clearly identifying the differences between them, how entropy coding and transform coding techniques work for data compression. Illustrate your answer with a simple example of each type. Compression can be categorised in two broad ways: Lossless Compression -- where data is compressed and can be reconstituted (uncompressed) without loss of detail or information. These are referred to as bit-preserving or reversible compression systems also. Lossy Compression -- where the aim is to obtain the best possible fidelity for a given bit-rate or minimizing the bit-rate to achieve a given fidelity measure. Video and audio compression techniques are most suited to this form of compression. Lossless compression frequently involves some form of entropy encoding and are based in information theoretic techniques Lossy compression use source encoding techniques that may involve transform encoding, differential encoding or vector quantisation. ENTROPY METHODS:
The entropy of an information source S is defined as H(S)=SUM,(P Log2(1/P) where Pi is the probability that symbol S, in S will occur Log2(1/P)indicates the amount of information contained in Si, i.e., the number of bits needed to code s Encoding for the Shannon-Fano Algorithm A top-down approach 1. Sort symbols according to their frequencies/probabilities, e.g., ABCDE 3 Recursively divide into two parts, each with approx. same number of counts (Huffman algorithm also valid indicated below) A simple transform coding example A Simple Transform Encoding procedure maybe described by the following steps for a 2x2 block of monochrome pixels 1. Take top left pixel as the base value for the block, pixel A 2. Calculate three other transformed values by taking the difference between these(respective) pixels and pixel A, i.e. B-A, C-A, D-A 3. Store the base pixel and the differences as the values of the transform Given the above we can easily for the forward transform and the inverse transform is trivial The above transform scheme may be used to compress data by explo redundancy in the data Any redundancy in the data has been transformed to values, Xi. So we can compress the data by using fewer bits to represent the differences. I. e if we use 8 bits per pixel then the 2x2 block uses 32 bits/ If we keep 8 bits for the ba pixel, XO, and assign 4 bits for each difference then we only use 20 bits Which is better than an average 5 bits/pixel 7 MARKS--- BOOKWORK
The entropy of an information source S is defined as: H(S) = SUMI (PI Log2 (1/PI) where PI is the probability that symbol SI in S will occur. Log2 (1/PI) indicates the amount of information contained in SI, i.e., the number of bits needed to code SI. Encoding for the Shannon-Fano Algorithm: A top-down approach 1. Sort symbols according to their frequencies/probabilities, e.g., ABCDE. 3 Recursively divide into two parts, each with approx. same number of counts. (Huffman algorithm also valid indicated below) A simple transform coding example A Simple Transform Encoding procedure maybe described by the following steps for a 2x2 block of monochrome pixels: 1. Take top left pixel as the base value for the block, pixel A. 2. Calculate three other transformed values by taking the difference between these (respective) pixels and pixel A, i.e. B-A, C-A, D-A. 3. Store the base pixel and the differences as the values of the transform. Given the above we can easily for the forward transform: and the inverse transform is trivial The above transform scheme may be used to compress data by exploiting redundancy in the data: Any Redundancy in the data has been transformed to values, Xi. So We can compress the data by using fewer bits to represent the differences. I.e if we use 8 bits per pixel then the 2x2 block uses 32 bits/ If we keep 8 bits for the base pixel, X0, and assign 4 bits for each difference then we only use 20 bits. Which is better than an average 5 bits/pixel 7 MARKS --- BOOKWORK
(c)( Show how you would use Huffman coding to encode the following set of tokens BABACACADADABBCBABEBEDDABEEEBB How is this message transmitted when encoded? The Huffman algorithm is now briefly summarised Initialization: Put all nodes in an OPEN list, keep it sorted at all times (e.g., ABCDE) 2. Repeat until the open list has only one node left (a) From open pick two nodes having the lowest frequencies/probabilities, create a parent node of them (b) Assign the sum of the childrens frequencies/probabilities to the parent node and insert it into OPEN (c)Assign code 0, I to the two branches of the tree, and delete the children from OPeN Symbol Count OPEN (1)OPEN (2)OPEN (3) ABCDE 8u34 7 Total 30 8 indicate merge node with other node with number in column
(c) (i) Show how you would use Huffman coding to encode the following set of tokens: BABACACADADABBCBABEBEDDABEEEBB How is this message transmitted when encoded? The Huffman algorithm is now briefly summarised: 1. Initialization: Put all nodes in an OPEN list, keep it sorted at all times (e.g., ABCDE). 2. Repeat until the OPEN list has only one node left: (a) From OPEN pick two nodes having the lowest frequencies/probabilities, create a parent node of them. (b) Assign the sum of the children's frequencies/probabilities to the parent node and insert it into OPEN. (c) Assign code 0, 1 to the two branches of the tree, and delete the children from OPEN. Symbol Count OPEN (1) OPEN (2) OPEN (3) A 8 18 B 10 - C 3 7 12 D 4 - E 5 - Total 30 8 indicate merge node with other node with number in column
Finished Huffman Tree P4(30) P2(1 P3(18) E(5) A(8) B(10) C(3) D(4) Symbol Code ABCDE 11 010 0l1 How is this message transmitted when encoded? Send code book and then bit code for each symbol 7 Marks -- UNSEEN (ii) How many bits are needed transfer this coded message and what is its Entropy? Symbol Count Subtotal of bits 8 ABCDE 9
Finished Huffman Tree: Symbol Code A 10 B 11 C 010 D 011 E 00 How is this message transmitted when encoded? Send code book and then bit code for each symbol. 7 Marks --- UNSEEN (ii) How many bits are needed transfer this coded message and what is its Entropy? Symbol Count Subtotal # of bits A 8 16 B 10 20 C 3 9 D 4 12 E 5 10 D(4) C(3) P1(7) E(5) P2(12) A(8) P3(18) B(10) P4(30) 0 0 0 0 1 1 1 1
Total Number bits(excluding code book)=62 Entropy=62/30=20666 8 MARKS--- UNSEEN (iii) What amendments are required to this coding technique if data is generated live or is otherwise not wholly available? Show how you could se this modified scheme by appending the tokens ADADA to the end of the above message Adaptive method needed Initialize model( le ((c getc code (C, output) update model (c) So encode message as before A=01D=0000 So addd stre 01000001000001
Total Number bits (excluding code book) = 62 Entropy = 62/30 = 2.06667 8 MARKS --- UNSEEN (iii) What amendments are required to this coding technique if data is generated live or is otherwise not wholly available? Show how you could use this modified scheme by appending the tokens ADADA to the end of the above message. Adaptive method needed: Basic idea (encoding) Initialize_model(); while ((c = getc (input)) != eof) { encode (c, output); update_model (c); } So encode message as before: A= 01 D = 0000 So addd stream: 01000001000001