62 Point-to-Point Protocols and Links Chap.2 The CRC polynomial c(D)is a function of the information polynomial s(D),de- fined in terms of a generator polynomial g(D);this is a polynomial of degree L with binary coefficients that specifies the particular CRC code to be used. g(D)=Dk+9L-1D-1+…+91D+1 (2.17) For a given g(D),the mapping from the information polynomial to the CRC polynomial c(D)is given by s(D)DL c(D)=Remainder (2.18) g(D) The polynomial division above is just ordinary long division of one polynomial by another,except that the coefficients are restricted to be binary and the arithmetic on coefficients is performed modulo 2.Thus,for example,(1+1)modulo 2 =0 and (0-1) modulo 2 =1.Note that subtraction using modulo 2 arithmetic is the same as addition. As an example of the operation in Eq.(2.18), D2+D D3+D2+1)D5+ D D3+D+ D2 D+D3+D2 D4+D3+D D2+D=Remainder Since g(D)is a polynomial of degree at most L.the remainder is of degree at most L-1.If the degree of c(D)is less than L-1,the corresponding leading coefficients CL-1,...,in Eq.(2.18)are taken as zero. This long division can be implemented easily in hardware by the feedback shift register circuit shown in Fig.2.16.By comparing the circuit with the long division above,it can be seen that the successive contents of the shift register cells are just the coefficients of the partial remainders in the long division.In practice,CRCs are usually calculated by VLSI chips,which often perform the other functions of DLC as well. Figure 2.16 Shift register circuit for dividing polynomials and finding the remainder. Each rectangle indicates a storage cell for a single bit and the preceding circles denote modulo 2 adders.The large circles at the top indicate multiplication by the value of gi Initially,the register is loaded with the first L bits of s(D)with s at the right.On each clock pulse,a new bit of s(D)comes in at the left and the register reads in the corresponding modulo 2 sum of feedback plus the contents of the previous stage.After K shifts,the switch at the right moves to the horizontal position and the CRC is read out
62 Point-to-Point Protocols and Links Chap. 2 The CRC polynomial c(D) is a function of the infonnation polynomial seD), defined in tenns of a generator polynomial g(D); this is a polynomial of degree L with binary coefficients that specifies the particular CRC code to be used. g(D) = D L + gL_1DL - 1+ ... + glD + I (2.17) For a given g(D), the mapping from the infonnation polynomial to the CRC polynomial c(D) is given by c(D) = Remainder [S(D)D L ] (2.18) g(D) The polynomial division above is just ordinary long division of one polynomial by another, except that the coefficients are restricted to be binary and the arithmetic on coefficients is perfonned modulo 2. Thus, for example, (l + I) modulo 2 = 0 and (0 - I) modulo 2 = 1. Note that subtraction using modulo 2 arithmetic is the same as addition. As an example of the operation in Eq. (2.18), D 3+ D 2 + I ) D S + D 3 D S + D 4 + D 2 D 4 +D3+D2 D 4 +D3 + D D 2 + D= Remainder Since g(D) is a polynomial of degree at most L, the remainder is of degree at most L - 1. If the degree of c(D) is less than L - I, the corresponding leading coefficients CL-I, ..., in Eg. (2.18) are taken as zero. This long division can be implemented easily in hardware by the feedback shift register circuit shown in Fig. 2.16. By comparing the circuit with the long division above, it can be seen that the successive contents of the shift register cells are just the coefficients of the partial remainders in the long division. In practice, CRCs are usually calculated by VLSI chips, which often perfonn the other functions of DLC as well. Figure 2.16 Shift register circuit for dividing polynomials and finding the remainder. Each rectangle indicates a storage cell for a single bit and the preceding circles denote modulo 2 adders. The large circles at the top indicate multiplication by the value of gi. Initially, the register is loaded with the first L bits of s(D) with s K _I at the right. On each clock pulse, a new bit of s(D) comes in at the left and the register reads in the corresponding modulo 2 sum of feedback plus the contents of the previous stage. After K shifts, the switch at the right moves to the horizontal position and the CRC is read out
Sec.2.3 Error Detection 63 Let z(D)be the quotient resulting from dividing s(D)D by g(D).Then,c(D) can be represented as s(D)D=g(D)2(D)+c(D) (2.19) Subtracting c(D)(modulo 2)from both sides of this equation and recognizing that modulo 2 subtraction and addition are the same,we obtain x(D)=s(D)D+c(D)=g(D)z(D) (2.20) Thus,all code words are divisible by g(D),and all polynomials divisible by g(D)are code words.It has not yet been shown that the mapping from s(D)to (D)corresponds to a parity check code.This is demonstrated in Problem 2.15 but is not necessary for the subsequent development. Now suppose that c(D)is transmitted and that the received sequence is repre- sented by a polynomial y(D),where (D)and y(D)differ because of the errors on the communication link.If the error sequence is represented as a polynomial e(D),then y(D)=(D)+e(D),where,as throughout this section,means modulo 2 addition; each error in the frame corresponds to a nonzero coefficient in e(D)[i.e.,a coefficient in which y(D)and x(D)differ].At the receiver,Remainder[y(D)/g(D)]can be calculated by essentially the same circuit as that above.Since it has been shown that x(D)is divisible by g(D), Remainder y(D) (D) D -Remainder 19(D) (2.21) If no errors occur,then e(D)=0 and the remainder above will be 0.The rule followed by the receiver is to decide that the frame is error-free if this remainder is 0 and to decide that there are errors otherwise.When errors actually occur [i.e.,e(D)0], the receiver fails to detect the errors only if this remainder is 0;this occurs only if e(D) is itself some code word.In other words,e(D)0 is undetectable if and only if e(D)=g(D)z(D) (2.22) for some nonzero polynomial z(D).We now explore the conditions under which unde- tected errors can occur. First,suppose that a single error occurs,say e=1,so that e(D)=Di.Since g(D)has at least two nonzero terms(i.e.,D and 1),g(D)a(D)must also have at least two nonzero terms for any nonzero z(D)(see Problem 2.13).Thus g(D)a(D)cannot equal D';since this is true for all i,all single errors are detectable.By the same type of argument,since the highest-order and lowest-order terms in g(D)(i.e.,DL and 1, respectively)differ by L,the highest-order and lowest-order terms in g(D)z(D)differ by at least L for all nonzero 2(D).Thus,if e(D)is a code word,the burst length of the errors is at least L+1 (the +1 arises from the definition of burst length as the number of positions from the first error to the last error inclusive). Next,suppose that a double error occurs,say in positions iand j,so that e(D)=D+D)=D(D-3+1),i>j (2.23)
Sec. 2.3 Error Detection 63 Let z(D) be the quotient resulting from dividing s(D)DL by g(D). Then, c(D) can be represented as s(D)DL = g(D)z(D) + c(D) (2.19) Subtracting c(D) (modulo 2) from both sides of this equation and recognizing that modulo 2 subtraction and addition are the same, we obtain x(D) = s(D)DL + c(D) = g(D)z(D) (2.20) (2.21) Thus, all code words are divisible by g(D), and all polynomials divisible by g(D) are code words. It has not yet been shown that the mapping from s(D) to x(D) corresponds to a parity check code. This is demonstrated in Problem 2.15 but is not necessary for the subsequent development. Now suppose that x(D) is transmitted and that the received sequence is represented by a polynomial y(D), where x(D) and y(D) differ because of the errors on the communication link. If the error sequence is represented as a polynomial e(D), then y(D) = x(D) + e(D), where, as throughout this section, + means modulo 2 addition; each error in the frame corresponds to a nonzero coefficient in e(D) [i.e., a coefficient in which y(D) and x(D) differ]. At the receiver, Remainder[y(D)jg(D)] can be calculated by essentially the same circuit as that above. Since it has been shown that x(D) is divisible by g(D), Remainder [~~~n = Remainder [;~~~] If no errors occur, then e(D) = 0 and the remainder above will be O. The rule followed by the receiver is to decide that the frame is error- free if this remainder is 0 and to decide that there are errors otherwise. When errors actually occur [i.e., e(D) i:- 0], the receiver fails to detect the errors only if this remainder is 0; this occurs only if e(D) is itself some code word. In other words, e(D) i:- 0 is undetectable if and only if e(D) = g(D)z(D) (2.22) for some nonzero polynomial z(D). We now explore the conditions under which undetected errors can occur. First, suppose that a single error occurs, say ei = 1, so that e(D) = D i . Since g(D) has at least two nonzero terms (i.e., D L and 1), g(D)z(D) must also have at least two nonzero terms for any nonzero z(D) (see Problem 2.13). Thus g(D)z(D) cannot equal D i ; since this is true for all i, all single errors are detectable. By the same type of argument, since the highest-order and lowest-order terms in g(D) (i.e., D L and 1, respectively) differ by L, the highest-order and lowest-order terms in g(D)z(D) differ by at least L for all nonzero z(D). Thus, if e(D) is a code word, the burst length of the errors is at least L + 1 (the +1 arises from the definition of burst length as the number of positions from the first error to the last error inclusive). Next, suppose that a double error occurs, say in positions i and j, so that (2.23)
64 Point-to-Point Protocols and Links Chap.2 From the argument above,D is not divisible by g(D)or by any factor of g(D);thus, e(D)fails to be detected only if D+1 is divisible by g(D).For any binary polynomial g(D)of degree L,there is some smallest n for which D+I is divisible by g(D).It is known from the theory of finite fields that this smallest n can be no larger than 2-1;moreover,for all L>0,there are special L-degree polynomials,called primitive polynomials,such that this smallest n is equal to 2-1.Thus,if g(D)is chosen to be such a primitive polynomial of degree L,and if the frame length is restricted to be at most 24-1,then D+1 cannot be divisible by g(D);thus,all double errors are detected. In practice,the generator polynomial g(D)is usually chosen to be the product of a primitive polynomial of degree L-I times the polynomial D+1.It is shown in Problem 2.14 that a polynomial e(D)is divisible by D+1 if and only if e(D)contains an even number of nonzero coefficients.This ensures that all odd numbers of errors are detected, and the primitive polynomial ensures that all double errors are detected (as long as the block length is less than 24).Thus,any code of this form has a minimum distance of at least 4,a burst-detecting capability of at least L,and a probability of failing to detect errors in completely random strings of 2-.There are two standard CRCs with length L=16(denoted CRC-16 and CRC-CCITT).Each of these CRCs is the product of D+1 times a primitive(L-1)-degree polynomial,and thus both have the foregoing properties. There is also a standard CRC with L =32.It is a 32-degree primitive polynomial,and has been shown to have a minimum distance of 5 for block lengths less than 3007 and 4 for block lengths less than 12,145 [FKL86].These polynomials are as follows: g(D)=D16+D15 +D2+1 for CRC-16 9(D D16+D2+D5 +1 for CRC-CCITT g(D)=D32+D26+D23+D22+D16+D2+D1+ DI0+D8+D?+D5+D4+D2+D'+1 2.4 ARQ:RETRANSMISSION STRATEGIES The general concept of automatic repeat request(ARQ)is to detect frames with errors at the receiving DLC module and then to request the transmitting DLC module to repeat the information in those erroneous frames.Error detection was discussed in the preceding section,and the problem of requesting retransmissions is treated in this section.There are two quite different aspects of retransmission algorithms or protocols.The first is that of correctness:Does the protocol succeed in releasing each packet,once and only once, without errors,from the receiving DLC?The second is that of efficiency:How much of the bit-transmitting capability of the bit pipe is wasted by unnecessary waiting and by sending unnecessary retransmissions?First,several classes of protocols are developed and shown to be correct (in a sense to be defined more precisely later).Later,the effect that the various parameters in these classes have on efficiency is considered
64 Point-to-Point Protocols and Links Chap. 2 r From the argument above, Dj is not divisible by g(D) or by any factor of g(D); thus, e(D) fails to be detected only if Di-j +I is divisible by g(D). For any binary polynomial g(D) of degree L, there is some smallest n for which D n + I is divisible by g(D). It is known from the theory of finite fields that this smallest n can be no larger than 2L - I; moreover, for all L > 0, there are special L-degree polynomials, called primitive polynomials, such that this smallest n is equal to 2L - 1. Thus, if g(D) is chosen to be such a primitive polynomial of degree L, and if the frame length is restricted to be at most 2L - I, then Di-j + I cannot be divisible by g(D); thus, all double errors are detected. In practice, the generator polynomial g(D) is usually chosen to be the product of a primitive polynomial of degree L - I times the polynomial D +I. It is shown in Problem 2.14 that a polynomial e(D) is divisible by D + I if and only if e(D) contains an even number of nonzero coefficients. This ensures that all odd numbers of errors are detected, and the primitive polynomial ensures that all double errors are detected (as long as the block length is less than 2L -I). Thus, any code of this form has a minimum distance of at least 4, a burst-detecting capability of at least L, and a probability of failing to detect errors in completely random strings of 2~L. There are two standard CRCs with length L = 16 (denoted CRC-16 and CRC-CCITT). Each of these CRCs is the product of D + I times a primitive (L-l)- degree polynomial, and thus both have the foregoing properties. There is also a standard CRC with L = 32. It is a 32-degree primitive polynomial, and has been shown to have a minimum distance of 5 for block lengths less than 3007 and 4 for block lengths less than 12,145 [FKL86]. These polynomials are as follows: g(D) = D I6 + DIS + D 2 + I g(D = D 16 + D I2 + D S + I for CRC-16 for CRC-CCITT g(D) = D 32 + D 26 + D 23 + D 22 + D 16 + D I2 + D ll+ D 10 + D 8 + D 7 + D S + D 4 + D 2 + D 1 + I 2.4 ARQ: RETRANSMISSION STRATEGIES The general concept of automatic repeat request (ARQ) is to detect frames with errors at the receiving DLC module and then to request the transmitting DLC module to repeat the information in those erroneous frames. Error detection was discussed in the preceding section, and the problem of requesting retransmissions is treated in this section. There are two quite different aspects of retransmission algorithms or protocols. The first is that of correctness: Does the protocol succeed in releasing each packet, once and only once, without errors, from the receiving DLC? The second is that of efficiency: How much of the bit-transmitting capability of the bit pipe is wasted by unnecessary waiting and by sending unnecessary retransmissions? First, several classes of protocols are developed and shown to be correct (in a sense to be defined more precisely later). Later, the effect that the various parameters in these classes have on efficiency is considered
Sec.2.4 ARQ:Retransmission Strategies 65 Recall from Fig.2.2 that packets enter the DLC layer from the network layer.The DLC module appends a header and trailer to each packet to form a frame,and the frames are transmitted on the virtual bit pipe (i.e.,are sent to the physical layer for transmis- sion).When errors are detected in a frame,a new frame containing the old packet is transmitted.Thus,the first transmitted frame might contain the first packet,the next frame the second packet,the third frame a repetition of the first packet,and so forth. When a packet is repeated,the frame header and trailer might or might not be the same as in the earlier version. Since framing will not be discussed until the next section,we continue to assume that the receiving DLC knows when frames start and end;thus a CRC (or any other technique)may be used for detecting errors.We also assume,somewhat unrealistically, that all frames containing transmission errors are detected.The reason for this is that we want to prove that ARQ works correctly except when errors are undetected.This is the best that can be hoped for,since error detection cannot work with perfect reliability and bounded delay;in particular,any code word can be changed into another code word by some string of transmission errors.This can cause erroneous data to leave the DLC or,perhaps worse,can cause some control bits to be changed.In what follows,we refer to frames without transmission errors as error-free frames and those with transmission errors as error frames.We are assuming,then,that the receiver can always disinguish error frames from error-free frames. Finally,we need some assumptions about the bit pipes over which these frames are traveling.The reason for these assumptions will be clearer when framing is studied; in effect,these assumptions will allow us to relax the assumption that the receiving DLC has framing information.Assume first that each transmitted frame is delayed by an arbitrary and variable time before arriving at the receiver,and assume that some frames might be "lost"and never arrive.Those frames that arrive,however,are assumed to arrive in the same order as transmitted,with or without errors.Figure 2.17 illustrates this behavior. Frame departure times at A 2 3 4 5 Node A Node B Frame arrival times at 8 Figure 2.17 Model of frame transmissions:frame 2 is lost and never arrives;frame 4 contains errors;the frames have variable transmission delay.but those that arrive do so in the order transmitted.The rectangles at the top of the figure indicate the duration over which the frame is being transmitted.Each arrow indicates the propagation of a frame from A to B,and the arrowhead indicates the time at which the frame is completely received:at this time the CRC can be recomputed and the frame accepted or not
Sec. 2.4 ARQ: Retransmission Strategies 65 Recall from Fig. 2.2 that packets enter the DLC layer from the network layer. The OLC module appends a header and trailer to each packet to form a frame, and the frames are transmitted on the virtual bit pipe (i.e., are sent to the physical layer for transmission). When errors are detected in a frame, a new frame containing the old packet is transmitted. Thus, the first transmitted frame might contain the first packet, the next frame the second packet, the third frame a repetition of the first packet, and so forth. When a packet is repeated, the frame header and trailer might or might not be the same as in the earlier version. Since framing will not be discussed until the next section, we continue to assume that the receiving OLC knows when frames start and end; thus a CRC (or any other technique) may be used for detecting errors. We also assume, somewhat unrealistically, that all frames containing transmission errors are detected. The reason for this is that we want to prove that ARQ works correctly except when errors are undetected. This is the best that can be hoped for, since error detection cannot work with perfect reliability and bounded delay; in particular, any code word can be changed into another code word by some string of transmission errors. This can cause erroneous data to leave the DLC or, perhaps worse, can cause some control bits to be changed. In what follows, we refer to frames without transmission errors as error-free frames and those with transmission errors as error frames. We are assuming, then, that the receiver can always disinguish error frames from error-free frames. Finally, we need some assumptions about the bit pipes over which these frames are traveling. The reason for these assumptions will be clearer when framing is studied; in effect, these assumptions will allow us to relax the assumption that the receiving OLC has framing information. Assume first that each transmitted frame is delayed by an arbitrary and variable time before arriving at the receiver, and assume that some frames might be "lost" and never arrive. Those frames that arrive, however, are assumed to arrive in the same order as transmitted, with or without errors. Figure 2.17 illustrates this behavior. Frame departure times at A NodeA NodeB Frame arrival times at B t--· Figure 2.17 Model of frame transmissions: frame 2 is lost and never arrives; frame 4 contains errors; the frames have variable transmission delay, but those that arrive do so in the order transmitted. The rectangles at the top of the figure indicate the duration over which the frame is being transmitted. Each arrow indicates the propagation of a frame from A to B, and the arrowhead indicates the time at which the frame is completely received; at this time the CRC can be recomputed and the frame accepted or not
66 Point-to-Point Protocols and Links Chap.2 2.4.1 Stop-and-Wait ARQ The simplest type of retransmission protocol is called stop-and-wait.The basic idea is to ensure that each packet has been received correctly before initiating transmission of the next packet.Thus,in transmitting packets from point A to B,the first packet is transmitted in the first frame,and then the sending DLC waits.If the frame is error-free, B sends an acknowledgment (called an ack)back to A;if the frame is an error frame, B sends a negative acknowledgment (called a nak)back to A.Since errors can occur from B to A as well as from A to B,the ack or nak is protected with a CRC. If an error-free frame is received at B,and the corresponding ack frame to A is error-free,then A can start to send the next packet in a new frame.Alternatively, detected errors can occur either in the transmission of the frame or the return ack or nak, and in either case A resends the old packet in a new frame.Finally,if either the frame or the return ack or nak is lost,A must eventually time-out and resend the old packet. Figure 2.18 illustrates a potential malfunction in such a strategy.Since delays are arbitrary,it is possible for node A to time-out and resend a packet when the first transmission and/or the corresponding ack is abnormally delayed.If B receives both transmissions of the given packet correctly,B has no way of knowing whether the second transmission is a new packet or a repetition of the old packet.One might think that B could simply compare the packets to resolve this issue,but as far as the DLC layer is concerned,packets are arbitrary bit strings and the first and second packets could be identical;it would be a violation of the principle of layering for the DLC layer to rely on higher layers to ensure that successive packets are different. The simplest solution to this problem is for the sending DLC module (at A)to use a sequence number in the frame header to identify successive packets.Unfortunately, even the use of sequence numbers from A to B is not quite enough to ensure correct operation.The problem is that acks can get lost on the return channel,and thus when B gets the same packet correctly twice in a row,it has to send a new ack for the second reception (see Fig.2.19).After transmitting the packet twice but receiving only one ack, 0 0 Time at A Node A Ack Node B Time at B- Packet 0 or 1? Packet 0 Figure 2.18 The trouble with unnumbered packets.If the transmitter at A times-out and sends packet 0 twice,the receiver at B cannot tell whether the second frame is a retransmission of packet 0 or the first transmission of packet I
66 2.4.1 Stop-and-Wait ARQ Point-to-Point Protocols and Links Chap. 2 The simplest type of retransmission protocol is called stop-and-wait. The basic idea is to ensure that each packet has been received correctly before initiating transmission of the next packet. Thus, in transmitting packets from point A to B, the first packet is transmitted in the first frame, and then the sending DLC waits. If the frame is error-free, B sends an acknowledgment (called an ack) back to A; if the frame is an error frame, B sends a negative acknowledgment (called a nak) back to A. Since errors can occur from B to A as well as from A to B, the ack or nak is protected with aCRe. If an error-free frame is received at B, and the corresponding ack frame to A is error-free, then A can start to send the next packet in a new frame. Alternatively, detected errors can occur either in the transmission of the frame or the return ack or nak, and in either case A resends the old packet in a new frame. Finally, if either the frame or the return ack or nak is lost, A must eventually time-out and resend the old packet. Figure 2.18 illustrates a potential malfunction in such a strategy. Since delays are arbitrary, it is possible for node A to time-out and resend a packet when the first transmission and/or the corresponding ack is abnormally delayed. If B receives both transmissions of the given packet correctly, B has no way of knowing whether the second transmission is a new packet or a repetition of the old packet. One might think that B could simply compare the packets to resolve this issue, but as far as the DLC layer is concerned, packets are arbitrary bit strings and the first and second packets could be identical; it would be a violation of the principle of layering for the DLC layer to rely on higher layers to ensure that successive packets are different. The simplest solution to this problem is for the sending DLC module (at A) to use a sequence number in the frame header to identify successive packets. Unfortunately, even the use of sequence numbers from A to B is not quite enough to ensure correct operation. The problem is that acks can get lost on the return channel, and thus when B gets the same packet correctly twice in a row, it has to send a new ack for the second reception (see Fig. 2.19). After transmitting the packet twice but receiving only one ack, Node A Node B TimeatA-- TimeatB-- Packet 0 Packet 0 or 1? Figure 2.18 The trouble with unnumbered packets. If the transmitter at A times-out and sends packet 0 twice. the receiver at B cannot tell whether the second frame is a retransmission of packet 0 or the first transmission of packet I