ao=inputo,input,...,input; a=inputs,input9,...inputs; a15={inp120,inpl121,…,inpl127}. The pattern can be extended to longer sequences (i.e.,for 192-and 256-bit keys),so that,in general, an=inputsn,inputsn+1,...,inputsn+7. (3.2) Taking Sections 3.2 and 3.3 together,Fig.2 shows how bits within each byte are numbered. Input bit sequence 01234678902B45678920222 23 Byte number Bit numbers in byte 76543210765432107654 2 Figure 2.Indices for Bytes and Bits 3.4 The State Internally,the AES algorithm's operations are performed on a two-dimensional array of bytes called the State.The State consists of four rows of bytes,each containing Nb bytes,where Nb is the block length divided by 32.In the State array denoted by the symbol s,each individual byte has two indices,with its row number r in the range 0<r<4 and its column number c in the range 0sc<Nb.This allows an individual byte of the State to be referred to as either sr.c or s[r,c].For this standard,Nb=4,i.e.,0sc<4 (also see Sec.6.3). At the start of the Cipher and Inverse Cipher described in Sec.5,the input-the array of bytes ino,in,...ins-is copied into the State array as illustrated in Fig.3.The Cipher or Inverse Cipher operations are then conducted on this State array,after which its final value is copied to the output-the array of bytes outo,out,...outis. input bytes State array output bytes ino 1n4 ins in12 S0,0S0,1 S0,2 50,3 outo out4 outs out12 in ins ing iny3 → S1,0S1,1 S12 S1,3 out outs outg out13 in In6 inio in14 S2.0 S2,1 S22 S2,3 out out6 out14 1n3 1 m11 inis S3.0S3,1 S32 S3,3 out out outu out1s Figure 3.State array input and output. Hence,at the beginning of the Cipher or Inverse Cipher,the input array,in,is copied to the State array according to the scheme: s[r,c]=in[r 4c] for0≤r<4and0≤c<Wb, (3.3) 9
9 a0 = {input0, input1, …, input7}; a1 = {input8, input9, …, input15}; M a15 = {input120, input121, …, input127}. The pattern can be extended to longer sequences (i.e., for 192- and 256-bit keys), so that, in general, an = {input8n, input8n+1, …, input8n+7}. (3.2) Taking Sections 3.2 and 3.3 together, Fig. 2 shows how bits within each byte are numbered. Input bit sequence 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 … Byte number 0 1 2 … Bit numbers in byte 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 … Figure 2. Indices for Bytes and Bits. 3.4 The State Internally, the AES algorithm’s operations are performed on a two-dimensional array of bytes called the State. The State consists of four rows of bytes, each containing Nb bytes, where Nb is the block length divided by 32. In the State array denoted by the symbol s, each individual byte has two indices, with its row number r in the range 0 £ r < 4 and its column number c in the range 0 £ c < Nb. This allows an individual byte of the State to be referred to as either sr,c or s[r,c]. For this standard, Nb=4, i.e., 0 £ c < 4 (also see Sec. 6.3). At the start of the Cipher and Inverse Cipher described in Sec. 5, the input – the array of bytes in0, in1, … in15 – is copied into the State array as illustrated in Fig. 3. The Cipher or Inverse Cipher operations are then conducted on this State array, after which its final value is copied to the output – the array of bytes out0, out1, … out15. input bytes State array output bytes in0 in4 in8 in12 s0,0 s0,1 s0,2 s0,3 out0 out4 out8 out12 in1 in5 in9 in13 s1,0 s1,1 s1,2 s1,3 out1 out5 out9 out13 in2 in6 in10 in14 s2,0 s2,1 s2,2 s2,3 out2 out6 out10 out14 in3 in7 in11 in15 ‡ s3,0 s3,1 s3,2 s3,3 ‡ out3 out7 out11 out15 Figure 3. State array input and output. Hence, at the beginning of the Cipher or Inverse Cipher, the input array, in, is copied to the State array according to the scheme: s[r, c] = in[r + 4c] for 0 £ r < 4 and 0 £ c < Nb, (3.3)
and at the end of the Cipher and Inverse Cipher,the State is copied to the output array out as follows: outir +4c]=s[r,c] for0≤r<4and0≤c<Nb. (3.4) 3.5 The State as an Array of Columns The four bytes in each column of the State array form 32-bit words,where the row number r provides an index for the four bytes within each word.The state can hence be interpreted as a one-dimensional array of 32 bit words(columns),wo...w3,where the column number c provides an index into this array.Hence,for the example in Fig.3,the State can be considered as an array of four words,as follows: 1W0=S0,0S1,0S2,0S3,0 1w2=S0,2S1,2S2,2S3,2 w1=S0,1S1,1S2,1S3,1 W3=S0,3S1,3S2,3S3,3. (3.5) 4. Mathematical Preliminaries All bytes in the AES algorithm are interpreted as finite field elements using the notation introduced in Sec.3.2.Finite field elements can be added and multiplied,but these operations are different from those used for numbers.The following subsections introduce the basic mathematical concepts needed for Sec.5. 4.1 Addition The addition of two elements in a finite field is achieved by "adding"the coefficients for the corresponding powers in the polynomials for the two elements.The addition is performed with the XOR operation(denoted by⊕)-i.e.,modulo2-so that 1⊕1=0,1⊕0=1,and0⊕0=0. Consequently,subtraction of polynomials is identical to addition of polynomials. Alternatively,addition of finite field elements can be described as the modulo 2 addition of corresponding bits in the byte.For two bytes (adasaasazaao)and (bbbsbab3b2bbol,the sum is {cc6cscc3C2cco},where each c=a⊕bi(i.e,c=a⊕b,c6=a6⊕bo,co=a⊕bo) For example,the following expressions are equivalent to one another: (x6+x4+x2+x+1)+(x?+x+1)=x?+x6+x4+x2 (polynomial notation); {01010111}⊕{10000011}={11010100} (binary notation); {57}©{83}={d4} (hexadecimal notation). 4.2 Multiplication In the polynomial representation,multiplication in GF(2)(denoted by.)corresponds with the multiplication of polynomials modulo an irreducible polynomial of degree 8.A polynomial is irreducible if its only divisors are one and itself.For the AES algorithm,this irreducible polynomial is m(x)=x8+x4+x3+x+1, (4.1) 10
10 and at the end of the Cipher and Inverse Cipher, the State is copied to the output array out as follows: out[r + 4c] = s[r, c] for 0 £ r < 4 and 0 £ c < Nb. (3.4) 3.5 The State as an Array of Columns The four bytes in each column of the State array form 32-bit words, where the row number r provides an index for the four bytes within each word. The state can hence be interpreted as a one-dimensional array of 32 bit words (columns), w0...w3, where the column number c provides an index into this array. Hence, for the example in Fig. 3, the State can be considered as an array of four words, as follows: w0 = s0,0 s1,0 s2,0 s3,0 w2 = s0,2 s1,2 s2,2 s3,2 w1 = s0,1 s1,1 s2,1 s3,1 w3 = s0,3 s1,3 s2,3 s3,3 . (3.5) 4. Mathematical Preliminaries All bytes in the AES algorithm are interpreted as finite field elements using the notation introduced in Sec. 3.2. Finite field elements can be added and multiplied, but these operations are different from those used for numbers. The following subsections introduce the basic mathematical concepts needed for Sec. 5. 4.1 Addition The addition of two elements in a finite field is achieved by “adding” the coefficients for the corresponding powers in the polynomials for the two elements. The addition is performed with the XOR operation (denoted by Å ) - i.e., modulo 2 - so that 1Å1 = 0 , 1Å 0 = 1, and 0 Å 0 = 0 . Consequently, subtraction of polynomials is identical to addition of polynomials. Alternatively, addition of finite field elements can be described as the modulo 2 addition of corresponding bits in the byte. For two bytes {a7a6a5a4a3a2a1a0} and {b7b6b5b4b3b2b1b0}, the sum is {c7c6c5c4c3c2c1c0}, where each ci = ai Å bi (i.e., c7 = a7 Å b7, c6 = a6 Å b6, ...c0 = a0 Å b0). For example, the following expressions are equivalent to one another: ( 1) 6 4 2 x + x + x + x + + ( 1) 7 x + x + = 7 6 4 2 x + x + x + x (polynomial notation); {01010111} Å {10000011} = {11010100} (binary notation); {57} Å {83} = {d4} (hexadecimal notation). 4.2 Multiplication In the polynomial representation, multiplication in GF(28 ) (denoted by ·) corresponds with the multiplication of polynomials modulo an irreducible polynomial of degree 8. A polynomial is irreducible if its only divisors are one and itself. For the AES algorithm, this irreducible polynomial is ( ) 1 8 4 3 m x = x + x + x + x + , (4.1)
or (01){1b}in hexadecimal notation. For example,(57)83)={c1),because (x6+x4+x2+x+1)(x7+x+1)= xB+x"+x9+x8+x7+ x7+x3+x3+x2+x+ x6+x4+x2+x+1 xB+x1+x9+x8+x6+x3+x4+x3+1 and x3+x+x+x+x+x+x+x3+1 modulo (x+x+x+x+1) =x7+x6+1. The modular reduction by m(x)ensures that the result will be a binary polynomial of degree less than 8,and thus can be represented by a byte.Unlike addition,there is no simple operation at the byte level that corresponds to this multiplication. The multiplication defined above is associative,and the element {01 is the multiplicative identity.For any non-zero binary polynomial b(x)of degree less than 8,the multiplicative inverse of b(x),denoted b"(x),can be found as follows:the extended Euclidean algorithm [7]is used to compute polynomials a(x)and c(x)such that b(x)a(x)+m(x)c(x)=1. (4.2) Hence,a(x).b(x)mod m(x)=1,which means b-(x)=a(x)mod m(x) (4.3) Moreover,for any a(x),b(x)and c(x)in the field,it holds that a(x)·(b(x)+c(x)=a(x)·b(x)+a(x)·c(x). It follows that the set of 256 possible byte values,with XOR used as addition and the multiplication defined as above,has the structure of the finite field GF(25). 4.2.1 Multiplication by x Multiplying the binary polynomial defined in equation(3.1)with the polynomial x results in b7x8+b6x7+b5x6+b4xs+b3x+b2x3+b1x2+box. (4.4) The result x.b(x)is obtained by reducing the above result modulo m(x),as defined in equation (4.1).If b7=0,the result is already in reduced form.If b7=1,the reduction is accomplished by subtracting (i.e.,XORing)the polynomial m(x).It follows that multiplication by x(i.e., (00000010)or (02))can be implemented at the byte level as a left shift and a subsequent conditional bitwise XOR with (1b).This operation on bytes is denoted by xtime(). Multiplication by higher powers ofx can be implemented by repeated application of xtime(). By adding intermediate results,multiplication by any constant can be implemented. For example,(57).13)={fe)because 11
11 or {01}{1b} in hexadecimal notation. For example, {57} · {83} = {c1}, because ( 1) 6 4 2 x + x + x + x + ( 1) 7 x + x + = + + + + + 13 11 9 8 7 x x x x x x + x + x + x + x + 7 5 3 2 1 6 4 2 x + x + x + x + = 1 13 11 9 8 6 5 4 3 x + x + x + x + x + x + x + x + and 1 13 11 9 8 6 5 4 3 x + x + x + x + x + x + x + x + modulo ( 1 8 4 3 x + x + x + x + ) = 1 7 6 x + x + . The modular reduction by m(x) ensures that the result will be a binary polynomial of degree less than 8, and thus can be represented by a byte. Unlike addition, there is no simple operation at the byte level that corresponds to this multiplication. The multiplication defined above is associative, and the element {01} is the multiplicative identity. For any non-zero binary polynomial b(x) of degree less than 8, the multiplicative inverse of b(x), denoted b -1(x), can be found as follows: the extended Euclidean algorithm [7] is used to compute polynomials a(x) and c(x) such that b(x)a(x) + m(x)c(x) = 1. (4.2) Hence, a(x) · b(x) mod m(x) = 1, which means ( ) ( ) mod ( ) 1 b x = a x m x - . (4.3) Moreover, for any a(x), b(x) and c(x) in the field, it holds that a(x) · (b(x) + c(x)) = a(x) · b(x) + a(x) · c(x) . It follows that the set of 256 possible byte values, with XOR used as addition and the multiplication defined as above, has the structure of the finite field GF(28 ). 4.2.1 Multiplication by x Multiplying the binary polynomial defined in equation (3.1) with the polynomial x results in b x b x b x b x b x b x b x b x0 2 1 3 2 4 3 5 4 6 5 7 6 8 7 + + + + + + + . (4.4) The result x · b(x)is obtained by reducing the above result modulo m(x), as defined in equation (4.1). If b7 = 0, the result is already in reduced form. If b7 = 1, the reduction is accomplished by subtracting (i.e., XORing) the polynomial m(x). It follows that multiplication by x (i.e., {00000010} or {02}) can be implemented at the byte level as a left shift and a subsequent conditional bitwise XOR with {1b}. This operation on bytes is denoted by xtime(). Multiplication by higher powers of x can be implemented by repeated application of xtime(). By adding intermediate results, multiplication by any constant can be implemented. For example, {57} · {13} = {fe} because
{57}·{02}=xtime({57})={ae} {57}·{04}=xtime({ae})={47} {57}·{08}=xtime({47})={8e} {57}·{10}=xtime({8e})={07}, thus, {57}·{13}={57}·({01}⊕{02}⊕{10}) ={57}⊕{ae}⊕{07} ={fe}. 4.3 Polynomials with Coefficients in GF(2) Four-term polynomials can be defined-with coefficients that are finite field elements-as: a(x)=ax+ax2+ax+ao (4.5) which will be denoted as a word in the form [ao,a,2,as].Note that the polynomials in this section behave somewhat differently than the polynomials used in the definition of finite field elements,even though both types of polynomials use the same indeterminate,x.The coefficients in this section are themselves finite field elements,i.e.,bytes,instead of bits;also,the multiplication of four-term polynomials uses a different reduction polynomial,defined below. The distinction should always be clear from the context. To illustrate the addition and multiplication operations,let b(x)=bx3+b2x2+bx+b。 (4.6) define a second four-term polynomial.Addition is performed by adding the finite field coefficients of like powers of x.This addition corresponds to an XOR operation between the corresponding bytes in each of the words -in other words,the XOR of the complete word values. Thus,using the equations of(4.5)and(4.6), a(x)+b(x)=(a3⊕b)x3+(a2田b2)x2+(a1⊕b)x+(a⊕b) (4.7) Multiplication is achieved in two steps.In the first step,the polynomial product c(x)=a(x). b(x)is algebraically expanded,and like powers are collected to give c(x)=Cox+csx+cax+cx+czx2+cx+co (4.8) where co=a·b C4=a3·b⊕a2·b2⊕a1·b3 C1=a1·b⊕a·b C5=a3·b2©a2·b3 C2=a2b⊕a1b⊕a·b2 C6=a3·b (4.9) 12
12 {57} · {02} = xtime({57}) = {ae} {57} · {04} = xtime({ae}) = {47} {57} · {08} = xtime({47}) = {8e} {57} · {10} = xtime({8e}) = {07}, thus, {57} · {13} = {57} · ({01} Å {02} Å {10}) = {57} Å {ae} Å {07} = {fe}. 4.3 Polynomials with Coefficients in GF(28 ) Four-term polynomials can be defined - with coefficients that are finite field elements - as: 1 0 2 2 3 3 a(x) = a x + a x + a x + a (4.5) which will be denoted as a word in the form [a0 , a1 , a2 , a3 ]. Note that the polynomials in this section behave somewhat differently than the polynomials used in the definition of finite field elements, even though both types of polynomials use the same indeterminate, x. The coefficients in this section are themselves finite field elements, i.e., bytes, instead of bits; also, the multiplication of four-term polynomials uses a different reduction polynomial, defined below. The distinction should always be clear from the context. To illustrate the addition and multiplication operations, let 1 0 2 2 3 3 b(x) = b x + b x + b x + b (4.6) define a second four-term polynomial. Addition is performed by adding the finite field coefficients of like powers of x. This addition corresponds to an XOR operation between the corresponding bytes in each of the words – in other words, the XOR of the complete word values. Thus, using the equations of (4.5) and (4.6), ( ) ( ) ( ) ( ) ( ) ( ) 1 1 0 0 2 2 2 3 a x + b x = a3 Åb3 x + a Åb x + a Åb x + a Åb (4.7) Multiplication is achieved in two steps. In the first step, the polynomial product c(x) = a(x) · b(x) is algebraically expanded, and like powers are collected to give 1 0 2 2 3 3 4 4 5 5 6 6 c(x) = c x + c x + c x + c x + c x + c x + c (4.8) where 0 a0 b0 c = · 4 a3 b1 a2 b2 a1 b3 c = · Å · Å · 1 a1 b0 a0 b1 c = · Å · 5 a3 b2 a2 b3 c = · Å · 2 a2 b0 a1 b1 a0 b2 c = · Å · Å · 6 a3 b3 c = · (4.9)
c3=a3·b©a2·b⊕a1·b2⊕a·b3. The result,c(x),does not represent a four-byte word.Therefore,the second step of the multiplication is to reduce c(x)modulo a polynomial of degree 4;the result can be reduced to a polynomial of degree less than 4.For the AES algorithm,this is accomplished with the polynomialx+1,so that xmod(x+1)=ximod4 (4.10) The modular product of a(x)and b(x),denoted by a(x)b(x),is given by the four-term polynomial d(x),defined as follows: d(x)=dx'+d,x2+d x +do (4.11) with d=(a·b)©(a3·b)©(a2·b2)©(a1·b) d1=(a1·b)©(a·b)⊕(a3·b2)⊕(a2·b3) (4.12) d2=(a2b)©(a1b)©(ab2)⊕(a3b) d3=(a3b)©(a2·b)⊕(a1·b2)©(a·b) When a(x)is a fixed polynomial,the operation defined in equation (4.11)can be written in matrix form as: d ao a;a a 「b d a a2 b (4.13) d az a ao a3 b2 a2 a aoJ儿b」 Because x+1 is not an irreducible polynomial over GF(2),multiplication by a fixed four-term polynomial is not necessarily invertible.However,the AES algorithm specifies a fixed four-term polynomial that does have an inverse (see Sec.5.1.3 and Sec.5.3.3): a(x)={03x3+{01}x2+{01x+{02} (4.14) a'(x)={0bx3+{0d}x2+{09}x+{0e} (4.15) Another polynomial used in the AES algorithm(see the Rotword (function in Sec.5.2)has ao =a=a2={00)and a3=(01),which is the polynomial x.Inspection of equation(4.13)above will show that its effect is to form the output word by rotating bytes in the input word.This means that [bo,b1,b2,b3]is transformed into [bi,b2,b3,bol. 5.Algorithm Specification For the AES algorithm,the length of the input block,the output block and the State is 128 bits.This is represented by Nb=4,which reflects the number of 32-bit words (number of columns)in the State. 13
13 3 a3 b0 a2 b1 a1 b2 a0 b3 c = · Å · Å · Å · . The result, c(x), does not represent a four-byte word. Therefore, the second step of the multiplication is to reduce c(x) modulo a polynomial of degree 4; the result can be reduced to a polynomial of degree less than 4. For the AES algorithm, this is accomplished with the polynomial x 4 + 1, so that 4 mod 4 mod( 1) i i x x + = x . (4.10) The modular product of a(x) and b(x), denoted by a(x) Ä b(x), is given by the four-term polynomial d(x), defined as follows: 1 0 2 2 3 3 d(x) = d x + d x + d x + d (4.11) with ( ) ( ) ( ) ( ) d0 a0 b0 a3 b1 a2 b2 a1 b3 = · Å · Å · Å · ( ) ( ) ( ) ( ) d1 a1 b0 a0 b1 a3 b2 a2 b3 = · Å · Å · Å · (4.12) ( ) ( ) ( ) ( ) d2 a2 b0 a1 b1 a0 b2 a3 b3 = · Å · Å · Å · ( ) ( ) ( ) ( ) d3 a3 b0 a2 b1 a1 b2 a0 b3 = · Å · Å · Å · When a(x) is a fixed polynomial, the operation defined in equation (4.11) can be written in matrix form as: ú ú ú ú û ù ê ê ê ê ë é ú ú ú ú û ù ê ê ê ê ë é = ú ú ú ú û ù ê ê ê ê ë é 3 2 1 0 3 2 1 0 2 1 0 3 1 0 3 2 0 3 2 1 3 2 1 0 b b b b a a a a a a a a a a a a a a a a d d d d (4.13) Because 1 4 x + is not an irreducible polynomial over GF(28 ), multiplication by a fixed four-term polynomial is not necessarily invertible. However, the AES algorithm specifies a fixed four-term polynomial that does have an inverse (see Sec. 5.1.3 and Sec. 5.3.3): a(x) = {03}x 3 + {01}x 2 + {01}x + {02} (4.14) a -1(x) = {0b}x 3 + {0d}x 2 + {09}x + {0e}. (4.15) Another polynomial used in the AES algorithm (see the RotWord() function in Sec. 5.2) has a0 = a1 = a2 = {00} and a3 = {01}, which is the polynomial x 3 . Inspection of equation (4.13) above will show that its effect is to form the output word by rotating bytes in the input word. This means that [b0, b1, b2, b3] is transformed into [b1, b2, b3, b0]. 5. Algorithm Specification For the AES algorithm, the length of the input block, the output block and the State is 128 bits. This is represented by Nb = 4, which reflects the number of 32-bit words (number of columns) in the State