1 / 87

Low Cost Design of Advanced Encryption Standard (AES) Processor

Low Cost Design of Advanced Encryption Standard (AES) Processor. Ming-Chih Chen Department of Electronic Engineering National Kaohsiung First University of Science and Technology. Outline. Introduction Previous AES Design Methods

Download Presentation

Low Cost Design of Advanced Encryption Standard (AES) Processor

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.


Presentation Transcript

  1. Low Cost Design of Advanced Encryption Standard (AES) Processor Ming-Chih Chen Department of Electronic Engineering National Kaohsiung First University of Science and Technology

  2. Outline • Introduction • Previous AES Design Methods • Two Proposed Substructure Sharing Methods for XOR-based Operations • Two Proposed CSE Algorithms for Sum-of-Product Operations • Comparisons and Implementations • Conclusions

  3. Introduction

  4. Introduction • In Oct. 2000, the Rijndael Advanced Encryption Standard was selected by the NIST (National Institute of Standards and Technology) as a new encryption standard. • The Rijndael AES algorithm is a symmetric block cipher that processes data blocks of 128 bits using cipher keys with lengths of 128, 192, and 256 bits. • Applications for AES include the security of wireless network (IEEE 802.11), smart card, …etc.

  5. Advanced Encryption Standard Finite Field Operations AES Transformations & Algorithm

  6. Finite Field Operations

  7. Finite Field Addition • Bitwise XOR operation (or modulo-2 addition) (Polynomial notation) (Binary notation) (Hexadecimal notation)

  8. Multiplication in GF(28) • Multiplication of two polynomials modulo an irreducible polynomial m(x)=x8+x4+x3+x+1 • Ex: {57}·{83}={c1} • Multiplicative identity: {01} • Multiplicative inverse of b(x) is denoted by b-1(x) • Extended Euclidean algorithm • b(x)a(x) + m(x)c(x)=1 => b-1(x)=a(x) mod m(x)

  9. Multiplication by X • b7=0 • Left shift • b7=1 • Left shift followed by bitwise XOR with {1b} • This operation is denoted by xtime( )

  10. Polynomial with Coeffs. in GF(28) • Each coeff. of a polynomial is a byte (8-bit) • Polynomial addition: a(x) + b(x) • Byte-wise XOR for corresponding coeffs. • Polynomial multiplication modulo x4+1 • d(x) = a(x) b(x) (similar to cyclic convolution)

  11. AES Transformations&Algorithm

  12. Inputs and Outputs • Input and output • Sequences of blocks with block length of 128 bits (Nb = 4 words for each block • Cipher key • Sequence of cipher keys with key length of 128, 192 or 256 bits (Nk = 4, 6, or 8 words for each key)

  13. Byte Representation • Block length = 128 bits = 16 bytes • Key length = 128, 192 or 256 bits = 16, 24 or 32 bytes • Finite field element representation • Polynomial, {01100011}=x6+x5+x+1 • Hexadecimal representation • {01100011}={63} • One extra bit to the left of a byte • {01}{1b}

  14. State: 2-D 4 x 4 array of bytes A state has four rows and Nb columns 1D array of 32-bit words w0, w1, w2, w3 with each word wi composed of a column in the 2-D state State

  15. Key-Block-Round

  16. Rijndael AES Algorithm (a) Encryption (b) Direct Decryption (c) Modified Decryption

  17. Four Transformations in Cipher • SubBytes( ):SB • Nonlinear byte substitution • ShiftRows( ):SR • Cyclically left-shift the last three rows of the state • MixColumns( ):MC • Transformation on each column of the state • AddRoundKey( ):ARK • Each column is XORed with a 32-bit key schedule word generated from the key expansion

  18. SubBytes( ) • Take multiplicative inverse (MI) in GF(28): S S-1 • Apply affine transformation (AF) over GF(2) as follows: S’=M·S-1+C (C={63}16) • where S and S’ are input/output bytes in 8-D vector formats

  19. Overall Effect of SubBytes( ) • Substitution table (S-box)

  20. ShiftRows( )

  21. MixColumns( ) • Polynomial multiplication of a fixed term a(x)={03}x3+{01}x2+{01}x+{02} modulo x4+1

  22. AddRoundKey( )

  23. Key Expansion • For Nk=4 or 6, and i ≠ multiple of Nk − w[i] = w[i-1] ⊕ w[i-Nk] • for i = multiple of Nk − w[i] = transformation1(w[i-1]) ⊕ w[i-Nk] − Transformation 1 contains RotWord(), followed by SubWord(), followed by XOR with Rcon[i] • If Nk=8 and i-4 = multiple of Nk − w[i] = transformation2(w[i-1]) ⊕ w[i-Nk] − Transformation 2 contains SubWord() only

  24. Key Expansion Structure: On-the-Fly w(i+2) / w(i+6) w(i+3) / w(i+7) w(i) / w(i+4) w(i+1) / w(i+5) w(i+3) / w(i+3) w(i+4) / w(i) w(i+5) / w(i+1) w(i+6) / w(i+2) w(i+7) / w(i+3)

  25. Four Transformations in Inverse Cipher • InvSubBytes( ):ISB • Nonlinear byte substitution • InvShiftRows( ):ISR • Cyclically left-shift the last three rows of the state • InvMixColumns( ):IMC • Transformation on each column of the state • AddRoundKey( ):ARK • Each column is XORed with a 32-bit key schedule word generated from the key expansion

  26. InvSubBytes( ) • Apply inverse affine (IAF) transformation over GF(2) as follows: S-1=M-1(S’+c) • Take multiplicative inverse (MI) in GF(28): S-1S • Overall effect: S-1-box

  27. InvShiftRows • Cyclically right-shift the last three rows of the state.

  28. InvMixColumns( ) • Polynomial multiplication of a fixed term a-1(x)={0b}x3+{0d}x2+{09}x+{0e} modulo x4+1

  29. Previous AES Design Methods

  30. Optimization Approaches for AES Transformations

  31. Three Categories of Transformation Optimization • The optimization of separate transformations. • The optimization of combined round transformations. • The optimization of integrated encryption/decryption transformations.

  32. The Optimization of Separate Transformations (1) • Two major transformations: • SB (ISB), MC (IMC) • SB (ISB): • Perform MI (Multiplicative Inverse) in GF(28) followed by AF. • 1. Uses 256x8-bit table look-up ROM (S-box) to store all pre-calculated results. • 2. Changes the calculation of MI in GF(28) to that in the composite field GF((24)2). • 3. Changes the calculation of MI in GF(28) to that in the composite field GF(((22)2)2). • 4. Uses the calculation of MI in GF(28) based on matrix decomposition of A-1.

  33. Calculation of Multiplicative Inverse (MI) in GF((24)2) (1.2a) • There are three stages for the calculation of MI in GF((24)2).

  34. Calculation of Multiplicative Inverse (MI) in GF((24)2) (1.2b) • Stage 1: • Translate from GF(28) to the composite field in GF((24)2). Expand The implementation of T transformation has area=17AXOR , and delay=3 TXOR.

  35. Calculation of Multiplicative Inverse (MI) in GF((24)2) (1.2c) • Stage 2: • Find the MI for the two number in GF(24). where A=(0001)2, and B=(1001)2

  36. Calculation of Multiplicative Inverse (MI) in GF((24)2) (1.2d) • Stage 3: • Convert the number in GF((24)2) to the number in GF(28) using T-1.

  37. Calculation of Multiplicative Inverse (MI) Using A-1 (1.4) • A-1: • The A-1 (MI) can be calculated by • It requires four GF(28) multipliers, plus one A2 and three A4 components.

  38. The Optimization of Separate Transformations (2) • MC (IMC): • 1. Byte-level optimization: Multiplication block (XTime): multiplies a byte with a constant value {02}16 and then reduces the numbers of XTime blocks by different byte-level sharing methods. • Ex1: MC: D”={01}A+{01}B+{02}D+{03}E =A+B+XTime(D)+XTime(E)+E • Ex2: MC: D”={02}(D+E)+(A+B+D+E)+D using {02}D={02}D+D+D, D+D=0

  39. The Optimization of Separate Transformations (3) • 2. Bit-level optimization: Common sub-expression elimination algorithm (CSE): extracts the common factors as possible in order to further reduce the hardware cost. • Ex: {02]A={a6, a5, a4, a3+a7, a2+a7, a1,a0+a7, a7} {03}A={a6+a7, a5+a6, a4+a5, a3+a4+a7, a2+a3+a7, a1+a2, a0+a1+a7, a0+a7} The factor a0+a7 appears at 1-th bit of {02}A, and 0, 1-th bits of {03}A can be extracted and replaced with a8=(a0+a7). The factor a3+a7 appears at 4-th bit of {02}A, and 3, 4-th bits of {03}A can also be extracted and replaced with a9=(a3+a7).

  40. The Optimization of Combined Round Transformations (1) • Combine SB, SR, and MC in encryption or ISB, ISR, and IMC in decryption. • 1. Table-lookup ROM (T-box or T-1-box):

  41. The Optimization of Combined Round Transformations (2) – 2. Combined IMC/ISR/IAF and AF/SR/MC with Shared MI in GF((24)2): (a) Combined AF/SR/MC (b) Combined IMC/ISR/IAF Integration of AES Enc. and Dec. with shared MI in GF((24)2)

  42. The Optimization of Integrated Encryption/Decryption Transformations (1) • Two major integrations: • Integration of SB and ISB, integration of MC and IMC. • SB/ISB: • Share the same MI logic in GF(28) but multiplexes the AF and IAF.

  43. The Optimization of Integrated Encryption/Decryption Transformations (2) • MC/IMC: • 1. Share the common factor, XTime block, for constructing one output byte of MC and IMC as shown in followed figure. • 2. Decompose the constant matrix of IMC =MC x C. C is a constant matrix as shown in the following equation.

  44. The Optimization of Integrated Encryption/Decryption Transformations (3) – 3. Decompose the IMC=MC+F+G. F and G are two constant matrix multiplications. IMC: + + MC F G

  45. Our Proposed Substructure Sharing Methods for XOR-based Operations Bit-level Expressions of AES Transformations Proposed Method: Bit-level Substructure Sharing

  46. Bit-level Expressions of AES Transformations

  47. Bit-level Expressions of AES Transformations • Two kinds of major transformations, SB (ISB), MC (IMC) occupy about 65% of total area cost for implementing AES. • They can be expressed as bit-level XOR-based sum-of-product (SoP) operations. • SB: OutSB=MI+AF • ISB: OutISB=IAF+MI • MI: GF((24)2), GF(((22)2)2) • MC: OutMC={01}A+{01}B+{02}D+{03}E (1-byte output) • IMC: OutIMC={0d}A+{09}B+{0e}D+{0b}E (1-byte output)

  48. Two Proposed CSE Algorithms for Sum-of-Product Operations Bit-level SoP Expressions Proposed Method III: Vertical CSE Algorithm Proposed Method IV: Horizontal CSE Algorithm

  49. Bit-level Expressions (1) • A group of P bit-level equations (z0, z1, ..., zP-1) with M0 primary input variables (a0, a1, …, aM0-1) and N0 product-terms (w0, w1, …, wN0-1) can be expressed as the following matrix product form:

  50. Bit-level Expressions (2) • The N0 intermediate bit variables wi can be expressed as • with where is defined as and .denotes the bit-wised AND operation.

More Related