1 / 51

Towards FPGA Architectures Optimized For Cryptographic Algorithms

Towards FPGA Architectures Optimized For Cryptographic Algorithms. 唐 明. Table of Contents. Antecedents Motivation General and Specific Objectives State art of the work Results Publications Future Work Conclusions. Antecedents. Cryptographic algorithms can be implemented through

clay
Download Presentation

Towards FPGA Architectures Optimized For Cryptographic Algorithms

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Towards FPGA Architectures Optimized For Cryptographic Algorithms 唐明

  2. Table of Contents • Antecedents • Motivation • General and Specific Objectives • State art of the work • Results • Publications • Future Work • Conclusions

  3. Antecedents • Cryptographic algorithms can be implemented through • Software • ASIC • FPGAs Choice of platform depends upon • Algorithm performance • Cost • Flexibility

  4. Antecedents(continued) • Software • Most flexible Low Performance • Low cost • ASIC • High performance No flexibility at all High cost • FPGAs • Most flexible • Low cost • High performance

  5. Motivation • FPGAs-Potential Features • Cryptographic algorithms-Basic Functions

  6. FPGA: Field programmable Gate Arrays

  7. Configurable Logic Block 4 Combinational Logic 4 16x1 RAM 1-bit reg 1-bit reg 1-bit reg 1-bit reg 4 Combinational Logic 4 16x1 RAM Logic Mode Memory Mode

  8. Virtex-II Pro 1 Logic Cell = (1) 4-input LUT + (1) FF + (1) Carry Logic 1 CLB = (4) Slices http://www.xilinx.com/products/tables/fpga.htm#v2p

  9. Cryptographic algorithms on FPGAs Cryptographic algorithms contains: • Simple logical operations - at a bit level • Replicated blocks • block length is high Can benefits FPGAs because • FPGAs actually treat bit level operations • Blocks can be just copied • Parallelism is possible (high no. of IOs) • More physical security • Flexibility • High density

  10. Objectives • General • To achieve optimized implementations for cryptographic algorithms Specific Objectives • DES: Data encryption standard • AES: Advance Encryption Standard • ECC: Elliptic Curve Cryptography

  11. Background The Advanced Encryption Standard (AES Algorithm) is a computer security standard that became effective on May 26, 2002 by NIST to replace DES. The cryptography scheme is a symmetric block cipher that encrypts and decrypts 128-bit blocks of data. Lengths of 128, 192, and 256 bits are standard key lengths used by AES Algorithm.

  12. Comparison

  13. AES: Advanced Encryption Standard Plain Text 128 AES Key 128 • AES Processes • Key Scheduling • Encryption • Decryption 128 Cipher Text

  14. AES: Advanced Encryption Standard Input = 128 bits = 16 bytes

  15. Key Scheduling ………………………….. …………………………..

  16. AES Encryption Algorithm Flow USER KEY SUB KEY SUB KEY IN OUT ARK BS ARK BS SR ARK (ROUND-1..9) SR MC BS: Byte Substitution SR: Shift Rows MC: Mix Column ARK: Add Round Key

  17. Byte Substitution SUB KEY BS ARK SR MC S-BOX 16x16 State Matrix

  18. ShiftRow(SR) SUB KEY BS ARK SR MC Offset 0 Offset 1 SR Offset 2 Offset 3 Offset 0 Offset 1 ISR Offset 2 Offset 3

  19. MixColumn(MC) & Inv MixColumn(IMC) SUB KEY BS ARK MC SR MC i=0,1,2,3 IMC

  20. b0,0 b0,1 b0,2 b0,3 k0,0 k0,1 k0,2 k0,3 a0,0 a0,1 a0,2 a0,3 b1,0 b1,1 b1,2 b1,3 k1,0 k1,1 k1,2 k1,3 a1,0 a1,1 a1,2 a1,3 b2,0 b2,1 b2,2 b2,3 k2,0 k2,1 k2,2 k2,3 a2,0 a2,1 a2,2 a2,3 b3,0 b3,1 b3,2 b3,3 k3,0 k3,1 k3,2 k3,3 a3,0 a3,1 a3,2 a3,3 AddRoundKey(ARK) SUB KEY BS ARK SR MC key

  21. Our Contributions • Design 1: Encryptor Core • Sequential vs. Pipelined Architecture • Design 2: Encryptor/Decryptor Core • MixColumn & Inv. MixColumn modified • Design 3: Encryptor/Decryptor Core • S-Box & Inv. S-Box

  22. Our Contributions • Design 1: Encryptor Core • Sequential vs. Pipelined Architecture

  23. USER-KEY CLK ROUND-KEY ROUND-KEY S PLAIN TEXT RND 0 CIPHER TEXT RND 1-9 LATCH RND 10 RCON CLK S USER KEY ROUND KEY KGEN LATCH AES Algorithm ImplementationSequential Approach

  24. AES Algorithm Implementation Pipelined Approach IN REG RND 0 RND 1 RND 2 RND 3 RND 4 RND 5 RND 6 RND 7 RND 8 RND 9 RND 10 OUT IN RK 10 RK 2 RK 3 RK 4 RK 5 RK 6 RK 7 RK 8 RK 9 RK 0 RK 1 IN REG KGEN KGEN KGEN KGEN KGEN KGEN KGEN KGEN KGEN KGEN KGEN USER- KEY

  25. Our Contributions • Design 2: Encryptor/Decryptor Core • MixColumn & Inv. MixColumn Modified

  26. S-BOX MI AF IN IAF MI INV S-BOX AF S-BOX IN MI IAF INV S-BOX BS and Inverse BS E/D

  27. MixColumn(MC) & Inv MixColumn(IMC) Revisted MC IMC **Every entry is represented in GF(28)

  28. MixColumn(MC) & Inv MixColumn(IMC) Cont… For MC, the biggest co-efficient is, 03 Where For IMC, the biggest co-efficient is, 0D • The co-efficient for IMC have higher hamming weight ? • It is costly operation?

  29. MixColumn(MC) & Inv MixColumn(IMC) Cont… We observe that, (1) (2) The biggest co-efficient for Eq.2 is, 05 Eq.1, we already have, Eq.2 calculation can be made before Eq.1

  30. Data Path for Encryption/Decryption Encryption: MI + AF + SR + MC + ARK Decryption: ISR + IAF + MI + ModM + MC + ARK

  31. Our Contributions • Design 3: Encryptor/Decryptor Core • S-Box & Inv. S-Box

  32. S-BOX IAF IN MI IAF INV S-BOX Byte Substitution (Revisited) S-BOX 16x16 State Matrix

  33. MI: 1st Approach • MI with Lookup Table • Same S-Box (MI) for encryption/decryption • Memory requirements become half • BRAMs are used for storing MI values. • No initial time to prepare them E/D E/D AF MC SR ARK MI OUT IN ISR IMC IARK IAF

  34. Ist Transformation MI Manipulation 2nd Transformation M-1 M GF(28) TO FIELD F IN GF(24) FIELD F TO GF(28) MI: 2nd Approach MI Three-Stage Strategy S. Morioka and A. Satoh, CHES 2002 • MI with Composite Fields GF(22)2 & GF(24)2 • Map the elementAGF(28) to a composite fieldF • Compute the Multiplicative Inverse over the fieldF • Map back from fieldF to GF(28)

  35. MI Implementation Let AF2 and A= AHy + AL, then it can be shown that:

  36. AES Algorithm Implementations Results

  37. Throughput := Clock cycle (Frequency) x No. of bits No. of rounds Matrix to measure? 1 2 • FPGAs Resources used • CLB slices • BRAMs • etc.

  38. Sequential Vs Pipeline design Sequential Design Pipeline Design

  39. MixColumn vs Inv MixColumn • Two approach for MC/IMC • Less BRAMs • Less Slices • Higher Throughput reported to-date

  40. S-Box Vs Inv S-Box • Two approaches for MI • Key Scheduling included • No initial delay • First design uses look-up table for MI, • Fast but high memory requirements • Second design use composite field approach • for MI, Slower with less memory requirements. • Both are efficient as compared to reported design

  41. 加密卡结构

  42. Our Contributions Elliptic Curve Cryptography

  43. Elliptic Curve Cryptography Scaler Multiplication Q = k P Elliptic Curve Operation Point doubling Q=2P Point addition R=P+Q Multiplication Squaring,Addition etc. GF(2m) Arithmatic

  44. GF(2191) Arithmetic-Square A = 1111 A2= 1010101

  45. GF(2191) Arithmetic-Reduction

  46. Karatsuba Multiplier GF(2191) Then Polynomial multiplication of A and B is: The karatsuba algorithm has an idea that the above product can be written as:

  47. Point addition GF(2191) Hessian Form

  48. Point doubling GF(2191) Hessian Form

  49. Performance results Tool : Xilinx Foundation F4.1i Device: XCV2600E For ECC scalar multiplication Maximum Reported timings := 170 µs [Gerardo, Chess 2000,] Estimated timings := <100 µs

More Related