1 / 23

EE800 Term Project

Study of AES Encryption/ Decription Optimizations. EE800 Term Project. Nathan Windels. Outline. Introduction AES Algorithm Areas of Optimization Progress/Conclusion. Introduction. Introduction. Three major implementation methods: Software

zudora
Download Presentation

EE800 Term Project

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Study of AES Encryption/Decription Optimizations EE800 Term Project Nathan Windels

  2. Outline • Introduction • AES Algorithm • Areas of Optimization • Progress/Conclusion

  3. Introduction

  4. Introduction Three major implementation methods: • Software • -Typically, this method is much slower than hardware implementations. • FPGA • -Implemented as a hardware module directly to pins. • -Peripheral to a soft-core processor (communicates via on-chip bus). • -Tightly-coupled hardware implemented as an extended instruction set. • Custom Hardware (ASIC)

  5. Introduction (2) • High throughput implementations are mainly used for high-end devices such as accelerator cards for e-commercial service and security trunk communications. • These types of implementations are typically unrolled loops within the AES algorithm with a pipelining of the 128-bit datapath. • Although they typically have a very high throughput, their area is very large.

  6. Introduction (3) • The 32-bit AES implementations mainly multiplex the 128-bit datapath to 32 bits • This reduces circuit area at the expense of lowering speed. • This type of implementation is actually ideal for embedded applications. • My goal is to provide synthesis results for the different implementations as well as simulation/implemented results if time permits.

  7. The AES Algorithm

  8. AES Algorithm: Top Level

  9. AES Algorithm: Input to Encryption Process to Key Schedule

  10. AES Algorithm: Data Path From Key Schedule

  11. AES Algorithm: Data Path – SubBytes

  12. AES Algorithm: Data Path – ShiftRows 1 2 3

  13. AES Algorithm: Data Path – MixColumns X =

  14. AES Algorithm: Data Path – Add Key Data Round Key

  15. AES Algorithm: Key Schedule • Without going into too much detail, the Key is generated in a ‘similar’ way. • In each Round a new Round Key is generated from the previous key. • This key is added to the dataset at the end of the round.

  16. Areas of Optimization

  17. Physical Layout - Starting Point

  18. Optimization: Key Expansion • Pre-calculated in software and then stored in hardware (loaded when needed) • Low area • Hardware has to wait if new key is introduced (not good for continually changing key) • Calculated in parallel with the corresponding iteration • This allows for a changing key to be calculated on the fly • Extra hardware/area cost (not good for (embedded) fixed key applications) • Calculated in hardware ahead of time and stored • High hardware cost – introduces latency when a new key is introduced • The circuit can be ‘turned off’ in ASIC solution

  19. Optimization: Shift Row • 16x8 memory with shifting ability • 2 shift registers • Rearrangement of wires (requires no extra area, but may cause congestion in the wiring)

  20. Optimization: Substitute Byte • LUT • Easy to implement and understand. Would be a good idea to use the on chip ROM rather than LE’s (depending on application). • Uses lots of resources • Combinational logic • No need for memories (XOR circuit could be good in FPGA as we’ve seen earlier in this class) • Slow due to complex circuit.

  21. Optimization: Mix Columns • Multiplication and XOR done in combinational logic • Easy to implement • Could be slow and cover a large area • Combine the MixCols multiplication with the sbox and leave XOR in the LE’s • Uses very few LE’s. Removes multiplication from the equation. • Quadrupalsthe size of the necessary ROM - could be a drawback

  22. Conclusion: So Far.... • Studied Papers that address several of the optimizations listed above • Decided on an approach to modify and test existing code • Begun modifications on the code that I’ve decided to use as a starting point • ...don’t quite have synthesis results yet...

  23. Papers “Embedded a Low Area 32-bit AES for Image Encryption/ Decryption Application” “Exploring HW/SW Co-Design of AES Algorithm Using Custom Instructions” “Improved Method to Increase AES System Speed” “An AES Tightly Coupled Hardware Accelerator in an FPGA-based Embedded Processor Core” “DSP’s, BRAM’s and Pinch of Logic: New Recipes for AES on FPGA’s”

More Related