1 / 16

Modular Multiplication: C = A * B mod M where A, B < M

Improving Cryptographic Architectures by Adopting Efficient Adders in their Modular Multiplication Hardware Adnan Gutub, Hassan Tahhan Computer Engineering Department, King Fahd University of Petroleum & Minerals.

rona
Download Presentation

Modular Multiplication: C = A * B mod M where A, B < M

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Improving Cryptographic Architectures by Adopting Efficient Adders in their Modular Multiplication Hardware Adnan Gutub, Hassan Tahhan Computer Engineering Department, King Fahd University of Petroleum & Minerals

  2. In many public-key encryption schemes (e.g., RSA, ElGamal & ECC), Modular Multiplication is a basic arithmetic operations heavily used. Modular Multiplication: C = A * B mod M where A, B < M Secure System very large operand size too expensive. Straightforward Method: Multiplication then modulus division. Modular Multiplication Operation M. M. A B M M. M. C

  3. Interleaving Multipl. and reduction P = 0 for i = n-1 to 0 { P = 2 * P if ( P  M ) P = P – M if ( bi = 1 ) { P = P + A if ( P  M ) P = P – M } } Interleaving In 1983, Blakley: Pi = 2 Pi-1 + bi A + q M In the literature, proposals to solve the magnitude comparison problem. Koc’s implementation based on carry-save adders. Partial products are represented as sum-carry pairs. The 5 MSBs of the pair is tested for sign estimation.

  4. Montgomery’s Method P’ = 0 for i= 0 to n-1 { P’ = P’ + a’i * B’ if ( p’0 = 1 ) P’ = P’ + M P’ = P’ / 2 } if ( P’ M ) P’ = P’ - M In 1985, Montgomery: Pi = Pi-1 + bi A + q M / 2 No full magnitude comparison is required. The correction step can be easily removed. However, pre and post calculations are needed in order to have the required result. As in the interleaving method, implementations based on carry-save adders are the most effective solutions. Montgomery

  5. High-Radix Method Speedups the modular multiplier by requiring less number of cycles. Area and time will increase. The reduction step will be the crucial operation. As the radix increases, it becomes more complex. Walter shows that there is a direct trade-off between the required space and the overall computation time. The AT factor is independent of the choice of the radix. The factor is expected to improve for radices that are not much larger than radix-2. High-Radix

  6. Comparison Between [6] and [18] Comparison

  7. Comparison Between [6] and [18] Comparison

  8. Comparison Between [6] and [18] Comparison

  9. Improvements on [6] Pipelining: Due to data dependency, the pipelining will not improve the throughput. However, the pipeline can be used to compute two separate operations simultaneously. Improvement

  10. Improvements on [6] Parallelism: The correction step at the end of the algorithm increases the algorithm complexity. At the hardware level, the correction step can be implemented using two options. By computing the two possible results in parallel, time will be saved. Improvement

  11. Binary Adders The last stage in both algorithms does full-length addition on the carry-sum pair which can be performed in hardware through binary adders. Statistics showed that 72% of the instructions perform additions in the data path of a prototypical RISC machine. The carry-lookahead adder and the carry-skip adder were compared in terms of time, area and power. Adders

  12. Carry-Lookahead Adder CLA The total delay of the carry-lookahead adder is (log n). There is a penalty paid for this gain: the area increases. The carry-lookahead adders require (n log n) area.

  13. Carry-Skip Adder The carry-skip adder has a simple and regular structure that requires an area in the order of (n) which is hardly larger then the area required by the ripple-carry adder. The time complexity of the carry-skip adder is bounded between (n1\2) and (log_n). An equal-block-size one-level carry-skip adder will have a time complexity of  (n1\2). However, a more optimized multi-level carry-skip adder will have a time complexity of O (log n). CSK

  14. CLA versus CSK Using 32-bit operands, a multi-level carry-skip adder was 14 % faster and its power dissipation was 58 % of that of the carry-lookahead adder. Using 64-bit operands, a one-level carry-skip adder was 38% slower and its power consumption is 68 % of the the carry-lookahead adder. Comparison

  15. Conclusion This work studied the modular multiplication problem over large operand sizes. Based on a survey, two implementations for modular multiplication algorithms were modeled using VHDL and synthesized. A time-area analysis of both implementations showed that Koc’s implementation has the potential to be an effective solution in terms of time and hardware requirements. This implementation was improved further. Carry-save adders give the maximum speedup in computing the partial products since. However, full-length addition on the sum-carry pair needs to be carried out at the last iteration through dedicated binary adder. Two binary adders were studied: the CLA and the CSK. Although the two adders can be of a comparable speed, the CSK requires smaller area and consumes much less power than the CLA. Conclusion

  16. The End Improving Cryptographic Architectures by Adopting Efficient Adders in their Modular Multiplication Hardware Adnan Gutub, Hassan Tahhan Computer Engineering Department, King Fahd University of Petroleum and Minerals Thank you

More Related