1 / 77

Martin Novotn ý , Andy Rupp Ruhr University Bochum

Application of FPGA Design: Design Challenges for Implementing Realtime A5/1 Attack with Precomputation Tables. Martin Novotn ý , Andy Rupp Ruhr University Bochum. Outline. A5/1 cipher Time-Memory Trade-off Tables Original Hellman Approach Distinguished points TMTO with multiple data

Download Presentation

Martin Novotn ý , Andy Rupp Ruhr University Bochum

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Application of FPGA Design: Design Challenges for Implementing Realtime A5/1 Attack with Precomputation Tables Martin Novotný, Andy Rupp Ruhr University Bochum

  2. Outline • A5/1 cipher • Time-Memory Trade-off Tables • Original Hellman Approach • Distinguished points • TMTO with multiple data • Rainbow tables • Thin-rainbow tables • Architecture of the A5/1 TMTO engine – table generation • Implementation results

  3. Outline • A5/1 cipher • Time-Memory Trade-off Tables • Original Hellman Approach • Distinguished points • TMTO with multiple data • Rainbow tables • Thin-rainbow tables • Architecture of the A5/1 TMTO engine – table generation • Implementation results

  4. A5/1 Cipher • Encrypts GSM communication • GSM communication organized in frames • 1 frame = 114 bits in each direction • Stream cipher • produces the keystream KS being xored with the plaintext P to form ciphertext C C = PKS A5/1 010010011101010010101 P 111101101001000000000 C A5/1 101111110100010010101 KS

  5. Architecture of A5/1 Cipher • 3 linear feedback shift registers (LFSRs) • LFSRs irregularly clocked • the register is clocked iff its clocking bit (yellow) is equal to the majority of all 3 clocking bits  at least 2 registers are clocked in each cycle

  6. Algorithm of A5/1 • Reset all 3 registers • (Initialization) Load 64 bits of key K + 22 bits of frame number FN into 3 registers • K and FN xored bit-by-bit to the least significant bits • registers clocked regularly • (Warm-up) Clock for 100 cycles and discard the output • registers clocked irregularly • (Execution) Clock for 228 cycles, generate 114+114 bits (for each direction) • registers clocked irregularly • Repeat for the next frame

  7. Cryptanalysis of stream ciphers with known plaintext • From the ciphertext C and known plaintext P compute keystream KS: • KS = PC • Keystream KS is a function of: • key K: KS = f(K) • internal state L: KS = g(L) • (internal state = content of all registers) • Reset all 3 registers • (Initialization) Load 64 bits of key K + 22 bits of frame number FN into 3 registers • (Warm-up) Clock for 100 cycles and discard the output • (Execution) Clock for 228 cycles, generate 114+114 bits (for each direction) We can skip Initialization and Warm-up!!!

  8. L Cryptanalysis of A5/1 • For (known) keystream KS find the internal state L • When L found, track the A5/1 cipher back through Warm-up phase and Initialization to get the key K. • Reset all 3 registers • (Initialization) Load 64 bits of key K + 22 bits of frame number FN into 3 registers • (Warm-up) Clock for 100 cycles and discard the output • (Execution) Clock for 228 cycles, generate 114+114 bits (for each direction)

  9. L0 L1 L2 L3 KS0 KS1 KS2 KS3 Cryptanalysis of A5/1 • Internal state L has 64 bits  we need (at least) 64 bits of keystream KS One A5/1 frame has 114 bits  we can make samples KSi 0100111101101010110100101010010100010010100010011110110001 It is sufficient to find any Li

  10. Outline • A5/1 cipher • Time-Memory Trade-off Tables • Original Hellman Approach • Distinguished points • TMTO with multiple data • Rainbow tables • Thin-rainbow tables • Architecture of the A5/1 TMTO engine – table generation • Implementation results

  11. Brute force attack Check all combinations of a key K online timeT = N = 2k memory M = 1 Table lookup (For a given plaintext P) All pairs key-ciphertext {Ki, Ci} precomputed and stored (sorted by C) Online phase: Look-up C in the table (and find K) time T = 1 memoryM = N = 2k Two extreme approaches

  12. Time-Memory Trade-Off (Hellman, 1981) • Compromises the above two extreme approaches • Precomputation phase: For a given plaintext P: • precompute (ideally all) pairs key-ciphertext {Ki, Ci}; • store only some of them in the table. • Online phase: • Perform some computations; • lookup the table and find the key K. • time T = N2/3 • memory M = N2/3

  13. Idea: Encryption function E is a pseudo-random function C = EK(P) Pairs {Ki, Ci} organized in chains Ci is used to create a key Ki+1 for the next step E is pseudo-random  we perform a pseudo-random walk in the keyspace R – reduction function (DES: C has 64 bits, K has 56 bits) f – step function f(x) = R(Ex(P)) P E K C plaintext P is the same End Point Start Point P P P f f f K2 C2 C1 K3 Kt EP Ct K1 SP = E E E R R R 1234 7A3D 28DF B05B 8EC0 Precomputation (offline) phase

  14. 1234 SP1 = k10 fk11 fk12 f … fk1t-1 fk1t = EP18EC0 1235 SP2 = k20 fk21fk22f … fk2t-1fk2t = EP22A1B 1236 SP3 = k30fk31fk32f … fk3t-1 fk3t = EP34D3C … … … 9999 SPm = km0fkm1fkm2f … fkmt-1 fkmt = EPm02E3 m chains with a fixed length tgenerated Only pairs {SPi, EPi} stored (sorted by EP)  reducing memory requirements P P P f f SPj = kj0 kjt = EPj kj1 kj2 kjt-1 E E R R f E R Precomputation (offline) phase

  15. y1 C R f Lookup: = EPi? K E f f f Online phase • Given C. (and P) • … try to find K, such that C = EK(P)  SPi

  16. y1 C R Lookup: = EPi? Online phase • Given C. (and P) • … try to find K, such that C = EK(P) 

  17. y1 C R y2 f Online phase • Given C. (and P) • … try to find K, such that C = EK(P)

  18. y1 C R Lookup: y2 = EPi? f Online phase • Given C. (and P) • … try to find K, such that C = EK(P) 

  19. y1 C R y2 y3 f f Online phase • Given C. (and P) • … try to find K, such that C = EK(P)

  20. y1 C R Lookup: y2 y3 = EPi? f f Online phase • Given C. (and P) • … try to find K, such that C = EK(P) 

  21. y1 C R y2 y3 y4 f f f Online phase • Given C. (and P) • … try to find K, such that C = EK(P)

  22. y1 C R Lookup: y2 y3 y4 = EPi? f f f Online phase • Given C. (and P) • … try to find K, such that C = EK(P) 

  23. y1 C R f Lookup: y2 y3 y4 = EPi? K E f f f f Online phase • Given C. (and P) • … try to find K, such that C = EK(P)  SPi

  24. Birthday paradox problem • m chains of fixed length t generated • R is not bijective ⇒ some kijcollide. Collisions yield in chain merges or in cycles in chains • Matrix stopping rule: Hellman proved that it is not worth to increase • number of chains m or • length of chain t • beyond the point at which • m ×t2 = N • (the coverage of keyspace does not increase too much then)

  25. SP1 … … … … … EP1 SP2 … … … … … EP2 SP3 … … … … … EP3 … … … SP200 … … … ... EP200 … … … SP1 … … … … … EP1 SP2 … … … … … EP2 SP3 … … … … … EP3 … … … SP200 … … … ... EP200 … … … SP1 … … … … … EP1 SP2 … … … … … EP2 SP3 … … … … … EP3 … … … SP200 … … … ... EP200 … … … SP1 … … … … … EP1 SP2 … … … … … EP2 SP3 … … … … … EP3 … … … SP200 … … … ... EP200 … … … Birthday paradox problem • Matrix stopping rule: • m ×t2 = N • Recommendation: To use r tables, each with different reduction (re-randomization) function R • Since also N = mtr, then r = t • Hellman recommends m = t = r = N1/3

  26. Hellman TMTO – Complexity • Precomputation phase • Precomputation time PT = mtr = N(e.g. 260) • Memory M = mr = N2/3(e.g. 240 ) • Online phase • Memory M = N2/3 • Online time T = t r = t2 = N2/3(e.g. 240 ) • Table accesses TA = T = N2/3(e.g. 240 )

  27. Hellman TMTO – Complexity • Precomputation phase • Precomputation time PT = mtr = N(e.g. 260) • Memory M = mr = N2/3(e.g. 240 ) • Online phase • Memory M = N2/3 • Online time T = t r = t2 = N2/3(e.g. 240 ) • Table accesses TA = T = N2/3 (e.g. 240 ) 34 years (1 disk access ~ 1 ms)

  28. Outline • A5/1 cipher • Time-Memory Trade-off Tables • Original Hellman Approach • Distinguished points • TMTO with multiple data • Rainbow tables • Thin-rainbow tables • Architecture of the A5/1 TMTO engine – table generation • Implementation results

  29. Distinguished points (DP)(Rivest, ????) • Slight modification of original Hellman method • Goal: To reduce the number of table accesses TA (in Hellman TA = N2/3) • Distinguished point is a point of a certain property (e.g. 20 most significant bits are equal to 0). • 000000000000000000000010101001101100101010010111110010110101

  30. Distinguished Points (DP)Precomputation phase • Chains are generated until the distinguished point (DP) is reached • if the chain exceeds maximum length tmax, then it is discarded and the next chain is generated • the chain is also discarded if the DP has been reached, but the chain is too short tmin (to have better coverage) • Triples {SPj, EPj, lj} stored, sorted by EP (ljis a length of the chain) • 1234 SP1 = k10 fk11 f … … … … … … … fk1u = EP1 0EC0 • 1235 SP2 = k20 fk21f … … fk2v = EP2 0A1B • 1236 SP3 = k30fk31f … … … … … fk3w = EP3 043C • … … … • 9999 SPm = km0fkm1f … … … fkmz = EPm 02E3 chains have different lengths End Points are DP

  31. y1 C R f Lookup: y3 y4 y2 = EPi? K E f f f f 043C Distinguished Points (DP)Online phase • There is 1 distinguished point per chain – the End Point • End Point Distinguished Point • Algorithm: • computeyi+1 = f(yi) iteratively until the DP is reached (or the maximum length tmax is exceeded) • then lookup (just once per table) (if tmax is exceeded, do not lookup at all) • Advantages • Table accesses TA = r = N1/3 (c.f. TA = t r = N2/3 in original Hellman) • Chain loops are not possible  SPi

  32. Outline • A5/1 cipher • Time-Memory Trade-off Tables • Original Hellman Approach • Distinguished points • TMTO with multiple data • Rainbow tables • Thin-rainbow tables • Architecture of the A5/1 TMTO engine – table generation • Implementation results

  33. L0 L1 L2 L3 KS0 KS1 KS2 KS3 TMTO with multiple data(Biryukov & Shamir, 2000) • Important for stream ciphers: To reveal an internal state Li having k bits we need only k bits of a keystream KSi • 0100111101101010110100101010010100010010100010011110110001 • Having D data samples of the ciphertext C (or the keystreamKS) we have D times more chances to find the key K (or the internal stateL) •  We calculate r/D tables only  we reduce the precomputation time PT and the memory M× online time T and #table access TA remain unchanged

  34. L0 L1 L2 L3 KS0 KS1 KS2 KS3 TMTO with multiple dataA5/1 • 1 frame: 114 bits • Internal state: 64 bits •  114 – 64 +1 = 51 data samples from 1 frame (each sample has 64 bits) • D = 51 • We calculate D times less tables ( save memory, save time) 0100111101101010110100101010010100010010100010011110110001

  35. Outline • A5/1 cipher • Time-Memory Trade-off Tables • Original Hellman Approach • Distinguished points • TMTO with multiple data • Rainbow tables • Thin-rainbow tables • Architecture of the A5/1 TMTO engine – table generation • Implementation results

  36. Idea: to use different reduction/re-randomization function Ri in each step of chain generation, hence the step functions are: f1f2f3 … ft-1ft Online phase: Compute y1 = Rt(C), compare with EPs, if no match, then Compute y2 = ft(Rt-1(C)), compare with EPs, if no match, then Compute y3 = ft(ft-1(Rt-2(C))), compare with EPs, if no match, then … P P P f1 f2 SPj = kj0 kjt = EPj kj1 kj2 xjt-1 E E R1 R2 ft E Rt Rainbow tables(Oechslin, 2003)

  37. Rainbow tables • Just one table (or only several tables) generated, • m = N2/3 (t reduction functions used ⇒ the table can be t times longer), • t = N1/3 • Advantages • chain loops impossible • point collisions lead to chain merges only if the equal points appear in the same position of the chain • online time T about ½ of the online time of original Hellman (for single data) • number of table accesses the same like for the Hellman+DP method (for single data) • Disadvantages • Inferior to the Hellman+DP method in the case of multiple data (D > 1)(online time T and the number of table accesses TA are D-times greater)

  38. Outline • A5/1 cipher • Time-Memory Trade-off Tables • Original Hellman Approach • Distinguished points • TMTO with multiple data • Rainbow tables • Thin-rainbow tables • Architecture of the A5/1 TMTO engine – table generation • Implementation results

  39. 1st 2nd kth Thin-rainbow tables • The way to cope with the rainbow tables when having multiple data • The sequence of S different reduction functions fi is applied k-times periodically in order to create a chain: • f1f2f3 … fS-1fSf1f2f3 … fS-1fS … … … f1f2f3 … fS-1fS • Chain length • t = S×k

  40. 1st 2nd kth Thin-rainbow tables + DP (to reduce # table accesses TA) • DP criterion is checked after each fS • f1f2f3 … fS-1fSf1f2f3 … fS-1fS … … … f1f2f3 … fS-1fS • We store only chains for which kmin < k < kmax DP ? DP ? DP ? DP ?

  41. Hellman + DP DP-criterion checked after each step-function f Thin-rainbow + DP DP-criterion checked after fSonly  simpler HW, better time/area product Candidates for implementation(in case of multiple data, D>1) Both have the same precomputation complexity Both have comparable online time T and # table accesses TA

  42. 1st 2nd kth Thin-rainbow tables + DP (to reduce # table accesses TA) • DP criterion is checked after each fS • f1f2f3 … fS-1fSf1f2f3 … fS-1fS … … … f1f2f3 … fS-1fS • We store only chains for which kmin < k < kmax DP ? DP ? DP ? DP ?

  43. Outline • A5/1 cipher • Time-Memory Trade-off Tables • Original Hellman Approach • Distinguished points • TMTO with multiple data • Rainbow tables • Thin-rainbow tables • Architecture of the A5/1 TMTO engine – table generation • Implementation results

  44. Pipeline? Array of small computing elements? Implementation choices

  45. Slice – FPGAs Basic Building Block Look-up Table Flip- flop Look-up Table Flip- flop Implements combinational logic (any logic function of 4 variables) It is RAM 16x1 (holds the truth-table)

  46. LUT – configuration choices Look-up Table Flip- flop Look-up Table Flip- flop • Can be configured as: • LUT (function generator) • RAM 16x1 • SRL16 (upto 16-bit shift register)

  47. Pipeline? All A5/1 bits should have been accessible in parallel max. 240 A5/1 units(64x FF/unit)(and no control unit, …) Array of small computing elements? LFSRs can be implemented using SRL16 (1 LUT config. as up to 16-bit shift register) max. 480 A5/1 units(8x SRL16 + 5x FF/unit)(enough FFs for control unit, …) Implementation choices

  48. A5/1 TMTO basic element • Calculates one chain • Two-stroke mode: • core #1 generates keystream, core #2 is loaded • core #2 generates keystream, core #1 is loaded

  49. A5/1 TMTO basic element First, the start point SPj is loaded to core #1

  50. A5/1 TMTO basic element … then one rainbow period f1f2f3 … fS-1fS is performed … In odd steps: Core #1 generates keystream … ... that is re-randomized … … and loaded to core # 2 as a new internal state

More Related