SHA Hash Functions History & Current State

SHA Hash FunctionsHistory& Current State Helsinki Institute for Information Technology, November 03, 2009. Sergey Panasenko, independent information security consultant, Moscow, Russia. serg@panasenko.ru www.panasenko.ru

SHA Hash Functions • Hash functions cryptanalysis review. • SHA (SHA-0) & SHA-1. • SHA-2. • SHA-3 project.

Section 1. Hash functions cryptanalysis review • typical hash function structure; • goals of hash functions cryptanalysis; • cryptanalysis methods.

Typical hash function structure Merkle-Damgård construction:

Primary goals of hash functions cryptanalysis Collision: m1 and m2 with the same hash: h = hash(m1)=hash(m2) Multicollision: several messages with the same hash. Theoretical time consumption: 2n/2operations for n-bit hash function.

Primary goals of hash functions cryptanalysis First preimage: such m that for given h: hash(m)=h Second preimage: such m2 that for given m1: hash(m2)=hash(m1) Theoretical time consumption: 2noperations for n-bit hash function.

Primary goals of hash functions cryptanalysis Secret key definition – for keyed hash functions or hash functions in keyed mode. Theoretical time consumption: 2koperations for k-bit key.

Secondary goals of hash functions cryptanalysis Near-collision: m1 and m2 with hash values differ in several bits: hash(m1)≈hash(m2) Pseudo-collision: m1 and m2 with the same hash but with different initial values: hash(m1, IV1)=hash(m2,IV2) Theoretical time consumption: 2n/2operations for n-bit hash function.

Secondary goals of hash functions cryptanalysis Pseudo-preimage: such m that for given h: hash(IV, m)=h where IV is non-standard initial value. Theoretical time consumption: 2noperations for n-bit hash function.

Attacks on hash functions • Step-by-step searching over the target space. • They define theoretical time consumption of any goal. • Can be used for finding collisions, preimages or secret keys. • Highly parallelizable. • Can be accelerated greatly by specific hardware. • Can be used in context of other attacks. • They define suitable hash or key sizes. Brute-force attacks

Attacks on hash functions • A kind of brute-force attacks on a reduced target space (e.g. words of any dictionary). • Typical application: finding a password for given hash value. • Offline work – precounting a table for searching the required password. Dictionary attacks

Attacks on hash functions The simplest case of tables: one hash for every password. Dictionary attacks

Attacks on hash functions Hash chains – reducing the memory (Martin Hellman, 1980): p1h1p2h2… pNhN Dictionary attacks

Attacks on hash functions Hash chains – collision example: Dictionary attacks

Attacks on hash functions Strengthening hash chains: • Several tables with different R-functions. • Variable length chains. Dictionary attacks

Attacks on hash functions Several R-functions R1…RN-1 for every column of strings: • cyclic strings are impossible; • collisions lead to strings coincidence when occur in the same column only – that can be detected. Dictionary attacks. Rainbow tables

Attacks on hash functions Invented by Philip Oechslin in 2003. Can be further strengthened by combining with variable-length chains. Are in active use for cracking real systems: • http://project-rainbowcrack.com; • http://lasecwww.epfl.ch; • http://www.freerainbowtables.com. Dictionary attacks. Rainbow tables

Attacks on hash functions Countermeasures: • Salt – randomizing hashing; • Increasing time to hash – e. g. multiple hashing. Example: Niels Provos & David Mazières (1999) – bcrypt hash function. Uses salt & cost variables. Cost defines the number of internal block cipher key extension rounds: 2cost+1+1 Dictionary attacks. Rainbow tables

Attacks on hash functions “Square root attack”: O() tries required to find the same element from an array with N elements. Application to hash functions (Gideon Yuval, 1979): • An adversary prepares r variants of fraud document f and r variants of original document m. • He searches among these variants such mx and fy that hash(mx)=hash(fy). • User signs mx, but his signature is correct when verifying it for fy. Birthday paradox

Attacks on hash functions Another variant of hash chains: mihash(mi)hash(hash(mi))  … All hash values are compared with previous values and values of other chains. Disadvantage: huge memory requirements. Jean-Jacques Quisquater, Jean-Paul Delescaille, 1987: store distinguished pointsonly. Their coincidence signals about found collision. Low memory requirements. Collision search

Attacks on hash functions Michael Wiener and Paul Van Oorschot, 1994: parallel collision search with specific values: Collisions search

Attacks on hash functions Birthday paradox & collisions search • Mihir Bellare and Tadayoshi Kohno, 2004: “amount of regularity” of hash functions – asoutput value distribution is regular. The less regular, the easy to find collision. • Bart Preneel, 2003: hash value size analysis. 160 bits are enough for at least 20 years.

Attacks on hash functions Florent Chabaud & Antoine Joux, 1998: SHI1 algorithm: Differential cryptanalysis

Attacks on hash functions Differential cryptanalysis

Attacks on hash functions Result: propagation of the difference is cancelled by the corrected bits. After 6 iterations the difference is 0. This is 6-round local collision: two messages differ in 6 bits (after expansion) but lead to the same hash value. Next step: construct messages which can expand with required difference. Attackers use disturbance vector – the table shows which bits of messages must be different to achieve the collision. Differential cryptanalysis

Attacks on hash functions Differential cryptanalysis F. Chabaud & A. Joux: SHI1 – SHI2 – SHI3 – SHA Step-by-step including non-linear operation into the iterations. From deterministic to probabilistic constructions: the same principles of attack can be applied to real SHA algorithm.

Attacks on hash functions Boomerang attack Invented by David Wagner for block ciphers in 1999. Applied to hash functions (SHA & SHA-1) by Antoine Joux and Thomas Peyrin, 2007. Boomerang attack uses one or more auxiliary differences besides the main difference. This significantly improves the probability of finding collisions.

Attacks on hash functions Boomerang attack

Attacks on hash functions Algebraic cryptanalysis Uses algebraic properties of an algorithm. Successfully applied to block ciphers (e. g. works of Nicolas Courtois against AES). Can be used in context of other attacks. Example: Makoto Sugita, Mitsuru Kawazoe, Hideki Imai (2006) attacked reduced-round SHA-1 by algebraic and differential cryptanalysis in complex.

Attacks on hash functions Message modification Xiaoyun Wang, Hongbo Yu, 2005: step-by-step modifying the message to meet the criteria for differential cryptanalysis. Message modification technique allows to speed up the collision search by fulfilling the required criteria for internal variables.

Attacks on hash functions Meet in the middle attack Can be applied when a function can be represent as two subfunctions: and if the second subfunction can be invertible.

Attacks on hash functions Meet in the middle attack Finding preimage for a hash value H: • Count hash1() for variants of the first half of messages (and store them in a table): Tx=hash1(M1x,IV). 2. Count inverted hash2() for variants of the second half of messages: Ty = hash2-1(M2y,H). 3. Searching for equivalent Tx and Ty.

Attacks on hash functions Correcting blocks Allows to find preimages or collisions. Example for collisions: 1. Select arbitrary messages M and M*. 2. Find such corrected blocks X and X* that: hash(M ||X) = hash(M* ||X*).

Attacks on hash functions Fixed points A fixed point occurs when it is possible to find such message block Mi that: hash(M) = hash(M||Mi), i. e. intermediate hash value remains the same after processing Mi block. Can be used for finding collisions.

Attacks on hash functions Block-level manipulations • inserting, • removing, • permutation, • substitution of message blocks without affecting the hash value.

Attacks on hash functions Two-block collisions Eli Biham et al., 2004:

Attacks on hash functions Multi-block collisions

Attacks on hash functions Specific attacks on block cipher based hash functions Allows to find collisions based on some weaknesses of an underlying block cipher: • weak keys, • equivalent keys, • groups of keys, • related-keys attacks.

Attacks on hash functions This group of attacks are invented by Paul Kocher, 1996. Passive side-channel attacks (an adversary only readsside-channel information): • Electromagnetic attacks. • Power attacks (simple & differential). • Timing attacks. • Error-message attacks. Side-channel attacks

Attacks on hash functions Active side-channel attacks (an adversary influences on hash function realization): • Optical, radiation or heating attacks. • Spike & glitch attacks. • Fault attacks (simple and differential). • Hardware modification. Side-channel attacks

Attacks on hash functions Countermeasures: • Constant time consumption of operations. • Inserting random delays, noises, random variables etc, redundant computations. • Error messages without extra information. • Doubling calculations with comparing their results. • Shielding. • Detecting of external actions. Side-channel attacks

Attacks on hash functions Other cryptanalytic methods • Using neutral bits (Eli Biham & Rafi Chen, 2004) – such bits of a message which do not influence on final or intermediate results during some rounds. • Attacks that can use specifics of hash functions realizations in network protocols, signature schemes etc. • Length-extension attack – inserting some data to the end of a message to find a collision.

Section 2. SHA & SHA-1 • SHA structure; • SHA-1 structure; • SHA cryptanalysis; • SHA-1 cryptanalysis.

SHA Secure Hash Algorithm. Invented by U.S. National Security Agency in 1992. U.S. hashing standard in 1993-1995 (FIPS 180). Must be used by U.S. Ministries and Agencies for hashing non-classified information. Recommended for commercial organizations. Renamed to SHA-0 after SHA-1 invention. Overview

SHA 160-bit hash value. Input data size – from 0 to (264-1) bits. Merkle-Damgaard construction with 512-bit data blocks. Last block is always padded by: • “1” bit; • zero bits when required; • 64-bit input data length in bits. High-level structure

SHA • 512-bit block is represented as 32-bit words W0…W15. • The following 32-bit words W16…W79 are calculated: Wn=Wn-3Wn-8Wn-14Wn-16. Message block expansion

SHA 80 iterations: Compression function

SHA fi functions: f(x, y, z) = (x & y) | (~x & z), i = 0…19; f(x, y, z) = xyz, i = 20…39, 60…79; f(x, y, z) = (x& y) | (x& z) | (y& z), i = 40…59. Compression function

SHA Intermediate hash values: 32-bit registers A…E. Chaining by addition modulo 232: A = A + a; B = B + b, etc. No finalization is performed: output hash value is concatenation of A…E after processing all message blocks. Chaining and finalization

SHA-1 U.S. hashing standard since 1995 (FIPS 180-1, FIPS 180-2). Will be withdrawn (for some applications) in 2010. All procedures are the same as in SHA algorithm, except the message block expansion. Overview & high-level structure

SHA Hash Functions History & Current State