Universal hash families
1 / 28

Universal Hash Families - PowerPoint PPT Presentation

  • Uploaded on

Universal Hash Families. Universal hash families. Family of hash functions Finite multiset H of string-valued functions, each h ∈ H having the same nonempty domain A ⊆{0,1} * and range B ⊆{0,1} * , for some constant b

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Universal Hash Families' - armina

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Universal hash families1
Universal hash families

  • Family of hash functions

    • Finite multiset H of string-valued functions, each h ∈ H having the same nonempty domain A ⊆{0,1}* and range B ⊆{0,1}*, for some constant b

  • [Definition] -almost universal2 (-AU2) & -almost XOR universal2 (-AXU2)

    • A family of hash functions H ={h: A{0,1}b} is -almost universal2 , written -AU2, if, for all distinct x, x’∈A, Prh∈ H [h(x)=h(x’)] .

    • The family of hash functions H is -almost XOR universal2 , written -AXU2, if, for all distinct x, x’∈A, and for all c∈{0,1}b, Prh∈ H [h(x)h(x’)=c] .

      •  =maxxx’{Prh [h(x)=h(x’)]} : collision probability

    • Principle measure of an AU2 : How small its collision probability is and how fast one can compute its functions

Composition of universal hash families
Composition of universal hash families

  • Make the domain of a hash family bigger

    • H ={h: {0,1}a{0,1}b}

    • Hm ={h: {0,1}am{0,1}bm}

      • its elements are the same as in H where h(x1 x2… xm), for | xi |=a, is defined by h(x1)|| h(x2)||…|| h(xm)

    • [Proposition] If H is -AU2, then Hm is -AU2

  • Make the collision probability smaller

    • H1 ={h: A{0,1}b1}, H2 ={h: A{0,1}b2}

    • H1 & H2 ={h: A{0,1}b1+ b2}

      • its elements are pairs of functions (h1, h2)∈H1H2 and where (h1, h2)(x) is defined as h1(x)||h2(x)

    • [Proposition] If H1 is 1-AU2 and H2 is 2-AU2, then H1 & H2 is 12-AU2

Composition of universal hash families cont
Composition of universal hash families(cont.)

  • Make the image of a hash function shorter

    • H1 ={h: {0,1}a{0,1}b}, H2 ={h: {0,1}b{0,1}c}

    • H2○H1 ={h: {0,1}a{0,1}c}

      • its elements are pairs of functions (h1, h2)∈H1H2 and where (h1, h2)(x) is defined as h2(h1(x))

    • [Proposition] If H1 is 1-AU2 and H2 is 2-AU2, then H2○H1 is (1+2)-AU2

  • Turn an AU2 family H1 and AXU2 family H2 into an AXU2 family H2○H1

    • [Proposition] Suppose H1 ={h: AB} is 1-AU2 and H2 ={h: BC} is 2-AXU2. Then H2○H1 ={h: AC} is (1+2)-AXU2

Related researches
Related researches

  • Carter-Wegman(1979, 1981)

    • Efficient authentication code under strongly universal hash functions

    • Key observations

      • Long messages can be authenticated efficiently using short keys if the number of bits in the authentication tag is increased slightly compared to ‘perfect’ schemes

      • If a message is hashed to a short authentication tag, weaker properties are sufficient for the first stage of the compression

      • Under certain conditions, the hash function can remain the same for many plaintexts, provided that hash results is encrypted using a one-time pad

  • Stinson(1994)

    • Improves the works by Wegman-Carter and establishes an explicit link between authentication codes and strongly universal hash functions

  • Johansson-Kabatianskii-Smeets(1996)

    • Establish a relation between authentication codes and codes correcting independent errors

Related researches cont
Related researches(cont.)

  • Krawczyk(1994, 1995)

    • Propose universal hash functions that are linear with respect to bitwise XOR

    • Makes it easier to reuse the authentication code

      • Encrypt the m-bit hash result for each new message using a one-time pad

    • Simple and efficient constructions based on polynomials and LFSR

  • Shoup(1996)

    • Propose and analyze the constructions based on polynomials over finite fields

  • Rogaway : Bucket hashing (1995)

  • Halevi-Krawczyk : MMH (1997)

    • Make optimal used of the multiply and accumulate instruction of the Pentium MMX processor

  • Black-Halevi-Krawczyk-Krovertz-Rogaway : UMAC (1999)

    • Further improved the performance on high end processors


  • Bucket hashing

    • is an -AU introduced by Rogaway

    • Defining the Bucket Hash Family B

      • word size w(1), parameters n(1), N(3)

      • domain D={0,1}wn, range R={0,1}wN

      • Let h Band let X=X1Xnbe the string we want to hash, where each |Xi|=w. Then h(X) is defined by the following algorithm. First, for each j{1,,N}, initialize Yjto 0w. Then, for each i{1,, n} and k  hi, replace Ykby Yk  Xi. When done, set h(X) = Y1||Y2||||YN.

      • Pseudocode

        for j 1 to N do Yj 0w

        for i  1 to n do

        Yhi1 Yhi1 Xi

        Yhi2 Yhi2 Xi

        Yhi3 Yhi3 Xi

        return Y1||Y2||||YN

Constructions cont

  • Bucket Hashing with Small Key Size

    • N=2s/L

    • Each hash function hB’[w,M,N] is specified by a list of length M

      • each entry contains L integers in the interval [0, N-1]

    • L arrays are introduced, each containing N buckets

    • Next, each array is compressed to s/L words, using a fixed primitive element GF(2s/L)

      • The hash result is equal to the concatenation of the L compressed arrays, each containing s/L words

Constructions cont1

  • Hash Family Based on Fast Polynomial Evaluation

    • is based on polynomial evaluation over a finite field

    • q = 2r, Q = 2m = 2r+s, n = 1+2s,  : a linear mapping from GF(Q) onto GF(q)

      • Q = q0m, q = q0r , q0 : a prime power

    • fa(x) =a0 + a1x +  + an-1xn-1

      • x, y, a0, a1, , an-1  GF(Q), z  GF(q)

    • H = {hx,y,z : hx,y,z(a) = hx,y,z(a0, a1, , an-1) =  (y fa(x)) + z}

Constructions cont2

  • Hash Family Using Toeplitz Matrices

    • Toeplitz matrices are matrices with constant values on the left-to-right diagonals

      • A Toeplitz matrix of dimension n  m can be used to hash messages of length m to hash results of length n by vector-matrix multiplication

    • The Toeplitz construction uses matrices generated by sequences of length n + m - 1 drawn from -biased distributions

      • -biased distributions are a tool for replacing truly random sequences by more compact and easier to generate sequences

      • The lower , the more random the sequence is

    • Krawczyk proves that the family of hash functions associated with a family of Toeplitz-matrices corresponding to sequences selected from a -biased distribution is  -AXU with  = 2-n + 

Constructions cont3

  • Evaluation Hash Function

    • is one of the variants analyzed by Shoup

    • The input (of length  tn) : viewed as a polynomial M(x) of degree < t over GF(2n)

    • The key : a random element   GF(2n)

    • the hash result : equal to M() GF(2n)

    • This family of hash functions is -AXU with  = t/2n

Constructions cont4

  • Division Hash Function

    • represents the input as a polynomial M(x) of degree less than tn over GF(2)

    • The hash key : a random irreducible polynomial p(x) of degree n over GF(2)

    • The hash result : m(x) xn mod p(x)

    • This family of hash functions is -AXU with  = tn/2n

      • The total number of irreducible polynomials of degree n is roughly equal to 2n/n

Constructions cont5

  • MMH(Multilinear Modular Hashing) hashing

    • consists of a (modified) inner product between message and key modulo a prime p (close to 2w, with w the word length; below w = 32)

    • is an -AXU2, but with xor replaced by subtraction modulo p

      • The core hash function maps 32 32-bit message words and 32 32-bit key words to a 32-bit result

      • The key size is 1024 bits and  = 1.5/ 230

      • For larger messages, a tree construction can be used

      • the value of  and the key length have to be multiplied by the height of the tree

    • This algorithm is very fast on the Pentium Pro, which has a multiply and accumulate instruction

      • On a 32-bit machine, MMH requires only 2 instructions per byte for a 32-bit result

Comparing the hash functions cont
Comparing the Hash Functions(cont.)

  • Scheme A

    • the input : divided into 32 blocks of 8 Kbyte

      • each block is hashed using the same bucket hash function with N = 160

      • results in an intermediate string of 20480 bytes

  • Scheme B

    • the input : divided into 64 blocks of 4 Kbyte

      • each block is hashed using the same bucket hash function with short key(s=42, L=6, N=128)

      • results in an intermediate string of 10752 bytes

  • Scheme C

    • the input is divided into 64 blocks of 4 Kbyte

      • each block is hashed using a 331024 Toeplitz matrix, based on a -biased sequence of length 1056 generated using an 88-bit LFSR

      • The length of the intermediate string is 8448 bytes

Comparing the hash functions cont1
Comparing the Hash Functions(cont.)

  • Scheme D

    • the input : hashed twice using the polynomial evaluation hash function with  = 2-15 resulting in a combined value of 2-30

      • W = 5

      • The performance is slightly key dependent. Therefore an average over a number of keys has been computed.

  • Scheme E

    • this is simply the evaluation hash function with t = 32768

      • the resulting value of  is too small

      • However, choosing a smaller value of n that is not a multiple of 32 induces a performance penalty

  • Scheme F

    • the input : divided into 2048 blocks of 128 bytes

      • each block is hashed twice using MMH

      • the length of the intermediate string is 16384 bytes

        • It is not possible to obtain a value of closer to 2-32 in an efficient way

Message authentication based on universal hashing
Message authentication based on Universal hashing

  • Message authentication based on Universal hashing

    • Wegman-Carter approach

      • The parties share a secret key k=(h,P)

        • P : infinite random string

        • h : function drawn randomly from a strongly universal2 family of hash functions H

          • H is strongly universal2 if, for all xx’, the random variable h1(x)||h2(x), for h ∈ H, is uniformly distributed

      • To authenticate a message x, the sender transmits h(x) xored with the next piece of the pad P

    • Standard cryptographic technique

      • use of a pseudorandom function family, F

      • [Theorem] Assume H is -AXU2, and that F is replaced by the truly random function family R of functions. In this case, if an adversary makes q1 queries to the authentication algorithm S and q2 queries to the verification algorithm V, the probability of forging a MAC is at most q2

Universal hashing mac1
Universal hashing MAC

  • Why Universal hashing MAC?

    • The speed of a universal hashing MAC depends on the speed of the hashing step and encrypting step

    • The encryption does not take long

      • hash function compresses messages => the encrypting message is short

    • The combinatorial properties of the universal hash function family is mathematically proven

      • needs no “over-design” or “safe margin” the way a cryptographic primitive would

  • Universal hashing MAC makes for desirable security properties

    • can select a cryptographically conservative design for the encrypting step

    • can pay with only a minor impact on speed

    • the cryptographic primitive is applied only to the much shorter hashed image of the message

    • security and efficiency are not conflicting requirements

Universal hash families

  • The UMAC algorithm

    • species how the message, key, and nonce determine an authentication tag

    • The sender

      • will need to provide the receiver with the message, nonce, and tag

    • The receiver

      • can then compute what “should be” the tag for this particular message and nonce, and see if it matches the received tag

  • employs a subkey generation process in which the shared (convenient-length) key is mapped into UMAC's internal keys

    • subkey generation is done just once, at the beginning of a communication session during which the key does not change, and so subkey-generation is usually not performance-critical

    • UMAC depends on a few different parameters

Umac cont

  • An illustrative special case of UMAC

    • Subkey generation:

      • Using a PRG, map Key to K = K1K2  K1024 and to A

        • each Ki : a 32-bit word, |A| = 512

    • Hashing the message Msg to HM = NHXKey(Msg):

      • Let Len be |Msg| mod 4096, encoded as a 2-byte string

      • Append to Msg the minimum number of 0 bits to make |Msg| divisible by 8

      • Let Msg = Msg1 || Msg2 ||  || Msgt where each Msgi is 1024 words except for Msgt, which has between 2 and 1024 words

      • Let HM = NHK(Msg1) || NHK(Msg2) || || NHK(Msgt) || Len

    • Computing the authentication tag:

      • The tag is Tag = HMAC-SHA1A(HM || Nonce)

Umac cont1

  • Definition of NH

    • blocksize n  2, wordsize w  1

    • domain : A = {0, 1}2w {0, 1}2w  {0, 1}nw

    • range : B = {0, 1}2w

    • a random function in NH[n,w] is given by a random nw-bit string K

    • Uw : {0,  , 2w-1}, U2w : {0,  , 22w -1}

    • for integers x, ylet (x +wy) denote (x + y) mod 2w

    • M  A and M = M1  Ml, |M1| =  = |Ml|= w

    • K = {0, 1}nwandK = K1  Kn, |K1| =  = |Kn|= w

Umac cont2

  • where miUw is the number that Mi represents (as an unsigned integer)

  • where kiUw is the number that Ki represents (as an unsigned integer)

  • the right-hand side of the above equation is understood to name the (unique) 2w-bit string which represents (as an unsigned integer) the U2w-valued integer result