Public Key Cryptography (key exchange, encryption, signatures) and Crypto-Hashing

Public Key Cryptography(key exchange, encryption, signatures)and Crypto-Hashing Last updated: Saturday, December 21, 2019 Prof. Amir Herzberg, room 324 Dept. of Computer Science, Bar Ilan University http://AmirHerzberg.com

Public Key Cryptography • Concept [DH76]: some operations are asymmetric, e.g.: everybody can send me mail, only I can read it. • Idea: use a public key – known to adversary • Encryption – public key cryptosystem (RSA) • Encrypt with public key, decrypt with private key • Digital signatures (RSA, DSA,…) • Sign with private key, verify with public key • Key agreement (DH) • Use public/private key pair to agree on shared secret key http://AmirHerzberg.com

Public keys are easier… • To distribute: • From directory (ensure or trust authentication) • From incoming message (if authenticated) • Less keys to distribute (same public key to all) • To maintain: • Can keep in non-secure storage • Validate (e.g. against hash) before using • Less keys: O(|parties|), not O(|parties|2) http://AmirHerzberg.com

But public key crypto is harder… • Requires related public, private keys • Private key `reverses` public key • Public key does not expose private key • Substantial overhead • Successful cryptanalytic shortcuts  need long keys (cf. shared key!) • Elliptic Curves (EC) may allow shorter key (almost no shortcuts found) • Complex computations • RSA: very complex (slow) key generation • Based on modular arithmetic… Commercial-grade securityLenstra & Verheul [LV02] http://AmirHerzberg.com

Recall: Modular arithmetic • Basic part of (integer) number theory • For every integers x,nthere are unique q,r≥0 s.t. x=qn+rwith r<n ; we call rresiduemod n • Notation: x=y mod n • Reads: “xis congruent to y modulo n” • If xand yhave the same remainder when divided by n, namely x=r+l∙n, y=r+l’∙n for some integers l, l’ • Regular arithmetic laws apply • E.g. distributive, commutative, associative,… • (a*b) mod n = [(a mod n)*(b mod n)] mod n http://AmirHerzberg.com

Hard Modular Math Problems • Hard problems: • No efficient solution • In spite of extensive efforts • Factoring: given the product of two uniformly chosen primes, it is infeasible to find the primes • Discrete logarithm in finite field • Select random prime p, generator g{2,p-1} • Given aR {1,p-1}, it is infeasible to find b{1,p-1} s.t. a=gb mod p. • Verification of solutions is easy • Factoring: multiply factors • Discrete log: exponentiation • Efficient exponentiation mod n: O((lg n)3) • `One-way` hard problems http://AmirHerzberg.com

The Key Agreement Problem • Motivation for simple public key problem… • Alice and Bob want to agree on some secret • Trivial if they have shared secret key… • Assume no prior shared secrets (e.g. key) • Afterwards, may use agreed-on secret as key • Physical setting • Assume Alice and Bob can exchange locked box • Origin of box is authentic (e.g. visually) • Problem: Alice and Bob have no shared key… • Solution ??? http://AmirHerzberg.com

Key Agreement Using Two-Lock Box http://AmirHerzberg.com

Can we use One Time Pad as lock? No! Adversary can find k=k’k’’ k’’’= =(kkB)  (kkBkA)  (kkA) http://AmirHerzberg.com

Can we use Exponentiation as lock? This seems Ok… but we can simplify. http://AmirHerzberg.com

Public Key Agreement [DH] • Based on Discrete Log problem • Agree, publish random prime p and generator g • Alice: secret key a, public key PA=ga mod p • Bob: secret key b, public key PB=gb mod p • To set up a shared key k: • Alice computes: (PB)a=(gb mod p)a= gba mod p • Bob computes: (PA)b=(ga mod p)b= gab mod p • k = gbamod p = gab mod p PA=ga mod p Alice Bob PB=gb mod p http://AmirHerzberg.com

Caution: Authenticate Public Keys! • Diffie-Hellman key agreement works… if the public keys are authentic • If Bob simply receives Alice’s public key, this is subject to `man in the middle` attack • Suppose authenticated communication … is DH secure? Hi, I’m Alice, ga mod p Hi, I’m Alice, ge mod p http://AmirHerzberg.com

Security of [DH] Key Agreement • Assume authenticated communication • Based on Discrete Log assumption: • Given aR{1,p-1}, can’t find b{1,p-1} s.t. a=gb mod p. • If given gb mod p it is easy to compute b, then adversary exposes k=gba mod p • But DH requires stronger assumption than Disc-Log: • Maybe from gb mod p andga mod p, Adversary can compute k=gba mod p(without knowing a,b)? PA=ga mod p Alice Bob PB=gb mod p http://AmirHerzberg.com

Can we assume authenticated channel? • Depends on threat model… • Passive (`eavesdropping only`) adversary? • Typical for audio phone / radio calls • Difficult for remote attackers (e.g. Internet hackers) • Spoofing (`blind`) adversary? • Easy for email, IP packets,… • Man-in-the-Middle (MITM) adversary? • How to establish key if channel is not authenticated? – Later… • First: how to encrypt without shared key? http://AmirHerzberg.com

Bob’s public encryption key B.e Bob’s private decryption key B.d encryption algorithm decryption algorithm Plaintext plaintext message, m ciphertext c=EB.e(m) Public keyCryptography m=DB.d(c)==DB.d(EB.d(m)) Asymmetric, Public Key Cryptosystem (PKCS): Alice knows only Bob’s public key B.e, Bob knows private key B.d • Most common PKCS: RSA: [Rivest, Shamir, Adelman, 1978] • Slower than symmetric (shared) key cryptosystems • Longer keys (e.g. 1024b) for same level of security (e.g. 128b AES) • Slow encryption, decryption operations • Use PK only to encrypt an shared key, use it to encrypt message • We’ll only see a simple public key encryption method http://AmirHerzberg.com

[DH] Public Key Cryptosystems (PKCs) • Assume Bob knows Alice’s public key A.e=ga mod p • Bob chooses ephemeral keys rand v= gr mod p • Bob computes (A.e) r= gar mod p • Bob encrypts message m using (A.e) ras shared key, e.g.: c=m (PA) r=m(gar mod p) • Bob sends c, v • Alice uses va=gar mod p to decrypt, e.g. m=c  gar mod p • Many other PKCs… see Crypto course, [KL],… • El-Gamal (variant of [DH]), RSA (most common), Pallier (homomorphic), Elliptic-curves (short key)… • We’ll move to signatures… [A.e=ga mod p] Alice Bob c=m(ga)r , v=gr mod p http://AmirHerzberg.com

Signatures… since MAC do not provide Evidences! • MAC allow Bob to authenticate messages from Alice • But: Alice can deny having sent m to Bob • Can we prevent this? Think of physical signatures... • Critical for commerce -conflicting interests of parties k = ?? MAC (m ’ ) = ?? k Eve m m ’ + + MAC (m ) Tag ?? k Alice Bob Key k Key k http://AmirHerzberg.com

m + Sign (m ) A.s Public Key Digital Signatures • Sign using a private, secret signature key • Validate using a public (non-secret) key • Everybody can validate signatures at any time • Provides non-repudiation – signer is committed + m σ Alice Bob Secret signature Alice ’ s public signature key A.s validation key A.v Verify using A.v that σ is Alice ’ s signature on m http://AmirHerzberg.com

Signature Forgery Experiment A.s(Alice’s secret signing key) A.v(Key to validate Alice’s signatures) Alice Bob Message (ok)or `invalid tag` message m,=SignA.s(m) Sign (tag) Validate Chosen ciphertext c m,  Chosen plaintext m Asymmetric: A.v is public (known to Eve) Digital signatures Symmetric: A.v=A.s is a shared secret key MAC (Message Authentication Code) m’, ’ http://AmirHerzberg.com

Definition: Signature Scheme • Three algorithms<K ,S,V> are a signature scheme, • K : Key generation algorithm; input 1n(security parameter), outputs: K .s (sign key) and K .v (validate key) • Security parameter defines minimal key/data length, `hardness` • Ss( m), Vv(m,σ): signing and validating signatures • If m, r, holds:`OK`= Vv(m,Ss(m)) where (s,v)=K (1n;r) • `Random bits` notation (sometimes omitted) • MAC can be viewed as special case • Where Ss(m)=MACs(m) and Vv(m,σ) returns`OK’ iff m=Ss(m) http://AmirHerzberg.com

Signature Experiment and Security Please sign: `I owe Eve 1$` I owe Eve 1$ A I owe Eve 10000$ A http://AmirHerzberg.com

Compare: MAC Experiment and Security Alice and Bob use key k … if you tell Bob: pay Eve 1$ Bob: pay Eve 1$ k Bob: pay Eve 10000$ k http://AmirHerzberg.com

Evidences for Confidential Data? • We often need evidences for confidential data (`for future use`) • Sign-then-encrypt (StE)? • StE may leak info, if attacker can learn when validation failed • Encrypt then Sign (EtS) • But: encryption should be `committing` • Recipient can prove plaintext (without exposing private key) • Holds for (most) public key cryptosystems: recover randomness during decryption • Sign also the public encryption key • Ongoing work for shared key… Plaintext Encrypt Ciphertext Sign Ciphertext tag http://AmirHerzberg.com

VIL Signatures: Hash-then-Sign` Paradigm • Signing long messages is impractical • Public key operations for long blocks are impractical • Hybrid signature/MAC? … doesn’t seem to work… • Hash-then-Sign paradigm: sign cryptographic hash of message: h(m) • Using an appropriate cryptographic hash functionh:{0,1}*{0.1}L(n) • L(n): output length a function of the security parametern • We often omit (n) and simply write L http://AmirHerzberg.com

Crypto Hash: Collision Resistance • Strong Collision Resistance: infeasible to find collision:<x,x’> s.t. x≠x’ yet h(x)=h(x’). • Birthday paradox: 50% chance for collision in 1.2*2L/2 tests,or expected number of trials: • Need 160b hash for `effective 80 bits security` • Natural definition, but… • his fixed • Collisions exist • Exist adversary that outputs it •  Add (public) key http://AmirHerzberg.com

Keyed CRHF Collision Resistant Hash Function (CRHF):a set {hk} of hash functions, such that any adversary, given kR{0,1}n, cannot efficiently find collision, i.e. <x,x’> such that x≠x’ yet hk(x)=hk(x’). h k x h (x )= Adversary knows k but not in advance – cannot `know` a collision k h (x ’ ) Pre - image k Range { 0 , 1 } * { 0 , 1 } L(n) x ’ http://AmirHerzberg.com

Keyed CRHF (Strongly/Any) Collision Resistant Hash Function (CRHF):a set {hk} of hash functions, such that any adversary, given kR{0,1}n, cannot efficiently find collision, i.e. <x,x’> such that x≠x’ yet hk(x)=hk(x’). In time, success probability h k x h (x )= Adversary knows k but not in advance – cannot `know` a collision k h (x ’ ) Pre - image k Range { 0 , 1 } * { 0 , 1 } L x ’ Comments: • `Birthday paradox`: collision after ~2L/2 guesses • `Standard` hashing is not keyed! http://AmirHerzberg.com

Security of `Hash-then-Sign` Paradigm • Given signature scheme (K,S,V) (for L bits), and keyed CRHF h, let: S_h<s,k>(m)=S.Signs(hk(m)), V_h<v,k>(m,σ)=Vv(m,hk(m)) • Notice: fixed, random hash key! • Theorem: if S is secure FIL signature, and h is a CRHF , then S_h is a secure signature scheme • Proof sketch: forgery for S_h gives either forgery for S or collision for h. • Details: [KL] or [BR] http://AmirHerzberg.com

Random Oracle Methodology • Use a fixed, keyless hash function h • Analyze as if hash h() is a random function • Of course an invalid assumption as h() is fixed! • Whenever h()is used, use oracle (black box) for random function • Good for screening insecure solutions • Random oracle security  many attacks fail • Not proof, but some evidence of security. • In practice: assume random oracle and use a standard hash functions… http://AmirHerzberg.com

Standard crypto hash functions • Several widely-used, standard, unkeyed hash functions (SHA-1, MD5, see [BR] and hidden foils) • Security by failed cryptanalysis • Efficient, free implementations • Collisions recently found in several • Goals: • Collision-Resistance • One-way function (`pre-image resistance`): given h(x) for random x, it is hard to find x' s.t. h(x')=h(x) • Randomness: output should be uniformly-distributed, if input `has enough entropy` http://AmirHerzberg.com

VIL CRHF from FIL CRHF • Recall Principle: design and cryptanalyze simple (FIL) function, use it to construct strong (VIL) function • Build VIL CRHF {0,1}*{0,1}L(n)from FIL • FIL hash function also called compression function: comp : {0,1}L’(n){0,1}L(n), L(n)<L’(n) • E.g. L’(n)=2L(n) x{0,1}L’(n) comp(x){0,1}L(n) comp http://AmirHerzberg.com

x[1] x[2] … x[l] VIL CRHF from FIL CRHF • Build VIL CRHF h:{0,1}*{0,1}L(n) from FIL CRHF c:{0,1}2L(n){0,1}L(n) • 1st Idea: use iterative process, compressing block by block • Let the input x=x[0]||x[1]||… where |x[i]|=L • Pad the last block if necessary[how? how to remove? later…] • For i=0,1,..l, let yi=c(x[i],yi-1); output h(x)=yl+1 • Is this CRHF? h(x)=yl=c(x[l],yl-1) x[0] c c c http://AmirHerzberg.com

x[1] x[2] … x[l] VIL CRHF from FIL CRHF • Build VIL CRHF h:{0,1}*{0,1}L from FIL CRHF c:{0,1}2L{0,1}L • 1st Idea: use iterative process, compressing block by block • Let the input x=x[0]||x[1]||… where |x[i]|=L • Pad last block if necessary • Let y0=x[0] • For i=1,..l, let yi=c(x[i],yi-1); output h(x)=yl+1 • Problem (collision):h(x)=h(y1||x[2]||..x[l]) x[0]=y0 h(x)=yl=c(x[l],yl-1) c c c http://AmirHerzberg.com

x[1] x[2] … x[l] VIL CRHF from FIL CRHF: adding IV • Build VIL CRHF h:{0,1}*{0,1}L from FIL CRHF c:{0,1}2L{0,1}L • 1st Idea: use iterative process, compressing block by block • 2nd idea: use a fixed IV as first block y0=IV {0,1}L • Let the input x=x[1]||…x[l] where |x[i]|=L • For i=1,..l, let yi=c(x[i],yi-1); output h(x)=yl • Suppose h(x)=h(x’), x≠x’: • If |x|=|x’|  c(x[i],yi-1)=c[x’[i],y’i-1) for <x[i],yi-1)>≠<x’[i],y’i-1> • Else: collision or preimage for IV (contradiction to OWF?) IV=y0 h(x)=yl=c(x[l],yl-1) c c c http://AmirHerzberg.com

x[1] x[2] … x[l]||10k |x| Merkle - Damgard Strengthening • Let pad(x)=x||1||0k||bin(|x|) • Where bin(|x|) is the L(n)–bit binary representation of |x| • And: |x|+1+k0 mod L(n) • Let y0=IV be some fixed Lbits (IV=Initialization Value) • For i=1,..|pad(x)|/L(n) let yi=c(x[i],yi-1) • Output MD[c]IV(x)=yl+1 Givenh(x)=h(x’),forxx’, we can findzz’s.t. c(z)=c(z’). IV h(x)=yl+1=c(|x|,yl) c c c c http://AmirHerzberg.com

x[1] x[2] … x[ ] |x| Security of MD Strengthening • Theorem (simplified, see [KL]): if c is CRHF, then h=MD[c] is CRHF. • Proof (sketch): we use collision in h to find collision in c. Suppose h(x)=h(x’) for x≠x’. • h(x)=c(|x| || y|x|)= c(|x’| || y’|x’|). Hence assume |x|=|x’| and y|x|= y’|x’| (or collision in c). • Recursively for j=l to 1, we have yj=y’j, i.e. c(x[j]||yj-1)=c (x’[j]||y’j-1). Hence x[j]=x’[j] and yj-1=y’j-1. But x≠x’ ! █ IV h(x)=h(x’) c c c c y2 http://AmirHerzberg.com

Pseudo-collisions • Question: does reverse hold, i.e.:(MD[c]is CRHF) (c is CRHF) ? • Collision forMD[c]from collision for compression function c? • Not always. • If FIL CRHF exist, then exists cwhich is not CRHF but MD[c]is CRHF [Prove yourself!] • Intuition: think of 1-block messages: c(x,y)=c(x’,y’).But h(x)=c(x,IV)≠c(x’,IV)=h(x’)! • Collision for compression function c is called pseudo-collision for the hash MD[c] http://AmirHerzberg.com

Alternative FIL to VIL: Hash Trees • To hash many parts (documents, part of long document) • Hash/compress each document (or part) • Hash all hashes (if recursively – add recursion level) • Less efficient than MD when validating all inputs • Requires to keep state (logarithmic in document size) • Advantages when validating only some inputs: • Efficiency: validate only what you need; parallelizable • Reuse: some recipients may not need all docs • Privacy: share documents only as needed • But: every collision of compression  collision of hash http://AmirHerzberg.com

Exploiting Chosen-Prefix Attacks • Hash trees are vulnerable to collision attacks! • chosen prefix collision attacks • Known for MD5, SHA0, others; suspect for SHA1 • For any prefix p (chosen by attacker) • Attack finds two collisions c, c’ • S.t. for any suffix sholds: h(p||c||s)=h(p||c’||s) • For hash trees: also for any suffix!! Much worse!! • Implications / exploits? • Colliding, different executables, documents (ps…) • Duplicate `tickets` (sign m=“ticket #”||n, n by subject) • What about hash-based MAC? Next… More on (exploiting / different) collision attacks: http://www.iaik.tugraz.at/research/krypto/collision/ http://AmirHerzberg.com

Hash based MAC • Advantages: • High speed • Most designs based on compression functions • Hash functions are widely, freely available • Specific HMAC construction now standardized • Heuristic constructions: • MACk(m)=h(k||m), MACk(m)=h(m||k), MACk(m)=h(k||m||k)… • All secure under `random oracle analysis` for h • Insecure if h is `only` (strong) CRHF • What if h is built using MD construction? http://AmirHerzberg.com

MAC using Iterated-Design CRHF • Assume one block: m{0,1}n-m • Standard hash use (variants of) MD construction, e.g. h(x)=MD[c](x) • MACk(m)=h(k||m) • IV: h(x)=IV[c](x)… • MACk(m||m’)=h(k||m||m’)=c(m’, c(m,c(k,IV))) =c(m’,MACk(m))  Forgery!! • MACk(m)=h(m||k) • Assume: collisions for any given prefix… (found to MD5, SHA) • IV: collision h(m)=h(m’)  h(m||x)=h(m’||x)  MACk(m)=MACk(m’) • Also implies subject to `birthday attack` (~2|h(x)/2|) – in exercise • Exercise: what of MD Strengthening? http://AmirHerzberg.com

HMAC - Hash based MAC • HMAC uses unkeyed hash function: HMACk(x)=h(k opad || h(k  ipad || x)) • opad, ipad are constants selected to maximize the hamming distance between k  opad and k  ipad. Specifically opadis a string of x'36' bytes, and ipadis a string of x'5c' bytes. • Widely deployed standard [RFC2104] • Not complete proof of security • Heuristic simplification of NMAC, which has a reduction to two assumptions on the hash http://AmirHerzberg.com

Summary: Principles of Cryptography • Arbitrary Adversary Principle: Assume restrictions on adversary’s capabilities not on adversary’s strategy! • Kerckhoff: keys can be secret, design is known • Sufficient key length Principle: • Number of possible keys should be large enough • To make attacks infeasible, using best adversary resources expected during `sensitivity period` of data • Limited key usage principle (refresh keys!) • Design and cryptanalyze simple (e.g. FIL) functions, use it to construct stronger (e.g. VIL) functions • Apply robust combiners to multiple candidates to tolerate attacks • Well defined, conservative assumptions and security specifications http://AmirHerzberg.com

Public Key Cryptography (key exchange, encryption, signatures) and Crypto-Hashing