physical security and side channel attacks
Download
Skip this Video
Download Presentation
Physical Security and Side-Channel Attacks

Loading in 2 Seconds...

play fullscreen
1 / 98

Physical Security and Side-Channel Attacks - PowerPoint PPT Presentation


  • 148 Views
  • Uploaded on

Physical Security and Side-Channel Attacks. Rice ELEC 528/ COMP 538 Farinaz Koushanfar Spring 2009. Outline. Introduction Hardware targets Attack classification Power attacks Timing attacks Electromagnetic attacks Fault injection attacks. Introduction.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Physical Security and Side-Channel Attacks' - dugan


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
physical security and side channel attacks

Physical Security and Side-Channel Attacks

Rice ELEC 528/ COMP 538

Farinaz Koushanfar

Spring 2009

outline
Outline
  • Introduction
  • Hardware targets
  • Attack classification
  • Power attacks
  • Timing attacks
  • Electromagnetic attacks
  • Fault injection attacks
introduction
Introduction
  • Classic cryptography views the securing problem using mathematical abstractions
  • The classic cryptoanalysis has had a great success and promise
    • Analysis and quantifying crypto algorithms’s resilience against attacks)
  • Recently, many of the security protocols have been attacked using physical attacks
    • Take advantage of the implementation specific to recover the secret parameters
physical attacks
Physical attacks
  • Traditional cryptography is centered around the concepts of one-way and trapdoor functions
  • A one-way function can be rapidly calculated, but is computationally difficult to invert
  • Polynomial time algorithms rarely find a pre-image of the one-way security functions for a random set of inputs
  • A trapdoor one-way function is a function that is easy to invert if and only if a certain secret (key) is available
  • Physical attacks usually have two phases:
    • Interaction phase: the attacker exploits some physical characteristics of the device
    • Exploitation phase: analyzing the gathered information to recover the secret
model
Model
  • Consider a device capable of doing cryptographic function
  • The key is usually stored in the device and protected
  • Modern crypto based on Kerckhoff’s assumptions all of the data required to operate a chip is entirely hidden in the secret
  • Attacker only needs to extract the keys
principle of divide and conquer attack
Principle of divide-and-conquer attack
  • The divide and conquer (D&C) attacks attempt at recovering the key by parts
  • The idea is that an observable characteristic can be correlated with a partial key
    • The partial key should be small enough to enable exhaustive search
  • Once a partial key is validated, the process is repeated for finding other keys
  • D&C attacks may be iterative (some parts of the key dependent on others) or independent
outline1
Outline
  • Introduction
  • Hardware targets
  • Attack classification
  • Power attacks
  • Timing attacks
  • Electromagnetic attacks
  • Fault injection attacks
hardware targets
Hardware targets
  • The most common victim of hardware cryptoanalysis are the smart cards (SC)
    • Attacks on SCs are applicable to any general purpose processor with a fixed bus length
    • Attacks on FPGAs are also reported. FPGAs represent application specific devices with parallel computing opportunity
smart cards
Smart Cards
  • It has a small processor (8bit or 32bit) long with ROM, EEPROM and a small RAM
  • There are eight wires connecting the processor to the outside world
  • Power supply: SCs have no internal batteries, the current provided by the reader
  • Clock: SCs do not have an internal clock
  • SCs are typically equipped with a shield that destroys the chip if a tampering happens
fpgas
FPGAs
  • The first difference with SCs is in the applications of the two processor.
  • FPGAs and ASICs allow parallel computing
  • Multiple programmable configuration bits
outline2
Outline
  • Introduction
  • Hardware targets
  • Attack classification
  • Power attacks
  • Timing attacks
  • Electromagnetic attacks
  • Fault injection attacks
attack classification
Attack classification
  • Many possible attacks, the attacks are often not mutually exclusive
  • Invasive vs. noninvasive attacks
  • Active vs. passive
    • Active attacks tamper with device’s proper functionality, either temporary or permanently
five major attack groups
Five major attack groups
  • Probing attack (invasive)
  • Fault injection attacks – active attacks , maybe invasive or noninvasive
  • Timing attacks exploit device’s running time
  • Power analysis attack
  • Electromagnetic analysis attacks
outline3
Outline
  • Introduction
  • Hardware targets
  • Attack classification
  • Power attacks
  • Timing attacks
  • Electromagnetic attacks
  • Fault injection attacks
measuring phase
Measuring phase
  • This task is usually straightforward
    • Easy for smart cards: the energy is provided by the terminal and the current can be read
  • Relatively inexpensive (<$1000) equipment can digitally sample voltage differences at high rates (1GHz++) with less than 1% error
  • Device’s power consumption depends on many things, including its structure and data
simple power analysis spa
Simple power analysis (SPA)
  • Monitoring the device’s power consumption to deduce information about data/operation
  • Example: SPA on DES – smart card
    • The internal structure is shown in the next slide
  • Summary DES - a block cipher
    • a product cipher
    • 16 rounds (iterations) on the input bits (of P)
      • substitutions (for confusion) and
      • permutations (for diffusion)
    • Each round with a round key
      • Generated from the user-supplied key
des basic structure

Input

Input Permutation

L0

R0

S

P

L1

R1

K1

K

L16

R16

K16

Final Permutation

Output

* DES Basic Structure

[Fig. – cf. J. Leiwo]

  • Input: 64 bits (a block)
  • Li/Ri– left/right half of the input block
  • for iteration i (32 bits) – subject to substitution S and permutation P (cf. Fig 2-8– text)
  • K - user-supplied key
  • Ki - round key:
    • 56 bits used +8 unused
    • (unused for E but often used for error checking)
  • Output: 64 bits (a block)
  • Note: Ri becomes L(i+1)
  • All basic op’s are simple logical ops
    • Left shift / XOR
example 1 spa on des cont d
Example 1 - SPA on DES (cont’d)
  • The upper trace – entire encryption, including the initial phase, 16 DES rounds, and the initial permutation
  • The lower trace – detailed view of the second and third rounds
spa on des cont d

square and multiply algorithm

SPA on DES (cont’d)
  • The DES structure and 16 rounds are known
  • Instruction flow depends on data  power signature
  • Example: Modular exponentiation in DES is often implemented by square and multiply algorithm
  • Typically the square operation is implemented differently compared with the multiply (for speed purposes)
  • Then, the power trace of the exponentiation can directly yields the corresponding value
  • All programs involving conditional branchingbased on the key values are at risk!
spa example cont d1
SPA example (cont’d)
  • Unprotected modular exponentiation – square and multiply algorithm
  • The pick values reveal the key values
differential power analysis dpa
Differential power analysis (DPA)
  • SPA targets variable instruction flow
  • DPA targets data-dependence
  • Difference b/w smart cards (SCs) and FPGAs
  • In SCs, one operation running at a time
    •  Simple power tracing is possible
  • In FPGAs, typically parallel computations prevents visual SPA inspection  DPA
example dpa on des
Example: DPA on DES
  • Divide-and-conquer strategy, comparing powers for different inputs
  • Record large number of inputs and record the corresponding power consumption
  • We have access to R15, that entered the last round operation, since it is equal to L16
  • Take this output bit (called M’i) at the last round and classify the curves based on the bit
  • 6 specific bits of R15 will be XOR’d with 6 bits of the key, before entering the S-box
  • By guessing the 6-bit key value, we can predict the bit b, or an arbitrary output bit of an arbitrary S-box output
  • Thus, with 26 partitions, one for each possible key, we can break the cipher much faster

A closer look at HW

Implementation Of DES

dpa cont d
DPA (cont’d)
  • DPA can be performed in any algorithm that has the operation =S(K),
    •  is known and K is the segment key

The waveforms are captured by a scope and

Sent to a computer for analysis

dpa cont d1
DPA (cont’d)

The bit will classify the wave wi

  • Hypothesis 1: bit is zero
  • Hypothesis 2: bit is one
  • A differential trace will be calculated for each bit!
dpa cont d4
DPA (cont’d)
  • The DPA waveform with the highest peak will validate the hypothesis
improvements over dpa
Improvements over DPA
  • Correlation power analysis (CPA) - attacker steps
    • Predict the power usage of the device at one specific instant, as a function of certain key bits
      • E.g., for DES, it is assumed to be function of the Hamming weight of the data
    • Prediction matrix stores the predicted values
    • Consumption vector Stores the measured power
    • The attacker compared the actual and the predicted values, using correlation coefficient
    • E.g., correlation b/w all the columns of the prediction vector and the consumption matrix
modeling the power consumption
Modeling the power consumption
  • Hamming weight model
    • Typically measured on a bus, Y=aH(X)+b
    • Y: power consumption; X: data value; H: Hamming weight
  • The Hamming distance model
    • Y=aH(PX)+b
    • Accounting for the previous value on the bus (P)
correlation power analysis cpa
Correlation power analysis (CPA)
  • The equation for generating differential waveforms replaced with correlations
  • Rather than attacking one bit, the attacker tries prediction of the Hamming weight of a word (H)
  • The correlation is computed by:
more about pa cont d
More about PA (cont’d)
  • Data-dependent attacks require power consumption model
    • Can be measured and learned
  • Synchronization of the measurements needs to be addressed
  • The attack is affected by parallel computing which lowers observability
  • The described attack is not the best achieved to date, e.g., techniques based on maximum likelihood often offer better results
anti dpa
Anti-DPA
  • Internal clock phase shift
further readings
Further Readings
  • Differential power analysis, by kocher, Jaffe, and Jun
  • Power analysis tutorial, by Aigner and Oswald
  • A tutorial on physical security and side-channel attacks, by Koeune and Standaert
  • Michael Tunstall has some good material – a few of the charts are his courtesy
  • Side channel attacks: countermeasures, by Verbauwhede
outline4
Outline
  • Introduction
  • Hardware targets
  • Attack classification
  • Power attacks
  • Timing attacks
  • Electromagnetic attacks
  • Fault injection attacks
timing attacks
Timing attacks
  • Running time of a crypto processor can be used as an information channel
  • The idea was proposed by Kocher, Crypto’96
rsa cryptosystem
RSA Cryptosystem
  • Key generation:
    • Generate large (say, 2048-bit) primes p, q
    • Compute n=pq and (n)=(p-1)(q-1)
    • Choose small e, relatively prime to (n)
      • Typically, e=3 (may be vulnerable) or e=216+1=65537 (why?)
    • Compute unique d such that ed = 1 mod (n)
    • Public key = (e,n); private key = d
      • Security relies on the assumption that it is difficult to factor n into p and q
  • Encryption of m: c = me mod n
  • Decryption of c: cd mod n = (me)d mod n = m
how does rsa decryption work
How Does RSA Decryption Work?
  • RSA decryption: compute yx mod n
    • This is a modular exponentiation operation
  • Naïve algorithm: square and multiply
kocher s observation
Kocher’s Observation

Whether iteration takes a long time

depends on the kth bit of secret exponent

This takes a while

to compute

This is instantaneous

outline of kocher s attack
Outline of Kocher’s Attack
  • Idea: guess some bits of the exponent and predict how long decryption will take
  • If guess is correct, will observe correlation; if incorrect, then prediction will look random
    • This is a signal detection problem, where signal is timing variation due to guessed exponent bits
    • The more bits you already know, the stronger the signal, thus easier to detect (error-correction property)
  • Start by guessing a few top bits, look at correlations for each guess, pick the most promising candidate and continue
rsa in openssl
RSA in OpenSSL
  • OpenSSL is a popular open-source toolkit
    • mod_SSL (in Apache = 28% of HTTPS market)
    • stunnel (secure TCP/IP servers)
    • sNFS (secure NFS)
    • Many more applications
  • Kocher’s attack doesn’t work against OpenSSL
    • Instead of square-and-multiply, OpenSSL uses CRT, sliding windows and two different multiplication algorithms for modular exponentiation
      • CRT = Chinese Remainder Theorem
      • Secret exponent is processed in chunks, not bit-by-bit
chinese remainder theorem
Chinese Remainder Theorem
  • n = n1n2…nk

where gcd(ni,nj)=1 when i  j

  • The system of congruences

x = x1 mod n1 = … = xk mod nk

    • Has a simultaneous solution x to all congruences
    • There exists exactly one solution x between 0 and n-1
  • For RSA modulus n=pq, to compute x mod n it’s enough to know x mod p and x mod q
rsa decryption with crt

Attack this computation in order to learn q.

This is enough to learn private key (why?)

RSA Decryption With CRT
  • To decrypt c, need to computem=cd mod n
  • Use Chinese Remainder Theorem (why?)
    • d1 = d mod (p-1)
    • d2 = d mod (q-1)
    • qinv = (1/q) mod p
    • Compute m1 = cd1 mod p; m2 = cd2 mod q
    • Compute m = m2+(qinv*(m1-m2) mod p)*q

these are precomputed

rsa decryption with crt1

Attack this computation in order to learn q.

This is enough to learn private key (why?)

RSA Decryption With CRT
  • To decrypt c, need to computem=cd mod n
  • Use Chinese Remainder Theorem (why?)
    • d1 = d mod (p-1)
    • d2 = d mod (q-1)
    • qinv = (1/q) mod p
    • Compute m1 = cd1 mod p; m2 = cd2 mod q
    • Compute m = m2+(qinv*(m1-m2) mod p)*q

these are precomputed

operations involved in decryption
Operations Involved in Decryption

What is needed to compute cd mod q and xy mod q?

  • Exponentiation
    • Sliding windows
  • Multiplication routines
    • Normal (when operands have unequal length)
    • Karatsuba (when operands have equal length): faster
  • Modular reduction
    • Montgomery reduction
montgomery reduction
Montgomery Reduction
  • Decryption requires computing m2 = cd2 mod q
  • This is done by repeated multiplication
    • Simple: square and multiply (process d2 1 bit at a time)
    • More clever: sliding windows (process d2 in 5-bit blocks)
  • In either case, many multiplications modulo q
  • Multiplications use Montgomery reduction
    • Pick some R = 2k
    • To compute x*y mod q, convert x and y into their Montgomery form xR mod q and yR mod q
    • Compute (xR * yR) * R-1 = zR mod q
      • Multiplication by R-1 can be done very efficiently
schindler s observation
Schindler’s Observation
  • At the end of Montgomery reduction, if zR > q, then need to subtract q
    • Probability of this extra step is proportional to c mod q
  • If c is close to q, a lot of subtractions will be done
  • If c mod q = 0, very few subtractions
    • Decryption will take longer as c gets closer to q, then become fast as c passes a multiple of q
  • By playing with different values of c and observing how long decryption takes, attacker can guess q!
    • Doesn’t work directly against OpenSSL because of sliding windows and two multiplication algorithms
reduction timing dependency
Reduction Timing Dependency

Decryption time

q

2q

p

Value of ciphertext c

attack is binary search

Decryption time

#ReductionsMult routine

0-1 Gap

q

Value of ciphertext

Attack Is Binary Search
attack overview
Attack Overview
  • Initial guess g for q between 2512 and 2511 (why?)
  • Try all possible guesses for the top few bits
  • Suppose we know i-1 top bits of q. Goal: ith bit
    • Set g =…known i-1 bits of q…000000
    • Set ghi=…known i-1 bits of q…100000 (note: g<ghi)
      • If g<q<ghi then the ith bit of q is 0
      • If g<ghi<q then the ith bit of q is 1
  • Goal: decide whether g<q<ghi or g<ghi<q
two possibilities for g hi

Decryption time

#ReductionsMult routine

ghi?

ghi?

g

q

Value of ciphertext

Two Possibilities for ghi

Difference in decryption times

between g and ghi will be small

Difference in decryption times

between g and ghi will be large

timing attack details
Timing Attack Details
  • What is “large” and “small”?
    • Know from attacking previous bits
  • Decrypting just g does not work because of sliding windows
    • Decrypt a neighborhood of values near g
    • Will increase difference between large and small values, resulting in larger 0-1 gap
  • Attack requires only 2 hours, about 1.4 million queries to recover the private key
    • Only need to recover most significant half bits of q
the 0 1 gap
The 0-1 Gap

Zero-one gap

extracting rsa private key
Extracting RSA Private Key

Montgomery reductiondominates

zero-one gap

Multiplication routine dominates

outline5
Outline
  • Introduction
  • Hardware targets
  • Attack classification
  • Power attacks
  • Timing attacks
  • Electromagnetic attacks
  • Fault injection attacks
outline6
Outline
  • Introduction
  • Hardware targets
  • Attack classification
  • Power attacks
  • Timing attacks
  • Electromagnetic attacks
  • Fault injection attacks
fault injection techniques
Fault injection techniques
  • Transient (provisional) and permanent (destructive) faults
    • Variations to supply voltage
    • Variations in the external clock
    • Temperature
    • White light
    • Laser light
    • X-rays and ion beams
    • Electromagnetic flux
provisional faults
Provisional faults
  • Single event upsets
    • Temporary flips in a cell’s logical state to a complementary state
  • Multiple event faults
    • Several simultaneous SEUs
  • Dose rate faults
    • The individual effects are negligible, but cumulative effect causes fault
  • Provisional faults are used more in fault injection
permanent faults
Permanent faults
  • Single-event burnout faults
    • Caused by a parasitic thyristor being formed in the MOS power transistors
  • Single-event snap back faults
    • Caused by self-sustained current by parasitic bipolar transistors in MOS
  • Single-event latch-up faults
    • Creates a self sustained current in parasitics
  • Total dose rate faults
    • Progressive degradation of the electronic circuit
fault impacts model
Fault impacts (model)
  • Resetting data
  • Data randomization – could be misleading, no control over!
  • Modifying op-code – implementation dependent
ad