Physical Security and Side-Channel Attacks

1 / 98

# Physical Security and Side-Channel Attacks - PowerPoint PPT Presentation

Physical Security and Side-Channel Attacks. Rice ELEC 528/ COMP 538 Farinaz Koushanfar Spring 2009. Outline. Introduction Hardware targets Attack classification Power attacks Timing attacks Electromagnetic attacks Fault injection attacks. Introduction.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about ' Physical Security and Side-Channel Attacks' - dugan

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### Physical Security and Side-Channel Attacks

Rice ELEC 528/ COMP 538

Farinaz Koushanfar

Spring 2009

Outline
• Introduction
• Hardware targets
• Attack classification
• Power attacks
• Timing attacks
• Electromagnetic attacks
• Fault injection attacks
Introduction
• Classic cryptography views the securing problem using mathematical abstractions
• The classic cryptoanalysis has had a great success and promise
• Analysis and quantifying crypto algorithms’s resilience against attacks)
• Recently, many of the security protocols have been attacked using physical attacks
• Take advantage of the implementation specific to recover the secret parameters
Physical attacks
• Traditional cryptography is centered around the concepts of one-way and trapdoor functions
• A one-way function can be rapidly calculated, but is computationally difficult to invert
• Polynomial time algorithms rarely find a pre-image of the one-way security functions for a random set of inputs
• A trapdoor one-way function is a function that is easy to invert if and only if a certain secret (key) is available
• Physical attacks usually have two phases:
• Interaction phase: the attacker exploits some physical characteristics of the device
• Exploitation phase: analyzing the gathered information to recover the secret
Model
• Consider a device capable of doing cryptographic function
• The key is usually stored in the device and protected
• Modern crypto based on Kerckhoff’s assumptions all of the data required to operate a chip is entirely hidden in the secret
• Attacker only needs to extract the keys
Principle of divide-and-conquer attack
• The divide and conquer (D&C) attacks attempt at recovering the key by parts
• The idea is that an observable characteristic can be correlated with a partial key
• The partial key should be small enough to enable exhaustive search
• Once a partial key is validated, the process is repeated for finding other keys
• D&C attacks may be iterative (some parts of the key dependent on others) or independent
Outline
• Introduction
• Hardware targets
• Attack classification
• Power attacks
• Timing attacks
• Electromagnetic attacks
• Fault injection attacks
Hardware targets
• The most common victim of hardware cryptoanalysis are the smart cards (SC)
• Attacks on SCs are applicable to any general purpose processor with a fixed bus length
• Attacks on FPGAs are also reported. FPGAs represent application specific devices with parallel computing opportunity
Smart Cards
• It has a small processor (8bit or 32bit) long with ROM, EEPROM and a small RAM
• There are eight wires connecting the processor to the outside world
• Power supply: SCs have no internal batteries, the current provided by the reader
• Clock: SCs do not have an internal clock
• SCs are typically equipped with a shield that destroys the chip if a tampering happens
FPGAs
• The first difference with SCs is in the applications of the two processor.
• FPGAs and ASICs allow parallel computing
• Multiple programmable configuration bits
Outline
• Introduction
• Hardware targets
• Attack classification
• Power attacks
• Timing attacks
• Electromagnetic attacks
• Fault injection attacks
Attack classification
• Many possible attacks, the attacks are often not mutually exclusive
• Invasive vs. noninvasive attacks
• Active vs. passive
• Active attacks tamper with device’s proper functionality, either temporary or permanently
Five major attack groups
• Probing attack (invasive)
• Fault injection attacks – active attacks , maybe invasive or noninvasive
• Timing attacks exploit device’s running time
• Power analysis attack
• Electromagnetic analysis attacks
Outline
• Introduction
• Hardware targets
• Attack classification
• Power attacks
• Timing attacks
• Electromagnetic attacks
• Fault injection attacks
Measuring phase
• This task is usually straightforward
• Easy for smart cards: the energy is provided by the terminal and the current can be read
• Relatively inexpensive (<\$1000) equipment can digitally sample voltage differences at high rates (1GHz++) with less than 1% error
• Device’s power consumption depends on many things, including its structure and data
Simple power analysis (SPA)
• Monitoring the device’s power consumption to deduce information about data/operation
• Example: SPA on DES – smart card
• The internal structure is shown in the next slide
• Summary DES - a block cipher
• a product cipher
• 16 rounds (iterations) on the input bits (of P)
• substitutions (for confusion) and
• permutations (for diffusion)
• Each round with a round key
• Generated from the user-supplied key

Input

Input Permutation

L0

R0

S

P

L1

R1

K1

K

L16

R16

K16

Final Permutation

Output

* DES Basic Structure

[Fig. – cf. J. Leiwo]

• Input: 64 bits (a block)
• Li/Ri– left/right half of the input block
• for iteration i (32 bits) – subject to substitution S and permutation P (cf. Fig 2-8– text)
• K - user-supplied key
• Ki - round key:
• 56 bits used +8 unused
• (unused for E but often used for error checking)
• Output: 64 bits (a block)
• Note: Ri becomes L(i+1)
• All basic op’s are simple logical ops
• Left shift / XOR
Example 1 - SPA on DES (cont’d)
• The upper trace – entire encryption, including the initial phase, 16 DES rounds, and the initial permutation
• The lower trace – detailed view of the second and third rounds

square and multiply algorithm

SPA on DES (cont’d)
• The DES structure and 16 rounds are known
• Instruction flow depends on data  power signature
• Example: Modular exponentiation in DES is often implemented by square and multiply algorithm
• Typically the square operation is implemented differently compared with the multiply (for speed purposes)
• Then, the power trace of the exponentiation can directly yields the corresponding value
• All programs involving conditional branchingbased on the key values are at risk!
SPA example (cont’d)
• Unprotected modular exponentiation – square and multiply algorithm
• The pick values reveal the key values
Differential power analysis (DPA)
• SPA targets variable instruction flow
• DPA targets data-dependence
• Difference b/w smart cards (SCs) and FPGAs
• In SCs, one operation running at a time
•  Simple power tracing is possible
• In FPGAs, typically parallel computations prevents visual SPA inspection  DPA
Example: DPA on DES
• Divide-and-conquer strategy, comparing powers for different inputs
• Record large number of inputs and record the corresponding power consumption
• We have access to R15, that entered the last round operation, since it is equal to L16
• Take this output bit (called M’i) at the last round and classify the curves based on the bit
• 6 specific bits of R15 will be XOR’d with 6 bits of the key, before entering the S-box
• By guessing the 6-bit key value, we can predict the bit b, or an arbitrary output bit of an arbitrary S-box output
• Thus, with 26 partitions, one for each possible key, we can break the cipher much faster

A closer look at HW

Implementation Of DES

DPA (cont’d)
• DPA can be performed in any algorithm that has the operation =S(K),
•  is known and K is the segment key

The waveforms are captured by a scope and

Sent to a computer for analysis

DPA (cont’d)

The bit will classify the wave wi

• Hypothesis 1: bit is zero
• Hypothesis 2: bit is one
• A differential trace will be calculated for each bit!
DPA (cont’d)
• The DPA waveform with the highest peak will validate the hypothesis
Improvements over DPA
• Correlation power analysis (CPA) - attacker steps
• Predict the power usage of the device at one specific instant, as a function of certain key bits
• E.g., for DES, it is assumed to be function of the Hamming weight of the data
• Prediction matrix stores the predicted values
• Consumption vector Stores the measured power
• The attacker compared the actual and the predicted values, using correlation coefficient
• E.g., correlation b/w all the columns of the prediction vector and the consumption matrix
Modeling the power consumption
• Hamming weight model
• Typically measured on a bus, Y=aH(X)+b
• Y: power consumption; X: data value; H: Hamming weight
• The Hamming distance model
• Y=aH(PX)+b
• Accounting for the previous value on the bus (P)
Correlation power analysis (CPA)
• The equation for generating differential waveforms replaced with correlations
• Rather than attacking one bit, the attacker tries prediction of the Hamming weight of a word (H)
• The correlation is computed by:
• Data-dependent attacks require power consumption model
• Can be measured and learned
• Synchronization of the measurements needs to be addressed
• The attack is affected by parallel computing which lowers observability
• The described attack is not the best achieved to date, e.g., techniques based on maximum likelihood often offer better results
Anti-DPA
• Internal clock phase shift
• Differential power analysis, by kocher, Jaffe, and Jun
• Power analysis tutorial, by Aigner and Oswald
• A tutorial on physical security and side-channel attacks, by Koeune and Standaert
• Michael Tunstall has some good material – a few of the charts are his courtesy
• Side channel attacks: countermeasures, by Verbauwhede
Outline
• Introduction
• Hardware targets
• Attack classification
• Power attacks
• Timing attacks
• Electromagnetic attacks
• Fault injection attacks
Timing attacks
• Running time of a crypto processor can be used as an information channel
• The idea was proposed by Kocher, Crypto’96
RSA Cryptosystem
• Key generation:
• Generate large (say, 2048-bit) primes p, q
• Compute n=pq and (n)=(p-1)(q-1)
• Choose small e, relatively prime to (n)
• Typically, e=3 (may be vulnerable) or e=216+1=65537 (why?)
• Compute unique d such that ed = 1 mod (n)
• Public key = (e,n); private key = d
• Security relies on the assumption that it is difficult to factor n into p and q
• Encryption of m: c = me mod n
• Decryption of c: cd mod n = (me)d mod n = m
How Does RSA Decryption Work?
• RSA decryption: compute yx mod n
• This is a modular exponentiation operation
• Naïve algorithm: square and multiply
Kocher’s Observation

Whether iteration takes a long time

depends on the kth bit of secret exponent

This takes a while

to compute

This is instantaneous

Outline of Kocher’s Attack
• Idea: guess some bits of the exponent and predict how long decryption will take
• If guess is correct, will observe correlation; if incorrect, then prediction will look random
• This is a signal detection problem, where signal is timing variation due to guessed exponent bits
• The more bits you already know, the stronger the signal, thus easier to detect (error-correction property)
• Start by guessing a few top bits, look at correlations for each guess, pick the most promising candidate and continue
RSA in OpenSSL
• OpenSSL is a popular open-source toolkit
• mod_SSL (in Apache = 28% of HTTPS market)
• stunnel (secure TCP/IP servers)
• sNFS (secure NFS)
• Many more applications
• Kocher’s attack doesn’t work against OpenSSL
• Instead of square-and-multiply, OpenSSL uses CRT, sliding windows and two different multiplication algorithms for modular exponentiation
• CRT = Chinese Remainder Theorem
• Secret exponent is processed in chunks, not bit-by-bit
Chinese Remainder Theorem
• n = n1n2…nk

where gcd(ni,nj)=1 when i  j

• The system of congruences

x = x1 mod n1 = … = xk mod nk

• Has a simultaneous solution x to all congruences
• There exists exactly one solution x between 0 and n-1
• For RSA modulus n=pq, to compute x mod n it’s enough to know x mod p and x mod q

Attack this computation in order to learn q.

This is enough to learn private key (why?)

RSA Decryption With CRT
• To decrypt c, need to computem=cd mod n
• Use Chinese Remainder Theorem (why?)
• d1 = d mod (p-1)
• d2 = d mod (q-1)
• qinv = (1/q) mod p
• Compute m1 = cd1 mod p; m2 = cd2 mod q
• Compute m = m2+(qinv*(m1-m2) mod p)*q

these are precomputed

Attack this computation in order to learn q.

This is enough to learn private key (why?)

RSA Decryption With CRT
• To decrypt c, need to computem=cd mod n
• Use Chinese Remainder Theorem (why?)
• d1 = d mod (p-1)
• d2 = d mod (q-1)
• qinv = (1/q) mod p
• Compute m1 = cd1 mod p; m2 = cd2 mod q
• Compute m = m2+(qinv*(m1-m2) mod p)*q

these are precomputed

Operations Involved in Decryption

What is needed to compute cd mod q and xy mod q?

• Exponentiation
• Sliding windows
• Multiplication routines
• Normal (when operands have unequal length)
• Karatsuba (when operands have equal length): faster
• Modular reduction
• Montgomery reduction
Montgomery Reduction
• Decryption requires computing m2 = cd2 mod q
• This is done by repeated multiplication
• Simple: square and multiply (process d2 1 bit at a time)
• More clever: sliding windows (process d2 in 5-bit blocks)
• In either case, many multiplications modulo q
• Multiplications use Montgomery reduction
• Pick some R = 2k
• To compute x*y mod q, convert x and y into their Montgomery form xR mod q and yR mod q
• Compute (xR * yR) * R-1 = zR mod q
• Multiplication by R-1 can be done very efficiently
Schindler’s Observation
• At the end of Montgomery reduction, if zR > q, then need to subtract q
• Probability of this extra step is proportional to c mod q
• If c is close to q, a lot of subtractions will be done
• If c mod q = 0, very few subtractions
• Decryption will take longer as c gets closer to q, then become fast as c passes a multiple of q
• By playing with different values of c and observing how long decryption takes, attacker can guess q!
• Doesn’t work directly against OpenSSL because of sliding windows and two multiplication algorithms
Reduction Timing Dependency

Decryption time

q

2q

p

Value of ciphertext c

Decryption time

#ReductionsMult routine

0-1 Gap

q

Value of ciphertext

Attack Is Binary Search
Attack Overview
• Initial guess g for q between 2512 and 2511 (why?)
• Try all possible guesses for the top few bits
• Suppose we know i-1 top bits of q. Goal: ith bit
• Set g =…known i-1 bits of q…000000
• Set ghi=…known i-1 bits of q…100000 (note: g<ghi)
• If g<q<ghi then the ith bit of q is 0
• If g<ghi<q then the ith bit of q is 1
• Goal: decide whether g<q<ghi or g<ghi<q

Decryption time

#ReductionsMult routine

ghi?

ghi?

g

q

Value of ciphertext

Two Possibilities for ghi

Difference in decryption times

between g and ghi will be small

Difference in decryption times

between g and ghi will be large

Timing Attack Details
• What is “large” and “small”?
• Know from attacking previous bits
• Decrypting just g does not work because of sliding windows
• Decrypt a neighborhood of values near g
• Will increase difference between large and small values, resulting in larger 0-1 gap
• Attack requires only 2 hours, about 1.4 million queries to recover the private key
• Only need to recover most significant half bits of q
The 0-1 Gap

Zero-one gap

Extracting RSA Private Key

Montgomery reductiondominates

zero-one gap

Multiplication routine dominates

Outline
• Introduction
• Hardware targets
• Attack classification
• Power attacks
• Timing attacks
• Electromagnetic attacks
• Fault injection attacks
Outline
• Introduction
• Hardware targets
• Attack classification
• Power attacks
• Timing attacks
• Electromagnetic attacks
• Fault injection attacks
Fault injection techniques
• Transient (provisional) and permanent (destructive) faults
• Variations to supply voltage
• Variations in the external clock
• Temperature
• White light
• Laser light
• X-rays and ion beams
• Electromagnetic flux
Provisional faults
• Single event upsets
• Temporary flips in a cell’s logical state to a complementary state
• Multiple event faults
• Several simultaneous SEUs
• Dose rate faults
• The individual effects are negligible, but cumulative effect causes fault
• Provisional faults are used more in fault injection
Permanent faults
• Single-event burnout faults
• Caused by a parasitic thyristor being formed in the MOS power transistors
• Single-event snap back faults
• Caused by self-sustained current by parasitic bipolar transistors in MOS
• Single-event latch-up faults
• Creates a self sustained current in parasitics
• Total dose rate faults
• Progressive degradation of the electronic circuit
Fault impacts (model)
• Resetting data
• Data randomization – could be misleading, no control over!
• Modifying op-code – implementation dependent