ECE 645
Sponsored Links
This presentation is the property of its rightful owner.
1 / 76

ECE 645 Spring 2007 PROJECT 2 Specification PowerPoint PPT Presentation


  • 60 Views
  • Uploaded on
  • Presentation posted in: General

ECE 645 Spring 2007 PROJECT 2 Specification. Topic Options. Public Key (Asymmetric) Cryptosystems. Private key of Bob - k B. Public key of Bob - K B. Network. Decryption. Encryption. Bob. Alice. RSA as a trap-door one-way function. PUBLIC KEY. C = f(M) = M e mod N. M. C.

Download Presentation

ECE 645 Spring 2007 PROJECT 2 Specification

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


ECE 645

Spring 2007

PROJECT 2

Specification


Topic Options


Public Key (Asymmetric) Cryptosystems

Private key of Bob - kB

Public key of Bob - KB

Network

Decryption

Encryption

Bob

Alice


RSA as a trap-door one-way function

PUBLIC KEY

C = f(M) = Me mod N

M

C

M = f-1(C) = Cd mod N

PRIVATE KEY

N = P  Q

P, Q - large prime numbers

e  d  1 mod ((P-1)(Q-1))


RSA keys

PUBLIC KEY

PRIVATE KEY

{ e, N }

{ d, P, Q }

N = P  Q

P, Q - large prime numbers

e  d  1 mod ((P-1)(Q-1))


Early Factoring Device – Lehmer Sieve

Bicycle chain sieve [D. H. Lehmer, 1928]

Computer Museum, Mountain View, CA


Supercomputer Cray-1 from 1980’s

Computer Museum, Mountain View, CA


FPGA based supercomputers

Machine

Released

SRC 6 fromSRC Computers

Cray XD1 fromfrom Cray

SGI Altix from

SGI

SRC 7 from

SRC Computers, Inc,

2002

2005

2005

2006


COPACOBANA

Ruhr University, Bochum,

University of Kiel, Germany, 2006

Cost: € 8980

120 Spartan 3 FPGAs

Clock frequency 100 MHz


Factoring 1024-bit RSA keysusing Number Field Sieve (NFS)

Polynomial Selection

Relation

Collection

Cofactoring

200 bit

& 350 bit

Trial division

ECM, p-1 method, rho method

Sieving

numbers

Linear Algebra

Square Root


Topic 1

Trial Division Sieve


Topic 1: Trial Division Sieve (1)

Given:

Inputs:

Variables:

  • Integers N1, N2, N3, .... each of the size of k-bits

    Constants:

    2. Factor base =

    set of all primes smaller smaller than a certain bound B

    = { p1=2, p2=3, p3=5, ... , pt ≤ B }

    Parameters of interest:

    4 ≤ k ≤ 512

    3 ≤ B ≤ 105


Topic 1: Trial Division Sieve (2)

Required:

Outputs:

For each integer Ni:

A list of primes from the factor base that divides Ni, and

the number of times each prime divides Ni.

For example if

Ni = p1e1 · p2e2 · p3e3· Mi,

where Mi is not divisible by any prime belonging to

a factor base, then

the output is

{p1, e1}, {p2, e2}, {p3, e3}


Topic 1: Trial Division Sieve (3)

Example:

Constants:

k=10, B=5

Factor base = {2, 3, 5}

Variables:

N1 = 408 = 23· 3 · 17

N2 = 630 = 2 · 32· 5 · 7

Outputs:

{2, 3}, {3, 1}

{2, 1}, {3, 2}, {5, 1}


Topic 1: Trial Division Sieve (4)

Optimization Criteria:

Maximum number of integers Ni fully processed per unit

of time for a given k and B.


Topic 2

Greatest Common Divisor

&

Multiplicative Inverse


Topic 2: Greatest Common Divisorand Multiplicative Inverse(2)

Given:

Inputs:

a, N: k-bit integers; a < N

Outputs:

y = gcd(a, N)

x = a-1 mod N

i.e., integer 1 ≤ x < N, such that

a  x (mod N) = 1

Parameters of interest:

4 ≤ k ≤ 1024


Greatest common divisor

Greatest common divisor of a and b, denoted by gcd(a, b),

is the largest positive integer that divides both a and b.

d = gcd (a, b) iff 1) d | a and d | b

2) if c | a and c | b then c d


gcd (8, 44) =

gcd (-15, 65) =

gcd (45, 30) =

gcd (31, 15) =

gcd (121, 169) =


Quotient and remainder

Given integers a and n, n>0

! q, r  Z such that

a = q n + r and 0  r < n

a

q =

q – quotient

r – remainder

(of a divided by n)

= a div n

n

a

r = a - q n = a –

 n

=

n

= a mod n


Euclid’s Algorithm

for computing gcd(a,b)

qi

q-1

q0

q1

qt-1

ri

r-2 = max(a, b)

r-1 = min(a, b)

r0

r1

rt-1 = gcd(a, b)

rt=0

i

-2

-1

0

1

t-1

t

ri+1 = ri-1 mod ri

ri-1

qi =

ri

ri+1 = ri-1 - qi ri


Euclid’s Algorithm

Example: gcd(36, 126)

qi

q-1= 3

q0= 2

q1

ri

r-2 = max(a, b) =126

r-1 = min(a, b) =36

r0= 18 = gcd(36, 126)

r1= 0

i

-2

-1

0

1

ri+1 = ri-1 mod ri

ri-1

qi =

ri

ri+1 = ri-1 - qi ri


Multiplicative inverse modulo n

The multiplicative inverse of a modulo n is an integer [!!!]

x such that

a x  1 (mod n)

The multiplicative inverse of a modulo n is denoted by

a-1 mod n (in some books a or a*).

According to this notation:

a a-1  1 (mod n)


Extended Euclid’s Algorithm (1)

ri = xi a + yi n

qi

q-1 =  n/a 

q0

q1

qt-1

yi

y-2=1

y-1=0

y0

y1

yt-1

yt

xi

x-2=0

x-1=1

x0

x1

xt-1

xt

ri

r-2 = n

r-1 = a

r0

r1

rt-1

rt=0

i

-2

-1

0

1

t-1

t

ri-1

qi =

ri

ri+1 = ri-1 - qi ri

xi+1 = xi-1 - qi xi

yi+1 = yi-1 - qi yi

rt-1 = xt-1 a + yt-1 n


Extended Euclid’s Algorithm (2)

rt-1 = xt-1 a + yt-1 n

rt-1 = xt-1 a + yt-1 n  xt-1 a (mod n)

If rt-1 = gcd (a, n) = 1 then

xt-1 a  1 (mod n)

and as a result

xt-1 = a-1 mod n


Extended Euclid’s Algorithm

for computing z = a-1 mod n

qi

q-1 =  n/a 

q0

q1

qt-1

ri

r-2 = n

r-1 = a

r0

r1

rt-1 = 1

rt=0

xi

x-2=0

x-1=1

x0

x1

xt-1 = a-1 mod n

xt = n

i

-2

-1

0

1

t-1

t

ri-1

qi =

ri

ri+1 = ri-1 - qi ri

xi+1 = xi-1 - qi xi

If rt-1 1 the inverse does not exist

Note:


Extended Euclid’s Algorithm

Example z = 20-1 mod 117

ri-1

qi

q-1 = 5

q0 = 1

q1 = 5

q2 = 1

q3 = 2

ri

r-2 = 117

r-1 = 20

r0 = 17

r1 = 3

r2 = 2

r3 = 1

r4 = 0

xi

x-2= 0

x-1= 1

x0 =-5

x1 = 6

x2 = -35

x3 = 41 = 20-1 mod 117

x4 = -117

i

-2

-1

0

1

2

3

4

qi =

ri

ri+1 = ri-1 - qi ri

xi+1 = xi-1 - qi xi

Check:

20  41 mod 117 = 1


Topic 3

RSA Encryption & Decryption

with

Montgomery Multipliers

based on Carry Save Adders


RSA as a trap-door one-way function

PUBLIC KEY

C = f(M) = Me mod N

M

C

M = f-1(C) = Cd mod N

PRIVATE KEY

N = P  Q

P, Q - large prime numbers

e  d  1 mod ((P-1)(Q-1))


Exponentiation: Y = XE mod N

Right-to-left binary

exponentiation

Left-to-right binary

exponentiation

E = (eL-1, eL-2, …, e1, e0)2

Y = 1;

S = X;

for i=0 to L-1

{

if (ei == 1)

Y = Y  S mod N;

S = S2 mod N;

}

Y = 1;

for i=L-1 downto 0

{

Y = Y2 mod N;

if (ei == 1)

Y = Y  X mod N;

}


Montgomery Modular Multiplication (1)

C = A  B mod M

A, B, M – k-bit numbers

Montgomery domain

Integer domain

A

A’ = A  2k mod M

B

B’ = B  2k mod M

C’ = MP(A’, B’, M) =

= A’  B’  2-k mod M =

= (A  2k)  (B  2k)  2-k mod M =

= A  B  2k mod M

C = A  B

C’ = C  2k mod M


Montgomery Modular Multiplication (2)

A

A’

A’ = MP(A, 22k mod M, M)

C

C’

C = MP(C’, 1, M)


Montgomery Modular Multiplication (3)

2k bits

X = A’B’

x2n-1

x2n-2

x2n-3

xn

. . .

. . .

x0

x1

+

q0M

x2n-1

x2n-2

0

x2n-3

xn

. . .

. . .

x1

+

q1Mb

x2n-1

x2n-2

0

0

x2n-3

x2

. . .

. . . . . .

C’ 2k = X + zM

C’ 2k X = A’B’

C’  A’B’ 2-k

0

0

. . .

0

C’

k bits


Fast modular exponentiation

using Chinese Remainder Theorem

d

N

=

C

M

mod

CP = C mod P

dP = d mod (P-1)

CQ = C mod Q

dQ = d mod (Q-1)

dQ

dP

=

CQ

Q

MQ

=

mod

CP

P

MP

mod

M = MP ·RQ + MQ ·RP mod N

where

RP = (P-1 mod Q) ·P = PQ-1 mod N

RQ = (Q-1 mod P) ·Q= QP-1 mod N


Time of exponentiation

without and with Chinese Remainder Theorem

SOFTWARE

Without CRT

tEXP(k) = cs k3

With CRT

k

1

tEXP-CRT(k)  2  cs ( )3 = tEXP(k)

2

4

HARDWARE

Without CRT

tEXP(k) = ch k2

With CRT

1

k

tEXP-CRT(k) ch ( )2 = tEXP(k)

4

2


Topic 4

RSA Encryption & Decryption

with

Word-Based

Montgomery Multipliers


Data dependency graph of a classical architecture

by Tenca & Koc


Data dependency graph of a new design

from GWU & GMU


Block diagram of the new architecture


Block diagram of the main Processing Element


Topic 5

p-1 Method of Factoring


p-1 algorithm

Inputs :

N– number to be factored

a– arbitrary integer such that gcd(a, N)=1

B1– smoothness bound for Phase1

Outputs:

q - factor of N, 1 < q ≤ N

or FAIL


p-1 algorithm – Phase 1

precomputations

main computations

postcomputations

out of scope for this project


p-1 Phase 1 – Numerical example

N = 1 740 719 = 1279·1361

a = 2

B1 = 20

k = 24·32·5·7·11·13·17·19 = 232 792 560

q0=ak mod N = 2232 792 560 mod 1 740 719 = 1 003 058

q = gcd (1 003 058  1; 1 740 719) = 1361

Why did the method work?

q-1 = 1360 = 2·5·17 | k

ak mod q = a(q-1)·m mod q = 1

q | ak-1


Design Methodology

Options


by Mike Babst

DSPlogic


Methodology 1

RTL VHDL

Classical VHDL-based

Design Methdology


Structure of a Typical Digital System

Data Inputs

Control Inputs

Control

Signals

Execution

Unit

(Datapath)

Control

Unit

(Control)

Data Outputs

Control Outputs


Hardware Design with RTL VHDL

Interface

Pseudocode

Control Unit

Execution Unit

Block

diagram

Block

diagram

ASM

VHDL code

VHDL code

VHDL code


Steps of the Design Process

  • Text description

  • Interface

  • Pseudocode

  • Block diagram of the Execution Unit

  • Interface with the division into Execution Unit

    and Control Unit

  • ASM chart and/or block diagram of the Control Unit

  • RTL VHDL code

  • Testbench

  • Debugging

  • Synthesis and implementation

  • Experimental testing (not required in this course)


Project 2 - Platform & tools

Target devices: Xilinx FPGAs

Tools:

VHDL Simulation: Aldec Active HDL or

Xilinx ModelSim

VHDL Synthesis: Synplify Pro or Xilinx XST

Implementation: Xilinx ISE or Xilinx WebPack

All tools available in S&T 2, rooms 203 & 265.

Xilinx tools available for free for home use.

Aldec Active HDL student edition available for home use.


Methodology 2

Graphical Data Flow Language

DSPlogic RCToolbox


See the presentation by

Mike Babst, PhD

DSPlogic

available through WebCT


Project 2 - Platform & tools

Target devices: Xilinx FPGAs

Tools:

Design Entry & Debugging:

DSPlogic RC Toolbox

MathWorks Simulink

MathWorks Matlab

Synthesis and Implementation:

Xilinx System Generator

Xilinx ISE

All tools available in S&T 2, room 220.


Two hands-on sessions

given by Dr. Babst

during the first two weeks after

the selection of the project


Reconfigurable computers

supported by DSPlogic toolset

Machine

Released

Cray XD1 fromfrom Cray

SGI Altix from

SGI

2005

2005


What is a

Reconfigurable Computer?

Microprocessor system

Reconfigurable system

. . .

P

P

. . .

FPGA

FPGA

P

memory

P

memory

FPGA

memory

FPGA

memory

. . .

. . .

Interface

Interface

I/O

I/O


Methodology 3

HLL Compilers

Celoxica Handel C


Design Flow

Executable Specification

Handel-C

VHDL

Synthesis

EDIF

EDIF

Place & Route


Handel-C / ANSI-C Comparisons

Handel-C Standard Library

ANSI-C Standard Library

Preprocessors

ie. #define

Parallelism

Pointers

Structures

Arbitrary width variables

Side Effects

ie. X = Y++

ANSI-C Constructs

for, while, if, switch

Arrays

RAM, ROM

Bitwise logical operators

Channels

Signals

Recursion

Logical operators

Interfaces

Arithmetic operators

Enhanced bit manipulation

Floating Point

Functions

ANSI-C

HANDEL-C


Handel-C Language (1)

A subset of ANSI-C

Sequential software style with a “par” construct to implement parallelism

A channel “chan” statement allows for communication and synchronization between parallel branches

Level of design abstraction is above RTL but below behavioral


Handel-C Language (2)

Each assignment and delay statement take one clock cycle

Automatic generation of the state machine from an algorithmic description of the circuit in terms of parallel and sequential blocks

Automatic scheduling of parallel and sequential blocks, that is the code following a group is scheduled only after that whole group has completed


Handel-C Language (3)

Automatic generation of clocks, clock enables and resets

Combinational logic may be implemented using for example bus, port and signal types

It is possible to design at a level where some Handel-C statements look similar to Verilog, but the overal program structure is different


Platform & tools – HLL Compilers

Target devices: Xilinx FPGAs

Tools:

Design Entry & Debugging:

Celoxica DK4 Design Suite

(integrated environment providing

Handel C compiler, debugging,

simulation, and synthesis to EDIF

and VHDL)

Synthesis and Implementation:

Xilinx ISE

All tools available in S&T 2, rooms 203 & 265.


VHDL macro declaration in Handel-C

ENTITY parmult IS

port (

clk: IN std_logic;

a: IN std_logic_VECTOR(7 downto 0);

b: IN std_logic_VECTOR(7 downto 0);

q: OUT std_logic_VECTOR(15 downto 0));

END parmult;

interface parmult (unsigned 16 q) parmult_instance (unsigned 1 clk, unsigned 8 a, unsigned 2 b) with {busformat = "B(I)"};


VHDL macro instantiation in Handel-C

unsigned 8 x1, x2;

unsigned resultX;

interface parmult

(unsigned 16 q)

parmult_instance1

(unsigned 1 clk = __clock,

unsigned 8 a = x1,

unsigned 8 b = x2 )

with {busformat = "B(I)"};


Celoxica RC10 board supporting Handel C libraries

used in the GMU ECE 448 FPGA and ASIC Design with VHDL


Literature

Additional literature with the detailed

description of all algorithms available

for each project.


Project Organization

  • 1-3 person teams allowed

  • 2 person teams preferred

    by Friday midnight the latest

    Please submit your

    - ranking of 4 topics

    - ranking of 3 design methodologies


  • Login