120 likes | 254 Views
Written by:. Haim Natan Benny Pano. Supervisor:. Gregory Mironov. Technion – Israel Institute of Technology Department of Electrical Engineering High Speed Digital Systems Lab. Inverse Matrix Accelerator. Characterization Presentation. Project Goal. Designing and implementing an FPGA
E N D
Written by: Haim Natan Benny Pano Supervisor: Gregory Mironov Technion – Israel Institute of Technology Department of Electrical Engineering High Speed Digital Systems Lab Inverse Matrix Accelerator Characterization Presentation
Project Goal Designing and implementing an FPGA circuitry that inverses a matrix by using a Monte-Carlo based algorithm.
Inverter Matrix Inverted matrix Input & Output • Input: • * A 625x625 matrix Output: * A 625x625 inversed matrix
Project Requirements • The matrix will be of size 625x625 • Matrix elements will be of type 64 bits double precision floating point • The inverted matrix should be accurate as much as possible • Calculation time < 20ms
N – number of markov chains T – length of each chain b – an inversed element MP() – a chain generator bi,j := 0; For c := 1 to N do { k0 := i ; w0 := 1 ; For t := 1 to T do { kt := MP( kt-1 ) ; wt := sign(dkt-1,kt) * wt-1 * Ekt-1 ; if kt = j then bi,j += wt ; } } bi,j /= N ; The algorithm (simplified version)
The algorithm (continued) • D = I – A • Ei =Σj|di,j| • P is a transition probability matrix such that pi,j= |di,j| /Ei
Implementation Guide-lines • Loop unroll (for c and t) • Pipeline • Use built-in multipliers • Parallelize operations • Cut out the fat
Pre Algorithm RAM RAM P Basic Flow Diagram FPGA A Monte-Carlo Algorithm k Chain Generator (E) RAM B P, (E)
T k = i MP MP MP E1 SW SW SW En SW SW SW 0 bi,j A A A Initial algorithm architecture
SW A Kin Tin Tin Ein Eout Win Wout * Kin Wint Rin Rout Cin Cout Kout Tout Vin Vout Switch & Accumulator Eout = Ein Rout = Rin Kout = Kin If Rin = Kin Then Tout = Ein Else Tout = Tin Cout = Cin Wout = Win * Tin Wint = Wout If Cin = Kin Then Vout = Vin + Wint Else Vout = Vin
Past Achievements • 32x32 Matrix • 8-bit Fixed Point • T = 8 • Time < 1ms • Close to maximum utilization of Virtex II
VHDL To-do List • Floating point arithmetic units • Algorithm • Markov chain generator • Simulation • Synthesis