1 / 77

Industrial Semantics Or How to Stop the Maths Getting in the Way of the Marketing Joe Stoy

Industrial Semantics Or How to Stop the Maths Getting in the Way of the Marketing Joe Stoy Founder and Principal Engineer Bluespec, Inc. (with help from many at Bluespec) APPSEM05 Workshop, 15 September 2005. Basic Message. An industrial tool needs good semantics Robust Simple

tadeo
Download Presentation

Industrial Semantics Or How to Stop the Maths Getting in the Way of the Marketing Joe Stoy

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Industrial Semantics Or How to Stop the Maths Getting in the Way of the Marketing Joe Stoy Founder and Principal Engineer Bluespec, Inc. (with help from many at Bluespec) APPSEM05 Workshop, 15 September 2005

  2. Basic Message • An industrial tool needs good semantics • Robust • Simple • Conforming to users’ model • But the theory must be “under the covers” • Learning curve • Perceived learning curve

  3. Outline • A technology based on Term-Rewriting Systems • A superstructure based on functional programming semantics • Semantic issues

  4. Tool and market • For designing chips (ASICs, FPGAs, ...) • currently low-level with Verilog or VHDL • chip complexity rising (millions of gates) • For chip designers, verification engineers, system architects • ASICs have huge NREs ($500K–$1M) • mistakes (respins) cost another NRE • tools run into millions of $$$ per team, form a significant fraction of a company’s budget (e.g., ~10%) • tools tend to run on UNIX (Solaris, Linux)

  5. History of this technology Research@MIT on high-level synthesis & verification (Prof. Arvind et. al.) Technology Productization within industry Major pilot project: Arbiter for 160 Gb/s router (1.5M gates, 200 MHz, 1.3m) Technology Bluespec, Inc. High-level synth. tool Product available VC funding ~1996 2000 2003 2004

  6. Design Flow BluespecSystemVerilog Transaction CycleAccurateSimulation DALDesign Assertion Level BluespecSynthesis Bluesim EventBasedSimulation Verilog RTL RTLSynthesis Netlist

  7. SystemC/Verilog Bluespec Bluespec SystemC/ Verilog Cosim: Typical Use Models Bluespec integrated into SystemC/Verilog SystemC/Verilog integrated into Bluespec Where Bluespec is: • …integrated into Verilog System-on-Chip (SoC) design • …back-annotated into SystemC model • …part of mixed SystemC/Bluespec model Where Bluespec is designed with: • …existing Verilog IP re-used • …a SystemC model awaiting the Verilog

  8. A technology based on term-rewriting systems

  9. Term-rewriting systems while some rule is applicable • choose an applicable rule • apply it to the term (or a subcomponent) • FP’s standard operational semantics • Our rewritings are less free-wheeling (don’t change structure of term) • maybe “state transition system” a better name

  10. Clocked synchronous hardware The compiler translates BSV source code into Verilog RTL Collection of State Elements I S“Next” S O Transition Logic

  11. p1 pn d1 dn Scheduling and control logic Modules’ (Current state) Modules’ (Next state) “CAN_FIRE” “WILL_FIRE” Rules p1 f1 Scheduler fn pn d1 Muxing cond action dn

  12. Term-rewriting systems while some rule is applicable • choose an applicable rule • apply it to the term (or a subcomponent) • FP’s standard operational semantics • Our rewritings are less free-wheeling (don’t change structure of term) • Rules are constructed from guarded atomic actions with interfaces

  13. Guarded Atomic Actions • Actions are guarded fifo.enq(x); • if fifo full, cannot happen • hides lots of tedious bureaucracy • Actions are atomic

  14. Atomicity • atomic

  15. Atomicity • ατομος

  16. Atomicity • a-tomic

  17. Atomicity • a-tomic • not • asymmetric • atypical • amoral

  18. Atomicity • a-tomic • not • asymmetric • atypical • amoral • cut • microtome • tomography • tome (of a multi-volume book)

  19. Atomicity • Rules are atomic • “Not cut” • Whenever they run, they run to completion • never interrupted • No other activities are interleaved with them • This greatly simplifies design • avoids many race conditions

  20. Guarded Atomic Actions • Actions can be composed a1;a2 • resulting action atomic • guarded by guards of a1 and a2

  21. Guarded Atomic Actions • Conditionals: if (b) a1; • Guarded by “b implies (a1’s guards)” • Another conditional: when (b) a1; • guarded by b (and a1’s guards) • Yet another perhaps-if (b) a1; • unguarded • a1’s guards conjoined to b • Nice algebra • (separate question — which to have in BSV)

  22. … with interfaces • A BSV design is structured by modules • Modules communicate only through interfaces

  23. module Modules and interfaces state interface rule

  24. An example module mkTest (); int n = 15; // constant Reg#(int) state <- mkReg(0); NumFn f <- mkFact2(); rule go (state == 0); f.start (n); state <= 1; endrule rule finish (state == 1); $display (“Result is %d”, f.result()); state <= 2; endrule endmodule: mkTest interface NumFn; method Action start (int n); method int result (); endinterface module mkFact2 (NumFn); Reg#(int) x <- mkReg(?); Reg#(int) j <- mkReg(0); rule step (j > 0); x <= x * j; j <= j - 1; endrule method start (n) if (j == 0); x <= 1; j <= n; endmethod method result () if (j == 0); return x; endmethod endmodule: mkFact2

  25. Method invocationsfit into rules module mkTest () ; … Fact f <- mkFact2(); … rule finish (state==1); $display(“…%d”, f.result()) state <= 2; endrule endmodule module mkFact2 (Fact); Reg#(int) x <- mkReg(?) Reg#(int) j <- mkReg(n); … method result () if (j == 0); return x; endmethod endmodule • Rule condition is: state==1 && j==0 • Explicit condition and all implicitconditionsof all method calls in the rule • Thus, • a part of the rule’s condition (j == 0), and • a part of a rule’s computation (reading x) are in a different module, via a method invocation

  26. Modularizing rules module mkTest () ; … Fact f <- mkFact2(); rule go (state==0); f.start(15); state <= 1; endrule … endmodule module mkFact2 (Fact); Reg#(int) x <- mkReg(?) Reg#(int) j <- mkReg(0); … method start (int n) if (j == 0); x <= 1; j <= n; endmethod endmodule • Rule condition: state==0 && j==0 • Rule actions: state<=1, x<=1 and j<=15 • Thus, a part of the rule’s action is in a different module

  27. Order of Evaluation • Not Lazy (e.g. Haskell’s) • Schedule as many rules as possible in each clock cycle • (patented technology – James Hoe et al)

  28. Clocked synchronous hardware The compiler translates BSV source code into Verilog RTL Collection of State Elements I S“Next” S O Transition Logic

  29. Rule semanticsmapped to hardware semantics rule steps Ri Rj Rk Rules Rj Rk HW clocks Ri The effect of each cycle is as if a sequence of rules was executed one-at-a-time Consequence: The HW state can never result from an interleaving of actions from different rules Rule atomicity (therefore, correctness) is preserved

  30. p1 pn d1 dn Scheduling and control logic Modules’ (Current state) Modules’ (Next state) “CAN_FIRE” “WILL_FIRE” Rules p1 f1 Scheduler fn pn d1 Muxing cond action dn

  31. … leads to pragmaticconstraints on rule combination • Initial set: • A rule fires within a clock cycle • A rule fires at most once in a clock cycle • A rule’s effect is only visible in the next clock • We only combine rules in a certain fixed order within a cycle • All rules which read a register must precede any which write it • We only consider rules enabled at the start of the clock cycle • Each rule is independent of previous rules executing in the same cycle • The logic path delay depends on individual rule paths, and not on combinations of rules • … • (Some since relaxed)

  32. Benefits of atomic-action semantics

  33. Consider this example • Process 0 increments register x • Process 1 transfers a unit from register x to register y • Process 2 decrements register y • This is an abstraction of some real applications: • Bank account: 0 = deposit to checking, 1 = transfer from checking to savings, 2 = withdraw from savings • Packet processor: 0 = packet arrives, 1 = packet is processed, 2 = packet departs • … 0 2 1 +1 -1 +1 -1 x y

  34. 0 2 1 +1 -1 +1 -1 x y Concurrency in the example cond0 cond1 cond2 • Process j (= 0,1,2) only updates under condition condj • Only one process at a time can update a register. Note: • Process 0 and 2 can run concurrently if process 1 is not running • Both of process 1’s updates must happen “indivisibly” (else inconsistent state) • Suppose we want to prioritize process 2 over process 1 over process 0 Process priority: 2 > 1 > 0

  35. 0 2 1 +1 -1 +1 -1 x y Is either correct? cond0 cond1 cond2 Process priority: 2 > 1 > 0 if ((!cond1 || cond2) && cond0) always @(posedge CLK) // process 0 if (!cond1 && cond0) x <= x + 1; always @(posedge CLK) // process 1 if (!cond2 && cond1) begin y <= y + 1; x <= x – 1; end always @(posedge CLK) // process 2 if (cond2) y <= y – 1; always @(posedge CLK) begin if (!cond2 && cond1) x <= x – 1; else if (cond0) x <= x + 1; if (cond2) y <= y – 1; else if (cond1) y <= y + 1; end Where’sthe error?  Which of these solutions are correct, if any? What’s required to verify that they’re correct? Now, what if I Δ’d the priorities: 1 > 2 > 0? And, what if the processes are in different modules?

  36. 0 2 1 +1 -1 +1 -1 x y With Bluespec, design is direct Process priority: 2 > 1 > 0 cond0 cond1 cond2 (* descending_urgency = “proc2, proc1, proc0” *) rule proc0 (cond0); x <= x + 1; endrule rule proc1 (cond1); y <= y + 1; x <= x – 1; endrule rule proc2 (cond2); y <= y – 1; endrule Functional correctness follows directly from rule semantics Related actions are grouped naturally with their conditions—easy to change Interactions between rules are managed by the compiler (scheduling, muxing, control) Same hardware as the RTL

  37. Reorder Buffer Verification-centric design

  38. FIFO FIFO FIFO FIFO FIFO FIFO FIFO FIFO Example from CPU design RegisterFile RegisterFile • Speculative, out-of-order • Many, many concurrent activities Re-OrderBuffer(ROB) Re-OrderBuffer(ROB) ALUUnit ALUUnit Decode Decode Fetch Fetch FIFO FIFO MEMUnit MEMUnit Branch Branch InstructionMemory InstructionMemory DataMemory DataMemory Nirav Dave, MEMOCODE, 2004

  39. E Get operandsfor instr W Writebackresults Di K State Instruction Operand 1 Operand 2 Result Do Head Get a readyALU instr Put MEM instr results in ROB Put ALU instr results in ROB Insert aninstr intoROB Tail Empty Waiting Resolvebranches Dispatched Killed Done ROB actions RegisterFile Re-Order Buffer Instr - V - V - - E Instr - V - V - - E W Instr A V 0 V 0 - ALUUnit(s) W Instr B V 0 V 0 - W Instr C V 0 V 0 - DecodeUnit V 0 W Instr D V 0 - E Instr - V - V - - E Instr - V - V - - E Instr - V - V - - Get a readyMEM instr MEMUnit(s) E Instr - V - V - - Instr - V - V - - E Instr - V - V - - E Instr - V - V - - E E Instr - V - V - - Instr - V - V - - E Instr - V - V - - E

  40. But, what about allthe potential race conditions? • Reading from the register file at the same time a separate instruction is writing back to the same location • Which value to read? • An instruction is being inserted into the ROB simultaneously with a dependent upstream instruction’s result coming back from an ALU • Put a tag or the value in the operand slot? • An instruction is being inserted into the ROB simultaneously with a branch mis-prediction • must kill the mis-predicted instructions and restore a “consistent state” across many modules

  41. Dispatch Instr • Mark instructiondispatched • Forward to appropriateunit • Insert Instr in ROB • Put instruction in firstavailable slot • Increment tail pointer • Get source operands • - RF <or> prev instr • Write Back Results to ROB • Write back results toinstr result • Write back to all waitingtags • Set to done • Commit Instr • Write results to registerfile (or allow memorywrite for store) • Set to Empty • Increment head pointer • Branch Resolution • … • … • … Rule Atomicity • Lets you code each operation in isolation • Eliminates the nightmare of race conditions (“inconsistent state”) under such complex concurrency conditions All behaviors are explicable as a sequence of atomic actions on the state

  42. Performance Semantics Another processor example

  43. RF IF Dec Exe Mem Wb bI bD bE bW iMem dMem Rule-based Specifications bypasses • Each pipeline stage is described as a set of atomic rules: R1 = 2 + 3 R1 = 5 rule Execute Add: when(bD.first == (Ri = va + vb)) ==> begin result = va + vb; // compute addition bE.enq (Ri = result); // enqueue result into bE bD.deq; // dequeue instruction from bd end Any legal behavior can be understood in terms of applying one rule at a time

  44. RF IF Dec Exe Mem Wb bI bD bE bW iMem dMem Performance Concerns • The designer wants to make sure that one instruction executes every cycle • FIFOs must support both enq and deq in each cycle A cycle in slow motion I4 I3 I2 I1 I0 I5

  45. What are the semantics of FIFOs?

  46. data_inpush_req_npop_req_nclkrstn data_outfullempty Example from a commerciallyavailable FIFO IP component These constraints are taken from several paragraphs of documentation, spread over many pages, interspersed with other text

  47. A FIFO interface in BSV interface FIFOQueue #(type aType); method Action push (aType val); method aType first(); method ActionValue#(aType) pop(); method Action clear(); endinterface

  48. enab n pop rdy not empty Methods as ports • push: • n-bit argument • has side effect (action) • first: • n-bit result • has no side effect • pop: • n-bit result • has side effect (action) • clear: • no argument • has side effect (action) n enab push rdy not full n rdy first not empty FIFOQueue module enab clear rdy always true

  49. FIFO semantics: types interface FIFOQueue #(type aType); method Action push (aType val); method aType first(); method ActionValue#(aType) pop(); method Action clear(); endinterface

  50. FIFO semantics: laws • Algebra of enq, deq, etc • Not at present part of BSV • Though SVA assertions are • Needed for formal verification work • So far, all in atomic-action world

More Related