1 / 50

CS184a: Computer Architecture (Structures and Organization)

CS184a: Computer Architecture (Structures and Organization). Day11: October 30, 2000 Interconnect Requirements. Last Time. Saw various compute blocks Role of automated mapping in exploring design space To exploit structure in typical designs we need programmable interconnect

ellery
Download Presentation

CS184a: Computer Architecture (Structures and Organization)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS184a:Computer Architecture(Structures and Organization) Day11: October 30, 2000 Interconnect Requirements Caltech CS184a Fall2000 -- DeHon

  2. Last Time • Saw various compute blocks • Role of automated mapping in exploring design space • To exploit structure in typical designs we need programmable interconnect • All reasonable, scalable structures: • small to moderate sized logic blocks • connected via programmable interconnect • been saying delay across programmable interconnect is a big factor Caltech CS184a Fall2000 -- DeHon

  3. Today • Interconnect Design Space • Dominance of Interconnect • Simple things • and why they don’t work • Interconnect Implications • Characterizing Interconnect Requirements Caltech CS184a Fall2000 -- DeHon

  4. Dominant Area Caltech CS184a Fall2000 -- DeHon

  5. Dominant Time Caltech CS184a Fall2000 -- DeHon

  6. Dominant Time Caltech CS184a Fall2000 -- DeHon

  7. Dominant Power XC4003A data from Eric Kusse (UCB MS 1997) Caltech CS184a Fall2000 -- DeHon

  8. For Spatial Architectures • Interconnect dominant • area • power • time • …so need to understand in order to optimize architectures Caltech CS184a Fall2000 -- DeHon

  9. Interconnect • Problem • Thousands of independent (bit) operators producing results • true of FPGAs today • …true for *LIW, multi-uP, etc. in future • Each taking as inputs the results of other (bit) processing elements • Interconnect is late bound • don’t know until after fabrication Caltech CS184a Fall2000 -- DeHon

  10. Design Issues • Flexibility -- route “anything” • (w/in reason?) • Area -- wires, switches • Delay -- switches in path, stubs, wire length • Power -- switch, wire capacitance • Routability -- computational difficulty finding routes Caltech CS184a Fall2000 -- DeHon

  11. (1) Shared Bus • Familiar case • Use single interconnect resource • Reuse in Time • Consequence? Caltech CS184a Fall2000 -- DeHon

  12. Shared Bus • Consider operation: y=Ax2 +Bx +C • 3 mpys • 2 adds • ~5 values need to be routed from producer to consumer • Performance lower bound if have design w/: • m multipliers • u madd units • a adders • i simultaneous interconnection busses Caltech CS184a Fall2000 -- DeHon

  13. Viewpoint • Interconnect is a resource • Bottleneck for design can be in availability of any resource • Lower Bound on Delay: Logical Resource / Physical Resources • May be worse • dependencies • ability to use resource Caltech CS184a Fall2000 -- DeHon

  14. Flexibility (+) routes everything (given enough time) can be trick to schedule use optimally Delay (Power) (--) wire length O(kn) parasitic stubs: kn+n series switch: 1 O(kn) sequentialize I/B Area (++) knswitches O(n) Shared Bus Caltech CS184a Fall2000 -- DeHon

  15. Term: Bisection Bandwidth • Partition design into two equal size halves • Minimize wires (nets) with ends in both halves • Number of wires crossing is bisection bandwidth Caltech CS184a Fall2000 -- DeHon

  16. (2) Crossbar • Avoid bottleneck • Every output gets its own interconnect channel Caltech CS184a Fall2000 -- DeHon

  17. Crossbar Caltech CS184a Fall2000 -- DeHon

  18. Crossbar Caltech CS184a Fall2000 -- DeHon

  19. Flexibility (++) routes everything (guaranteed) Delay (Power) (-) wire length O(kn) parasitic stubs: kn+n series switch: 1 O(kn) Area (-) Bisection bandwidth n kn2 switches O(n2) Crossbar Caltech CS184a Fall2000 -- DeHon

  20. Crossbar • Too expensive • Switch Area = k*n2*2.5Kl2 • Switch Area/LUT = k*n* 2.5Kl2 • n=1024, k=4 => 10M l2 • What can we do? Caltech CS184a Fall2000 -- DeHon

  21. Avoiding Crossbar Costs • Typical architecture trick: • exploit expected problem structure Caltech CS184a Fall2000 -- DeHon

  22. Avoiding Crossbar Costs • Typical architecture trick: • exploit expected problem structure • We have freedom in operator placement • Designs have spatial locality • =>place connected components “close” together • don’t need full interconnect? Caltech CS184a Fall2000 -- DeHon

  23. Exploit Locality • Wires expensive • Local interconnect cheap • 1D versions? • (explore on hmwrk) Caltech CS184a Fall2000 -- DeHon

  24. Exploit Locality • Wires expensive • Local interconnect cheap • Use 2D to make more things closer • Mesh? Caltech CS184a Fall2000 -- DeHon

  25. Mesh Analysis • Can we place everything close? Caltech CS184a Fall2000 -- DeHon

  26. Mesh “Closeness” • Try placing “everything” close Caltech CS184a Fall2000 -- DeHon

  27. Flexibility - ? Ok w/ large w Delay (Power) Series switches 1--n Wire length w--n Stubs O(w)--O(wn) Area Bisection BW -- wn Switches -- O(nw) O(w2n) [linear pop] larger on homework Mesh Analysis Caltech CS184a Fall2000 -- DeHon

  28. Mesh • Plausible • …but What’s w • …and how does it grow? Caltech CS184a Fall2000 -- DeHon

  29. Characterize Locality • Want to exploit locality • How much locality do we have? • Impact on resources required? Caltech CS184a Fall2000 -- DeHon

  30. Bisection Bandwidth • Bisect design • Bisection bandwidth of design • => lower bound on network bisection bandwidth • Design with more locality • => lower bisection bandwidth • Enough? N/2 cutsize N/2 Caltech CS184a Fall2000 -- DeHon

  31. Characterizing Locality • Single cut not capture locality within halves • Cut again • => recursive bisection Caltech CS184a Fall2000 -- DeHon

  32. Regularizing Growth • How do bisection bandwidths shrink (grow) at different levels of bisection hierarchy? • Basic assumption: Geometric • 1 • 1/ • 1/2 Caltech CS184a Fall2000 -- DeHon

  33. Geometric Growth • (F,a)-bifurcator • F bandwidth at root • geometric regression a at each level Caltech CS184a Fall2000 -- DeHon

  34. Good Model? Log-log plot ==> straight lines represent geometric growth Caltech CS184a Fall2000 -- DeHon

  35. Rent’s Rule • Long standing empirical relationship • IO = C*NP • 0P 1.0 • compare (F,a)-bifurcator • a= 2P • Captures notion of locality • some signals generated and consumed locally • reconvergent fanout Caltech CS184a Fall2000 -- DeHon

  36. Monday class stopped here Caltech CS184a Fall2000 -- DeHon

  37. Rent’s Rule • Typically consider • 0.5P 0.75 • “High-Speed” Logic P=0.67 • Memory (P~0.1-0.2) • Example (i10) • max C=7, P=0.68 • avg C=5, P=0.72 Caltech CS184a Fall2000 -- DeHon

  38. What tell us about design? • Recursive bandwidth requirements in network Caltech CS184a Fall2000 -- DeHon

  39. What tell us about design? • Recursive bandwidth requirements in network • lower bound on resource requirements • N.B. necessary but not sufficient condition on network design • I.e. design must also be able to use the wires Caltech CS184a Fall2000 -- DeHon

  40. What tell us about design? • Interconnect lengths • Intuition • if p>0.5, everything cannot be nearest neighbor • as p grows, so wire distances Caltech CS184a Fall2000 -- DeHon

  41. What tell us about design? • Interconnect lengths • IO=(n2)P cross distance n • dIO/dn end at exactly distance n • E(l)=Integral 0 to n=N • of n*(dIO/dn)/n2 • assume iid sources • E(l)=O(N(p-0.5)) • p>0.5 Caltech CS184a Fall2000 -- DeHon

  42. IONP Bisection BWNP side length NP N if p<0.5 Area N2p p>0.5 What Tell us about design? N.B. 2D VLSI world has “natural” Rent of P=0.5 (area vs. perimeter) Caltech CS184a Fall2000 -- DeHon

  43. Rent’s Rule Caveats • Modern “systems” on a chip -- likely to contain subcomponents of varying Rent complexity • Less I/O at certain “natural” boundaries • System close • (Rent’s Rule apply to workstation, PC, PDA?) Caltech CS184a Fall2000 -- DeHon

  44. Area/Wire Length • Bad news • Area ~ O(N2p) • faster than N • Avg. Wire Length ~ O(N(p-0.5)) • grows with N • Can designers/CAD control p (locality) once appreciate its effects? • I.e. maybe this cost changes design style/criteria so we mitigate effects? Caltech CS184a Fall2000 -- DeHon

  45. What Rent didn’t tell us • Bisection bandwidth purely geometrical • No constraint for delay • I.e. a partition may leave critical path weaving between halves Caltech CS184a Fall2000 -- DeHon

  46. Critical Path and Bisection Minimum cut may cross critical path multiple times. Minimizing long wires in critical path => increase cut size. Caltech CS184a Fall2000 -- DeHon

  47. Rent Weakness • Not account for path topology • ? Can we define a “Temporal” Rent which takes into consideration? • Promising research topic Caltech CS184a Fall2000 -- DeHon

  48. Finishing Up... Caltech CS184a Fall2000 -- DeHon

  49. Big Ideas[MSB Ideas] • Interconnect Dominant • power, delay, area • Can be bottleneck for designs • Can’t afford full crossbar • Need to exploit locality • Can’t have everything close Caltech CS184a Fall2000 -- DeHon

  50. Big Ideas[MSB Ideas] • Rent’s rule characterize locality • => Area growth O(N2p) • p>0.5 => interconnect growing faster than compute elements • expect interconnect to dominate other resources Caltech CS184a Fall2000 -- DeHon

More Related