1 / 21

A Hierarchical Modeling Framework for On-Chip Communication Architectures

A Hierarchical Modeling Framework for On-Chip Communication Architectures. Xinping Zhu, Sharad Malik Department of Electrical Engineering Princeton University. Outline. Introduction Design Space of On-Chip Communication Architectures (OCAs) Modeling Methodology Modeling Infrastructure

adli
Download Presentation

A Hierarchical Modeling Framework for On-Chip Communication Architectures

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Hierarchical Modeling Framework for On-Chip Communication Architectures Xinping Zhu, Sharad Malik Department of Electrical Engineering Princeton University

  2. Outline • Introduction • Design Space of On-Chip Communication Architectures (OCAs) • Modeling Methodology • Modeling Infrastructure • Case Studies: CoreConnect/AMBA Bus and RAW Network Architectures • Conclusions and Future Work

  3. Architecture of the SoC

  4. Wide Diversity of Choices • Examples • Bus • CoreConnect Bus Architecture (IBM) • AMBA (ARM) • Packet Switching Network • RAW Architecture (MIT) Tile-based 2D-mesh interconnection network • Alpha 21364 (DEC/HP) 2D-torus interconnection network • Challenge: • Design and Validation of Systems with OCAs • Selecting an appropriate OCA • Validating the final design

  5. Modeling of PEs and OCAs Challenge: What is the OCA “ISA”?

  6. Modeling On-Chip Communication Architectures (OCAs) • Functional Primitives • Shared Memory Model • OCA read(x, u) moves data x in the shared memory into local variable u • OCA write (y, v) writes the value of local variable y into shared variable v • Message Passing Model • OCA send (x, I) sends the value of x to PE i asynchronously • OCA receive (y, j) receives the value of x from PE jsynchronously

  7. CommunicationComputeFunction(…) { int image[IMAGE_SIZE],image2[IMAGE_SIZE]; // Load Input Images receive(PE0, image, IMAGE_SIZE); // Run Kernels convolve (image,image2); // Store Output send(PE1, image2, IMAGE_SIZE); } Façade OCA Functional Primitives OCA Operational Intrinsics PE datapath OCA modules Façade between PE and OCA provide a unified interface to a set of interfaces in the OCA subsystem Communication Sequence(…) { wait(BufferNotFull){ while( !AllReceived){ ReceivePacket; }}... while (HaveCredit){ SendPacket; }}

  8. What are the structural primitives for OCA? Computer Architects • Machine Structure Composed of hierarchical modules The universe of these modules needs to be explored • How to organize these modules? Need classification • PE microarchitectures Well-understood • OCA architecture Still need to be done Structural Specification

  9. Link Mux Duplex Link DeMux CrossBar Bus Backplane Synchronous Backplane Asynchronous Backplane AMBA Backplane CoreConnect Backplane Buffer FIFO Multi Queue Central Pool • Buffer • Parameter Depth, datawidth • Buffering policy FIFO, MultiQueue, etc. Class Hierarchy of OCA Structural Primitives Module • Link • Parameters Datawidth, latency • Ports

  10. Interface SendInterface ReceiveInterface SlaveInterface MasterInterface ResourceScheduler Allocator Arbiter • ResourceScheduler • Arbitration policy • Port convention Class Hierarchy of OCA Structural Primitives Module • Interface • Implement façade Translate datapath action into OCA actions

  11. Link Mux Duplex Link DeMux CrossBar Bus Backplane Synchronous Backplane Asynchronous Backplane AMBA Backplane CoreConnect Backplane Buffer FIFO Multi Queue Central Pool Interface ReceiveInterface SendInterface SlaveInterface MasterInterface ResourceScheduler Allocator Arbiter Class Hierarchy of OCA Structural Primitives Module

  12. Operational Semantics TimingBehavior MachineDescription Module/ Actor MachineConfiguration ExecutableModel Simulator Kernel Stimulus Execution Performance Modeling Infrastructure • Two modeling and simulation environments • Ptolemy II from UC Berkeley an object-oriented, heterogeneous design and modeling framework • Liberty Simulation Environment (LSE) from Princeton a fast execution-driven compiled-code modeling and simulation framework

  13. Case Studies • Two types of on-chip interconnection scheme • On-chip buses - AMBA bus and IBM CoreConnect Bus • On-chip packet switching network - RAW network from MIT What are the basic components?

  14. Components of Bus Architectures

  15. A RAW On-chip Network

  16. West West South South East East North North Local Local Inside the Router Scheduler request in out select Buf West config Buf South in out Crossbar 5 x 5 Buf East out in Buf North out in Buf Local in out router architecture grant

  17. Design Space Exploration Illustrated public class Arbiter extends ResourceScheduler( ) { InputPort request[SIZE]; OutputPort grant[SIZE]; public int ProcessRequest(){ // process request with round-robin algorithm ...} ... } public class PriorityArbiter extends Arbiter( ) { InputPort request[SIZE]; InputPort priority[SIZE]; OutputPort grant[SIZE]; public int ProcessRequest(){ // process requests with priority ...} ... } • Resource Scheduler Functionality: Arbitrate the communication resources such as bus backplane, crossbar, etc. • Reusability Analysis • Request/grant ports convention • Encapsulation of arbitration algorithms

  18. Toolset Evaluation • Design Space Exploration • Two on-chip bus systems and one on-chip packet-switching network • Reusability • “White-box” reuse • Reusable class hierarchy • Boosting productivity • “Plug and Play” modules A stable and well-maintained library reduces developing time and increases productivity • Flexibility and extensibility • Shared central buffer, integrated power/performance simulation

  19. Simulation Results Simulation Speed vs. Development Time • Language Issue Java vs. C • Scheduler Efficiency DE vs. Clocked based • Modeling Granularity Speed vs. accuracy tradeoff *: updated simulation speed and code size due to the reimplementation of the simulation framework and key scheduling algorithms All simulation runs on a dual PentiumIII 800MHz machine with Linux.

  20. Conclusions and Future Work • Need Design Space Exploration for OCAs • Reusable Structural Components of OCAs • Fast and Accurate Performance Enables Early Evaluation • Modeling Tradeoffs • Future Work • SystemC Based Modeling • Integrated Application Driven Simulation Environment • Orion: An integrated performance/power network simulator

  21. Acknowledgements • Part of the MESCAL Project Modern Embedded Systems Compilers Architectures and Languages Princeton and UC Berkeley www.gigascale.org/mescal mescal.princeton.edu • A Gigascale Silicon Research Center (GSRC) effort www.gigascale.org Funded by DARPA and MARCO • Liberty Research Group @ Princeton http://liberty.cs.princeton.edu

More Related