1 / 24

Optimizations for a Simulator Construction System Supporting Reusable Components

Optimizations for a Simulator Construction System Supporting Reusable Components. David A. Penry and David I. August The Liberty Architecture Research Group Princeton University. Architecture Options. Architectural Simulator. Architectural Exploration.

Download Presentation

Optimizations for a Simulator Construction System Supporting Reusable Components

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Optimizations for a Simulator Construction System Supporting Reusable Components David A. Penry and David I. August The Liberty Architecture Research Group Princeton University

  2. Architecture Options Architectural Simulator Architectural Exploration • Architectural options are studied using simulators • More iterations = better decisions • Need fast path to simulator • Need fast simulator

  3. Architecture Description Simulator Builder Architectural Simulator Instance Simulator Construction Systems • Reuse simulator infrastructure • But still must be able to reuse descriptions • Structural composition • Medium-grained components • Standard communication contracts • High parameterizability • Separation of concerns

  4. The Reuse Penalty • Reusability leads to a speed penalty: • more component instances • more signals • more general code • Therefore: reusable systems are often slower How can we mitigate the reuse penalty?

  5. Data Enable Ack Liberty Simulation Environment • Simulator construction system for high reuse • Two-tiered specifications • Leaf module templates in C • Netlisting language for instantiation and customization • Three-signal standard communications contract with overrides (control functions) • Code is generated

  6. Contrast: SystemC • Simulator construction libraries (C++) • Partially supports reuse: + Structural composition + Module granularity varies ? Communications contracts by convention - Low parameterizability - Separation of concerns • Description is a C++ program

  7. System C uses Discrete Event (DE) LSE uses Heterogenous Synchronous Reactive (HSR) Edwards (1997) Unparsed code blocks (black boxes) Values begin unresolved and resolve monotonically Chaotic scheduling A C A A A C C C A B B B B B B B A C C A C D D D D D D D Models of Computation

  8. B A C D Potential HSR Benefits vs. DE • Static schedules possible • Lower per-signal overhead • Use of unresolved value to avoid redundant computation

  9. Experimental methodology • Three models of a 4-way out-of-order microprocessor • SystemC using custom speed-optimized components • LSE model using custom speed-optimized components • LSE model using standard reusable components • 9 benchmarks (CPU 2000/MediaBench) • See paper for compiler, etc. Non-edge signals Model Signals Instances Custom SystemC 4 71 32 Custom LSE 3 138 48 Reusable LSE 11 489 423

  10. Custom LSE vs. SystemC • Custom LSE outperforms custom SystemC • Reduction in overhead • Use of unresolved signal value • Static instantiation and code specialization • Dynamic schedule for both

  11. Reuse Penalty • Reusable model suffers large reuse penalty (0.26) • Many more signals • Many more non-edge signals • More components • All dynamic schedules

  12. A C D B Creating Static Schedules • Edward’s algorithm (1997) • Construct a signal dependency graph • Break into strongly-connected components (SCC). Schedule in topological order • Partition each SCC into a head and tail • Schedule tail recursively, then repeat head (any order)and tail’s schedule • Coalesce

  13. 1 2 A C 3 D 4 B Creating Static Schedules • Edward’s algorithm (1997) • Construct a signal dependency graph • Break into strongly-connected components (SCC). Schedule in topological order • Partition each SCC into a head and tail • Schedule tail recursively, then repeat head (any order)and tail’s schedule • Coalesce 2 1 3 4

  14. 1 2 A C 3 D 4 B Creating Static Schedules • Edward’s algorithm (1997) • Construct a signal dependency graph • Break into strongly-connected components (SCC). Schedule in topological order • Partition each SCC into a head and tail • Schedule tail recursively, then repeat head (any order)and tail’s schedule • Coalesce 2 1 3 b 4 a c Schedule: a b c

  15. 1 2 A C 3 D 4 B Creating Static Schedules • Edward’s algorithm (1997) • Construct a signal dependency graph • Break into strongly-connected components (SCC). Schedule in topological order • Partition each SCC into a head and tail • Schedule tail recursively, then repeat head (any order)and tail’s schedule • Coalesce T 2 H 1 3 b 4 a c Schedule: 1 b 4

  16. 1 2 A C 3 D 4 B Creating Static Schedules • Edward’s algorithm (1997) • Construct a signal dependency graph • Break into strongly-connected components (SCC). Schedule in topological order • Partition each SCC into a head and tail • Schedule tail recursively, then repeat head (any order)and tail’s schedule • Coalesce T 2 H 1 3 b 4 a c Schedule: 1 2 3 2 4

  17. 1 2 A C 3 D 4 B Creating Static Schedules • Edward’s algorithm (1997) • Construct a signal dependency graph • Break into strongly-connected components (SCC). Schedule in topological order • Partition each SCC into a head and tail • Schedule tail recursively, then repeat head (any order)and tail’s schedule • Coalesce T 2 H 1 3 B 4 A C Schedule: 1 2 3 2 4 A B C B (D) Choosing an optimal partition is exponential

  18. A B C Dynamic sub-schedule embedding SCCs arise due to incomplete information • “Optimal” schedules are optimal w.r.t. information • “Optimal” schedule may be worse than dynamic When an SCC is “too big”, just schedule that section dynamically

  19. A B C Dependency information enchancement • In practice, we see big SCCs • Peek in the black box • Simple parsing of communication overrides (control functions) • Can ask user to tell about internal dependencies • Not too painful because it is reused

  20. Evaluation of Information Enhancement • Control function parsing more useful alone • Not principally through scheduling • It is important to have both kinds of enhancement

  21. Reuse Penalty Revisited • Reuse penalty mitigated in part Reusable LSE model 6% faster than custom SystemC

  22. Conclusions • A tradeoff exists between speed and reuse • The simulator construction system can help • Higher base speed makes reuse penalty less painful • Optimizations are possible with HSR model • Ability of scheduler adapt to information available is powerful • This adaptation is not possible with DE • You can have high reuse at reasonable speeds

  23. Future Work • Release of LSE • Fall 2003 • http://liberty.princeton.edu • Hybrid model of computation • Embed HSR in DE, DE in HSR • Automatic extraction of HSR portions from DE

  24. Other optimizations • Improved block coalescing • See paper • Code specialization • Implementation of APIs depends upon environment

More Related