1 / 16

Design-Space Exploration of Resource-Sharing Solutions for Custom Instruction Set Extensions

Design-Space Exploration of Resource-Sharing Solutions for Custom Instruction Set Extensions. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 28, NO. 12, DECEMBER 2009 Marcela Zuluaga and Nigel Topham 2010/05/20 Presenter : 陳俊霖. Outline. Introduction

shea
Download Presentation

Design-Space Exploration of Resource-Sharing Solutions for Custom Instruction Set Extensions

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Design-Space Exploration of Resource-SharingSolutions for Custom Instruction Set Extensions IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 28, NO. 12, DECEMBER 2009 Marcela Zuluaga and Nigel Topham 2010/05/20 Presenter : 陳俊霖

  2. Outline • Introduction • Merging two DGFs • Parametric Resource-Sharing Heuristic • A. Multioperation Vertices • B. Controlling the Area-Latency Tradeoff • C. Controlling the Execution-Time Impact of Merging • D. Vertex Grouping • Results

  3. Introduction Resource sharing can reduce the die area and energy consumption of a customized processor. But ISE latencies may increase after merging.

  4. Introduction • Assumption: • ISEs have been identified by a previous compiler phase. • ISEs represent as a collection of directed acyclic graphs (DAGs) annotated with execution frequency. • All inputs arrive together and that all outputs leave together. • Problem: • How to merge such a collection of graphs to reduce the overall die area while minimizing the increase in execution latency.

  5. Merging two DFGs Global Common Strings : G1(2,4) G2(0,2), G1(3) G2(4), G1(1) G2(1) Local Common String : G’(1)G’(5), G’(3)G’(7) Local Common String : G’(3)G’(7) MaxStrLocal : G’(3)G’(7) MaxStrGlobal : G1(2,4) G2(0,2) MaxStrLocal : G’(1)G’(5)

  6. Parametric Resource-Sharing Heuristic Resource sharing is induced by the search for maximum-area common substrings between two paths belonging to different graphs. Area reduction is maximized by the expected area saved rather than by simply considering the substring length.

  7. Parametric Resource-Sharing Heuristic • Five parameters are put on the algorithm to find many alternative solutions. • αT, βT,θT : threshold parameters, real values in range[0,1]. • To limit the increase in the ISE execution delay in relation to the area saved by merging operators • MultiOp: binary value. • To control the creation of multioperation vertices from similar operators. • Grouping : binary value. • To determine whether certain operator groupings will be recognized and exploited during the merging process.

  8. A. Multioperation Vertices • Vertices that perform similar but different operations could be merged with a small overhead. • The creation of muotioperation vertices is governed by the parameter MultiOp. • Area saving of multioperation vertex • Area(x,y) = Ax + Ay - Axy

  9. B. Controlling the Area-Latency Tradeoff • The heuristic must decide whether the increased function unit latency resulting from the merge is sufficiently offset by the area savings to make the merge beneficial. • θ is introduced to quantify the area-latency tradeoff. • θk∈{x, y} = • First term: the relative decrease in latency perceived by not performing the merge. • Second term: the area savings that do result from merging. • If θx θy exceed the threshold θT, the G’ is discarded from Gout and store in the set S* then restart merging. (G’ : merged graph)

  10. B. Controlling the Area-Latency Tradeoff θ1 and θ2 < θT Common substrings: G1(1)G3(3) : area=6 G1(3)G3(2) : area=5 forbidden θ1andθ2 < θT Calculate θ1θ2 , if θ1 or θ2 > θT Discard and restart If θ1 orθ2 > θT S*={G1(1)G3(3)} S*={G1(0)G2(3)}

  11. C. Controlling the Execution-Time Impact of Merging • Although θT prevents from a poor tradeoff between area savings and increased latency, it’s not sufficient. • If G1 is a frequently executed graph, then the resulting Gout is not a good solution.

  12. C. Controlling the Execution-Time Impact of Merging • αi for Gi is to counteract this effect • Fi: the normalized execution frequency of Gi. • Mi: the precentage of area of possibly merged operations in Gi. • Each αi is compared with the threshold αT before merging. If αi exceeds αT, Gi is excluded from the set of input graphs. • The effect of αT is to leave Gi unmerged if the merging process would increase its latency beyond an acceptable threshold.

  13. C. Controlling the Execution-Time Impact of Merging • Another case • Another metric n is the number of input graphs

  14. D. Vertex Grouping • Certain operator sequences can be combined as an atomic unit during logic synthesis to yield smaller and faster solutions than their individual components. • Grouping controls whether operator groups should be identified and retained instead of trying to merge each operator independently.

  15. Results • The specific effect of varying αTβT andθT. • MultiOp = 0 and Grouping = 0 • AsθT is reduced, the resulting solutions are pushed to the left.

  16. Results (x’s)MultiOp=0 Grouping=0 (circle)MultiOp=1 Grouping=0 (squares)MultiOp=0 Grouping=1 (crosses)MultiOp=1 Grouping=1

More Related