1 / 39

390 likes | 646 Views

Automated Design of Self-Adjusting Pipelines. Jieyi Long and Seda Ogrenci Memik Dept. of EECS, Northwestern Univ. Outline. Introduction Self-Adjusting Pipeline (SAP) Systematic Design Framework of SAP Experiment: Microprocessor Pipeline Conclusions. Introduction. Novel Design

Download Presentation
## Automated Design of Self-Adjusting Pipelines

**An Image/Link below is provided (as is) to download presentation**
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.
Content is provided to you AS IS for your information and personal use only.
Download presentation by click this link.
While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

**Automated Design of Self-Adjusting Pipelines**Jieyi Long and Seda Ogrenci Memik Dept. of EECS, Northwestern Univ.**Outline**• Introduction • Self-Adjusting Pipeline (SAP) • Systematic Design Framework of SAP • Experiment: Microprocessor Pipeline • Conclusions**Introduction**Novel Design Methodologies are Needed! • Aggressive Scaling Down • Process variation (PV) • Circuit behaviors are harder to predict • Steadily Increasing Integration Capacity • Complexity of the designs amplified from generation to generation • Challenges for Traditional CAD Flow • Synthesis: has to be conservative • Simulation: challenging due to the variations • Verification: very tricky due to the complexity**Introduction**• Self-Adjusting Architecture • A promising methodology to address the above mentioned challenges • Offers a way handle uncertainty after manufacturing • Still in its infancy • Automated design tools • Examples of Self-Adjusting Architectures • Razor [Ernst et al. MICRO2003] • SACTA [Long et al. ICCAD2007]**Self-Adjusting Pipeline (SAP)**• Impact of Process Variation Traditionally, pipeline stages are designed to have the same nominal delay (i.e., balanced) Due to PV, balanced designs are actually NOT balanced! The impacts of PV on different stages are different R1 R2 R3 clk**More Vulnerable**Less Vulnerable Self-Adjusting Pipeline (SAP) • Impact of Process Variation • FMAX Model [Bowman et al, TCAD 2007]: • With the same nominal delay, pipeline stages having • larger number of independent critical paths and • smaller logic depth • have higher probability to have longer delay R1 R2 R3 clk**Self-Adjusting Pipeline (SAP)**• Allocate Execution Time on a Need Basis Our solution: Create dynamic clock skews to satisfy the actual need of the stages R3 R1 R2 clk**Self-Adjusting Pipeline (SAP)**• How to obtain the actual execution time of each stage? • Razor: Detect after execution • Our solution: Measure and predict! R3 R1 R2 clk**Self-Adjusting Pipeline (SAP)**• How to fix the timing error? • Our solution: Measure and predict! • We predict the error before it manifests itself, so we might have time to fix it R3 R1 R2 clk**Tmax**TD Self-Adjusting Pipeline (SAP) • Two supporting circuit elements • Delay sensor [based on Ghosh et al. TCAD 2007] • Adjustable skew buffer Delay Sensor CLK P Sawtooth VREF TAH**Self-Adjusting Pipeline (SAP)**• Two supporting circuit elements • Delay sensor [based on Ghosh et al. TCAD 2007] • Adjustable skew buffer Adjustable Skew Buffer**Performance Metric of a Set of Chips**BP=∑i =1 fi · yi • Speed Binning Batch Performance [Das et al., ASGI 2007] Frequency bins: [f1, f2], …, [fi, fi+1], …, [fn, fn+1] Yield yi of bin [fi, fi+1]: The fraction of chips falling into the bin Systematic Design Framework of SAP • Objective Function: Average Performance**Locations of the delay sensors:**Cannot be too early in the stage, otherwise the prediction will not be accurate Cannot be too late in the stage, otherwise we do not have time to fix the error • Nominal delay of the adjustable skew buffer Cannot be too small, otherwise the timing error in the first stage is not fixed Cannot be too large, otherwise there might be a timing error in the second stage Systematic Design Framework of SAP • Variables**Systematic Design Framework of SAP**Problem Definition • Automated Delay Sensor Insertion and Clock Skew Buffer Configuration • Given: • 1) a two back-to-back pipeline stages, where the first stage is • more vulnerable to process variation than the second • 2) max tolerable delay of each internal node in the pipeline • Determine: • 1) the location of the delay sensors, • 2) nominal delay of the adjustable skew buffers, • such that the Batched Performance is maximized.**Systematic Design Framework of SAP**Mixed-Integer Programming Formulation • Directed Acyclic Graph (DAG) representation of the pipelines • Gates → vertices • Registers → primary I/O vertices • Interconnects → directed edges**Systematic Design Framework of SAP**Mixed-Integer Programming Formulation • Directed Acyclic Graph (DAG) representation of the pipelines • Primary Path: a path between a primary input and a primary output vertices • Coverage Requirement: each primary path must be covered by one and only one delay sensor • The edges with delay sensor on it form a cut of the DAG**Systematic Design Framework of SAP**Mixed-Integer Programming Formulation • We assign a decision variable xi to each vertex • The decision variables specifies the locations of the sensors: A sensor is on edge (vi, vj), iff xi– xj= 1**Systematic Design Framework of SAP**Mixed-Integer Programming Formulation • Constraints specifying the Coverage Requirement xi– xj≥ 0, for each edge (vi, vj) xp = 1, for all vp in PI1 xp = 0, for all vp in PI2 xq= 0, for all vq in PO1 or PO2**Systematic Design Framework of SAP**Mixed-Integer Programming Formulation • Forbidden Vertex Set VF • The delay from the underlying vertex and any primary output is less than the worst case delay of the OR-MUX chain xf = 0, for all vp in VF**Systematic Design Framework of SAP**Mixed-Integer Programming Formulation • Objective Function: Batch Performance Pr(f): The probability that the pipeline stage meets the timing constraints at frequency f BP=∑i =1 fi · yi =∑i =1 fi · (Pr(fi+1 ) - Pr(fi ))**Systematic Design Framework of SAP**Mixed-Integer Programming Formulation • Analysis of Pr(f) • We consider two situations: • At least one error is predicted • No error is predicted Some definitions: Di: accumulative delay at node i Dim: the maximum tolerable delay at node i α(x) = 1 if x > 0, 0 otherwise**R1 = ∑ α((xi- xj)(Di- Dim)) > 0**(vi, vj) Systematic Design Framework of SAP Mixed-Integer Programming Formulation • At least one error is predicted If the sensor on edge(vi , vj) detects an error α((xi - xj)(Di - Dim)) = 1 At least one sensordetects an error**Systematic Design Framework of SAP**Mixed-Integer Programming Formulation • At least one error is predicted The skew buffer will be reconfigured to generate a skew of amount δ For stage 1, the effective clock cycle time becomes (1/f +δ), we should have For each vk∈PO1,α(1/f + δ – Dk) = 1 For stage 2, the effective clock cycle time becomes (1/f –δ) For each vk∈PO2,α(1/f – δ – Dk) = 1**R2= ( ∏α(1/f + δ – Dk)) ( ∏α(1/f – δ – Dk)) =**1 vk∈PO1 vk∈PO2 Systematic Design Framework of SAP Mixed-Integer Programming Formulation • At least one error is predicted The skew buffer will be reconfigured to generate a skew of amount δ Timing correctness requirement:**Systematic Design Framework of SAP**Mixed-Integer Programming Formulation • At least one error is predicted The probability of error being fixed: Pr(R1 > 0and R2 = 1)**R3= ( ∏α(1/f – Dk)) ( ∏α(1/f – Dk)) = 1**vk∈PO2 vk∈PO1 Systematic Design Framework of SAP Mixed-Integer Programming Formulation • No error is predicted there is actually no timing error:**Systematic Design Framework of SAP**Mixed-Integer Programming Formulation • The probability that the pipeline executed correctly Pr(f) Pr(R1 > 0and R2 = 1) + Pr(R1 = 0and R3 = 1)**Systematic Design Framework of SAP**Mixed-Integer Programming Formulation max ∑i =1 fi · (Pr(fi+1 ) - Pr(fi )) s.t. xi– xj≥ 0, for each edge (vi, vj) xp = 1, for all vp in PI1 xp = 0, for all vp in PI2 xq= 0, for all vq in PO1 or PO2 xf = 0, for all vp in VF xi= 0 or xi = 1**Systematic Design Framework of SAP**Simulated Annealing Solving the MIP Formulation Solution Space X = {x1, x2, … , xn, } satisfying the constraints of the MIP formulation Initial Solution xi = 1 iff vibelongs toPI1**Systematic Design Framework of SAP**Simulated Annealing Solving the MIP Formulation Solution Perturbation M0(xj, X)(0 to 1 toggle): i) keep the value of xi for each i != j; ii) change the value of xj from 0 to 1 if 1) xj = 0 and xi = 1 for each edge (vi, vj), and 2) xj does not belong to VF**Systematic Design Framework of SAP**Simulated Annealing Solving the MIP Formulation Solution Perturbation M0(xj, X)(0 to 1 toggle): i) keep the value of xi for each i != j; ii) change the value of xj from 0 to 1 if 1) xj = 0 and xi = 1 for each edge (vi, vj), and 2) xj does not belong to VF**Systematic Design Framework of SAP**Simulated Annealing Solving the MIP Formulation Solution Perturbation M1(xi, X)(1 to 0 toggle): i) keep the value of xj for each j != i; ii) change the value of xj from 1 to 0 if 1) xj = 0 and xi = 1 for each edge (vi, vj)**Systematic Design Framework of SAP**Simulated Annealing Solving the MIP Formulation Solution Perturbation M1(xi, X)(1 to 0 toggle): i) keep the value of xj for each j != i; ii) change the value of xj from 1 to 0 if 1) xj = 0 and xi = 1 for each edge (vi, vj)**IF**MAP IQ REG ALU Cache Less Vulnerable More Vulnerable Experiment: Microprocessor Pipeline • DEC Alpha-like 6 Stage Pipeline • Cache and IF are next to each other • Cache has a lot of critical paths, each consisting of small number of gates • IF has just a few critical paths, each consisting of large number of gates • According to FMAX model [Bowman et al, TCAD 2007], the delay of the cache tends to be longer**IF**MAP IQ REG ALU Cache Less Vulnerable More Vulnerable Experiment: Microprocessor Pipeline • DEC Alpha-like 6 Stage Pipeline • Cache and IF are next to each other • It will be beneficial to create dynamic clock skew when the cache does needs longer execution time**IF**MAP IQ REG ALU Cache Less Vulnerable More Vulnerable Experiment: Microprocessor Pipeline Setup The critical paths of the Cache and IF are extracted from the Verilog code of OpenSPARC processor We assume 45nm technology**Experiment: Microprocessor Pipeline**Results: the average frequency increases from 1.989GHz to 2.178GHz (9.5% improvement)**Conclusions**• We identified the challenges in modern VLSI cad tool design • We proposed to leverage Self-Adjusting Pipeline to solve the problems • We propose a systematic Design Framework of SAP • Application: Microprocessor Pipeline • Experimental results illustrates the effectiveness of our approach**Thank You!**Any Question?

More Related