A Unified Model for Timing Speculation:
Sponsored Links
This presentation is the property of its rightful owner.
1 / 29

Marc de Kruijf Shuou Nomura Karu Sankaralingam PowerPoint PPT Presentation


  • 73 Views
  • Uploaded on
  • Presentation posted in: General

A Unified Model for Timing Speculation: Evaluating the Impact of Technology Scaling, CMOS Design Style, and Fault Recovery Mechanism. Marc de Kruijf Shuou Nomura Karu Sankaralingam. From Hard to Harder. 10000nm. 720nm. 4000um. 360nm. 1500um. 180nm. 90nm. 45nm & beyond. Hard.

Download Presentation

Marc de Kruijf Shuou Nomura Karu Sankaralingam

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


A Unified Model for Timing Speculation: Evaluating the Impact of Technology Scaling, CMOS Design Style, and Fault Recovery Mechanism

Marc de Kruijf

Shuou Nomura

KaruSankaralingam


From Hard to Harder

10000nm

720nm

4000um

360nm

1500um

180nm

90nm

45nm & beyond

Hard

Harder


What is the Problem?

  • Non-ideal transistor scaling

    • Transistor wear-out

    • Process, voltage, and temperature (PVT) variations

    • Errors due to particle interference

    • Noise coupling & crosstalk


What is the Problem?

Dynamic

verification

Multi-core

Coherence &

consistency

Timing speculation

DMR

Need high-level analysis tools

ECC

On-chip network

TMR

RMT

Watchdog

Branch

prediction

Out-of-order

HW checkpoints

Performance Toolbox

Reliability Toolbox


Our Contribution

  • A model for timing speculation

    • Unifies hardware + system

    • Small set of high-level inputs

processor

designer

Also….

Q.What is the impact of technology scaling?

A.Further benefits are small to none.

Q. What is the impact of CMOS design style?

A.Very low power designs benefit most.

Q.What is the impact of the fault recovery mechanism?

A.Fine-grained recovery is key to high efficiencies.


Outline

  • Timing Speculation

  • Model Overview

    • Hardware Efficiency Model

    • System Recovery Model

  • Results

  • Conclusion


Timing Speculation

clock

clock period

( = 1/frequency )

detect &

recover

circuit delay

variations

Timing failure!

slower clock

OK!


Outline

  • Timing Speculation

  • Model Overview

    • Hardware Efficiency Model

    • System Recovery Model

  • Results

  • Conclusion


Model Overview

Hardware Efficiency

System Recovery

Overall Efficiency

Energy

Energy

Time

Error rate

Error rate

Error rate

Model Inputs

1. A hardware path delay distribution

2. Effect of variations on path delay as N(μ,σ)

3. The time between recovery checkpoints

4.The time to restore a checkpoint


Hardware Efficiency Model

Input 1: Path delay distribution

Input 2: Path delay variation (σ)

# Paths

Path delay

Error prob.

Error prob.

Clock period

Clock period

Energy

Error rate

Clock period

Energy

e.g.

frequency

scaling

Error prob.

Error prob.


System Recovery Model

(applies to all backward error recovery systems)

( )

overhead(rate) =

failures(rate) x

waste(rate)

+ restore

System Recovery Model Inputs

1. The time between recovery checkpoints (cycles)

2.The time to restore a checkpoint (restore)

Time

Error rate


Outline

  • Timing Speculation

  • Model Overview

    • Hardware Efficiency Model

    • System Recovery Model

  • Results

  • Conclusion


Results

Is the model useful?

What can we learn?

Technology

Node

Recovery

System

CMOS Design Style

High Performance CMOS

Razor

11nm

45nm

Low Power CMOS

Reunion

Ultra-low Power CMOS

Paceline


Results

Hardware Efficiency

System Recovery

Overall Efficiency

Energy

Energy

Time

Error rate

Error rate

Error rate


Hardware Model Inputs

  • Path delay distribution

    • Application: H.264 decoding

    • Hardware: OpenRISC processor

  • Effect of process variations as N(μ,σ) using ITRS data

    • High Performance CMOS

      • 45nm σ = 0.046μ

      • 11nm σ = 0.051μ

    • Low Power CMOS

      • 45nm σ = 0.029μ

      • 11nm σ = 0.042μ

    • Ultra-low Power CMOS

      • 45nm σ = 0.196μ


Hardware Efficiency

Energy = Power x Time

Energy

EDP

EDP = Power x Time2

Error rate

Normalized

EDP

Results for

High Performance CMOS

Error rate


Recovery Model Inputs

  • The time between recovery checkpoints &

  • The time to restore a checkpoint

    • Razor

      • Latch-level detection + pipeline rollback

      • 1 cycle checkpoint size & 5 cycle recovery cost

    • Reunion

      • DMR detection + checkpoint

      • 100 cycle checkpoint size & 100 cycle recovery cost

    • Paceline

      • DMR detection + checkpoint + flush

      • 100 cycle checkpoint size & 1000 cycle recovery cost


System Recovery

Time

Error rate

Normalized

Time

Error rate


Overall Efficiency

1. High Performance CMOS

2. Low Power CMOS

3. Ultra-low Power CMOS

EDP

Error rate


Overall Efficiency

High Performance CMOS

Normalized

EDP

23% PEAK, 8-15% Typical

Error rate


Overall Efficiency

Low Power CMOS

Normalized

EDP

18% Peak, 5-10% Typical

Error rate


Overall Efficiency

Ultra-low Power CMOS

Normalized

EDP

47% Peak, 20-30% Typical

Error rate


Outline

  • Timing Speculation

  • Model Overview

    • Hardware Efficiency Model

    • System Recovery Model

  • Results

  • Conclusion


Conclusions

  • A High-level Model

  • Results

    • Efficiency gains improve only minimally with scaling

    • Ultra-low power (sub-threshold) CMOS benefits most

    • Fine-grained recovery is key

  • Future Work

    • Incorporate more sources of variation

    • A tool for processor designers?

      • Under development at http://www.cs.wisc.edu/vertical


Questions?

Multi-core

Coherence &

consistency

On-chip network

Timing speculation

Branch

prediction

Out-of-order


?

DSN 2010 - ‹#›


Timing Speculation

Source of Timing Variation

Manufacturing

Process

Runtime

Application

Speed Binning

Online Timing Analysis

Timing Speculation

Figure adapted from Greskamp et al., Paceline: [...]. In PACT ’07.


System Recovery Model

System Recovery Model Inputs

1. The time between recovery checkpoints (cycles)

2.The time to restore a checkpoint (restore)

expected # failures before success

expected # cycles executed upon failure


Overall Inputs

  • Path delay distribution

    • Application: H.264 decoding

    • Hardware: OpenRISC processor

  • Effect of process variations on path delay as N(μ,σ) using ITRS data

    • High Performance [email protected] = 0.046μ

    • Low Power [email protected] = 0.029μ

    • Ultra-low Power CMOS @45nmσ = 0.196μ

  • The time between recovery checkpoints &

  • The time to restore a checkpoint

    • Razor – Latch-level detection + pipeline rollback(1 & 5 cycles)

    • Reunion – DMR detection + checkpoint(100 & 100 cycles)

    • Paceline – DMR detection + checkpoint + flush(100 & 1000 cycles)


  • Login