1 / 31

Architecture Exploration For Ambient Energy Harvesting Nonvolatile Processors

Architecture Exploration For Ambient Energy Harvesting Nonvolatile Processors. Introduction. Future: powered by technology harvesting ambient energy sources Battery-free systems Ambient energy sources: Solar Energy Wi-Fi and Radio Frequency (RF) energy Motion energy – Piezoelectric devices

Download Presentation

Architecture Exploration For Ambient Energy Harvesting Nonvolatile Processors

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Architecture Exploration For Ambient Energy Harvesting Nonvolatile Processors

  2. Introduction • Future: powered by technology harvesting ambient energy sources • Battery-free systems • Ambient energy sources: • Solar Energy • Wi-Fi and Radio Frequency (RF) energy • Motion energy – Piezoelectric devices Eg. Wireless powered smart contact lens

  3. Application Categories Applications vary in complexity, throughput constraints and computational demands. Based on demand for nonvolatility, categorized into: • Signal detection and sensing: Detection and relaying. Eg. UV radiation, blood pressure, blood sugar level, temperature • Signal detection and analysis: Computation carried out for analyzing the signal for diagnosis. Eg. wearable EEG/ECG • Signal prediction: Predicts future pattern. Eg. Wearable systems that warn against seizures Ambient energy sources are unreliable. Category 1 is easier to implement Category 2, 3 require QoS (to be completed within fixed time)

  4. Energy Harvesting System Structure • Energy Harvesting and Management: Determines entire power used for signal sensing, processing and transmission • Digital Signal Processor: More about it later • I/O Interface and analog RF frontend: Digital interfaces, antennas, etc

  5. Processor Design: Volatile Vs Nonvolatile • Volatile processor with periodic checkpointing – Forced rollback to previously checkpointed state • NV processor: enables more complex state-dependent signal processing that tolerates power source insufficiency and unreliability – consumes more power for read and write

  6. Architectural Exploration Parameters to be analyzed: • Number of pipeline stages • Data to be backed up • Frequency of backup Assumptions: • MIPS ISA • Clock frequency - 8 KHz: limited strength of the Wi-Fi signal used • Instruction memory (ROM) and ICache (SRAM, NVM) • Data memory (nonvolatile) and DCache (NV write-back)

  7. Non-pipelined Configuration (NP) Entire state of the processor can be characterized by a single instruction state • Program Counter (PC): Instruction being executed and needs to be stored • Register File (RegFile): Volatile RegFile is energy efficient due to frequent usage and large number of frequent read and writes Tradeoff between energy consumed in backing up and recovering data and the overall performance Which data to save? When to save? 3 policies: • Backup Every Cycle (BEC) • On Demand All Backup (ODAB) • On Demand Selective Backup (ODSB)

  8. NV – Backup Every Cycle (BEC) • Employs NVM RegFile inspite of significant energy penalty, else volatile and nonvolatile need to be updated every cycle • PC and few registers in RegFile written every cycle • Instructions like StoreWord and Jump do not require RegFile write

  9. NV – On Demand All Backup (ODAB) • All RegFile entries to be backed up in the event of reduced power state • If input power < preset threshold, power warning signal is activated • Control unit backs up PC and resets atomic flag • Upon power restore, energy is accumulated in the capacitor

  10. NV – On Demand Selective Backup (ODSB) • Synchronous power warning signal ensures that current PC finishes executing and writing back. PC + 4 is stored to avoid re-execution • Change flag to identify if a register has been written into • Control unit doesn’t generate address for unchanged data • Reduces backup time and energy penalty

  11. Simulation Results And Comparison • Total area is similar as NVM cache and backup blocks are much bigger than logic • BEC has lowest peak frequency due to frequent backups • Recovery time: Time from activation of Energy OK signal to the time all backup operations are complete • ODSB backup time < ODAB backup time

  12. Simulation Results And Comparison • ODSB is more energy efficient with stable source like solar • ODSB can reduce backup energy penalty by 69% with 0.002% area overhead • BEC doesn’t need time to accumulate energy in cap, viable when power failure is extremely frequent (less than 1 in 10 cycles)

  13. N-stage-pipeline • Increased circuit complexity and activity factor results in higher power threshold compared to non-pipelined processor • 5 Stage Pipeline (5SP) under study • Two backup schemes proposed: • Shifted PC and Volatile Flip-flops (SPC/VFF) • Nonvolatile Flip-flops Solution (NVFF)

  14. Shifted PC & Volatile FF (SPC/VFF) • Pipelined data flow with bypass and forward, complex control flow to handle hazard • Shifter buffer stores the PC value in each pipeline stage • When power is down, PC in write back stage will be finished, unfinished PC to be backed up will be in data memory stage • Shifter used instead of rolling back since different PC needs to be backed up for jump and branch • An extra 4 clock cycles are needed to re-execute the last 4 instructions lost from the latter pipeline stages after recovery

  15. Nonvolatile FF Solution (NVFF) • This solution uses NVM flip-flops • SPC/VFF requires 11% less time and 57% less energy than NVFF

  16. Out-of-order Processor (OoO) • More complex than NP and 5SP • System state is broadly distributed across structures such as PC, ROB, RegFile, Map Table, Issue Queue, Load Store Queue, BHT and BTB • Larger power requirement  fewer periods where the input power exceeds the min threshold. Which structures need to be backed up?

  17. Resource Selection Strategies The resource selection strategies proposed are: • Minimum State Resource backup solution (MinR) • Low-latency Backup solution (LLB) • Middle-level Backup solution (MLB) • Min-state-lost Backup solution (MPL) • Integrated Flexible Atomic Backup Solution (IFA)

  18. Resource Selection Strategies • Minimum State Resource backup solution (MinR): • Backs up min number of bits required to preserve functionality • Depends on branch misprediction mechanism to minimize the number of valid/ relevant state bits prior to backup. • ROB and PC: Backs up the first uncommitted PC at the head of ROB • ARegFile is backed up as it is small • Map Table: Pseudo-Misprediction is used to restore Map table • PRegFile, Ready Table, Free List, BHT, BTB can be recovered

  19. Resource Selection Strategies • Low Latency Backup solution (LLB): Aims to minimize the number of bits to store if backup begins immediately • Backs up the entire ROB, IQ, ARegFile, Map Table and PRegFile • Middle-level Backup solution (MLB): Backs up Ready table and Free List as well • Min-state-lost Backup solution (MPL): All structures including BHT, BTB backed up • Integrated Flexible Atomic Backup Solution (IFA): Even if the power is below threshold, it could allow for an optional state (BHT) to be stored subjected to optimistic attempt

  20. OOO Strategies Comparison In MinR pseudo-misprediction operation for map table requires extra backup clock cycles. While recovering, extra clock cycles needed to restore PRegFile, Ready Table and Free Table

  21. OoO Strategies Comparison • LLB: ROB, PRegFile are large  increase backup time and energy. Recovery energy is smaller as instructions in ROB are backed up (no re-execution) • MPL incurs largest backup and recovery penalties, but backing up all structures incurs min latency to return to peak performance after a power failure • OoO needs higher threshold, but periods of sufficient power are common enough to allow superior performance to pay for lost clock cycles

  22. Simulation Results • The configurations are compared with baseline non-pipelined volatile processor without checkpointing or data backup • The volatile processor’s progress returns to zero when power drops to below threshold • Nonvolatile NP and 5SP have higher power threshold • OoO runs for only a small fraction of time but its performance can be upto 4x faster than NP and 5SP

  23. Validation • Non pipelined On Demand strategy was explored using an actual fabricated processor (THU1010N) • It has an Intel 8051 CISC like architecture • The saved state includes the state machine that captures current instruction • PC, RegFiles are FeRAM based FF. FF have additional backup FeCap • NV processor based system interfaced to a solar panel and UV sensor

  24. Operation • Upon power failure detection, NV control logic backs up DFFs to FeCaps • When power resumes, data is restored from FeCaps to DFFs • Internal RC oscillator is used. External osc becomes unstable with low power Simulator calibration: • Several kernels executed both on platform and simulator • Intermittent power supply modeled by a 1KHz square waveform • Processor frequency: 3MHz • Each kernel is executed 1000 times to obtain completion time • Stable power case: No mismatch; Unstable power case: mismatch < 5% • Simulator averages energy consumed by instruction to estimate remaining energy

  25. Dependence On Input Power • Input signal characteristics plays a major role in determining optimal design. • Performance of backup schemes with home and office Wi-Fi sources for harvesting • In home, NP ODSB architecture is best performing, in office OoO MPL is most desirable

  26. Dependence On Nature Of Input Source • Input energy sources differ in magnitude • For each case, the best performing backup policy is adopted • For same input power source, the actual execution time for NP and 5SP is almost same • Higher power threshold in OoO results in longer Off time

  27. Meeting QoS Requirements • Some application (like ECG) require periodic outputs within fixed time periods – QoS constraints • Ambient energy - unreliable • Piezo and solar can provide almost 100% QoS • QoS can be improved by: • Shrinking size and using FinFETs • Power reduction techniques: dark silicon aware architecture, clock gating, DVFS, DATS, Tunnel FET, low power sub-threshold circuits

  28. Conclusion • Explored various factors : battery-less system with ambient energy • Intermittent energy source: Different nonvolatile processor configurations, techniques to conserve state while maximizing forward progress • Examined tradeoffs between performance and energy for different architecture • Compared and validated simulation results with nonvolatile solar energy harvesting processor platform • The video of HPCA 2015 Best Paper Competition Demo

  29. References • KaiSheng Ma, Yang Zheng, Shuangchen Li, KarthikSwaminathan, Xueqing Li, Yongpan Liu, Jack Sampson, Yuan Xie, Vijaykrishnan Narayanan. " Architecture Exploration for Ambient Energy Harvesting Nonvolatile Processors", The International Symposium on High-Performance Computer Architecture (HPCA-21) • A. Parks, A. Sample, Z. Yi, and J. Smith. A wireless sensing platform utilizing ambient RF energy. In IEEE Radio and Wireless Symposium (RWS), 2013. • S. Kannan, A. Gavrilovska, K. Schwan, and D. Milojicic. Optimizing checkpoints using NVM as virtual memory. In IPDPS, 2013. • X. Dong, C. Xu, Y. Xie, and N. Jouppi. NVSim: A circuit-level performance, energy, and area model for emerging nonvolatile memory. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 31(7):994–1007, 2012.

  30. Questions?

  31. Thank You

More Related