1 / 39

Cooperative cross-layer protection for resource constrained embedded systems

Cooperative cross-layer protection for resource constrained embedded systems. Prof. Nikil Dutt Prof. Nalini Venkatasubramanian Prof. Lichun Bao. Kyoungwoo Lee (topic exam). June 17, 2008. Contents. Motivation Cooperative, Cross-layer Methods PPC (Partially Protected Caches)

blue
Download Presentation

Cooperative cross-layer protection for resource constrained embedded systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Cooperative cross-layer protection for resource constrained embedded systems Prof. Nikil Dutt Prof. Nalini Venkatasubramanian Prof. Lichun Bao Kyoungwoo Lee (topic exam) June 17, 2008

  2. Contents • Motivation • Cooperative, Cross-layer Methods • PPC (Partially Protected Caches) • EAVE (Error-Aware Video Encoding) • Thesis Outline and Plan

  3. Motivation • Mobile computing is popular Business Resource-limited mobile devices! Fundamental problem is to achieve low power with high performance Communication, Entertainment, & Education Battlefield Wellness Science

  4. Motivation (cont’) • Reliability is an emerging and critical concern • Mobile applications are running close to humans • Wearable computing and wellness mobile devices • New enhanced technology makes devices vulnerable to errors due to high complexity and high integration • Exponential increase of soft error rate as technology scales [Hazucha, 00] • Redundancy techniques incur high overheads of power and performance • TMR (Triple Modular Redundancy) may exceed 200% overheads without optimization [Nieuwland, 06] • Challenging to optimize multiple properties (e.g., performance, power, and reliability) in mobile embedded systems

  5. Reliability Across Layers in Mobile Devices Application Middleware/ OS Hardware Network • Errors and error control schemes at system abstraction layers

  6. Errors and Error Control Schemes at Hardware Hardware Application Network MW/ OS • Hardware failures are increasing as technology scales • (e.g.) SER increases by up to 1000 times [Mastipuram, 04] • Redundancy techniques are expensive • (e.g.) ECC-based protection in caches incurs 95% performance penalty [Li, 05] • FIT: Failures in Time (109 hours) • MTTF: Mean Time To Failure • MTBF: Mean Time b/w Failures • TMR: Triple Modular Redundancy • EDC: Error Detection Codes • ECC: Error Correction Codes • RAID: Redundant Array of • Inexpensive Drives

  7. Errors and Error Control Schemes at Software Hardware Application Network MW/ OS • Software errors become dominant as system’s complexity increases • (e.g.) Several bugs per kilo lines • Hard to debug, and redundancy techniques are expensive • (e.g.) Backward recovery with checkpoints is inappropriate for real-time applications • QoS: Quality of Service

  8. Errors and Error Control Schemes in Networks Hardware Application Network MW/ OS • Network is unreliable (especially, wireless networks) • Combined approaches across OSI layers have been investigated for optimal solutions [Vuran, 06][Schaar, 07] • SNR: Signal to Noise Ratio • MTTR: Mean Time To Recovery • CRC: Cyclic Redundancy Check • MIMO: Multiple-In Multiple-Out

  9. Thesis Problem Statement • Study conflicts among system properties • Examine errors and error control schemes across system abstraction layers • Maximize reliability while minimizing costs of power and performance for mobile embedded systems

  10. Why Cross-Layer Approach? • Cross-layer interactions and conflicts arise between system properties • DVS increases SER exponentially • Over protection or under protection • All ECC for multimedia data is an overkill • Cross-layer approaches can maximize the reliability with minimal power and performance overheads • Benefits of Cross-layer approaches • Global system view • Coordination for intelligent selection • Adaptation • Cross-layer approaches have been promising to save the resources at the cost of QoS [Mohapatra, 05][Yuan, 04] • DVS: Dynamic Voltage Scaling • SER: Soft Error Rate • ECC: Error Correction Codes • QoS: Quality of Service

  11. Thesis Proposed Contribution: CC-PROTECT • Cooperative Cross-layer Protection (CC-PROTECT) by exploiting error-awareness and error control schemes across system abstraction layers • Contribution • Present cost-efficient reliability methods (cooperative cross-layer protection) • Open expanded tradeoff spaces and operating points • Rediscover applicability of existing approaches for other purposes

  12. Outline of CC-PROTECT 12 Original Video Error-Controller (e.g., frame drop) Error-Resilient Encoder (e.g., PBPAIR) Error- Aware Video Error-Aware Video Encoder (EAVE) Mobile Video Application Error Injection Rate & Frame Loss Rate QoS Loss BER (Backward Error Recovery) DFR (Drop & Forward Recovery) Monitor & Translate SER Trigger Selective DFR Support EAVE & PPC Packet Loss Frame Drop MW/OS Feedback SER Data Mapping frame K frame K+1 Parameter Unprotected Cache Protected Cache EDC Error detection PPC Error-prone Networks

  13. Contents Application Hardware Middleware/ OS Network • Motivation • Cooperative, Cross-layer Methods • PPC (Partially Protected Caches) • EAVE (Error-Aware Video Encoding) • Thesis Outline and Plan

  14. Soft Errors (Transient Faults) • SER increases exponentially as technology scales • Integration, voltage scaling, altitude, latitude • Caches are most hit due to: • Larger portion in processors (more than 50%) • No masking effects (e.g., logical masking) Intel Itanium II Processor [Baumann, 05] Transistor 5 hours MTTF 0 1 1 month MTTF Bit Flip • MTTF: Mean time To Failure

  15. Conventional Protection for Caches • Conventional Protected Caches (Safe) • Unaware of fault tolerance at applications • Implement a redundancy technique such as ECC to protect all data for every access • Overkill for multimedia applications • ECC (e.g., a Hamming Code) incurs high performance penalty by up to 95%, and power overhead by up to 22% Unaware of Application High Cost Cache ECC

  16. Related Work • Process Technology Solutions • Hardening [Baze, IEEE Trans. on Nuclear Science 00] • SOI [O. Musseau, IEEE Trans. on Nuclear Science 96] • Process complexity, yield loss, and substrate cost • Microarchitectural Solutions for Caches • Cache Scrubbing [Mukherjee, PRDC04] • Low Power Cache [Li, ISLPED04] • Area Efficient Protection [Kim, DATE06] • Multiple Bit Correction [Neuberger, TODAES 03] • Cache Size Selection [Cai, ASP-DAC06] • In-Cache Replication [Zhang, DSN03] • Replication Cache [Zhang, IEEE Computers 05] • High overheads in terms of power, performance, and area • Our Solution • Protects caches from failures due to soft errors exploiting error-tolerance of applications • Protection can be in conjunction with any techniques

  17. Unequal Data Protection • All pages are not equally failure critical • Multimedia data is failure non-critical • Program variables are failure critical • Failures: system crash, infinite loop, segmentation faults, etc • QoS degradation is not a failure Only 9 pages out of 83 are failure critical

  18. PPC (Partially Protected Caches) • Propose PPC architectures to provide an unequal protection for mobile multimedia systems [Lee, TVLSI08] • Unprotected cache and Protected cache at the same level of memory hierarchy • Protected cache is typically smaller to keep power and delay the same as or less than those of Unprotected cache PPC Unprotected Cache Protected Cache How to Partition Data? Memory

  19. PPC Unprotected Cache Protected Cache Memory PPC for Multimedia Applications • Propose a selective data protection based on HPC [Lee, CASES06] • Unequal protection at hardware layer exploiting error-tolerance at application layer • Simple data partitioning for multimedia applications • Multimedia data is failure non-critical • All other data is failure critical Power/Delay Reduction Fault Tolerance • HPC: Horizontally Partitioned Caches

  20. PPC Unprotected Cache Protected Cache Memory PPC for General Applications • DPExplore [Lee, PPCDIPES08] • Explore partitioning space by exploiting awareness of vulnerability of each data page • Vulnerable time • It is vulnerable for the time when eventually it is read by CPU or written back to Memory • Pages causing high vulnerable timeare failure critical • Vulnerable time closely estimates failure rate invulnerable Incoming Eviction data Read Write t0 t1 t2 t3 Vulnerable

  21. Experimental Results – Failure Rate Failure rate of PPC is close to that of Safe (Safe is a protected cache configuration with an ECC protection, i.e., protecting all data, and Unsafe is an unprotected cache)

  22. Experimental Results – Performance Runtime of PPC is close to that of Unsafe

  23. Experimental Results – Power Energy consumption of PPC is close to that of Unsafe

  24. Summary – PPC (Partially Protected Caches) • All data are not equally failure critical • Propose a PPC architecture to provide unequal data protection • Support an unequal protection at hardware layer by exploiting error-tolerance and vulnerability at application • Present cost-efficient reliability • Related Publications • [Lee, CASES06] • [Lee, PPCDIPES08] • [Lee, TVLSI08]

  25. Contents Application Middleware/ OS Hardware Network • Motivation • Cooperative, Cross-layer Methods • PPC (Partially Protected Caches) • EAVE (Error-Aware Video Encoding) • Thesis Outline and Plan

  26. Parameters Resilience PLR Error-Resilient Video Encoding Network • Error-resilient video encodings have been developed to combat errors in networks • PBPAIR – energy-efficient and error-resilient video encoding [Kim,06] • Passive Error Exploitation • It compresses video data according to PLR Mobile Video Application Embed Error-Resilience against packet losses Maintain the QoS Packet Loss • PBPAIR: Probability-Based Power Aware Intra Refresh Error-prone Networks

  27. Related Work • Energy/QoS-aware video encoding • Video encoding parameters [Mopatra, IPDPS05] • Motion estimation algorithm [Tourapis, VCIP00] • Integrated power management [Mohapatra, ACM MM03] • Global cross-layer adaption [Yuan, MMCN04] • Transmission power and QoS [Eisenberg, IEEE Trans. on CSVT 02] • Not consider error-resilience • Error-resilient video encoding • Error-resilient GOP [Yang, JVCIP07] • AIR (Adaptive Intra Refreshing) [Worral, ICASSP01] • PGOP (Progressive GOP) [Cheng, PCS04] • PBPAIR (Probability-Based Power Aware Intra Refresh) [Kim, MCCR06] • Passive error exploitation • Our Solution • Error-aware video encoding: exploits errors actively to minimize energy consumption

  28. Active Error Exploitation – Intentional Frame Drop • Intentional Frame Drop (one way to actively exploit errors) can result in energy reduction for each operation • FDT-1 affects the following components with respect to power, performance, and QoS in mobile video applications Mobile Video Application Enc Tx Rx Dec CPU WNI WNI CPU FDT-1 FDT-2 FDT-3 Packet Loss • FDT: Frame Drop Type • Enc: Encoding, Dec: Decoding • WNI: Wireless Network Interface Error-prone Networks

  29. Error-Aware Video Encoding • Propose EE-PBPAIR [Lee, DIPES08] • Intentionally drop frames at video encoding • Reduce the energy consumption for video encoding • Maintain the video quality by exploiting error-resilience of PBPAIR Intentional frame drop Packet Loss Error-Aware Video Encoder (EAVE) Error- Resilient Video Error- Aware Video Original Video Error-Controller (e.g., frame dropping) Error-Resilient Encoder (e.g., PBPAIR) EIR • EIR: Error Injection Rate Error-prone Networks

  30. Error-Aware Video Encoding (EAVE) Network • Cross-layer architecture • Intentional exploitation of errors at application incorporating error-resilience in network Resilience FLR EIR feedback PLR • EIR: Error Injection Rate • FLR: Frame Loss Rate • PLR: Packet Loss Rate

  31. Experimental Results – Energy Reduction Energy saving occurs at every component in a path from encoding to decoding in mobile video applications EC = Energy Consumption Enc EC = EC for Encoding Tx EC = EC for Transmission Dec EC = EC for Decoding Rx EC = EC for Receiving • PLR = 10% and EIR = 10% • PSNR: Peak Signal to Noise Ratio

  32. Experimental Results – Expanded Tradeoff Space

  33. Summary – EAVE (Error-Aware Video Encoding) • Intentional Frame Drop is one way to exploit errors actively • Propose an error-aware video encoding (EE-PBPAIR) • Intentional frame dropping and the nature of energy-efficiency of PBPAIR reduces the energy consumption for video encoding • Present a knob (EIR) to adjust the amount of errors considering the QoS feedback • Maintain the video quality using error-resilience of PBPAIR • Related Publication • [Lee, DIPES08]

  34. Contents • Motivation • Cooperative, Cross-layer Methods • PPC (Partially Protected Caches) • EAVE (Error-Aware Video Encoding) • Thesis Outline and Plan

  35. Thesis Outline Middleware/ OS Network Hardware Application • Thesis Problem • Exploit errors and error control schemes across layers to maximize reliability with minimal costs for mobile embedded systems • Topic 1 – Approach at hardware and application layers • PPC (unequal data protection at hardware exploiting error tolerance at application) [Lee, CASES06][Lee, DIPES08][Lee, TVLSI08] • Topic 2 – Approach at application, middleware, and network layers • EAVE (intentional exploitation of errors at application, incorporating error resilience in networks) [Lee, DIPES08] • Topic 3 – Approach across application/middleware-OS/HW • CC-PROTECT (middleware-driven cooperative exploitation of errors and error control schemes across layers) [under submission to ACM MM 08 and on-going work]

  36. Outline of CC-PROTECT Original Video Error-Controller (e.g., frame drop) Error-Resilient Encoder (e.g., PBPAIR) Error- Aware Video Error-Aware Video Encoder (EAVE) Mobile Video Application Error Injection Rate & Frame Loss Rate QoS Loss BER (Backward Error Recovery) DFR (Drop & Forward Recovery) Monitor & Translate SER Trigger Selective DFR Support EAVE & PPC Packet Loss Frame Drop MW/OS Mobile Video Application Feedback SER Data Mapping frame K frame K+1 Parameter Unprotected Cache Protected Cache EDC Error detection PPC Error-prone Networks Error-prone Networks

  37. Time Plan • Fall, 2003 ~ Spring, 2008 • PPC, EAVE, etc. • Summer, 2008 • CC-PROTECT • Extended versions of previous work • End of Summer, 2008 • Final Defense • Dissertation

  38. Publications Application Middleware/ OS Hardware Network [Lee, TVLSI08] K. Lee, A. Shrivastava, I. Issenin, N. Dutt, and N. Venkatasubramanian, “Partially protected caches to reduce failures due to soft errors in multimedia applications”, In IEEE Transactions on Very Large Scale Integration Systems (TVLSI), 2008, to appear. [Lee, DIPES08] K. Lee, M. Kim, N. Dutt, and N. Venkatasubramanian, “Error exploiting video encoder to extend energy/QoS tradeoffs for mobile embedded systems”, In 6th IFIP Working Conference on Distributed and Parallel Embedded Systems (DIPES), Sep. 2008, to appear [Lee, PPCDIPES08] K. Lee, A. Shrivastava, N. Dutt, and N. Venkatasubramanian, “Data partitioning techniques for partially protected caches to reduce soft error induced failures”, In 6th IFIP Working Conference on Distributed and Parallel Embedded Systems (DIPES), Sep. 2008, to appear [Lee, CASES06] K. Lee, A. Shrivastava, I. Issenin, N. Dutt, and N. Venkatasubramanian, “Mitigating soft error failures for multimedia applications by selective data protection”, In Int.Conference on Compilers, Architecture, & Synthesis for Embedded Systems (CASES), Oct. 2006. [Lee, ICME05] K. Lee, N. Dutt, and N. Venkatasubramanian, “Experimental Study on Energy Consumption of Video Encryption for Mobile Handheld Devices", In IEEE International Conference on Multimedia and Expo (ICME 05), Poster Session, July 2005. [Mohapatra, IPDPS05] S. Mohapatra, R. Cornea, H. Oh, K. Lee, M. Kim, N. Dutt, R. Gupta, A. Nicolau, S. Shukla, and N. Venkatasubramanian, “A cross-layer approach for power-performance optimization in distributed mobile systems”, In Next Generation Software Program in conjunction with IEEE International Parallel and Distributed Processing Symposium (IPDPS), April 2005. [Lee, DIPES08] [Lee, TVLSI08] [Lee, PPCDIPES08] [Lee, CASES06] [Mohapatra, IPDPS05] [Lee, ICME05]

  39. Thank you! Any Questions or Comments?

More Related