Copyright © 2008 UCI ACES/DSM Laboratories 1 Nalini Venkatasubramanian 1 Kyoungwoo Lee,

Slides:



Advertisements
Similar presentations
1 A Self-Tuning Cache Architecture for Embedded Systems Chuanjun Zhang*, Frank Vahid**, and Roman Lysecky *Dept. of Electrical Engineering Dept. of Computer.
Advertisements

Microprocessor Reliability
CML CML Presented by: Aseem Gupta, UCI Deepa Kannan, Aviral Shrivastava, Sarvesh Bhardwaj, and Sarma Vrudhula Compiler and Microarchitecture Lab Department.
Maintaining Data Integrity in Programmable Logic in Atmospheric Environments through Error Detection Joel Seely Technical Marketing Manager Military &
Copyright © 2008 UCI ACES Laboratory Kyoungwoo Lee, Minyoung Kim, Nikil Dutt, and Nalini Venkatasubramanian Error-Exploiting.
Presented by Santhi Priya Eda Vinutha Rumale.  Introduction  Approaches  Video Streaming Traffic Model  QOS in WiMAX  Video Traffic Classification.
Copyright © 2006 UCI ACES Laboratory Kyoungwoo Lee 1, Aviral Shrivastava 2, Ilya Issenin 1, Nikil Dutt 1, and Nalini Venkatasubramanian.
Copyright © 2002 UCI ACES Laboratory A Design Space Exploration framework for rISA Design Ashok Halambi, Aviral Shrivastava,
Limin Liu, Member, IEEE Zhen Li, Member, IEEE Edward J. Delp, Fellow, IEEE CSVT 2009.
SCHOOL OF COMPUTING SCIENCE SIMON FRASER UNIVERSITY CMPT 820 : Error Mitigation Schaar and Chou, Multimedia over IP and Wireless Networks: Compression,
Ashish Gupta Under Guidance of Prof. B.N. Jain Department of Computer Science and Engineering Advanced Networking Laboratory.
An Error-Resilient GOP Structure for Robust Video Transmission Tao Fang, Lap-Pui Chau Electrical and Electronic Engineering, Nanyan Techonological University.
Sliding-Window Digital Fountain Codes for Streaming of Multimedia Contents Matta C.O. Bogino, Pasquale Cataldi, Marco Grangetto, Enrico Magli, Gabriella.
Data Partitioning Techniques for Partially Protected Caches to Reduce Soft Error Induced Failures (DIPES 08) Kyoungwoo Lee.
Efficient Fine Granularity Scalability Using Adaptive Leaky Factor Yunlong Gao and Lap-Pui Chau, Senior Member, IEEE IEEE TRANSACTIONS ON BROADCASTING,
Differentiated Multimedia Web Services Using Quality Aware Transcoding S. Chandra, C.Schlatter Ellis and A.Vahdat InfoCom 2000, IEEE Journal on Selected.
Compilation Techniques for Energy Reduction in Horizontally Partitioned Cache Architectures Aviral Shrivastava, Ilya Issenin, Nikil Dutt Center For Embedded.
Source-Channel Prediction in Error Resilient Video Coding Hua Yang and Kenneth Rose Signal Compression Laboratory ECE Department University of California,
Using Redundancy and Interleaving to Ameliorate the Effects of Packet Loss in a Video Stream Yali Zhu, Mark Claypool and Yanlin Liu Department of Computer.
H.264/AVC for Wireless Applications Thomas Stockhammer, and Thomas Wiegand Institute for Communications Engineering, Munich University of Technology, Germany.
1 Quality of Service: for Multimedia Internet Broadcasting Applications CP Lecture 1.
Architectural and Compiler Techniques for Energy Reduction in High-Performance Microprocessors Nikolaos Bellas, Ibrahim N. Hajj, Fellow, IEEE, Constantine.
Low Latency Wireless Video Over Networks Using Path Diversity John Apostolopolous Wai-tian Tan Mitchell Trott Hewlett-Packard Laboratories Allen.
On Error Preserving Encryption Algorithms for Wireless Video Transmission Ali Saman Tosun and Wu-Chi Feng The Ohio State University Department of Computer.
Adaptive Video Coding to Reduce Energy on General Purpose Processors Daniel Grobe Sachs, Sarita Adve, Douglas L. Jones University of Illinois at Urbana-Champaign.
Killing Zombies with Rate Controlled Adaptive Intra Refresh over Wireless HDMI by Nicholas Jamba.
TASK ADAPTATION IN REAL-TIME & EMBEDDED SYSTEMS FOR ENERGY & RELIABILITY TRADEOFFS Sathish Gopalakrishnan Department of Electrical & Computer Engineering.
COGNITIVE RADIO FOR NEXT-GENERATION WIRELESS NETWORKS: AN APPROACH TO OPPORTUNISTIC CHANNEL SELECTION IN IEEE BASED WIRELESS MESH Dusit Niyato,
A Compiler-in-the-Loop (CIL) Framework to Explore Horizontally Partitioned Cache (HPC) Architectures Aviral Shrivastava*, Ilya Issenin, Nikil Dutt *Compiler.
Roza Ghamari Bogazici University.  Current trends in transistor size, voltage, and clock frequency, future microprocessors will become increasingly susceptible.
Integrating Fine-Grained Application Adaptation with Global Adaptation for Saving Energy Vibhore Vardhan, Daniel G. Sachs, Wanghong Yuan, Albert F. Harris,
Computer Science Open Research Questions Adversary models –Define/Formalize adversary models Need to incorporate characteristics of new technologies and.
Architectures for mobile and wireless systems Ese 566 Report 1 Hui Zhang Preethi Karthik.
Assuring Application-level Correctness Against Soft Errors Jason Cong and Karthik Gururaj.
Distributing Layered Encoded Video through Caches Authors: Jussi Kangasharju Felix HartantoMartin Reisslein Keith W. Ross Proceedings of IEEE Infocom 2001,
Copyright © 2008 UCI ACES Laboratory Kyoungwoo Lee 1, Aviral Shrivastava 2, Nikil Dutt 1, and Nalini Venkatasubramanian 1.
Delivering Adaptive Scalable Video over the Wireless Internet Pavlos Antoniou, Vasos Vassiliou and Andreas Pitsillides Computer Science Department University.
SiLab presentation on Reliable Computing Combinational Logic Soft Error Analysis and Protection Ali Ahmadi May 2008.
Soft errors in adder circuits Rajaraman Ramanarayanan, Mary Jane Irwin, Vijaykrishnan Narayanan, Yuan Xie Penn State University Kerry Bernstein IBM.
CML CML Compiler-Managed Protection of Register Files for Energy-Efficient Soft Error Reduction Jongeun Lee, Aviral Shrivastava* Compiler Microarchitecture.
1 Distributed Energy-Efficient Scheduling for Data-Intensive Applications with Deadline Constraints on Data Grids Cong Liu and Xiao Qin Auburn University.
Adaptive Multi-path Prediction for Error Resilient H.264 Coding Xiaosong Zhou, C.-C. Jay Kuo University of Southern California Multimedia Signal Processing.
RAID SECTION (2.3.5) ASHLEY BAILEY SEYEDFARAZ YASROBI GOKUL SHANKAR.
1 Adaptable applications Towards Balancing Network and Terminal Resources to Improve Video Quality D. Jarnikov.
Low-Power H.264 Video Compression Architecture for Mobile Communication Student: Tai-Jung Huang Advisor: Jar-Ferr Yang Teacher: Jenn-Jier Lien.
Bypass Aware Instruction Scheduling for Register File Power Reduction Sanghyun Park, Aviral Shrivastava Nikil Dutt, Alex Nicolau Yunheung Paek Eugene Earlie.
Investigating Adaptive Compilation using the MIPSpro Compiler Keith D. Cooper Todd Waterman Department of Computer Science Rice University Houston, TX.
Scalable Video Coding and Transport Over Broad-band wireless networks Authors: D. Wu, Y. Hou, and Y.-Q. Zhang Source: Proceedings of the IEEE, Volume:
A Robust Luby Transform Encoding Pattern-Aware Symbol Packetization Algorithm for Video Streaming Over Wireless Network Dongju Lee and Hwangjun Song IEEE.
11 Using Checksum to Reduce Power Consumption of Display Systems for Low-Motion Content Kyungtae Han*, Zhen Fang, Paul Diefenbaugh, Richard Forand, Ravi.
11 Online Computing and Predicting Architectural Vulnerability Factor of Microprocessor Structures Songjun Pan Yu Hu Xiaowei Li {pansongjun, huyu,
Copyright © 2010 Houman Homayoun Houman Homayoun National Science Foundation Computing Innovation Fellow Department of Computer Science University of California.
Power Analysis of Embedded Software : A Fast Step Towards Software Power Minimization 指導教授 : 陳少傑 教授 組員 : R 張馨怡 R 林秀萍.
Multimedia Computing and Networking Jan Reduced Energy Decoding of MPEG Streams Malena Mesarina, HP Labs/UCLA CS Dept Yoshio Turner, HP Labs.
Adaptive QoS Control of Multimedia Transmission Over Band-limited Networks Presenter: Hu Huang Nov G.Y.Hong 1, Member, IEEE, A.C.M.Fong 1, Member,
Blind Quality Assessment System for Multimedia Communications Using Tracing Watermarking P. Campisi, M. Carli, G. Giunta and A. Neri IEEE Transactions.
Wireless Cache Invalidation Schemes with Link Adaptation and Downlink Traffic Presented by Ying Jin.
Kyoungwoo Lee1, Aviral Shrivastava2, Ilya Issenin1,
Problem and Motivation
SE-Aware HPC Extension : Selective Data Protection for reducing failures due to soft errors 7/20/2006 Kyoungwoo Lee.
June 2007 An Experimental Study on Energy Consumption of Video Encryption for Mobile Handheld Devices Kyoungwoo Lee, Nikil Dutt, Nalini Venkatasubramanian.
Maintaining Data Integrity in Programmable Logic in Atmospheric Environments through Error Detection Joel Seely Technical Marketing Manager Military &
Mitigating the Impact of Hardware Defects on Multimedia Applications – A Cross-Layer Approach 1Kyoungwoo Lee, 2Aviral Shrivastava, 1Minyoung Kim, 1Nikil.
Ann Gordon-Ross and Frank Vahid*
Partially Protected Caches to Reduce Failures Due to Soft Errors in Multimedia Applications Kyoungwoo Lee, Aviral Shrivastava, Ilya Issenin, Nikil Dutt,
Overview of Secure Video Applications
Kyoungwoo Lee, Nikil Dutt, and Nalini Venkatasubramanian
Kyoungwoo Lee (final defense)
Kyoungwoo Lee, Minyoung Kim, Nikil Dutt, and Nalini Venkatasubramanian
Presentation transcript:

Copyright © 2008 UCI ACES/DSM Laboratories 1 Nalini Venkatasubramanian 1 Kyoungwoo Lee, 2 Aviral Shrivastava, 1 Minyoung Kim, 1 Nikil Dutt, and 1 Nalini Venkatasubramanian Mitigating the Impact of Hardware Defects on Multimedia Applications – A Cross-Layer Approach 1 Department of Computer Science University of California at Irvine 2 Department of Computer Science and Engineering Arizona State University

Copyright © 2008 UCI ACES/DSM Laboratories ACM Multimedia’08 #2 Multimedia Mobile Devices are Popular Web Browsing Image Browsing Satellite TV Video Streaming Animation Video Conferencing Resource-limited mobile devices! Main problem is to achieve low power with high performance, high QoS, and high reliability Map Routing Mobile TV 3D Graphics

Copyright © 2008 UCI ACES/DSM Laboratories ACM Multimedia’08 #3 Mobile Multimedia System network Raw video data Compressed video data Wireless Network Mobile Video Conferencing Application (e.g., Video Encoding) Operating System Hardware Mobile Video Encoding Soft Error Packet Loss Packet Loss Low cost reliability Bug Exception

Copyright © 2008 UCI ACES/DSM Laboratories ACM Multimedia’08 #4 Temporary Hardware Faults  Temporary hardware faults such as transient faults (=soft errors) or intermittent faults cause failures  System crash, infinite loops, segmentation faults, etc. Middleware/ Operating System Hardware Application Soft Error  Causes of transient faults or soft errors  Environmental causes – Natural or man-made external radiation such as alpha particle, proton, and neutron  Technology factors – Technology scaling, increase of transistor densities, lower operating voltages, etc.  Marginal design parameters – Timing problems due to races, hazards, and skew  Signal integrity problems – Crosstalk, ground bounce, etc.

Copyright © 2008 UCI ACES/DSM Laboratories ACM Multimedia’08 #5 Soft Errors on an Increase Transistor  Soft error rate (SER) increases exponentially as technology scales  Integration, voltage scaling, altitude, latitude, etc hours MTTF 1 month MTTF Soft Error = Transient Fault = Bit Flip (memory) [Baumann, 05] MTTF: Mean Time To Failure Middleware/ Operating System Hardware Application Soft Error SER  N flux CS x exp Q critical {- x QsQs } where Q critical = Capacitance Voltage x N flux : Neutron flux intensity, CS: Area of cross section, Q S : Charge collection efficiency

Copyright © 2008 UCI ACES/DSM Laboratories ACM Multimedia’08 #6 Soft Error is an Every Second Concern  Soft Error Rate (SER)  FIT (Failures in Time) – How many errors in one billion operation hours  SER per 0.13 µm = 1,000 FIT ≈ 104 years in MTTF  Soft error is becoming an every second problem SER (FIT)MTTFReason µm years µm64x8x daysHigh Integration nm2x1000x64x8x10001 hourTechnology scaling and Twice Integration A 65 nm2x2x1000x64x8x minutesMemory takes up 50% of soft errors in a system A system with voltage 65 nm 100x2x2x1000x64x8x secondsExponential relationship b/w SER & Supply Voltage A system with voltage flight (35, nm 800x100x2x2x1000x6 4x8x1000 FIT 0.02 seconds High Intensity of Neutron Flux at flight (high altitude)

Copyright © 2008 UCI ACES/DSM Laboratories ACM Multimedia’08 #7 Caches and Video Encoding  Soft error rate is proportional to the time and area to be exposed [Cai, 06]  Soft error rate (SER) is measured in FIT (Failures in Time) per unit size  SER = 1,000 FIT per Mbit for SRAM  The larger memory system, the higher SER  The longer the execution, the higher SER Middleware/ Operating System Hardware Application H.263 Video Encoding  Video encoding consists of complex algorithms  Also, processes the huge amount of video data Motion Estimation Discrete Cosine Transform Quantization Scale Variable Length Encoding  Caches are most hit due to:  Larger portion in processors (more than 50%) Y. Cai, et al., “Cache size selection for performance, energy and reliability of time-constrained systems”, ASP-DAC, Video encodings are time-intensive and memory- intensive, thus very vulnerable to soft errors

Copyright © 2008 UCI ACES/DSM Laboratories ACM Multimedia’08 #8 Soft Error Protection Within-HW  ECC (Error Correction Codes)  Forward Error Recovery (FER)  ECC incurs high overheads in terms of:  power (22% [Phelan,03]), performance (95% [Li,05]), and area (25% [Kreuger,08])  Conventional micro-architectural techniques within hardware layer still exploit ECC  EDC (Error Detection Codes)  EDC is much less expensive than ECC in terms of power, performance, and area  up to 73% less in power and 47% less in performance than ECC [Li, 04]  Need to correct the detected error  Checkpoints and Roll backward (BER – Backward Error Recovery)  Bad for real-time requirement Middleware/ Operating System Hardware Application Error Detection Checkpoint K K+1 BER FER time

Copyright © 2008 UCI ACES/DSM Laboratories ACM Multimedia’08 #9 (e.g., HW-Based Protection) Within-Layer Approach  Cross-layer approach  Integrate and coordinate techniques across system layers in a cooperative manner for system optimization  Can we coordinate within-layer approaches across layers to combat errors for minimal cost reliability? Middleware/ Operating System Hardware Application Soft Error Packet Loss Packet Loss Cross-Layer Approach? (e.g., Error Resilient Video Encoding)

Copyright © 2008 UCI ACES/DSM Laboratories ACM Multimedia’08 #10 Related Cross-Layer Work  GRACE UIUC [W. Yuan Ph.D. thesis in ’04 and A. F. Harris III, Ph.D. thesis in ’06]  QoS/Power tradeoffs  Primarily OS adaptation for power management in multimedia mobile devices  Network adaptation for power management in multimedia communications  DYNAMO middleware for FORGE UCI [S. Mohapatra Ph.D. thesis in ’05 and R. Cornea Ph.D. thesis in ’07]  QoS/Power tradeoffs for mobile embedded systems  Middleware-driven coordination and proxy-based cooperation  Content transcoding at the application layer  Network traffic shaping at the network layer  Backlight (LCD display) setting at the hardware layer  NIC shutdown, CPU DVS/DFS at the hardware layer  xTune UCI and SRI [M. Kim Ph.D. thesis in ’ 08]  QoS/Power/Timeliness adaptation for distributed real-time embedded systems  A Formal Methodology for cross-layer tuning and verifiable timeliness of Mobile Embedded Systems  Our Contribution  QoS/Power/Reliability system optimization for mobile multimedia embedded systems  Use cross-layer approach to provide reliability with minimal cost

Copyright © 2008 UCI ACES/DSM Laboratories ACM Multimedia’08 #11 Related Cross-Layer Work -- GRACE  GRACE UIUC  Primarily OS adaptation for power management in multimedia mobile devices  Network adaptation for power management in multimedia communications [GRACE, 05] W. Yuan and K. Nahrstedt, “Practical voltage scaling for mobile multimedia devices”, ACM international conference on Multimedia, D. G. Sachs, et al., “GRACE: A cross-layer adaptation framework for saving energy”, IEEE Computer, special issue on Power-Aware Computing, Dec 2003

Copyright © 2008 UCI ACES/DSM Laboratories ACM Multimedia’08 #12 Related Cross-Layer Work -- Dynamo  DYNAMO – Proxy-based middleware-driven cross- layer approach for QoS/Energy Tradeoffs  Content transcoding at application layer  Network traffic shaping at network layer  Backlight (LCD display) setting at hardware layer  NIC shutdown, CPU DVS/DFS at hardware layer Shivajit Mohapatra, "DYNAMO: Power aware middleware for distributed mobile computing", Ph.D. Thesis, University of California, Irvine, 2005 Radu Cornea, “Content annotation for power and quality trade-offs in mobile multimedia systems”, Ph.D. Thesis, University of California, Irvine, 2007 Shivajit Mohapatra, et al., "DYNAMO: A cross-layer framework for end-to-end QoS and energy optimization in mobile handheld devices", IEEE JSAC, May 2007 Radu Cornea, et al., “Software annotations for power optimization on mobile devices”, DATE, 2006 Shivajit Mohapatra, et al., "Integrated power management for video streaming to mobile handheld devices", ACM Multimedia, Nov2003 Middleware Coordination

Copyright © 2008 UCI ACES/DSM Laboratories ACM Multimedia’08 #13 Related Cross-Layer Work -- xTune  xTune – A Formal Methodology for Cross-layer Tuning of Mobile Embedded Systems Handheld Server Minyoung Kim, " xTune: A formal methodology for cross-layer tuning of mobile real-time embedded systems", Ph.D. Thesis, University of California, Irvine, 2005 Minyoung Kim, et al., “xTune: A formal methodology for cross-layer tuning of mobile embedded systems”, ACM SIGBED Review, Jan2008 Minyoung Kim, et al., PBPAIR: An energy-efficient error-resilient encoding using probability based power aware intra refresh”, ACM SIGMOBILE MCCR, 2006 Informed selection from formal model and analysis Enhanced by integrating it with observations of system  Adaptive reasoning and proactive control

Copyright © 2008 UCI ACES/DSM Laboratories ACM Multimedia’08 #14 Outline  Motivation and Related Work  Problem Statement  Our Solution  CC-PROTECT – Cooperative Cross-Layer Protection  Mitigate the impact of soft errors with minimal cost  Experiments  Conclusion

Copyright © 2008 UCI ACES/DSM Laboratories ACM Multimedia’08 #15 Problem Statement and Our Goals Application (e.g., video encoding) Middleware / Operating System Error-Prone Hardware (e.g., error-prone cache) Soft Error Mobile Video Encoding  Soft Errors on Caches for Video Encoding  Soft errors are transient faults at hardware layer  SER is becoming a critical concern as technology scales  Caches are most hit  Video encoding is time-intensive and memory-intensive  Impact of Soft Errors 1.Failures 2.Quality Degradation  Problem  Develop Cross-Layer approach to mitigate the impact of soft errors 1.Reducing the failure rate 2.Minimizing the quality loss  Minimize the cost (power and performance)

Copyright © 2008 UCI ACES/DSM Laboratories ACM Multimedia’08 #16 CC-PROTECT Overview Middleware/ Operating System Hardware Application Previously, Hardware-based Error Protection (ECC, etc.) Unprotected Cache Protected Cache Protected Cache ECC DFR - Error Correction PBPAIR - Error Resilience ECC: Error Correction Codes EDC: Error Detection Codes DFR: Drop and Forward Recovery PBPAIR: Probability-Based Power Aware Intra Refresh CC-PROTECT - Cooperative Cross-layer Protection Soft Error EDC

Copyright © 2008 UCI ACES/DSM Laboratories ACM Multimedia’08 #17 Failure Mitigation  Goal 1 – Reduce soft error induced failures

Copyright © 2008 UCI ACES/DSM Laboratories ACM Multimedia’08 #18 Partial Cross-Layer Protection -- PPC  PPC (Partially Protected Caches) [Lee, 06]:  One protected cache  ECC, etc.  Typically smaller  The other unprotected cache  Compiler  Maps failure-critical (FC) data into the protected cache  Maps failure-non-critical (FNC) data into the unprotected cache  Still incurs overheads due to high expensive ECC protection  29% energy reduction compared to the protected cache  10% energy overhead compared to the unprotected cache Processor Pipeline Processor Unprotected Cache Protected Cache Protected Cache Memory PPC FC Pages FNC Pages FNC FC K. Lee, et al., “Mitigating soft error failures for multimedia applications by selective data protection”, CASES, Oct 2006.

Copyright © 2008 UCI ACES/DSM Laboratories ACM Multimedia’08 #19 PPC with EDC at Hardware Middleware/ Operating System Hardware Application Unprotected Cache Protected Cache Protected Cache ECC: Error Correction Codes EDC: Error Detection Codes Soft Error EDC Non- Video Data Video Data Resource Saving

Copyright © 2008 UCI ACES/DSM Laboratories ACM Multimedia’08 #20 DFR across HW & MW/OS  Drop and Forward Recovery (DFR) at video encoding  Transform components into the next correct state  (e.g.) detect an error and move forward to the next frame encoding  BER rolls backward  Especially, well-suited for multimedia applications  Hardware defects will be managed by DFR (with timeliness)  Quality degradation due to DFR will be minimized by inherent error-tolerance of video data DFR Error Detection Frame KFrame K+1 BER FER Hardware Application Soft Error Middleware / Operating System time Resource Saving

Copyright © 2008 UCI ACES/DSM Laboratories ACM Multimedia’08 #21 Mitigation of QoS Degradation  Goal 2 – Mitigate quality degradation due to soft errors and frame drops

Copyright © 2008 UCI ACES/DSM Laboratories ACM Multimedia’08 #22 Resilience to Network-induced Packet Losses Error-Resilient Video Encoding Middleware / Operating System Hardware Raw video data Error-Resilient Compressed video data Error-Prone Network Packet Loss Packet Loss PLR network PLR: Packet Loss Rate PBPAIR: Probability-Based Power Aware Intra Refresh Mobile Video Encoding Error-Resilient Video Encoding compresses video data resilient against errors in networks such as packet losses goal: improves the VIDEO QoS (e.g.) PBPAIR – energy efficient

Copyright © 2008 UCI ACES/DSM Laboratories ACM Multimedia’08 #23 PBPAIR – Error Resilient Video Encoding  PBPAIR (Probability Based Power Aware Intra Refresh) [Kim,06] ACM Multimedia’08 #23 PBPAIR PLR Packet Loss Packet Loss network  Two Parameters 1)PLR (Packet Loss Rate) – Network Status  The higher PLR, the more intra macro blocks 2)Intra_Threshold – User-level Resilience Request  The higher Intra_Threshold, the more intra macro blocks  Error resilient and energy efficient video encoding  Tradeoffs among energy efficiency, compress efficiency, and QoS  Up to 34% energy reduction compared to previous encodings at 10% PLR Intra_Threshold Minyoung Kim, et al., PBPAIR: An energy-efficient error-resilient encoding using probability based power aware intra refresh”, ACM SIGMOBILE MCCR, 2006

Copyright © 2008 UCI ACES/DSM Laboratories ACM Multimedia’08 #24 Resilience to Soft Error induced Frame Drops Error-Resilient Video Encoding Middleware / Operating System Hardware Raw video data Error-Resilient Compressed video data Error-Prone Network Packet Loss Packet Loss PLR network PLR: Packet Loss Rate PBPAIR: Probability-Based Power Aware Intra Refresh Mobile Video Encoding SER (Soft Error Rate) FLR (Frame Loss Rate) Middleware translates SER into FLR Middleware translates SER into FLR Error-Resilient Video Encoding compresses video data resilient against not only packet losses but also soft errors Soft Error Induced Frame Drop? Soft Error Induced Frame Drop? Resource Saving

Copyright © 2008 UCI ACES/DSM Laboratories ACM Multimedia’08 #25 Translation from SER to FLR  N SE = S cache × N inst × R SE  N SE is the number of soft errors per frame encoding  S cache is the size of caches in KB  32 KB unprotected cache and 2 KB protected cache for a PPC in our study  N inst is the number of instructions for one frame encoding  ACET (Average Case Execution Time) is used in our study  R SE is a soft error rate per KB and per instruction  per KB and per instruction is used in our study (accelerated by several orders of magnitude)  N SE is converted into % value, which is FLR  (e.g.) N SE = 32 x 10 9 x = 0.32  FLR = 32%

Copyright © 2008 UCI ACES/DSM Laboratories ACM Multimedia’08 #26 Adaptive CC-PROTECT  Naïve DFR  Always DFR when an error is detected  Significant quality degradation  Adaptive DFR/BER  Slack-Aware DFR/BER  Depends on elapsed time  Frame-Aware DFR/BER  Depends on frame importance  QoS-Aware DFR/BER  Depends on feedbacked video quality Error Detection Frame KFrame K+1 DFR if T elapsed < T threshold BER else DFR where T threshold is portion of ACET BER K-1 Error DFR K K+1 K+2 Error DFR T elapsed ACET: Average Case Execution Time if Frame K is important (e.g., I-frame) BER else DFR if QoS feedback < QoS requirement BER else DFR Where QoS feedback is from decoding side

Copyright © 2008 UCI ACES/DSM Laboratories ACM Multimedia’08 #27 Application (e.g., Video Encoding) Middleware / Operating System Hardware Raw video data Compressed video data Error-Prone Network SER FLR PLR Resilience Mitigation (QoS) network Mobile Video Encoding Within-Layer Protections CC-PROTECT -- Cross-Layer Protection Error-Resilient Video Encoding (e.g., PBPAIR) Error-Protected Data Cache (e.g., PPC) Packet Loss Packet Loss Soft Error PPC with ECC No Coupling, No Cooperation Local Optimization within Layers Middleware / Operating System PPC with EDC Middleware relates SER at HW to FLR at Application selects a policy based on available information (parameters & constraints) CC-PROTECT 1. achieves system-level optimization 2. extends the applicability of existing schemes

Copyright © 2008 UCI ACES/DSM Laboratories ACM Multimedia’08 #28 Outline  Motivation and Related Work  Problem Statement  Our Solution  Experiments  Experimental Setup and Compositions  Effectiveness of CC-PROTECT in terms of failure rate, QoS, runtime, and energy consumption  Effectiveness of Adaptive DFR/BER Schemes  Conclusion

Copyright © 2008 UCI ACES/DSM Laboratories ACM Multimedia’08 #29 Experimental Framework Application (H.263 Video Encoding) Compiler (gcc) ExecutablePage Mapping Cache Simulator (SimpleScalar) Analyzer REPORT : Failure Rate Access Time Energy QoS Video Data DFR Parameters Soft Error Rate Power Numbers Delay Penalties 1.Error Prone Video Encoding (GOP-K) 2.Error Resilient Video Encoding (PBPAIR) 1.Protected Cache Parameters 2.Unprotected Cache Parameters COASTGUARDAKIYO FOREMAN High Activity Low Activity Mid Activity

Copyright © 2008 UCI ACES/DSM Laboratories ACM Multimedia’08 #30 Compositions 1.BASE – No Protection  Error-Prone Video Encoding (GOP- K) + Unprotected Cache 2.HW-PROTECT  Error-Prone Video Encoding (GOP- K) + PPC with ECC 3.APP-PROTECT  Error-Resilient Video Encoding (PBPAIR) + Unprotected Cache 4.MULTI-PROTECT  Error-Resilient Video Encoding (PBPAIR) + PPC with ECC 5.CC-PROTECT  Error-Resilient Video Encoding (PBPAIR) + DFR + PPC with EDC Middleware/ Operating System Hardware (Data Cache) Application (Video Encoding) GOP-K PBPAIR Unprotected Cache PPC EDC DFR 5 - Cross- Layer Protection 1 - NO Protection Soft Error Monitoring SER Translation Selection b/w DFR & BER 2, 3, & 4 Within- Layer Protections

Copyright © 2008 UCI ACES/DSM Laboratories ACM Multimedia’08 #31 Effectiveness of CC-PROTECT  First Set of Experiments – Evaluate CC-PROTECT with existing protections in terms of failure rate, video quality, energy consumption, and performance for FOREMAN.QCIF (mid activity)

Copyright © 2008 UCI ACES/DSM Laboratories ACM Multimedia’08 #32 Failure Rate  Failure Rate is the number of failures (e.g., system crash) due to soft errors, out of thousands simulations CC-PROTECT reduces the failure rate by more than 1,000 times, as compared to BASE

Copyright © 2008 UCI ACES/DSM Laboratories ACM Multimedia’08 #33 Video Quality  QoS is the video quality measured in PSNR CC-PROTECT demonstrates the video quality close to those of other compositions

Copyright © 2008 UCI ACES/DSM Laboratories ACM Multimedia’08 #34  Energy consumption includes the energy consumptions of caches, bus, and main memory Energy Consumption CC-PROTECT reduces the energy consumption of memory subsystem by 49%, compared to BASE EDC impact 17% Reduction compared to HW-PROTECT 4% Reduction compared to BASE EDC + DFR impact 36% Reduction compared to HW-PROTECT 26% Reduction compared to BASE EDC + DFR + PBPAIR(CC-PROTECT) impact 56% Reduction compared to HW-PROTECT 49% Reduction compared to BASE

Copyright © 2008 UCI ACES/DSM Laboratories ACM Multimedia’08 #35  Performance is estimated in access time to memory subsystem (caches, bus, and memory) Performance CC-PROTECT reduces the memory access time by 58%, compared to BASE

Copyright © 2008 UCI ACES/DSM Laboratories ACM Multimedia’08 #36 CC-PROTECT achieves low-cost reliability (more than 50% cost reduction and more reliable, at the cost of QoS, than within-layer protections) Effectiveness of CC-PROTECT

Copyright © 2008 UCI ACES/DSM Laboratories ACM Multimedia’08 #37 Effectiveness of Adaptive CC-PROTECT  Second Set of Experiments – Evaluate adaptive CC- PROTECT schemes (SA-DFR/BER, FA-DFR/BER, and QA- DFR/BER) to naïve schemes (Naïve DFR and Naïve BER) in terms of video quality and energy consumption with FOREMAN.QCIF (mid activity)  For failure rate and performance, please refer to our paper  SA-DFR/BER – 60% ACET (Average Case Execution Time) is the threshold value  60% is the least threshold value, causing better QoS than BASE  FA-DFR/BER – 2 nd Frame must be protected  Losing 2 nd frame affects the QoS most  QA-DFR/BER – dB is the threshold value to select DFR or BER  dB is the PSNR value in case of BASE for FOREMAN

Copyright © 2008 UCI ACES/DSM Laboratories ACM Multimedia’08 #38 QoS Adaptive CC-PROTECT improves the video quality, as compared to Naïve DFR

Copyright © 2008 UCI ACES/DSM Laboratories ACM Multimedia’08 #39 Energy Consumption Adaptive CC-PROTECT balances energy consumption between Naïve DFR and Naïve BER, and QA-DFR/BER is the best in terms of energy

Copyright © 2008 UCI ACES/DSM Laboratories ACM Multimedia’08 #40 Conclusion  Soft error is a critical design concern for mobile multimedia embedded systems  Previously proposed protection techniques within layers are expensive for resource-constrained mobile devices  Propose CC-PROTECT approach, which cooperates existing schemes across layers to mitigate the impact of soft errors on the failure rate and video quality in mobile video encoding systems  PPC (Partially Protected Caches) with EDC (Error Detection Codes) at hardware layer  DFR (Drop and Forward Recovery) at middleware  PBPAIR (Probability-Based Power Aware Intra Refresh) at application layer  Demonstrate the effectiveness of low-cost (about 50%) reliability (1,000x) at the minimal cost of QoS (less than 1%)  Future work includes:  Expand CC-PROTECT for various errors and for runtime approach  Intelligent schemes to improve the effectiveness  Design space exploration techniques

Copyright © 2008 UCI ACES/DSM Laboratories Thanks! Any Questions?

Copyright © 2008 UCI ACES/DSM Laboratories Backup Slides

Copyright © 2008 UCI ACES/DSM Laboratories ACM Multimedia’08 #43 [Hazucha et al., IEEE] P. Hazucha and C. Svensson. Impact of CMOS Technology Scaling on the Atmospheric Neutron Soft Error Rate. IEEE Trans. on Nuclear Science, 47(6):2586–2594, Soft Errors on an Increase  Increase exponentially due to technology scaling  0.18 µ m  1,000 FIT per Mbit of SRAM  0.13 µ m  10,000 to 100,000 FIT per Mbit of SRAM  Voltage Scaling  Voltage scaling increases SER significantly Soft Error is a main design concern! SER  N flux CS x exp Q critical {- x QsQs } where Q critical = C V x

Copyright © 2008 UCI ACES/DSM Laboratories ACM Multimedia’08 #44 Soft Error is an Every Second Concern  Soft Error Rate (SER)  FIT (Failures in Time) – How many errors in one billion operation hours  SER per 0.13 µm = 1,000 FIT ≈ 104 years in MTTF  Soft error is becoming an every second problem  SER for µm = 64x8x1,000 FIT ≈ 81 days in MTTF  SER for nm = 2x1,000x64x8x1,000 FIT ≈ 1 hour in MTTF  SER for a 0.65 nm = 2x2x1,000x64x8x1,000 FIT ≈ 30 minutes in MTTF  SER with voltage scaling for a 0.65 nm = 100x2x2x1,000x64x8x1,000 FIT ≈ 20 seconds in MTTF  SER with voltage scaling for a flight (35, nm = 800x100x2x2x1,000x64x8x1,000 FIT ≈ 0.02 seconds in MTTF Actel, “Neutrons from above – Soft Error Rates”, Actel tech. rep., 2002 Robert Baumann, “Soft errors in advanced computer systems”, IEEE Design and Test of Computers, 2005 Gorden E. Moore, “Cramming more components onto integrated circuits”, Electronics, 1965 S. Mitra, et al., “Robust system design with built-in soft-error resilience”, IEEE Computer 2005 P. Hazucha et al., “Impact of CMOS technology scaling on the atmospheric neutron soft error rate”, IEEE Trans. on Nuclear Science, 2000 Ritesh Mastipuram and Edwin C. Wee, “Soft errors’ impact on system reliability”,

Copyright © 2008 UCI ACES/DSM Laboratories ACM Multimedia’08 #45 Problem Statement and Our Goals Two Impacts 1.Failure 2.Quality Application (e.g., video encoding) Middleware / Operating System Error-Prone Hardware (e.g., error-prone cache) Raw video data Compressed video data Error-Prone Network Soft Error network Mobile Video Conferencing Mobile Video Encoding

Copyright © 2008 UCI ACES/DSM Laboratories ACM Multimedia’08 #46 FER and BER  Forward Error Recovery (FER)  Transform components into any correct state  ECC  Overkill for multimedia applications  Backward Error Recovery (BER)  Roll back into the previous correct state  EDC + Checkpoint and Roll backward  Bad for the real-time requirement Error Detection Checkpoint K Checkpoint K+1 BER FER

Copyright © 2008 UCI ACES/DSM Laboratories ACM Multimedia’08 #47 Error-Resilience at Application  PBPAIR [Kim, 06] takes into account packet loss rate to determine the error resilience level   Error Rate = Packet Loss Rate Hardware Soft Error Middleware / Operating System  EE-PBPAIR [Lee, 08] has a mechanism to adjust packet loss rate  EE-PBPAIR at application encodes the video data resilient against not only packet losses but also soft errors   Error Rate = PLR + FLR (Frame Loss Rate)  SER (Soft Error Rate) at Hardware is translated into FLR (Frame Loss Rate) at Middleware Application

Copyright © 2008 UCI ACES/DSM Laboratories Preliminary and Extra Experimental Results

Copyright © 2008 UCI ACES/DSM Laboratories ACM Multimedia’08 #49 Energy Consumption

Copyright © 2008 UCI ACES/DSM Laboratories ACM Multimedia’08 #50 CC-PROTECT for AKIYO (low activity) CC-PROTECT obtains better results with low activity of video streams

Copyright © 2008 UCI ACES/DSM Laboratories ACM Multimedia’08 #51 CC-PROTECT for COASTGUARD (high activity) CC-PROTECT obtains effective results with various video streams

Copyright © 2008 UCI ACES/DSM Laboratories ACM Multimedia’08 #52 Failure Rate Adaptive CC-PROTECT obtains the worse failure rate than Naïve DFR, still better than BASE

Copyright © 2008 UCI ACES/DSM Laboratories ACM Multimedia’08 #53 Performance Adaptive CC-PROTECT balances between Naïve DFR and Naïve BER

Copyright © 2008 UCI ACES/DSM Laboratories ACM Multimedia’08 #54 Compositions in the following slides  Base  GOP + Unprotected Cache  HW-Protection 1  GOP + Protected Cache with ECC  HW-Protection 2  GOP + Protected Cache with EDC + BER (checkpoint and roll- backward)  App-Protection  PBPAIR + Unprotected Cache  All-Protection  PBPAIR + Protected Cache with ECC  Cross-Layer Protection 1  GOP + PPC with EDC + DFR (drop and forward recovery)  Cross-Layer Protection 2  PBPAIR + PPC with EDC + DFR (drop and forward recovery)

Copyright © 2008 UCI ACES/DSM Laboratories ACM Multimedia’08 #55 Failure Rate

Copyright © 2008 UCI ACES/DSM Laboratories ACM Multimedia’08 #56 Video Quality

Copyright © 2008 UCI ACES/DSM Laboratories ACM Multimedia’08 #57 Performance

Copyright © 2008 UCI ACES/DSM Laboratories ACM Multimedia’08 #58 Energy Consumption

Copyright © 2008 UCI ACES/DSM Laboratories ACM Multimedia’08 #59 Naïve DFR  Naïve DFR  Strategy – Any soft error results in DFR  Pros – High Energy Saving and High Reliability  Cons – QoS degradation  e.g.) Consecutive frames dropped Error Detection Frame KFrame K+1 DFR K-1 KK+1K+2 Error Drop QoS ?

Copyright © 2008 UCI ACES/DSM Laboratories ACM Multimedia’08 #60 Slack-Aware Adaptive DFR/BER  SA-DFR/BER  Strategy – Enough slack time can help improve the QoS by retrying it  Pros – QoS Improvement  Cons – Increasing Energy Consumption Error Detection Frame KFrame K+1 DFR ACET if T elapsed < T threshold go back to Frame K else drop and move forward to Frame K+1 where T threshold is C% of ACET BER K-1 K K+1 K+2 Error Drop K+1 BER

Copyright © 2008 UCI ACES/DSM Laboratories ACM Multimedia’08 #61 Frame-Aware Adaptive DFR/BER  FA-DFR/BER  Strategy – Important frame with perspective of QoS should not be dropped  Pros – QoS Improvement  Cons – Increasing Energy Consumption and need to change the encoder Error Detection Frame KFrame K+1 DFR if F K == F I-frame go back to Frame K else drop and move forward to F K+1 BER K-1 K K+1 K+2 Error Drop K+1 BER if F K-1 (previous frame) was dropped go back to Frame K else drop and move forward to F K+1 if Diff K-1 and K > Diff threshold go back to Frame K else drop and move forward to F K+1 A B C

Copyright © 2008 UCI ACES/DSM Laboratories ACM Multimedia’08 #62 QoS-Aware Adaptive DFR/BER  QA-DFR/BER  Strategy – QoS/Delay feedback from receiver helps adjust DFR policies.  (e.g.) QoS degradation makes BER work  (e.g.) QoS degradation can increase the time threshold, increasing the chance to retry it  (e.g.) if delay matters, apply DFR aggressively  Pros – QoS is managed by user-end  Cons – it may call BER always Error Detection Frame KFrame K+1 DFR Low quality-feedback increases error- resilience aggressively or decreases DFR by adjusting threshold values T threshold is increasing by quality-feedback  BER will be applied more often T threshold is decreasing by delay-feedback  DFR will be applied more often BER senderreceiver stream feedback

Copyright © 2008 UCI ACES/DSM Laboratories ACM Multimedia’08 #63 Randomly Adaptive DFR/BER  Random DFR/BER  Strategy – select DFR or BER based on pseudo random generation with Probability  Pros – new knob to adjust DFR policy  Cons – no intelligence Error Detection Frame KFrame K+1 DFR if P pseudo-random > P threshold go back to Frame K else drop and move forward to Frame K+1 where P threshold is weight of DFR and P pseudo-random is one number b/w 0 to 100 in pseudo-random BER K-1 K K+1 K+2 Error Drop K+1 BER

Copyright © 2008 UCI ACES/DSM Laboratories Results for DFR + BER