Winter Semester 2010 ”Politehnica” University of Timisoara Course No. 5: Expanding Bio-Inspiration: Towards Reliable MuxTree  Memory Arrays – Part 2 –

Slides:



Advertisements
Similar presentations
1 Lecture 18: RAID n I/O bottleneck n JBOD and SLED n striping and mirroring n classic RAID levels: 1 – 5 n additional RAID levels: 6, 0+1, 10 n RAID usage.
Advertisements

Distributed Systems Major Design Issues Presented by: Christopher Hector CS8320 – Advanced Operating Systems Spring 2007 – Section 2.6 Presentation Dr.
Minimization of Circuits
STA305 Spring 2014 This started with excerpts from STA2101f13
Programming Paradigms and languages
CSCE430/830 Computer Architecture
10/14/2005Caltech1 Reliable State Machines Dr. Gary R Burke California Institute of Technology Jet Propulsion Laboratory.
COE 444 – Internetwork Design & Management Dr. Marwan Abu-Amara Computer Engineering Department King Fahd University of Petroleum and Minerals.
Programming Types of Testing.
REDUNDANT ARRAY OF INEXPENSIVE DISCS RAID. What is RAID ? RAID is an acronym for Redundant Array of Independent Drives (or Disks), also known as Redundant.
Maintaining Data Integrity in Programmable Logic in Atmospheric Environments through Error Detection Joel Seely Technical Marketing Manager Military &
Reliable System Design 2011 by: Amir M. Rahmani
Computer ArchitectureFall 2007 © November 28, 2007 Karem A. Sakallah Lecture 24 Disk IO and RAID CS : Computer Architecture.
Evaluating Hypotheses Chapter 9. Descriptive vs. Inferential Statistics n Descriptive l quantitative descriptions of characteristics.
Evaluating Hypotheses Chapter 9 Homework: 1-9. Descriptive vs. Inferential Statistics n Descriptive l quantitative descriptions of characteristics ~
3-1 Introduction Experiment Random Random experiment.
By : Nabeel Ahmed Superior University Grw Campus.
Lecture 11: Storage Systems Disk, RAID, Dependability Kai Bu
1 Chapter 20 Two Categorical Variables: The Chi-Square Test.
CSCI 5801: Software Engineering
Chapter 6 RAID. Chapter 6 — Storage and Other I/O Topics — 2 RAID Redundant Array of Inexpensive (Independent) Disks Use multiple smaller disks (c.f.
Quantum Error Correction Jian-Wei Pan Lecture Note 9.
Roza Ghamari Bogazici University.  Current trends in transistor size, voltage, and clock frequency, future microprocessors will become increasingly susceptible.
Matthew Ziegler CS 851 – Bio-Inspired Computing Evolvable Hardware and the Embryonics Approach.
Software Reliability SEG3202 N. El Kadri.
 CS 5380 Software Engineering Chapter 8 Testing.
Protecting the Public, Astronauts and Pilots, the NASA Workforce, and High-Value Equipment and Property Mission Success Starts With Safety Believe it or.
Seattle June 24-26, 2004 NASA/DoD IEEE Conference on Evolvable Hardware Self-Repairing Embryonic Memory Arrays Lucian Prodan Mihai Udrescu Mircea Vladutiu.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 20 Slide 1 Critical systems development 3.
+ CS 325: CS Hardware and Software Organization and Architecture Memory Organization.
14.1/21 Part 5: protection and security Protection mechanisms control access to a system by limiting the types of file access permitted to users. In addition,
An OBSM method for Real Time Embedded Systems Veronica Eyo Sharvari Joshi.
The concept of RAID in Databases By Junaid Ali Siddiqui.
“Politehnica” University of Timisoara Course Advisor:  Lucian Prodan Evolvable Systems Web Page:   Teaching  Graduate Courses Summer.
D_160 / MAPLD Burke 1 Fault Tolerant State Machines Gary Burke, Stephanie Taft Jet Propulsion Laboratory, California Institute of Technology.
Using Memory to Cope with Simultaneous Transient Faults Authors: Universidade Federal do Rio Grande do Sul Programa de Pós-Graduação em Engenharia Elétrica.
“Politehnica” University of Timisoara Course No. 3: Project E MBRYONICS Evolvable Systems Winter Semester 2010.
In-Place Decomposition for Robustness in FPGA Ju-Yueh Lee, Zhe Feng, and Lei He Electrical Engineering Dept., UCLA Presented by Ju-Yueh Lee Address comments.
Evolving, Adaptable Visual Processing System Simon Fung-Kee-Fung.
Part.2.1 In The Name of GOD FAULT TOLERANT SYSTEMS Part 2 – Canonical Structures Chapter 2 – Hardware Fault Tolerance.
Hands-On Microsoft Windows Server 2008 Chapter 7 Configuring and Managing Data Storage.
Class Seven Turn In: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 For Class Eight: Chapter 20: 18, 20, 24 Chapter 22: 34, 36 Read Chapters 23 &
RAID Technology By: Adarsha A,S 1BY08A03. Overview What is RAID Technology? What is RAID Technology? History of RAID History of RAID Techniques/Methods.
Reliability of Disk Systems. Reliability So far, we looked at ways to improve the performance of disk systems. Next, we will look at ways to improve the.
Network-Attached Storage. Network-attached storage devices Attached to a local area network, generally an Ethernet-based network environment.
Week#3 Software Quality Engineering.
MAPLD 2005/213Kakarla & Katkoori Partial Evaluation Based Redundancy for SEU Mitigation in Combinational Circuits MAPLD 2005 Sujana Kakarla Srinivas Katkoori.
Kandemir224/MAPLD Reliability-Aware OS Support for FPGA-Based Systems M. Kandemir, G. Chen, and F. Li Department of Computer Science & Engineering.
 How do you know how long your design is going to last?  Is there any way we can predict how long it will work?  Why do Reliability Engineers get paid.
Chapter 3 Data Representation
RAID.
A Case for Redundant Arrays of Inexpensive Disks (RAID) -1988
CHAPTER 4s Reliability Operations Management, Eighth Edition, by William J. Stevenson Copyright © 2005 by The McGraw-Hill Companies, Inc. All rights reserved.
Multiple Platters.
An array-based study of increased system lifetime probability
External Memory.
Outline Introduction Characteristics of intrusion detection systems
Maintaining Data Integrity in Programmable Logic in Atmospheric Environments through Error Detection Joel Seely Technical Marketing Manager Military &
RAID RAID Mukesh N Tekwani
Design of a ‘Single Event Effect’ Mitigation Technique for Reconfigurable Architectures SAJID BALOCH Prof. Dr. T. Arslan1,2 Dr.Adrian Stoica3.
TECHNICAL SEMINAR PRESENTATION
RAID Redundant Array of Inexpensive (Independent) Disks
UNIT IV RAID.
Ontogenetic hardware Ok, so the Tom Thumb algorithm can self-replicate an arbitrary structure within an FPGA But what kind of structures is it interesting.
Improving Quantum Circuit Dependability
Hardware Assisted Fault Tolerance Using Reconfigurable Logic
Guihai Yan, Yinhe Han, and Xiaowei Li
RAID RAID Mukesh N Tekwani April 23, 2019
Seminar on Enterprise Software
Fault Mitigation of Switching Lattices under the Stuck-At Model
Presentation transcript:

Winter Semester 2010 ”Politehnica” University of Timisoara Course No. 5: Expanding Bio-Inspiration: Towards Reliable MuxTree  Memory Arrays – Part 2 – – Part 2 – Emerging Systems

Presentation Outline Chapter 1: Bio-Inspired Reliability (With a plea for bio- inspiration and a comparison between artificial Embryonics cells and the stem cells from biology) Chapter 2: A Bird’s Eye View Over Faults (Includes fault tolerance motivation, causes of unexpected, soft errors and a description of the physical phenomena involved) Chapter 3: Embryonics and SEUs (Particularities of the project, datapath model in memory structures, and reliability analysis)

Current state-of-the-art Bio-inspired memory for Embryonics genome storage Genome storage critical: –drives actual hardware (polymerase and ribosomic genome) –contains instructions on how additional hardware will be driven (operative genome) No memory protection mechanisms currently Both desirable and feasible Chapter 3: Embryonics and SEUs (1)

3.1. Error-Type Distribution “By far the most common type of chip failure is a soft error of a single cell on a chip” Multiple bit flips 1÷7% of total soft fails recorded Double bit-flips under 5% of the total events 2 cases of quadruple bit flip events witnessed; predicted rate 1 in 65 years per device Chapter 3: Embryonics and SEUs (1)

3.2. Datapath Model for Memory Structures 3D matrix; M rows and N columns of physically identical storage molecules, of F 1- bit memory cells each Data synchronously circled Chapter 3: Embryonics and SEUs (2)

3.2. Datapath Model for Memory Structures For each L i,j a vicinity V(L i,j ) = L x,y L i,j L z,w defined Data shifting process: Chapter 3: Embryonics and SEUs (3)

3.2. Datapath Model for Memory Structures Useful for error injection testing Chapter 3: Embryonics and SEUs (4)

3.3. Reliability Analysis Basic assumption: failures exponentially distributed inside a molecule Similar assumptions found to work well Chapter 3: Embryonics and SEUs (5)

3.4. Error Coding Failure situations: A.Single failure; recovery by parity-based coding B.Double failure; core affected by at least one error, at most two errors on the same row; recovery by Hamming-like codes C.Multiple failure; same as previous, likelihood found to be minimal D.Terminal failure; too many faults, cannot be recovered E.No failures detected; either normal operating or undetectable combination of errors; does not require/ cannot be established recovery measures Chapter 3: Embryonics and SEUs (6)

3.4. Error Coding Strategies of tolerating faults in Embryonics –Fault tolerance at the molecular level Advantage: isolating faulty molecules possible, use of the transparent reconfiguration process; Disadvantage: considerable portion of molecular core affected for redundant coding –Fault tolerance at the macro-cell level Advantage: separate macro-cells for redundant coding and additional logic Disadvantage: reconfiguration process quite difficult due to lack of addressing Chapter 3: Embryonics and SEUs (7)

Macro-Cell Level, Classic SEC Chapter 3: Embryonics and SEUs (8)

Macro-Cell Level, Classic SEC Chapter 3: Embryonics and SEUs (9)

Macro-Cell Level, Protochip SEC Faults in a row superimposed onto a protochip In each protochip, independent Poisson processes formed by failure types a the probability for a type A failure Chapter 3: Embryonics and SEUs (10)

Macro-Cell Level, Protochip SEC Chapter 3: Embryonics and SEUs (11)

Macro-Cell Level, Protochip DEC a the probability for a type A failure Chapter 3: Embryonics and SEUs (12)

Macro-Cell Level, Protochip DEC Chapter 3: Embryonics and SEUs (13)

3.6. Molecular Level Chapter 3: Embryonics and SEUs (14) Molecular reliability λ known

3.6. Molecular Level Chapter 3: Embryonics and SEUs (15) Reliability>90%: 28.4 million hours (SEC) VS 63.3 million hours (DEC) periods; Reliability=50% reached after 89.8 million hours (SEC) VS million hours (DEC)

3.7. Conclusions Chapter 3: Embryonics and SEUs (16) Final expressions of R and MTTF quite complicated Failure rate λ essentially empirical –determined through extensive measurements –may be affected by aggressive environments –constant → variable

3.7. Conclusions Chapter 3: Embryonics and SEUs (17) Unfortunately, no accurate model for cosmic rays Understanding causes and modeling soft fails hot field of research Stochastic nature of soft fails

3.7. Conclusions Chapter 3: Embryonics and SEUs (18) Different macro-cell configurations; may prove too small for real applications Classic reliability analysis difficult, based on non-stochastic parameters Protochip-based analysis with similar results, better suited to other influences (such as cosmic rays)