Download presentation
Presentation is loading. Please wait.
Published byGeorgina Matthews Modified over 9 years ago
1
Fault-tolerant Typed Assembly Language Frances Perry, Lester Mackey, George A. Reis, Jay Ligatti, David I. August, and David Walker Princeton University TAL FT : An Idealized Fault-tolerant System Goal: Detect all faults that change a program’s observable behavior and prove that well-typed programs always detect a single fault. General Strategy: Two redundant and independent copies of each computation (labeled green and blue) that are compared before stores and control flow transfers. The TAL FT Machine A machine state consists of a code memory C, a value memory M, a register file R, a current instruction ir, and a store queue Q.. 0x0393 mov r 1 4 0x0394 mov r 3 4 0x0395 st G r 2 r 1 0x0396 st B r 4 r 3. C Q A value is observed when it is stored to memory. Q buffers a store from the green computation until the corresponding blue store is performed. Special hardware is used by memory and control flow instructions to detect mismatches between the computations and trigger the special fault state. Performance Results Conclusion For More Information… See our upcoming paper in PLDI ’07 http://www.cs.princeton.edu/~frances/tal_ft-pldi07.pdf Visit the Project ap Homepage http://www.cs.princeton.edu/sip/proj/zap Transient faults are already a significant concern, and their impact is increasing. TAL FT formalizes idealized hardware for a hybrid hardware-software fault-tolerance scheme. TAL FT ’s type system captures the invariants necessary to reason about the system. We formally proved that all well-typed programs are fault-tolerant relative to the fault model. We simulated the TAL FT hardware using the Velocity Research Compiler. Despite essentially doubling the number of instructions, TAL FT is only 34% slower than the equivalent fault-intolerant code. Formalizing the TAL FT System Operational semantics ! ’ specifies how a program executes. is a machine state (R,C,M,Q,ir). Special operational rules randomly introduce faults by arbitrarily modifying a value in the register file R or the queue Q Judgment ` Z states that machine state is well-typed under zap tag Z. Z is either empty or a color c, representing the color of the computation that has been corrupted. If Z is empty then all standard typing invariants must hold. If Z is a color c, then values colored c do not need to conform to their given types. The Fault-tolerance Theorem A program is fault tolerant when all the faulty executions simulate fault-free executions of the program. The relationship 1 sim c 2 says that 1 and 2 are identical modulo values colored c. (Simplified) Fault-Tolerance Theorem: Either a faulty execution is indistinguishable from the equivalent fault-free execution, or it terminates in the fault state. If ` and sim c 1 and ! ’ then either 1 ! 2 and ’ sim c 2 or 1 ! fault. Transient Faults Transient faults (also known as soft fails) occur when an energetic particle strikes a transistor, causing it to change state. Does not permanently damage hardware. May corrupt computation by altering stored values and signal transfers. 1 + 1 = 18 Sun, HP, and Cypress Semiconductor have admitted that transient faults caused crashes and problems at eBay, AOL, Los Alamos, and other major sites. Faster clock rates, increasing transistor density, decreasing voltages, and smaller feature size all contribute to increasing fault rates of approximately 8% per generation. 2006 2014 (estimated) Transient faults are a well-known problem in the architecture community. There are many existing solutions, but do these solutions actually work? Solutions generally consist of algorithms stated in English, and it is left to the audience to judge correctness. In 2004, a typical laptop with 1GB RAM had 1 soft failure per year. Fault rates increase with altitude. Fault rates on a trans-pacific flight increase 300x to nearly 1 fault per roundtrip. Frances Perry. CRA-W/CDC Programming Languages Summer School. May 9-11, 2007.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.