Download presentation
Presentation is loading. Please wait.
Published byConstance Skinner Modified over 9 years ago
1
29 September 2005 Dynamic Voting Schemes to Enhance Evolutionary Repair in Reconfigurable Logic Devices C. Milliord, C. A. Sharma, and R. F. DeMara University of Central Florida
2
Technical Objective: Autonomous FPGA Regeneration Redundancy increases with amount of spare capacity restricted at design-time based on time required to select spare resource determined by adequacy of spares available (?) yes Regeneration weakly-related to number recovery capacity variable at recovery-time based on time required to find suitable recovery affected by multiple characteristics (+ or -) yes Overhead from Unutilized Spares weight, size, power Granularity of Fault Coverage resolution where fault handled Fault-Resolution Latency availability via downtime required to handle fault Quality of Repair likelihood and completeness Autonomous Operation recover without outside intervention Increased availability without pre-configured spares … everyday examplespare tirecan of fix-a-flat NASA Moon, Mars, and Beyond: Realize 10’s years service life ??? Reconfiguration allows new fault-handling paradigm
3
Problem Statement FPGAs in Space Harsh conditions lead to faults in hardware Radiation Extreme temperatures Mechanical stress Long Mission duration Experiment with several combinations of GAs and voting schemes Population of FPGA configurations that are physically distinct, but functionally equivalent Voting involves 3 or more configurations, with a majority output Hypothesis The added space and computation associated with a voting scheme is justified by a quicker and more complete repair
4
EHW Environments Evolvable Hardware (EHW) Environments enable experimental methods to research soft computing intelligent search techniques EHW operates by repetitive reprogramming of real-world physical devices using an iterative refinement process: Genetic Algorithm Hardware in the loop or Two modes of Evolvable Hardware Extrinsic Evolution Genetic Algorithm software model Done? Build it device “design-time” refinement Simulation in the loop Intrinsic Evolution device “run-time” refinement new approach to Autonomous Repair of failed devices Stardust Satellite: >100 FPGAs onboard hostile environment: radiation, thermal stress How to achieve reliability to avoid mission failure??? Application
5
Genetic Algorithms (GAs) selection of parents population of candidate solutions parents offspring crossover mutation evaluate fitness of individuals replacement start Fitness function Goal reached Initial population of configurations Functionally equivalent, Physically distinct Fitness level Based on number of correct outputs for all possible inputs Creating a new generation Mutation “100011101” -> “101011101” Crossover “101100” & “011110” -> “101110”
6
Previous Work [1] Re-routing scheme replaces faulty CLB Time-saving method with low overhead [2] TMR fault-detection On-line approach High overhead and power consumption [3] On-line technique using a BIST Limited power consumption Spare resources [4] GA repair of integer multiplier Voting system may not always outperform individual with the highest fitness Initialized GA with copies of one hand-designed configuration [1] Xu, J., Si, P., Huang, W., and Lombardi, F., “A novel fault tolerant approach for SRAM-based FPGAs”, Proceedings of the Pacific Rim Int’l Symposium, Dec. 1999, pp. 40-44. [2] Li, Y., Li, D., and Wang, Z., “A new approach to detect-mitigate-correct radiation-induced faults for SRAM-based FPGAs in aerospace application”, Proceedings of the IEE National Aerospace and Electronics Conference, Oct. 2000, pp. 588-594. [3] Abramovici, M., Emmert, J., and Stroud, C., “Roving STARs: an integrated approach to on- line testing, diagnosis, and fault tolerance for FPGAs in adaptive computing systems”, Proceedings of The Third NASA DoD Workshop, July 2001, pp. 73-92. [4] Vigander, S., “Evolutionary fault repair of electronics in space applications”, Dissertation, University of Sussex, Brighton, UK, 2001.
7
Experimental Setups C++ program that simulates FPGA circuit design/repair Input files GA parameters Logic function truth table Input/Output pairs FPGA parameters Configuration properties of perfect individuals Simulate repair in voting experiments Output files Configuration properties at selected generations Data showing fitness level at each generation Produce graphs Loosely Coupled (LC) Virtex System PC WorkStation running Xilinx EDK and ISE with AVNET V2Pro PCI card (SoC) version using PowerPC embedded in FPGA fabric now operational … results reported on previous environment
8
Experimental Inputs GA parameters Population size Offspring population size Mutation rate Tournament size (2) Maximum number of generations FPGA parameters Number of inputs (6) Number of outputs (6) Number of CLBs Number of look-up tables (LUTs) per CLB (SW only) Number of LUT select lines (SW only) I1I1 I2I2 I3I3 I4I4 I5I5 I6I6 O1O1 O2O2 O3O3 O4O4 O5O5 O6O6 000000000000 010000000000 010001000010 010010000100 010011000110 010100001000 010101001010 010110001100 010111001110 011000000000 Ideal Fitness = 60
9
Experiment #1 Circuit evolution – no repair Maximize GA performance before voting (tweak parameters) Used 200 for max number of generations Varied the mutation rate from.001 to.097 with a step of.004 Population sizes of 15, 40, and 50 6, 9, 12, 16, and 36 for number of CLBs Evolve several perfect configurations repeated the most successful runs for 1000 generations
10
FPGA Genetic Representations Chromosome Goals: Allow all possible LUT configurations Allow all possible CLB interconnections given constraints of routing support Disallow illegal FPGA configurations and non-coding introns (junk DNA) Facilitate crossover operator Bitstring representation is natural choice, though may not scale well (investigating generative reps) Representation shown here is sample specific to Xilinx Virtex FPGA CLB 0 LUT 0 LUT 1 LUT 2 LUT 3 CLB 1CLB n LUT 0 LUT 1 LUT 2 LUT 3 LUT 0 LUT 1 LUT 2 LUT 3
11
Generations = 200, pop size = 50, CLBs = 9 MRF F F F F.001 55.02154.04153.06155.08153.005 57.02553.04552.06555.08552.00954.02954.04954.06954.089 59.01354.03356.05352.07353.09355.01754.03753.05752.07754.09753 Experiment #1 Results
12
Perfect Individuals Parameters used in evolving perfect individuals (fitness of 60) Maximum Number of Generations: 1000 Mutation Rate:.002 Population Size: 50 Number of CLBs: 9 These create a diverse initial population for TMR style voting in Experiment #2
13
…Perfect Individuals Config.GenerationsANDORNORXORNAND 1150118458 2382810387 3473138573 4582765108 5881871065 Avg.493.69.47.85.47.26.2
14
Three-plex Experiments Six injected stuck-at faults on LUT inputs Resulting fitness of perfect individuals: 38, 40, 47 Parameters Number of Generations: 400 Mutation Rate:.089 Population Size: 50 Number of CLBs: 9
15
Experiment #2 Simulating repair Implement voting schemes Injected stuck-at faults Implemented 3-plex and 5-plex voting schemes Chose GA/FPGA parameters according to Experiment #1 For each voting run, graphed the fitness of best fit individual vs. number of generations for voting elements and system Repeated 3-plex experiment with a single element (no voting) for 3X number of generations GA #1 Configuration FPGA Input Data Voter FPGA Output Data Output GA #2 Configuration GA #3 Configuration
16
Partial Repair: Max Fitness = 58 at generation 68 Three-plex Voting Results
17
Complete Repair achieved at generation 302
18
Three-plex Voting Results Complete Repair at generation 33
19
Three-plex Voting Results Perfect fitness is temporarily reached at generation 17
20
Three-plex Voting Summary Rank Highest Voting Fitness Reached Earliest Generation of Highest Fitness GA #1 (voting fitness/final fitness) GA #2 (voting fitness/final fitness) GA #3 (voting fitness/final fitness) Final Vote Fitness (Generation 400) 1056256/5648/5456/5656 957248/5455/5552/5556 8586856/5755/5655/5758 76030255/5556/5658/5860 6 26158/5856/5658/5860 5 17957/5756/5655/5560 4 3351/5653/5859/5960 3 1758/5856/5652/5459 260351/5652/5654/5660 1 255/5552/5456/5659
21
Compare: Single GA Run 1200 generations Total GA computation equivalent to a 3-plex run for 400 generations 3 runs Max fitness of 56 at 934 generations Max fitness of 56 at 852 generations Max fitness of 57 at 274 generations N-plex Voting advantageous Improved the likelihood of obtaining a complete repair significantly with fewer total number of circuit evaluations n x g v << g o for n-plex voting with g v voting generations vs. g o evolutionary generations without voting
22
Experiment #3: 5-plex Six injected stuck-at faults on LUT inputs Resulting fitness of perfect individuals: 38, 40, 47 Parameters Number of Generations: 300 Mutation Rate:.089 Population Size: 50 Number of CLBs: 9
23
Five-plex Voting Results Complete Repair at generation 48
24
Five-plex Voting Results Complete Repair fitness at generation 34
25
Five-plex Voting Results Perfect fitness at generation 2
26
Five-plex Voting Summary Rank Highest Voting Fitness Reached Earliest Generation of Highest Fitness GA #1 (voting fitness/final fitness) GA #2 (voting fitness/final fitness) GA #3 (voting fitness/final fitness) GA #4 (voting fitness/final fitness) GA #5 (voting fitness/final fitness) Final Vote Fitness (Generation 300) 10596855/5854/5555/5554/5553/5557 96015454/5656/5652/5256/5855/5560 8 10856/5654/5455/55 53/5360 7 5553/5556/5657/5756/56 60 6 4855/5556/5653/5655/5759/5960 5 3456/5655/5851/5557/5755/5560 4 2756/5653/5552/5755/5552/5658 360456/5653/5551/5656/5652/5660 2 349/5451/5555/5553/5752/5655 160250/5456/5654/5450/5554/5660
27
3-plex vs. 5-plex 3-plex scheme 7 out of 10 runs reached perfect fitness Average of 113.86 generations to do so 5 out of 10 runs exhibited perfect fitness upon completion (400 generations) 5-plex scheme 9 out of 10 reached perfect fitness Average of 48.33 generations needed 7 out of 10 exhibited perfect fitness at completion (300 generations)
28
Conclusion Autonomous FPGA Repair Strategy combining dynamic redundancy with online evolution TMR Style Voting beneficial in presence of partial refurbishment Complete repair can be quickly obtained with three/five imperfectly repaired individuals Improvement of fitness in an individual GA can outperform voting fitness Stabilization of a complete repair is more important than how quickly it is achieved In all six runs where a perfect fitness was obtained after 50 generations, the fitness was maintained Only 5 of 10 runs which obtained a perfect fitness before 50 generations maintained that fitness for remainder of run
29
Development Board to Self-Contained FPGA Qualitative Analysis of CRR model Number of iterations and completeness of regeneration repair Percentage of time the device remains online despite physical resource fault (availability) Hardware Resource Management Optimization of hardware profile for Xilinx Virtex II Pro Field Testing on SRAM-based FPGA in a Cubesat mission
30
For further info … EH Website http://cal.ucf.edu
31
Backup Slides On following pages …
32
Fault Recovery Characteristics of Selected Approaches Previous Work on Fault Recovery Normalized Power Consumption (Energy per Operation): n-plex solution using n redundant devices Reconfiguration cost r Gate-Level redundancy g Updated with scan rate s on c CLBs
33
Previous Work - Tool Level Approach FPGA Supported On-chip System Bit Stream Reuse System Coupling Degree Potential Limitations Moraes, Mesquita, Palma, Moller Virtex XCV300 devices NoNLoose Lack of Area Relocation Capability Raghavan, Sutton Xilinx Virtex devices NoNLoose Cumbersome CAD flow Blodget, McMillan Virtex II devices PartialYMedium Limited hardware speed and capacity. Lack of information for bit stream reuse
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.