Presentation is loading. Please wait.

Presentation is loading. Please wait.

29 September 2005 Dynamic Voting Schemes to Enhance Evolutionary Repair in Reconfigurable Logic Devices C. Milliord, C. A. Sharma, and R. F. DeMara University.

Similar presentations


Presentation on theme: "29 September 2005 Dynamic Voting Schemes to Enhance Evolutionary Repair in Reconfigurable Logic Devices C. Milliord, C. A. Sharma, and R. F. DeMara University."— Presentation transcript:

1 29 September 2005 Dynamic Voting Schemes to Enhance Evolutionary Repair in Reconfigurable Logic Devices C. Milliord, C. A. Sharma, and R. F. DeMara University of Central Florida

2 Technical Objective: Autonomous FPGA Regeneration Redundancy increases with amount of spare capacity restricted at design-time based on time required to select spare resource determined by adequacy of spares available (?) yes Regeneration weakly-related to number recovery capacity variable at recovery-time based on time required to find suitable recovery affected by multiple characteristics (+ or -) yes Overhead from Unutilized Spares weight, size, power Granularity of Fault Coverage resolution where fault handled Fault-Resolution Latency availability via downtime required to handle fault Quality of Repair likelihood and completeness Autonomous Operation recover without outside intervention Increased availability without pre-configured spares … everyday examplespare tirecan of fix-a-flat NASA Moon, Mars, and Beyond: Realize 10’s years service life ??? Reconfiguration allows new fault-handling paradigm

3 Problem Statement FPGAs in Space  Harsh conditions lead to faults in hardware Radiation Extreme temperatures Mechanical stress Long Mission duration Experiment with several combinations of GAs and voting schemes  Population of FPGA configurations that are physically distinct, but functionally equivalent  Voting involves 3 or more configurations, with a majority output Hypothesis  The added space and computation associated with a voting scheme is justified by a quicker and more complete repair

4 EHW Environments Evolvable Hardware (EHW) Environments enable experimental methods to research soft computing intelligent search techniques EHW operates by repetitive reprogramming of real-world physical devices using an iterative refinement process: Genetic Algorithm Hardware in the loop or Two modes of Evolvable Hardware Extrinsic Evolution Genetic Algorithm software model Done? Build it device “design-time” refinement Simulation in the loop Intrinsic Evolution device “run-time” refinement new approach to Autonomous Repair of failed devices Stardust Satellite: >100 FPGAs onboard hostile environment: radiation, thermal stress How to achieve reliability to avoid mission failure??? Application

5 Genetic Algorithms (GAs) selection of parents population of candidate solutions parents offspring crossover mutation evaluate fitness of individuals replacement start Fitness function Goal reached Initial population of configurations  Functionally equivalent, Physically distinct Fitness level  Based on number of correct outputs for all possible inputs Creating a new generation  Mutation “100011101” -> “101011101”  Crossover “101100” & “011110” -> “101110”

6 Previous Work [1] Re-routing scheme replaces faulty CLB  Time-saving method with low overhead [2] TMR fault-detection  On-line approach  High overhead and power consumption [3] On-line technique using a BIST  Limited power consumption  Spare resources [4] GA repair of integer multiplier  Voting system may not always outperform individual with the highest fitness  Initialized GA with copies of one hand-designed configuration [1] Xu, J., Si, P., Huang, W., and Lombardi, F., “A novel fault tolerant approach for SRAM-based FPGAs”, Proceedings of the Pacific Rim Int’l Symposium, Dec. 1999, pp. 40-44. [2] Li, Y., Li, D., and Wang, Z., “A new approach to detect-mitigate-correct radiation-induced faults for SRAM-based FPGAs in aerospace application”, Proceedings of the IEE National Aerospace and Electronics Conference, Oct. 2000, pp. 588-594. [3] Abramovici, M., Emmert, J., and Stroud, C., “Roving STARs: an integrated approach to on- line testing, diagnosis, and fault tolerance for FPGAs in adaptive computing systems”, Proceedings of The Third NASA DoD Workshop, July 2001, pp. 73-92. [4] Vigander, S., “Evolutionary fault repair of electronics in space applications”, Dissertation, University of Sussex, Brighton, UK, 2001.

7 Experimental Setups C++ program that simulates FPGA circuit design/repair  Input files GA parameters Logic function truth table Input/Output pairs FPGA parameters Configuration properties of perfect individuals Simulate repair in voting experiments  Output files Configuration properties at selected generations Data showing fitness level at each generation Produce graphs Loosely Coupled (LC) Virtex System  PC WorkStation running Xilinx EDK and ISE with AVNET V2Pro PCI card  (SoC) version using PowerPC embedded in FPGA fabric now operational … results reported on previous environment

8 Experimental Inputs GA parameters  Population size  Offspring population size  Mutation rate  Tournament size (2)  Maximum number of generations FPGA parameters  Number of inputs (6)  Number of outputs (6)  Number of CLBs  Number of look-up tables (LUTs) per CLB (SW only)  Number of LUT select lines (SW only) I1I1 I2I2 I3I3 I4I4 I5I5 I6I6 O1O1 O2O2 O3O3 O4O4 O5O5 O6O6 000000000000 010000000000 010001000010 010010000100 010011000110 010100001000 010101001010 010110001100 010111001110 011000000000 Ideal Fitness = 60

9 Experiment #1 Circuit evolution – no repair Maximize GA performance before voting (tweak parameters)  Used 200 for max number of generations  Varied the mutation rate from.001 to.097 with a step of.004  Population sizes of 15, 40, and 50  6, 9, 12, 16, and 36 for number of CLBs Evolve several perfect configurations  repeated the most successful runs for 1000 generations

10 FPGA Genetic Representations Chromosome Goals:  Allow all possible LUT configurations  Allow all possible CLB interconnections given constraints of routing support  Disallow illegal FPGA configurations and non-coding introns (junk DNA)  Facilitate crossover operator Bitstring representation is natural choice, though may not scale well (investigating generative reps) Representation shown here is sample specific to Xilinx Virtex FPGA CLB 0 LUT 0 LUT 1 LUT 2 LUT 3 CLB 1CLB n  LUT 0 LUT 1 LUT 2 LUT 3 LUT 0 LUT 1 LUT 2 LUT 3

11 Generations = 200, pop size = 50, CLBs = 9 MRF F F F F.001 55.02154.04153.06155.08153.005 57.02553.04552.06555.08552.00954.02954.04954.06954.089 59.01354.03356.05352.07353.09355.01754.03753.05752.07754.09753 Experiment #1 Results

12 Perfect Individuals Parameters used in evolving perfect individuals (fitness of 60)  Maximum Number of Generations: 1000  Mutation Rate:.002  Population Size: 50  Number of CLBs: 9 These create a diverse initial population for TMR style voting in Experiment #2

13 …Perfect Individuals Config.GenerationsANDORNORXORNAND 1150118458 2382810387 3473138573 4582765108 5881871065 Avg.493.69.47.85.47.26.2

14 Three-plex Experiments Six injected stuck-at faults on LUT inputs  Resulting fitness of perfect individuals: 38, 40, 47 Parameters  Number of Generations: 400  Mutation Rate:.089  Population Size: 50  Number of CLBs: 9

15 Experiment #2 Simulating repair Implement voting schemes  Injected stuck-at faults  Implemented 3-plex and 5-plex voting schemes  Chose GA/FPGA parameters according to Experiment #1  For each voting run, graphed the fitness of best fit individual vs. number of generations for voting elements and system  Repeated 3-plex experiment with a single element (no voting) for 3X number of generations GA #1 Configuration FPGA Input Data Voter FPGA Output Data Output GA #2 Configuration GA #3 Configuration

16 Partial Repair: Max Fitness = 58 at generation 68 Three-plex Voting Results

17 Complete Repair achieved at generation 302

18 Three-plex Voting Results Complete Repair at generation 33

19 Three-plex Voting Results Perfect fitness is temporarily reached at generation 17

20 Three-plex Voting Summary Rank Highest Voting Fitness Reached Earliest Generation of Highest Fitness GA #1 (voting fitness/final fitness) GA #2 (voting fitness/final fitness) GA #3 (voting fitness/final fitness) Final Vote Fitness (Generation 400) 1056256/5648/5456/5656 957248/5455/5552/5556 8586856/5755/5655/5758 76030255/5556/5658/5860 6 26158/5856/5658/5860 5 17957/5756/5655/5560 4 3351/5653/5859/5960 3 1758/5856/5652/5459 260351/5652/5654/5660 1 255/5552/5456/5659

21 Compare: Single GA Run 1200 generations  Total GA computation equivalent to a 3-plex run for 400 generations 3 runs  Max fitness of 56 at 934 generations  Max fitness of 56 at 852 generations  Max fitness of 57 at 274 generations N-plex Voting advantageous  Improved the likelihood of obtaining a complete repair significantly with fewer total number of circuit evaluations  n x g v << g o for n-plex voting with g v voting generations vs. g o evolutionary generations without voting

22 Experiment #3: 5-plex Six injected stuck-at faults on LUT inputs  Resulting fitness of perfect individuals: 38, 40, 47 Parameters  Number of Generations: 300  Mutation Rate:.089  Population Size: 50  Number of CLBs: 9

23 Five-plex Voting Results Complete Repair at generation 48

24 Five-plex Voting Results Complete Repair fitness at generation 34

25 Five-plex Voting Results Perfect fitness at generation 2

26 Five-plex Voting Summary Rank Highest Voting Fitness Reached Earliest Generation of Highest Fitness GA #1 (voting fitness/final fitness) GA #2 (voting fitness/final fitness) GA #3 (voting fitness/final fitness) GA #4 (voting fitness/final fitness) GA #5 (voting fitness/final fitness) Final Vote Fitness (Generation 300) 10596855/5854/5555/5554/5553/5557 96015454/5656/5652/5256/5855/5560 8 10856/5654/5455/55 53/5360 7 5553/5556/5657/5756/56 60 6 4855/5556/5653/5655/5759/5960 5 3456/5655/5851/5557/5755/5560 4 2756/5653/5552/5755/5552/5658 360456/5653/5551/5656/5652/5660 2 349/5451/5555/5553/5752/5655 160250/5456/5654/5450/5554/5660

27 3-plex vs. 5-plex 3-plex scheme  7 out of 10 runs reached perfect fitness  Average of 113.86 generations to do so  5 out of 10 runs exhibited perfect fitness upon completion (400 generations) 5-plex scheme  9 out of 10 reached perfect fitness  Average of 48.33 generations needed  7 out of 10 exhibited perfect fitness at completion (300 generations)

28 Conclusion Autonomous FPGA Repair Strategy combining dynamic redundancy with online evolution TMR Style Voting beneficial in presence of partial refurbishment  Complete repair can be quickly obtained with three/five imperfectly repaired individuals Improvement of fitness in an individual GA can outperform voting fitness Stabilization of a complete repair is more important than how quickly it is achieved  In all six runs where a perfect fitness was obtained after 50 generations, the fitness was maintained  Only 5 of 10 runs which obtained a perfect fitness before 50 generations maintained that fitness for remainder of run

29 Development Board to Self-Contained FPGA Qualitative Analysis of CRR model Number of iterations and completeness of regeneration repair Percentage of time the device remains online despite physical resource fault (availability) Hardware Resource Management Optimization of hardware profile for Xilinx Virtex II Pro Field Testing on SRAM-based FPGA in a Cubesat mission

30 For further info … EH Website http://cal.ucf.edu

31 Backup Slides On following pages …

32 Fault Recovery Characteristics of Selected Approaches Previous Work on Fault Recovery Normalized Power Consumption (Energy per Operation): n-plex solution using n redundant devices Reconfiguration cost r Gate-Level redundancy g Updated with scan rate s on c CLBs

33 Previous Work - Tool Level Approach FPGA Supported On-chip System Bit Stream Reuse System Coupling Degree Potential Limitations Moraes, Mesquita, Palma, Moller Virtex XCV300 devices NoNLoose Lack of Area Relocation Capability Raghavan, Sutton Xilinx Virtex devices NoNLoose Cumbersome CAD flow Blodget, McMillan Virtex II devices PartialYMedium Limited hardware speed and capacity. Lack of information for bit stream reuse


Download ppt "29 September 2005 Dynamic Voting Schemes to Enhance Evolutionary Repair in Reconfigurable Logic Devices C. Milliord, C. A. Sharma, and R. F. DeMara University."

Similar presentations


Ads by Google