Aerospace Conference ‘12 A Framework to Analyze, Compare, and Optimize High-Performance, On-Board Processing Systems Nicholas Wulf Alan D. George Ann Gordon-Ross.

Slides:



Advertisements
Similar presentations
DSPs Vs General Purpose Microprocessors
Advertisements

CprE 458/558: Real-Time Systems (G. Manimaran)1 CprE 458/558: Real-Time Systems Fault-Tolerant Scheduling Techniques.
Scrubbing Approaches for Kintex-7 FPGAs
EEL6686 Guest Lecture February 25, 2014 A Framework to Analyze Processor Architectures for Next-Generation On-Board Space Computing Tyler M. Lovelly Ph.D.
Department of Computer Science iGPU: Exception Support and Speculative Execution on GPUs Jaikrishnan Menon, Marc de Kruijf Karthikeyan Sankaralingam Vertical.
HPEC 2012 Scrubbing Optimization via Availability Prediction (SOAP) for Reconfigurable Space Computing Quinn Martin Alan George.
ICAP CONTROLLER FOR HIGH-RELIABLE INTERNAL SCRUBBING Quinn Martin Steven Fingulin.
A Graphical Operator Framework for Signature Detection in Hyperspectral Imagery David Messinger, Ph.D. Digital Imaging and Remote Sensing Laboratory Chester.
Device Tradeoffs Greg Stitt ECE Department University of Florida.
Maintaining Data Integrity in Programmable Logic in Atmospheric Environments through Error Detection Joel Seely Technical Marketing Manager Military &
CHREC F5/F6 W EEKLY M EETING F5-11: Device Performance Metrics and Mission-Critical Processing Nicholas Wulf Justin Richardson Quinn Martin Steven Fingulin.
Zheming CSCE715.  A wireless sensor network (WSN) ◦ Spatially distributed sensors to monitor physical or environmental conditions, and to cooperatively.
University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science August 20, 2009 Enabling.
Identifying "Good" Architectural Design Alternatives with Multi-Objective Optimization Strategies By Lars Grunske Presented by Robert Dannels.
Distributed Regression: an Efficient Framework for Modeling Sensor Network Data Carlos Guestrin Peter Bodik Romain Thibaux Mark Paskin Samuel Madden.
DCABES 2009 China University Of Geosciences 1 The Parallel Models of Coronal Polarization Brightness Calculation Jiang Wenqian.
2. Introduction to Redundancy Techniques Redundancy Implies the use of hardware, software, information, or time beyond what is needed for normal system.
Reliability-Aware Frame Packing for the Static Segment of FlexRay Bogdan Tanasa, Unmesh Bordoloi, Petru Eles, Zebo Peng Linkoping University, Sweden 1.
EECS Department, Northwestern University, Evanston Thermal-Induced Leakage Power Optimization by Redundant Resource Allocation Min Ni and Seda Ogrenci.
A Performance and Energy Comparison of FPGAs, GPUs, and Multicores for Sliding-Window Applications From J. Fowers, G. Brown, P. Cooke, and G. Stitt, University.
Bitstream Relocation with Local Clock Domains for Partially Reconfigurable FPGAs Adam Flynn, Ann Gordon-Ross, Alan D. George NSF Center for High-Performance.
Using Programmable Logic to Accelerate DSP Functions 1 Using Programmable Logic to Accelerate DSP Functions “An Overview“ Greg Goslin Digital Signal Processing.
GPGPU platforms GP - General Purpose computation using GPU
2014 IEEE Aerospace Conference March 2, 2014 A Framework to Analyze Processor Architectures for Next-Generation On-Board Space Computing Tyler M. Lovelly.
Reconfigurable Computing in Space with Radiation-Hardened Xilinx FPGAs
Exploring the Tradeoffs of Configurability and Heterogeneity in Multicore Embedded Systems + Also Affiliated with NSF Center for High- Performance Reconfigurable.
1 Exploring Data Reliability Tradeoffs in Replicated Storage Systems NetSysLab The University of British Columbia Abdullah Gharaibeh Advisor: Professor.
SAGE: Self-Tuning Approximation for Graphics Engines
RAID: High-Performance, Reliable Secondary Storage Mei Qing & Chaoxia Liao Nov. 20, 2003.
Partially Reconfigurable System-on-Chips for Adaptive Fault Tolerance Shaon Yousuf Adam Jacobs Ph.D. Students NSF CHREC Center, University of Florida Dr.
Highest Performance Programmable DSP Solution September 17, 2015.
Floating Point vs. Fixed Point for FPGA 1. Applications Digital Signal Processing -Encoders/Decoders -Compression -Encryption Control -Automotive/Aerospace.
A comprehensive method for the evaluation of the sensitivity to SEUs of FPGA-based applications A comprehensive method for the evaluation of the sensitivity.
Computer Science Open Research Questions Adversary models –Define/Formalize adversary models Need to incorporate characteristics of new technologies and.
Architectures for mobile and wireless systems Ese 566 Report 1 Hui Zhang Preethi Karthik.
Uncovering the Multicore Processor Bottlenecks Server Design Summit Shay Gal-On Director of Technology, EEMBC.
Young Suk Moon Chair: Dr. Hans-Peter Bischof Reader: Dr. Gregor von Laszewski Observer: Dr. Minseok Kwon 1.
(TPDS) A Scalable and Modular Architecture for High-Performance Packet Classification Authors: Thilan Ganegedara, Weirong Jiang, and Viktor K. Prasanna.
Reconfiguration Based Fault-Tolerant Systems Design - Survey of Approaches Jan Balach, Jan Balach, Ondřej Novák FIT, CTU in Prague MEMICS 2010.
Publication: Ra Inta, David J. Bowman, and Susan M. Scott. Int. J. Reconfig. Comput. 2012, Article 2 (January 2012), 1 pages. DOI= /2012/ Naveen.
POLITECNICO DI MILANO Reconfiguration 4 Reliability design methodology for reliability assessment and enhancement of FPGA-based systems Dynamic Reconfigurability.
Advanced Computer Architecture, CSE 520 Generating FPGA-Accelerated DFT Libraries Chi-Li Yu Nov. 13, 2007.
Space Radiation and Fox Satellites 2011 Space Symposium AMSAT Fox.
Evaluating FERMI features for Data Mining Applications Masters Thesis Presentation Sinduja Muralidharan Advised by: Dr. Gagan Agrawal.
A new Ad Hoc Positioning System 컴퓨터 공학과 오영준.
CprE 458/558: Real-Time Systems
Fault-Tolerant Systems Design Part 1.
Re-Configurable Byzantine Quorum System Lei Kong S. Arun Mustaque Ahamad Doug Blough.
Accelerated Long Range Traverse (ALERT) Paul Springer Michael Mossey.
LOGIC OPTIMIZATION USING TECHNOLOGY INDEPENDENT MUX BASED ADDERS IN FPGA Project Guide: Smt. Latha Dept of E & C JSSATE, Bangalore. From: N GURURAJ M-Tech,
1/14 Merging BIST and Configurable Computing Technology to Improve Availability in Space Applications Eduardo Bezerra 1, Fabian Vargas 2, Michael Paul.
FTCA Lecture November 14, 2012 FTCA: On-Board Processing Design Optimization Framework Dr. Alan D. George Professor of ECE University of Florida Dr. Herman.
An Integrated GPU Power and Performance Model (ISCA’10, June 19–23, 2010, Saint-Malo, France. International Symposium on Computer Architecture)
Learning Photographic Global Tonal Adjustment with a Database of Input / Output Image Pairs.
V.M. Sliusar, V.I. Zhdanov Astronomical Observatory, Taras Shevchenko National University of Kyiv Observatorna str., 3, Kiev Ukraine
Recursive Architectures for 2DLNS Multiplication RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR 11 Recursive Architectures for 2DLNS.
EnerJ: Approximate Data Types for Safe and General Low-Power Computation (PLDI’2011) Adrian Sampson, Werner Dietl, Emily Fortuna Danushen Gnanapragasam,
Sunpyo Hong, Hyesoon Kim
Adding Algorithm Based Fault-Tolerance to BLIS Tyler Smith, Robert van de Geijn, Mikhail Smelyanskiy, Enrique Quintana-Ortí 1.
Image Enhancement Objective: better visualization of remotely sensed images visual interpretation remains to be the most powerful image interpretation.
Paper by F.L. Kastensmidt, G. Neuberger, L. Carro, R. Reis Talk by Nick Boyd 1.
CS203 – Advanced Computer Architecture Dependability & Reliability.
GangES: Gang Error Simulation for Hardware Resiliency Evaluation Siva Hari 1, Radha Venkatagiri 2, Sarita Adve 2, Helia Naeimi 3 1 NVIDIA Research, 2 University.
In the name of God.
Hyperspectral Sensing – Imaging Spectroscopy
CFTP ( Configurable Fault Tolerant Processor )
Comparative Analysis of Parallel OPIR Compression on Space Processors
A Closer Look at Instruction Set Architectures
FPGAs in AWS and First Use Cases, Kees Vissers
Maintaining Data Integrity in Programmable Logic in Atmospheric Environments through Error Detection Joel Seely Technical Marketing Manager Military &
Presentation transcript:

Aerospace Conference ‘12 A Framework to Analyze, Compare, and Optimize High-Performance, On-Board Processing Systems Nicholas Wulf Alan D. George Ann Gordon-Ross 3/5/2012

On-Board Proc. Framework Goal: Provide theory for determining Pareto-optimal device & fault-tolerant (FT) strategy for on-board processing systems  Focus on small satellites, UAVs, smart munitions, AUVs  Generalized comparison framework deters ad hoc design methodology, leading to Pareto-optimal designs Approach: Divide exploration space into its key components for study  Device Set – Includes CPUs, DSPs, FPGAs, GPUs, & any Rad-Hard versions  FT Strategies – Sacrifice resources for fault tolerance (may be app-specific)  Mission Characteristics – Defines system’s environmental conditions  Application Set – Includes most common kernels (e.g., space apps.)  Analysis – Design evaluation metrics calculated from other components Is selected design acceptable? How does it compare to others? 2 Device Set FT Strategy Set Mission Characteristics Application Kernel Set Analysis

Computational Density (CD) used to analyze device performance  Developed within CHREC at the University of Florida  Applicable to wide range of architectures (e.g., CPU, DSP, GPU, FPGA, etc.)  Produces results in terms of operations per second Different CD result for each type of precision (e.g., bit, 8-bit int., 64-bit double) CD also depends on type of operation (e.g., add, multiply, divide) Standard objective methodology for calculating CD on FPGAs  Generate and analyze compute core on FPGA Perform twice for each core; one with and one without DSP resources Core must match desired precision & operation  Collect data on resources used and max freq.  Use optimal packing algorithm to determine maximum number of cores on device  CD equals number of cores times worst clock rate from the list of cores used Calculating Computational Density 3

Designs evaluated according to two metrics Power – power consumed by device  Sensor data rate & application determine amount and type of required processing  Device utilization is ratio of corresponding device CD to required processing  FT strategy overhead increases utilization  Linear interpolation between power range to find power consumption Dependability – time between failures  Device radiation characteristics and environment determine device upset rate  Low utilization causes lower vulnerability  FT strategies help to mitigate upsets Framework Methodology 4 Dependability 72% Consumed Device Power Sensor Utilization: 24%Utilization: 72% Device 100%0%50% Static Max Utilization Device Cross-Section (SEE / Particle·m 2 ) Environmental Radiation (Particles / m 2 ·ster·s) Power Device Upset Rate: 10 upsets/day Only 72% of device is vulnerable TMR protects against 99% of upsets MTBF: days Sensor Dependability MTBF = Mean Time Between Failures TMR = Triple-Modular Redundancy CD = Computational Density

Hyperspectral Imaging (HSI) enables detection of key materials within a scene  Detects intensities of hundreds of wavelengths for each pixel forming an image cube  Compares characteristic spectrum of pixels to spectral signatures of desired materials  Marks location of identified materials within the scene Image cube too large for transfer to processing station due to limited bandwidth Solution: perform computation in-situ and only transfer final results Two HSI missions chosen as case studies for on-board processing  Hyperion on EO-1 satellite requiring 2.2× bit integer MAC/s  AVIRIS on ER-2 aircraft requiring 348× bit integer MAC/s HSI Case Studies 5 DetectCompare Identify MAC = Multiply Accumulate

Studied design combinations of 6 FPGAs and 3 fault-tolerant strategies  Devices: 3 standard Virtexes, 2 low power Spartans, and 1 Rad-Hard SIRF  Fault-tolerant strategies: No fault tolerance (NFT): 0% overhead and no added reliability Algorithm-based FT (ABFT): 10% overhead and significant added reliability Triple-modular redundancy (TMR): 200% overhead and strongest reliability Power consumption and dependability metrics calculated for each design Raw Results 6 FPGA = Field-Programmable Gate Array Values predicted with linear regression on feature size in underlined italics

Framework determines Pareto-optimal designs based on metrics  Non-Pareto-optimal designs can be completely ignored Constraints on metrics enforced to remove invalid designs  Removes unreliable and/or power-hungry designs based on mission Out of 18 designs, only 5 designs selected as optimal for each mission  Designers decide which design is best based on desired tradeoffs Optimal Design Selection 7 EO-1 Hyperion Satellite ER-2 AVIRIS Aircraft

Conclusions  The design of on-board processing systems can be complicated Many constraints due to sparse and hostile environments Many combinations of devices and FT-strategies to consider  Our framework simplifies exploration space and highlights most promising designs, allowing designers to choose best design for their needs Future Research  Add CPUs to list of studied devices (from high performance to low power)  Consider memory devices in addition to processors Integral to on-board processing systems Affects performance, power, and dependability  Expand framework to include metrics such as lifetime (TID), cost, size/weight, maximum dose rate Conclusions and Future Research 8