Honeywell Defense & Space, Clearwater, FL

Slides:



Advertisements
Similar presentations
Technology Drivers Traditional HPC application drivers – OS noise, resource monitoring and management, memory footprint – Complexity of resources to be.
Advertisements

EEL6686 Guest Lecture February 25, 2014 A Framework to Analyze Processor Architectures for Next-Generation On-Board Space Computing Tyler M. Lovelly Ph.D.
Rad Hard By Software for Space Multicore Processing David Bueno, Eric Grobelny, Dave Campagna, Dave Kessler, and Matt Clark Honeywell Space Electronic.
Advanced Processing Systems Honeywell Proprietary1 12/04/2003 Honeywell UF HCS & Honeywell DSES Opportunities Presented by Advanced Processing Systems.
IBM RS6000/SP Overview Advanced IBM Unix computers series Multiple different configurations Available from entry level to high-end machines. POWER (1,2,3,4)
City University London
Chapter 13 Embedded Systems
Legion Worldwide virtual computer. About Legion Made in University of Virginia Object-based metasystems software project middleware that connects computer.
John David Eriksen Jamie Unger-Fink High-Performance, Dependable Multiprocessor.
1 A survey on Reconfigurable Computing for Signal Processing Applications Anne Pratoomtong Spring2002.
Scientific Computing in Space Using COTS Processors
The Pursuit for Efficient S/C Design The Stanford Small Sat Challenge: –Learn system engineering processes –Design, build, test, and fly a CubeSat project.
RSC Williams MAPLD 2005/BOF-S1 A Linux-based Software Environment for the Reconfigurable Scalable Computing Project John A. Williams 1
4.x Performance Technology drivers – Exascale systems will consist of complex configurations with a huge number of potentially heterogeneous components.
Future Airborne Capability Environment (FACE)
Command and Data Handling (C&DH)
RiceNIC: A Reconfigurable and Programmable Gigabit Network Interface Card Jeff Shafer, Dr. Scott Rixner Rice Computer Architecture:
Panel Three - Small Businesses: Sustaining and Growing a Market Presence Open Interfaces and Market Penetration Protecting Intellectual Innovation and.
Page No. 1 Kelvin Nichols Payload Operations and Integration Center EO50 Delay Tolerant Networking (DTN) Implementation on the International Space Station.
.1 RESEARCH & TECHNOLOGY DEVELOPMENT CENTER SYSTEM AND INFORMATION SCIENCES JHU/MIT Proprietary Titan MESSENGER Autonomy Experiment.
A Case Study in HW/SW Codesign and RC Project Risk Management: The Honeywell Reconfigurable Space Computer (HRSC) Jeremy Ramos Advanced Processing Systems.
NMP ST8 Dependable Multiprocessor Precis Presentation Dr. John R. Samson, Jr. Honeywell Defense & Space Systems U.S. Highway 19 North Clearwater,
Page 1 Reconfigurable Communications Processor Principal Investigator: Chris Papachristou Task Number: NAG Electrical Engineering & Computer Science.
Nov 3, 2009 RN - 1 Jet Propulsion Laboratory California Institute of Technology Current Developments for VLBI Data Acquisition Equipment at JPL Robert.
Issues Autonomic operation (fault tolerance) Minimize interference to applications Hardware support for new operating systems Resource management (global.
MAPLD 2005/254C. Papachristou 1 Reconfigurable and Evolvable Hardware Fabric Chris Papachristou, Frank Wolff Robert Ewing Electrical Engineering & Computer.
An Architecture and Prototype Implementation for TCP/IP Hardware Support Mirko Benz Dresden University of Technology, Germany TERENA 2001.
March 2004 At A Glance NASA’s GSFC GMSEC architecture provides a scalable, extensible ground and flight system approach for future missions. Benefits Simplifies.
NMP ST8 Dependable Multiprocessor (DM)
Accelerated Long Range Traverse (ALERT) Paul Springer Michael Mossey.
Graphical Design Environment for a Reconfigurable Processor IAmE Abstract The Field Programmable Processor Array (FPPA) is a new reconfigurable architecture.
Rad Hard By Software for Space Multicore Processing David Bueno, Eric Grobelny, Dave Campagna, Dave Kessler, and Matt Clark Honeywell Space Electronic.
NMP ST8 Dependable Multiprocessor (DM)
Backprojection and Synthetic Aperture Radar Processing on a HHPC Albert Conti, Ben Cordes, Prof. Miriam Leeser, Prof. Eric Miller
ESA Harwell Robotics & Autonomy Facility Study Workshop Autonomous Software Verification Presented By: Rick Blake.
Dependable Multiprocessing with the Cell Broadband Engine Dr. David Bueno- Honeywell Space Electronic Systems, Clearwater, FL Dr. Matt Clark- Honeywell.
NMP ST8 Dependable Multiprocessor (DM) Precis Presentation Dr. John R. Samson, Jr. Honeywell Defense & Space Systems U.S. Highway 19 North Clearwater,
© 2004 IBM Corporation Power Everywhere POWER5 Processor Update Mark Papermaster VP, Technology Development IBM Systems and Technology Group.
March 2004 At A Glance The AutoFDS provides a web- based interface to acquire, generate, and distribute products, using the GMSEC Reference Architecture.
JSTAR Independent Test Capability (ITC) Core Flight System (CFS) Utilization October 26, 2015 Justin R Morris NASA IV&V Program.
Page No. 1 Overview Kelvin Nichols Payload Operations and Integration Center EO50 SSCN Delay Tolerant Networking (DTN)
1 OPERATING SYSTEMS. 2 CONTENTS 1.What is an Operating System? 2.OS Functions 3.OS Services 4.Structure of OS 5.Evolution of OS.
Introduction to Operating Systems Concepts
Parallel Programming Models
Chapter 6: Securing the Cloud
Adam Schlesinger NASA – JSC November 3, 2011
Introduction to Distributed Platforms
Rekayasa Perangkat Lunak
Computing models, facilities, distributed computing
cFE FSW at APL & FSW Reusability
Operating Systems : Overview
Dependable Multiprocessing with the Cell Broadband Engine
Joseph JaJa, Mike Smorul, and Sangchul Song
Cluster Active Archive
Grid Computing.
Adam Schlesinger NASA – JSC November 3, 2011
Chapter III Desktop Imaging Systems & Issues
Adam Schlesinger NASA - JSC October 30, 2013
Anne Pratoomtong ECE734, Spring2002
Rekayasa Perangkat Lunak
QNX Technology Overview
Chapter 2: The Linux System Part 1
Operating Systems : Overview
Autonomous Operations in Space
Subject Name: Operating System Concepts Subject Number:
3rd Studierstube Workshop TU Wien
Co-designed Virtual Machines for Reliable Computer Systems
Chapter-1 Computer is an advanced electronic device that takes raw data as an input from the user and processes it under the control of a set of instructions.
L. Glimcher, R. Jin, G. Agrawal Presented by: Leo Glimcher
Presentation transcript:

Honeywell Defense & Space, Clearwater, FL Dependable Multiprocessor (DM) Support for Diverse and Heterogeneous Processing Précis Presentation Dr. Matthew Clark Honeywell Defense & Space, Clearwater, FL matthew.clark@honeywell.com (727) 539 - 3439 September 22, 2009

DM: A COTS-Based High-Performance Payload Cluster Computing Platform What is DM? A high-performance, COTS-based, fault tolerant cluster onboard processing system that can operate in a natural space radiation environment High throughput density (>300 MOPS/watt), scalable & software based High system availability >0.995 High probability of timely and correct delivery of data >0.995 Technology independent system software that manages cluster of high performance COTS processing elements Technology independent system software that enhances radiation upset tolerance TRL6 Testbed Data Processor Emulated Spacecraft Computer Discrete Control Network System Controller (Mass Data Store) Ethernet Switch DM Why is DM important? Status? Flying high-performance COTS in space is a long-held NASA and DoD objective DM is bringing this objective closer to reality Enables heretofore unrealizable levels of onboard data and autonomy processing Enables faster, more efficient application development Enables users to port applications directly from laboratory to space environment DM is a significant paradigm shift provides ~ 10X – 100X throughput density available with current RHBP & software-based RHBD processing at much lower cost software-based technology allows space to keep pace with COTS NASA NMP ST8 has invested >$12M in the development and demonstration of DM technology through TRL6 Demonstrated DM predictive Availability, “Computational Consistency,” and Performance models Demonstrated ability to meet NASA Level 1 requirements/goals Successfully completed system-level radiation testing DM project has further developed, refined and demonstrated the process for migrating COTS high performance computing to space DM technology has been demonstrated on wide variety of platforms DM technology is applicable to wide range of missions Seeking a ride to space to achieve TRL7 What is it? An Advanced on-board Payload Data Processing System Developed for JPL New Millennium Program Space Technology 8 The Dependable Multiprocessor is a technology being developed by Honeywell Inc. It allows us to do complex processing of science data onboard the spacecraft. Using standard commercial components arranged in a unique architecture, along with special software to detect and correct radiation induced errors, this technology will allow us to build computers that are up to 1000 times more capable than current day spaceborne computer systems. These new, highly capable computers will make possible a new generation of intelligent spacecraft and robotics for future NASA science and exploration missions. The ST8 technology validation effort sought to fly 4-node cluster with a Rad Hard SBC host or controller node, a mass storage node, and data processors on a cPCI backplane. The SBC includes the three (3) Gigabit Ethernet ports for high speed networking of a cluster of these high performance data processing nodes. It leverages four data processors consisting of the 7447a SBC with an altivec coprocessor Today’s DM architecture is a ground TRL6 validation experiment 800 MHz ~ 5200 MFLOPS ~ 0.6 kg The system completes 1K Complex FFT in ~ 9.8 msec ~ 433 MFLOPS/watt DM is a Payload data processing system architecture, including a software framework and set of fault tolerance techniques, which provides: Heterogeneous Scalable Architecture Enables COTS based, high performance, processors Runtime Environment Familiar to Application Developers Supports parallel/distributed processing Incorporates reconfigurable co-processors Advanced Fault Tolerance Facilitates porting of applications from the laboratory to the spacecraft Maintains required dependability and availability Responsive to environment, application criticality and system mode Autonomous and adaptive controller for configuration Optimizes resource utilization and system efficiency TRL6 Status: Radiation Testing proton and heavy ion testing established SEE rates for all components on COTS DP boards system-level testing performed with one COTS DP board exposed to proton beam while running the flight experiment application suite (SAR, Matrix Multiply, 2DFFT, LUD, AltiVec (FFTW), stressing Logic test, and stressing Branch test) in a DM system context DM flight experiment instrumentation including emulated ground station operated and post-experiment data analysis demonstrated DMM middleware performed as designed DM system successfully recovered from all radiation-induced faults validated DM predictive Availability, “Computational Consistency,” and Performance models DM Markov Models demonstrated DM predictive Availability, “Computational Consistency,” and Performance models models based on component-level radiation test results and comprehensive SWIFI (Software-Implemented Fault Injection) campaigns extrapolation to various radiation environments, i.e., orbits, and other applications Demonstrated ability to meet NASA level 1 requirements/goals > 0.995 Availability > 0.995 “Computational Consistency,” the probability of timely and correct delivery of data > 300 MOPS per watt > 307 MOPS/watt HSI application on 7447a processor with AltiVec (including System Controller power) > 1077 MOPS/watt HSI application on PS Semi dual core processor with AltiVec Demonstrated ease of use independent 3rd party with minimal knowledge of fault tolerance ported two (2) diverse applications to DM testbed in less than three (3) days including scalable parallelization, hybrid ABFT/in-line replication, 2D convolution and median filter ABFT library functions, FEMPI, and check-pointing DM Technology is Ready for a Flight Experiment 2

DM TRL6 Testbed System TRL6 Testbed Configuration System Controller (SC) Honeywell Ganymede (PPC603e) VxWorks 5.5 4x Data Processing Nodes (DP) XES XPedite 6031 (PPC7447a @ 800MHz with AltiVec) ruggedized, conductively-cooled COTS SBCs Wind River PNE-LE CGL 1.4 (Kernel 2.6.14) 1 DP emulates rad-hard “mass memory” device 100BaseT Ethernet Network Spacecraft Communication Interface over RS422 on SC Dependable Multiprocessor Middleware (DMM) Critical Design Review - ruggedized, conductively-cooled, COTS boards can fly in space DM TRL6 Testbed XES XPedite 6031 Data Processors Ethernet Backplane Extender Cards Honeywell Ganymede System Controller Standard Compact PCI Backplane DM is the path to space; DMM & COTS processors ready for flight experiment

The Next Steps in Diversity and Heterogeneity To enhance DM’s diversity and heterogeneity, efforts are being made in the following areas: Hardware Additional COTS (SOI) processing architectures with path to space for improved throughput density Additional high-speed interconnects with path to space for improved bandwidth, reliability, fault tolerance Software Additional POSIX compliant operating systems to expand supported data-processing platforms Newer versions of VxWorks (6.x) Wider variants of Linux (e.g., linux-rt) Non-monolithic kernels (e.g., QNX) Add support for Open MPI to expand the types of user applications that can be transparently migrated to DM environment Upgrade current HAM software foundation to a Service AvailabilityTM Forum (SAF) compliant suite DM is flexible; designed to meet current & future needs

Summary & Conclusion DM is as an architecture and methodology that enables COTS-based, high performance, scalable, multi-computer systems, and accommodates future technology upgrades (HW & SW) DM can rapidly incorporate new techniques/technologies to overcome performance gaps with regards to throughput, power, mass and cost DM technology is platform-agnostic middleware DM is a significant paradigm shift - for applications that only need to be radiation tolerant, DM can provide 10x-100x throughput density (MOPS/watt) over current software programmable RHBP & RHBD processing capability with reduced cost, risk, and schedule software-based technology allows space to keep pace with COTS DM technology enables more onboard processing, faster onboard processing, faster frame processing, lower downlink bandwidth, and data/information direct to the war fighter DM was developed by NASA as flight project - extensive ground testing - predictive models - ready to fly DM technology can take COTS to space; ready to support a flight experiment