Download presentation
Presentation is loading. Please wait.
1
Honeywell Defense & Space, Clearwater, FL
Dependable Multiprocessor (DM) Support for Diverse and Heterogeneous Processing Précis Presentation Dr. Matthew Clark Honeywell Defense & Space, Clearwater, FL (727) September 22, 2009
2
DM: A COTS-Based High-Performance Payload Cluster Computing Platform
What is DM? A high-performance, COTS-based, fault tolerant cluster onboard processing system that can operate in a natural space radiation environment High throughput density (>300 MOPS/watt), scalable & software based High system availability >0.995 High probability of timely and correct delivery of data >0.995 Technology independent system software that manages cluster of high performance COTS processing elements Technology independent system software that enhances radiation upset tolerance TRL6 Testbed Data Processor Emulated Spacecraft Computer Discrete Control Network System Controller (Mass Data Store) Ethernet Switch DM Why is DM important? Status? Flying high-performance COTS in space is a long-held NASA and DoD objective DM is bringing this objective closer to reality Enables heretofore unrealizable levels of onboard data and autonomy processing Enables faster, more efficient application development Enables users to port applications directly from laboratory to space environment DM is a significant paradigm shift provides ~ 10X – 100X throughput density available with current RHBP & software-based RHBD processing at much lower cost software-based technology allows space to keep pace with COTS NASA NMP ST8 has invested >$12M in the development and demonstration of DM technology through TRL6 Demonstrated DM predictive Availability, “Computational Consistency,” and Performance models Demonstrated ability to meet NASA Level 1 requirements/goals Successfully completed system-level radiation testing DM project has further developed, refined and demonstrated the process for migrating COTS high performance computing to space DM technology has been demonstrated on wide variety of platforms DM technology is applicable to wide range of missions Seeking a ride to space to achieve TRL7 What is it? An Advanced on-board Payload Data Processing System Developed for JPL New Millennium Program Space Technology 8 The Dependable Multiprocessor is a technology being developed by Honeywell Inc. It allows us to do complex processing of science data onboard the spacecraft. Using standard commercial components arranged in a unique architecture, along with special software to detect and correct radiation induced errors, this technology will allow us to build computers that are up to 1000 times more capable than current day spaceborne computer systems. These new, highly capable computers will make possible a new generation of intelligent spacecraft and robotics for future NASA science and exploration missions. The ST8 technology validation effort sought to fly 4-node cluster with a Rad Hard SBC host or controller node, a mass storage node, and data processors on a cPCI backplane. The SBC includes the three (3) Gigabit Ethernet ports for high speed networking of a cluster of these high performance data processing nodes. It leverages four data processors consisting of the 7447a SBC with an altivec coprocessor Today’s DM architecture is a ground TRL6 validation experiment 800 MHz ~ 5200 MFLOPS ~ 0.6 kg The system completes 1K Complex FFT in ~ 9.8 msec ~ 433 MFLOPS/watt DM is a Payload data processing system architecture, including a software framework and set of fault tolerance techniques, which provides: Heterogeneous Scalable Architecture Enables COTS based, high performance, processors Runtime Environment Familiar to Application Developers Supports parallel/distributed processing Incorporates reconfigurable co-processors Advanced Fault Tolerance Facilitates porting of applications from the laboratory to the spacecraft Maintains required dependability and availability Responsive to environment, application criticality and system mode Autonomous and adaptive controller for configuration Optimizes resource utilization and system efficiency TRL6 Status: Radiation Testing proton and heavy ion testing established SEE rates for all components on COTS DP boards system-level testing performed with one COTS DP board exposed to proton beam while running the flight experiment application suite (SAR, Matrix Multiply, 2DFFT, LUD, AltiVec (FFTW), stressing Logic test, and stressing Branch test) in a DM system context DM flight experiment instrumentation including emulated ground station operated and post-experiment data analysis demonstrated DMM middleware performed as designed DM system successfully recovered from all radiation-induced faults validated DM predictive Availability, “Computational Consistency,” and Performance models DM Markov Models demonstrated DM predictive Availability, “Computational Consistency,” and Performance models models based on component-level radiation test results and comprehensive SWIFI (Software-Implemented Fault Injection) campaigns extrapolation to various radiation environments, i.e., orbits, and other applications Demonstrated ability to meet NASA level 1 requirements/goals > Availability > “Computational Consistency,” the probability of timely and correct delivery of data > 300 MOPS per watt > 307 MOPS/watt HSI application on 7447a processor with AltiVec (including System Controller power) > 1077 MOPS/watt HSI application on PS Semi dual core processor with AltiVec Demonstrated ease of use independent 3rd party with minimal knowledge of fault tolerance ported two (2) diverse applications to DM testbed in less than three (3) days including scalable parallelization, hybrid ABFT/in-line replication, 2D convolution and median filter ABFT library functions, FEMPI, and check-pointing DM Technology is Ready for a Flight Experiment 2
3
DM TRL6 Testbed System TRL6 Testbed Configuration
System Controller (SC) Honeywell Ganymede (PPC603e) VxWorks 5.5 4x Data Processing Nodes (DP) XES XPedite MHz with AltiVec) ruggedized, conductively-cooled COTS SBCs Wind River PNE-LE CGL 1.4 (Kernel ) 1 DP emulates rad-hard “mass memory” device 100BaseT Ethernet Network Spacecraft Communication Interface over RS422 on SC Dependable Multiprocessor Middleware (DMM) Critical Design Review - ruggedized, conductively-cooled, COTS boards can fly in space DM TRL6 Testbed XES XPedite 6031 Data Processors Ethernet Backplane Extender Cards Honeywell Ganymede System Controller Standard Compact PCI Backplane DM is the path to space; DMM & COTS processors ready for flight experiment
4
The Next Steps in Diversity and Heterogeneity
To enhance DM’s diversity and heterogeneity, efforts are being made in the following areas: Hardware Additional COTS (SOI) processing architectures with path to space for improved throughput density Additional high-speed interconnects with path to space for improved bandwidth, reliability, fault tolerance Software Additional POSIX compliant operating systems to expand supported data-processing platforms Newer versions of VxWorks (6.x) Wider variants of Linux (e.g., linux-rt) Non-monolithic kernels (e.g., QNX) Add support for Open MPI to expand the types of user applications that can be transparently migrated to DM environment Upgrade current HAM software foundation to a Service AvailabilityTM Forum (SAF) compliant suite DM is flexible; designed to meet current & future needs
5
Summary & Conclusion DM is as an architecture and methodology that enables COTS-based, high performance, scalable, multi-computer systems, and accommodates future technology upgrades (HW & SW) DM can rapidly incorporate new techniques/technologies to overcome performance gaps with regards to throughput, power, mass and cost DM technology is platform-agnostic middleware DM is a significant paradigm shift - for applications that only need to be radiation tolerant, DM can provide 10x-100x throughput density (MOPS/watt) over current software programmable RHBP & RHBD processing capability with reduced cost, risk, and schedule software-based technology allows space to keep pace with COTS DM technology enables more onboard processing, faster onboard processing, faster frame processing, lower downlink bandwidth, and data/information direct to the war fighter DM was developed by NASA as flight project - extensive ground testing - predictive models - ready to fly DM technology can take COTS to space; ready to support a flight experiment
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.