Engineering Applications on

Slides:



Advertisements
Similar presentations
FPGA (Field Programmable Gate Array)
Advertisements

FPGAs for HIL and Engine Simulation
GPGPU Introduction Alan Gray EPCC The University of Edinburgh.
An Introduction to Reconfigurable Computing Mitch Sukalski and Craig Ulmer Dean R&D Seminar 11 December 2003.
NASA High Performance Computing (HPC) Directions, Issues, and Concerns: A User’s Perspective Dr. Robert C. Singleterry Jr. NASA Langley Research Center.
Embedded Systems: Introduction. Course overview: Syllabus: text, references, grading, etc. Schedule: will be updated regularly; lectures, assignments.
MA5233: Computational Mathematics
Introduction to Reconfigurable Computing CS61c sp06 Lecture (5/5/06) Hayden So.
Lecture 26: Reconfigurable Computing May 11, 2004 ECE 669 Parallel Computer Architecture Reconfigurable Computing.
ENGIN112 L38: Programmable Logic December 5, 2003 ENGIN 112 Intro to Electrical and Computer Engineering Lecture 38 Programmable Logic.
Computational Astrophysics: Methodology 1.Identify astrophysical problem 2.Write down corresponding equations 3.Identify numerical algorithm 4.Find a computer.
Hypercomputing With the CORDIC Algorithm
Digital Design Haldun Hadimioglu Computer and Information Science 3/30/2003 CS 2204 Laboratory.
Programmable logic and FPGA
PhD/Master course, Uppsala  Understanding the interaction between your program and computer  Structuring the code  Optimizing the code  Debugging.
LabVIEW Design of Digital Integrated Circuits FPGA IC Implantation.
CS 151 Digital Systems Design Lecture 38 Programmable Logic.
Presenter MaxAcademy Lecture Series – V1.0, September 2011 Introduction and Motivation.
General FPGA Architecture Field Programmable Gate Array.
1 Down Place Hammersmith London UK 530 Lytton Ave. Palo Alto CA USA.
EKT303/4 PRINCIPLES OF PRINCIPLES OF COMPUTER ARCHITECTURE (PoCA)
Networking Virtualization Using FPGAs Russell Tessier, Deepak Unnikrishnan, Dong Yin, and Lixin Gao Reconfigurable Computing Group Department of Electrical.
RSC Williams MAPLD 2005/BOF-S1 A Linux-based Software Environment for the Reconfigurable Scalable Computing Project John A. Williams 1
Lecture 2: Field Programmable Gate Arrays September 13, 2004 ECE 697F Reconfigurable Computing Lecture 2 Field Programmable Gate Arrays.
Time Integration Utilities on an FPGA Cris A. Kania with Olaf O. Storaasli, Ph. D. NASA Langley.
ENG3050 Embedded Reconfigurable Computing Systems General Information Handout Winter 2015, January 5 th.
Making FPGAs a Cost-Effective Computing Architecture Tom VanCourt Yongfeng Gu Martin Herbordt Boston University BOSTON UNIVERSITY.
COMPUTER SCIENCE &ENGINEERING Compiled code acceleration on FPGAs W. Najjar, B.Buyukkurt, Z.Guo, J. Villareal, J. Cortes, A. Mitra Computer Science & Engineering.
Parallel Computing Using FPGA ( Field Programmable Gate Arrays ) 15 th May, 2009 Studies in Parallel & Distributed Systems – Sohaib Ahmed.
Sogang University Advanced Computing System Chap 1. Computer Architecture Hyuk-Jun Lee, PhD Dept. of Computer Science and Engineering Sogang University.
Reckless Speeding The investigation of the programming capabilities of the HAL hypercomputer Reese Dandawate Governor’s School NASA mentorship July 25,
Research on Reconfigurable Computing Using Impulse C Carmen Li Shen Mentor: Dr. Russell Duren February 1, 2008.
NDA Confidential. Copyright ©2005, Nallatech.1 Implementation of Floating- Point VSIPL Functions on FPGA-Based Reconfigurable Computers Using High- Level.
200/MAPLD 2004 Craven1 Super-Sized Multiplies: How Do FPGAs Fare in Extended Digit Multipliers? Stephen Craven Cameron Patterson Peter Athanas Configurable.
SJSU SPRING 2011 PARALLEL COMPUTING Parallel Computing CS 147: Computer Architecture Instructor: Professor Sin-Min Lee Spring 2011 By: Alice Cotti.
Lecture No. 1 Computer Logic Design. About the Course Title: –Computer Logic Design Pre-requisites: –None Required for future courses: –Computer Organization.
Reminder Lab 0 Xilinx ISE tutorial Research Send me an if interested Looking for those interested in RC with skills in compilers/languages/synthesis,
Somervill RSC 1 125/MAPLD'05 Reconfigurable Processing Module (RPM) Kevin Somervill 1 Dr. Robert Hodson 1
Computing Faster Without CPUs GOAL: Evaluate FPGA*-based Hypercomputer Potential for NASA Scientific Computations * Field-Programmable Gate Array (e.g.
Storaasli 5/9/03 Analytical and Computational Methods Computing Faster without CPUs Scientific Applications on FPGA-based* Reconfigurable Hypercomputers.
CML REGISTER FILE ORGANIZATION FOR COARSE GRAINED RECONFIGURABLE ARCHITECTURES (CGRAs) Dipal Saluja Compiler Microarchitecture Lab, Arizona State University,
RSC MAPLD 2005/130Hodson Robert F. Hodson 1, Kevin Somervill 1, John Williams 2, Neil Bergman 2, Rob Jones 3 1 NASA LaRC, 2 University of Queensland, 3.
Algorithm and Programming Considerations for Embedded Reconfigurable Computers Russell Duren, Associate Professor Engineering And Computer Science Baylor.
EKT303/4 PRINCIPLES OF PRINCIPLES OF COMPUTER ARCHITECTURE (PoCA)
Wang Chen, Dr. Miriam Leeser, Dr. Carey Rappaport Goal Speedup 3D Finite-Difference Time-Domain.
Hardware Accelerator for Combinatorial Optimization Fujian Li Advisor: Dr. Areibi.
Overview Real World NP-hard problems, such as fluid dynamics, calcium cell signaling, and stomata networks in plant leaves involve extensive computation.
© 2010 Altera Corporation - Public Lutiac – Small Soft Processors for Small Programs David Galloway and David Lewis November 18, 2010.
Processor Structure and Function Chapter8:. CPU Structure  CPU must:  Fetch instructions –Read instruction from memory  Interpret instructions –Instruction.
FPGA-Based System Design: Chapter 1 Copyright  2004 Prentice Hall PTR Moore’s Law n Gordon Moore: co-founder of Intel. n Predicted that number of transistors.
Evaluating Logic Resources Utilization in an FPGA-Based TMR CPU
Somervill RSC 1 125/MAPLD'05 Reconfigurable Processing Module (RPM) Kevin Somervill 1 Dr. Robert Hodson 1
© 2004, D. J. Foreman 1 Device Mgmt. © 2004, D. J. Foreman 2 Device Management Organization  Multiple layers ■ Application ■ Operating System ■ Driver.
Ram is a volatile memory meaning that it can only store its contents as long as its power source is constantly maintained. SDRAM: Dynamic RAM - Inexpensive.
1 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames) CPRE 583 Reconfigurable Computing Lecture.
Chapter 11 System Performance Enhancement. Basic Operation of a Computer l Program is loaded into memory l Instruction is fetched from memory l Operands.
Computer Architecture Chapter (5): Internal Memory
3-1 MKE1503/MEE10203 Programmable Electronics Computer Engineering Department Faculty of Electrical and Electronic Universiti Tun Hussein Onn Malaysia.
Presented by Reconfigurable HPC Research at ORNL using Field-Programmable Gate Arrays (FPGAs) Olaf O. Storaasli Future Technologies Group Computer Science.
Sridhar Rajagopal Bryan A. Jones and Joseph R. Cavallaro
Computer Operations Part 2.
Programmable Hardware: Hardware or Software?
Accelerating Genome Sequencing 100X with FPGAs
ریز پردازنده. ریز پردازنده مراجع درس میکروکنترلرهای AVR برنامه نویسی اسمبلی و C محمدعلی مزیدی، سپهر نعیمی و سرمد نعیمی مرجع کامل میکروکنترلرهای AVR.
The Arabica Project A distributed scientific computing project based on a cluster computer and Java technologies. Daniel D. Warner Dept. of Mathematical.
Final Project presentation
Star Bridge Systems, Inc.
PANN Testing.
Computing as Fast as an Engineer can Think
Programmable logic and FPGA
Presentation transcript:

Engineering Applications on NASA’s FPGA*-based Hypercomputers By Olaf.O.Storaasli@nasa.gov Analytical & Computational Methods Branch NASA Langley Research Center Hampton Virginia 7th Military Aerospace Programmable Logic Device (MAPLD) International Conference Reagan Center, Washington DC September 10, 2004 NOTES: find out what Rutishauser’s research is for summer ’03 slides confirm GFLOP numbers *Field-Programmable Gate Array

Contents Background: Hardware, “Gateware” Current: Algorithms Applications: CPU-FPGA, FPGA Future: “New” Spacecraft Hypercomputer 2

NASA Reconfigurable Hypercomputers 6M gates/FPGA 62K gates/FPGA ‘02 ‘04 Good Afternoon – Happy to be here and share some exciting results about a new computing paradigm. Our goal is to harness.FPGAs (image proc – networking) for scientific Our Team has grown from OOS & RCS…..to include 6 NASA + 8 students We’ve Partnered with SBS (SAA), and collaborate with many (NSA..) 3

Computing Faster Without CPUs GOAL: Explore Engineering Applications on NASA’s FPGA-based Hypercomputers TEAM: Drs. Olaf Storaasli, Jarek Sobieski & Robert Singleterry, Dave Rutishauser, Joe Rehder, Garry Qualls, Robert Lewis Students: MIT Harvard VT Brown UVA JPMorgan Case Pitt, Governor’s School PARTNERS: Starbridge Systems (FPGA H/W + VIVA S/W) NSA, USAF, MSFC, AlphaStar Good Afternoon – Happy to be here and share some exciting results about a new computing paradigm. Our goal is to harness.FPGAs (image proc – networking) for scientific Our Team has grown from OOS & RCS…..to include 6 NASA + 8 students We’ve Partnered with SBS (SAA), and collaborate with many (NSA..) 4

VIVA: Custom Chip Design What: Graphically code FPGAs: drag & drop vs text) VIVA Menu Traditional Code: 1D do i = 1, 1000 C= A+B end do VIVA Gateware: 3D + +…+ Parallelism natural esoteric How: Converts icons-transports to FPGA circuit Why: near-ASIC speed (w/o chip design $$$) Corelib: Pre-built objects & examples Data: Any type-size-precision (not fixed) More: System Description ports to any H/W “write once, run anywhere” 5

FPGA Use CPU +FPGA Accelerator Replace CPUs CPU CPU Exploit Local Parallelism Max {kernel Ops/cycle} C/FORTRAN calls VIVA kernel Limit: FPGA gates + Amdahl’s Law Replace CPUs Exploit Parallelism Fully Max {Ops/cycle} => Fill FPGA VIVA/VHDL/Verilog code Limit: FPGA(s) gates CPU CPU <=> Call FPGA kernel Ax=b NASA GPS 50 line kernel 95% CPU Time Move to FPGA 28k lines FORTRAN 6 Cray XD1: Opterons + Xilinx FPGAs

GENOA-GPS* “Port” *‘99 NASA Software-of-the-Year GENOA Analysis/Design (AlphaStar) GPS Matrix Equation Solver (NASA) Structural, EM, acoustic analysis+design Most Computations in 50-line kernel kernel coded: VIVA-GPS VIVA2.4 => large applications ongoing (NASA-AlphaStar-Starbridge) Progressive Failure, Reliability, Durability Manufacturing,Virtual Test, Life prediction Calls GPS Shuttle re-entry wing damage analysis time: 660 hours => minutes (Goal) Finite Element Model *‘99 NASA Software-of-the-Year 7

Columbia Burn-thru Analysis RCC-Tseal Fracture 503 sec Leading Edge FEM Leading Edge Panel 6 Panel 7 Panel 8 38in Insulation Fracture 230 Sec Spar Fracture 500 sec RCC-Tseal Fracture 503 sec Time 8

Maximize Performance via Parallelism FPGA Use CPU +FPGA Accelerator Exploit Local Parallelism Max {kernel Ops/cycle} C/FORTRAN calls VIVA kernel Limit: FPGA gates + Amdahl’s Law Replace CPUs Exploit Parallelism Fully Max {Ops/cycle} => Fill FPGA 100% VIVA code Limit: FPGA(s) gates Maximize Performance via Parallelism Adds/FPGA 16 32 128 256 512 640 % FPGA used 1 2 8 41 51 109 Ops 4 34 77 154 192 1000+ adds/clock cycle => 1011 Ops/sec (1 add/cycle on CPUs) 9 Cray XD1: Opterons + Xilinx FPGAs

Memory: FPGA & SDRAM - keep “action” on/near FPGA - 2-8GB SDRAM (large applications) 144x 2KB blocks RAM 10

File I/O FileIn/FileOut in Corelib Transfers 2 KB blocks (Disk  FPGA RAM) User can access FPGA RAM 4 Bytes at a time 11

Add Files in Parallel R S + W R S Read 2 files => Store in FPGA RAM => + files => Write result R S + W R S 12

Parallel Adds Faster - same file size - CPUs (1 add) 100 92 90 80 File size 70 Time in cycles 60 4KB 8KB 50 46 16KB Log. (8KB) 40 Log. (4KB) Log. (16KB) 30 23 20 10 2 4 8 12 16 20 24 28 Number of FPGA Adders used 13

Algorithms Developed Matrix Algebra: {V}, [M], {V}T{V}, [M]x[M],GCD,… n! => Probability: Combinations/Permutations Cordic => Transcendentals: sin, log, exp, cosh… ∂y/∂x & ∫f(x)dx => Runge-Kutta: CFD, Newmark Beta: CSM Matrix Equation Solvers: [A]{x} = {b}, Gauss & Jacobi . Dynamic Analysis: [M]{ü} + [C]{u} + [K]{u} + NL = {P(t)} Analog Computing: digital accuracy NLT - non-linear terms Nonlinear Analysis: reduces NL time Structural Design & Optimization 14

Applications: VIVA Code Jacobi Matrix Solver Gauss Matrix Solver Runge-Kutta Cellular Automata 15

Gauss-Jordan A x = B Solver • VIVA code solves n equations. Ex: x0 + x1 + x2 = 0 x0 – 2x1 + 2x2 = 4 x0 + 2x1 – x2 = 2 => x0 = 4 x1 = -2 x2 = -2 • Run on hypercomputer emulator, then FPGA 16

Spring-Mass Solver Method: 4-stage Runge-Kutta f 17

Cellular Automata • Parallel: Stephen Wolfram - A New Kind of Science • Complexity via simple interactions w/o PDEs • CFD => Structures • Cell-neighbors interactions; simple compute/cell d P FEA solution Cellular Automata solution 18

Cantilever Beam Optimization Constants: L = 24” W = 3” P = 20 lbs = 0.097 lbs/in3 Constraint: Stressallowed = 40K lbs/in2 Find thickness, d, to minimize where 19

VIVA FPGA Code Minimizes Beam Weight d chosen 1023 times VIVA Results: d= 0.156” (0.155 exact) Minimum weight = 1.09 lbs (1.082 exact) 20

“a bold new course into the cosmos” Reconfigurable Scalable Computing (RSC) for Space Applications - $14.8M 21

Spirit & Opportunity Rovers 6 Radiation-tolerant FPGAs: 1M gates @ 100kRads ----------------------------------------- Next: 6M gates @ 200kRads 22

What Reconfigurable Scalable Computing (RSC) for Space Applications Who Langley, Goddard, NSA, Starbridge, Jefferson Lab, ASRC, Queensland When 4 years (FY ‘05-’08) How $14.8M Goal Effective-affordable processing for moon & Mars missions Plan Design-implement-demonstrate RSC for space applications Hardware Stacked scalable FPGAs Gateware Conventional (MPI/Linux) + Special (VIVA) More: 23

Summary Hardware: Exploiting advanced FPGA-based systems FPGAs: Rapid growth, inherently //, flexible, efficient VIVA: Powerful & growing (tailored to NASA needs) Applications: - Many Engineering algorithms (VIVA => FPGAs) - GPS-VIVA => CPU+FPGA accelerator Speed: 640 ops/cycle (2x1011 ops/sec) measured Future: Reconfigurable Scalable Computing for Space 24

The End 25