Storaasli 5/9/03 Analytical and Computational Methods Computing Faster without CPUs Scientific Applications on FPGA-based* Reconfigurable Hypercomputers.

Slides:



Advertisements
Similar presentations
FPGA (Field Programmable Gate Array)
Advertisements

Introduction to Programmable Logic John Coughlan RAL Technology Department Electronics Division.
University Of Vaasa Telecommunications Engineering Automation Seminar Signal Generator By Tibebu Sime 13 th December 2011.
EELE 367 – Logic Design Module 2 – Modern Digital Design Flow Agenda 1.History of Digital Design Approach 2.HDLs 3.Design Abstraction 4.Modern Design Steps.
An Introduction to Reconfigurable Computing Mitch Sukalski and Craig Ulmer Dean R&D Seminar 11 December 2003.
Seven Minute Madness: Special-Purpose Parallel Architectures Dr. Jason D. Bakos.
Chapter Chapter Goals Describe the layers of a computer system Describe the concept of abstraction and its relationship to computing Describe.
ENGIN112 L38: Programmable Logic December 5, 2003 ENGIN 112 Intro to Electrical and Computer Engineering Lecture 38 Programmable Logic.
FPGA chips and DSP Algorithms By Emily Fabes. 2 Agenda FPGA Background Reasons to use FPGA’s Advantages and disadvantages of using FPGA’s Sample VHDL.
Hypercomputing With the CORDIC Algorithm
Introduction What is Parallel Algorithms? Why Parallel Algorithms? Evolution and Convergence of Parallel Algorithms Fundamental Design Issues.
White and Gloster P741 An Implementation of the Discrete Fourier Transform on a Reconfigurable Processor By Michael J. White 1,2* and Clay Gloster, Jr.,
ECE Lecture 1 1 ECE 3561 Advanced Digital Design Department of Electrical and Computer Engineering The Ohio State University.
Implementation of DSP Algorithm on SoC. Mid-Semester Presentation Student : Einat Tevel Supervisor : Isaschar Walter Accompaning engineer : Emilia Burlak.
CS 151 Digital Systems Design Lecture 38 Programmable Logic.
Field Programmable Gate Array (FPGA) Layout An FPGA consists of a large array of Configurable Logic Blocks (CLBs) - typically 1,000 to 8,000 CLBs per chip.
Using Programmable Logic to Accelerate DSP Functions 1 Using Programmable Logic to Accelerate DSP Functions “An Overview“ Greg Goslin Digital Signal Processing.
General FPGA Architecture Field Programmable Gate Array.
Chapter 1 The Big Picture Chapter Goals Describe the layers of a computer system Describe the concept of abstraction and its relationship to computing.
Chapter 01 Nell Dale & John Lewis.
Chapter 4 Programmable Logic Devices: CPLDs with VHDL Design Copyright ©2006 by Pearson Education, Inc. Upper Saddle River, New Jersey All rights.
Section I Introduction to Xilinx
April 15, Synthesis of Signal Processing on FPGA Hongtao
Time Integration Utilities on an FPGA Cris A. Kania with Olaf O. Storaasli, Ph. D. NASA Langley.
Overview Introduction The Level of Abstraction Organization & Architecture Structure & Function Why study computer organization?
Engineering Applications on
ENG3050 Embedded Reconfigurable Computing Systems General Information Handout Winter 2015, January 5 th.
COMPUTER SCIENCE &ENGINEERING Compiled code acceleration on FPGAs W. Najjar, B.Buyukkurt, Z.Guo, J. Villareal, J. Cortes, A. Mitra Computer Science & Engineering.
1 of 23 Fouts MAPLD 2005/C117 Synthesis of False Target Radar Images Using a Reconfigurable Computer Dr. Douglas J. Fouts LT Kendrick R. Macklin Daniel.
Chapter 1 The Big Picture.
PROGRAMMABLE LOGIC DEVICES (PLD)
Reckless Speeding The investigation of the programming capabilities of the HAL hypercomputer Reese Dandawate Governor’s School NASA mentorship July 25,
Research on Reconfigurable Computing Using Impulse C Carmen Li Shen Mentor: Dr. Russell Duren February 1, 2008.
TO THE COURSE ON DIGITAL DESIGN FOR INSTRUMENTATION TO THE COURSE ON DIGITAL DESIGN FOR INSTRUMENTATION.
ASIP Architecture for Future Wireless Systems: Flexibility and Customization Joseph Cavallaro and Predrag Radosavljevic Rice University Center for Multimedia.
SJSU SPRING 2011 PARALLEL COMPUTING Parallel Computing CS 147: Computer Architecture Instructor: Professor Sin-Min Lee Spring 2011 By: Alice Cotti.
J. Christiansen, CERN - EP/MIC
FPGA (Field Programmable Gate Array): CLBs, Slices, and LUTs Each configurable logic block (CLB) in Spartan-6 FPGAs consists of two slices, arranged side-by-side.
Reminder Lab 0 Xilinx ISE tutorial Research Send me an if interested Looking for those interested in RC with skills in compilers/languages/synthesis,
PDCS 2007 November 20, 2007 Accelerating the Complex Hessenberg QR Algorithm with the CSX600 Floating-Point Coprocessor Yusaku Yamamoto 1 Takafumi Miyata.
Computing Faster Without CPUs GOAL: Evaluate FPGA*-based Hypercomputer Potential for NASA Scientific Computations * Field-Programmable Gate Array (e.g.
Chapter 4 Programmable Logic Devices: CPLDs with VHDL Design Copyright ©2006 by Pearson Education, Inc. Upper Saddle River, New Jersey All rights.
Field Programmable Gate Arrays (FPGAs) An Enabling Technology.
Alternative ProcessorsHPC User Forum Panel1 HPC User Forum Alternative Processor Panel Results 2008.
Abstraction And Technology 1 Comp 411 – Fall /28/06 Computer Abstractions and Technology 1. Layer Cakes 2. Computers are translators 3. Switches.
Algorithm and Programming Considerations for Embedded Reconfigurable Computers Russell Duren, Associate Professor Engineering And Computer Science Baylor.
Wang Chen, Dr. Miriam Leeser, Dr. Carey Rappaport Goal Speedup 3D Finite-Difference Time-Domain.
Computing Environment The computing environment rapidly evolving ‑ you need to know not only the methods, but also How and when to apply them, Which computers.
Overview Real World NP-hard problems, such as fluid dynamics, calcium cell signaling, and stomata networks in plant leaves involve extensive computation.
A New Class of High Performance FFTs Dr. J. Greg Nash Centar ( High Performance Embedded Computing (HPEC) Workshop.
December 13, G raphical A symmetric P rocessing Prototype Presentation December 13, 2004.
Philipp Gysel ECE Department University of California, Davis
FPGA ( Field programmable gate array ) April 2008 Prepared by : Muhammad Ziyada Muhammad Al tabakh.
ALMA Integrated Computing Team Coordination & Planning Meeting #3 Socorro, June 2014 Observation with ACA correlator for Cycle3 Manabu Watanabe NAOJ.
Presented by Reconfigurable HPC Research at ORNL using Field-Programmable Gate Arrays (FPGAs) Olaf O. Storaasli Future Technologies Group Computer Science.
Programmable Hardware: Hardware or Software?
Accelerating Genome Sequencing 100X with FPGAs
A Methodology for System-on-a-Programmable-Chip Resources Utilization
VLSI Testing Lecture 6: Fault Simulation
VLSI Testing Lecture 6: Fault Simulation
Programmable Logic Devices: CPLDs and FPGAs with VHDL Design
Matlab as a Development Environment for FPGA Design
A Digital Signal Prophecy The past, present and future of programmable DSP and the effects on high performance applications Continuing technology enhancements.
The performance requirements for DSP applications continue to grow and the traditional solutions do not adequately address this new challenge Paradigm.
Alternative Processor Panel Results 2008
ECE 699: Lecture 3 ZYNQ Design Flow.
VHDL Introduction.
Star Bridge Systems, Inc.
♪ Embedded System Design: Synthesizing Music Using Programmable Logic
Computing as Fast as an Engineer can Think
Presentation transcript:

Storaasli 5/9/03 Analytical and Computational Methods Computing Faster without CPUs Scientific Applications on FPGA-based* Reconfigurable Hypercomputers by Dr. Olaf Storaasli Analytical & Computational Methods Branch Structures and Materials for May Seminar, Electronic Systems Branch * Field-Programmable Gate Array

Storaasli 5/9/03 Analytical and Computational Methods 2 NASA Research Background FEA: NASTRAN Viking ==> Mars IPAD: Integrated Design Finite Element Machine: Early // Computer Cray GigaFLOP Award: Shuttle SRB Matrix Equation Solver: FORTRAN, C, Java Lanczos Eigensolver: 88x Speedup Intel: Supercomputer Users Board, P6 Award Symposia: Large-Scale Apps. (5) NASA Software of the Year Award Creativity & Innovation Awards NASA Fellowship: Norway

Storaasli 5/9/03 Analytical and Computational Methods 3 Exploring Scientific Applications on Reconfigurable Hypercomputers ‘02‘03 Creativity & Innovation 62K gates/FPGA 6M gates/FPGA

Storaasli 5/9/03 Analytical and Computational Methods 4 Computing Faster Without CPUs GOAL: Evaluate FPGA*-based Hypercomputer Potential for NASA Scientific Computations TEAM: Dr. Olaf Storaasli, Principal Investigator PARTNERS: Star Bridge Systems, NSA, USAF, MSFC, William Fithian-Harvard, Siddhartha Krishnamurthy-VT Shaun Foley-MIT, Cris Kania-GS, Neha Dandawate-GS, Patrick Butler-VT Kristin Barr-JPMorgan, Robert Lewis-Morehouse, Vincent Vance-VT Jarek Sobieski, Robert Singleterry, Dave Rutishauser, Joe Rehder Garry Qualls

Storaasli 5/9/03 Analytical and Computational Methods 5 William Fithian* (Harvard, Merit Scholar, Oracle Award) * NASA-NHGS mentorship ‘00-’02

Storaasli 5/9/03 Analytical and Computational Methods 6 First Langley Hypercomputers 10 FPGAs each

Storaasli 5/9/03 Analytical and Computational Methods 7

Storaasli 5/9/03 Analytical and Computational Methods 8 FPGA Programming User controls gates: middle man removed Code options: –1-D Text, sequential FORTRAN-like: C-to-Gate, VHDL parallelism esoteric –3-D Graphic, parallel drag & drop: Viva Parallelism inherent data flow like analog computer NASA Hypercomputers

Storaasli 5/9/03 Analytical and Computational Methods 9 FPGA: New Computing Paradigm Traditional CPU Gateware: VIVA Icons & Transports 26 MFLOPS /250 MHz SGI Reconfigurable FPGA Sequential: 1 operation/cycle Fixed gates & data types Wasteful: 99% gates idle/cycle yet all draw power Software: Text - 1D do i = 1, billion c= a+b end do Parallel: Inherent Dynamic gates & data types Efficient: Optimizes gates to task 392+ MFLOPS /64 MHz FPGA GFLOPS /10 FPGA board

Storaasli 5/9/03 Analytical and Computational Methods 10 Select-Drag-Drop to Code “icon” Primitives Add new code to library Complex algorithms “drill in”

Storaasli 5/9/03 Analytical and Computational Methods 11 VIVA:Custom Chip Design Gateware What: Graphics tool to “route” FPGAs (VHDL cumbersome) Growth in VIVA Capability Extensive Data Types Trig, Logs, Transcendentals File Input/Output Vector-Matrix Support Access to Multiple FPGAs Extensive Documentation Stable Development Few “bugs” NO Floating Point NO Scientific Functions NO File Input/Output NO Vector-Matrix Support Access to One FPGA Primitive Documentation Weekly Changes Frequent “bugs” VIVA 1 (Feb ’01) VIVA2 (July ’02) How: Converts icon-transport “gateware” to circuit logic Why: Achieve near-ASIC speed (w/o chip design $)

Storaasli 5/9/03 Analytical and Computational Methods 12

Storaasli 5/9/03 Analytical and Computational Methods 13 Viva User1 Viva User2 Viva User3 Viva User4 Viva User5 Viva User6 Windows Server HC-38m Hypercomputer 7 FPGAs 6M Gates Parallel Use of Parallel FPGAs

Storaasli 5/9/03 Analytical and Computational Methods 14 Algorithms Developed* * In AIAA & Military & Aerospace Programmable Logic Device (MAPLD) papers. n! => Probability: Combinations/Permutations AirSC Cordic => Transcendentals: sin, log, exp, cosh…  ∂y/∂x & ∫ f(x)dx => Runge-Kutta: CFD, Newmark Beta: CSM  Matrix Equation Solver : [A]{x} = {b} - Gauss & Jacobi Nonlinear Analysis : Analog simulation avoids NLT devp’t time  Matrix Algebra : {V}, [M], {V} T {V}, [M]x[M],GCD,… Dynamic Analysis : [M]{ ü } + [C]{u} + [K]{u} + NLT = {P(t)} Analog Computing : digital accuracy

Storaasli 5/9/03 Analytical and Computational Methods 15 Numeric Integration f(x)=x 2 f(x)*  x xx  f(x)*  x x i+1 = x i +  x Control Output (Area under curve) f(x) x xx

Storaasli 5/9/03 Analytical and Computational Methods 16 VIVA Sparse Matrix Equation Solver Jacobi Iterative (3x3 Demo) Control 3 Row Loads 3 // Dot Products [A]{x}={b} x 1 = 1/A 11 *(b 1 - A 12 *x 2 - A 13 *x 3 )

Storaasli 5/9/03 Analytical and Computational Methods 17 z b P a x - / y d M e x - / z y y i = (P - b z i-1 )/a z i = (M - dy i-1 )/e - initialized output y output z Analog Diagram of 2x2 Equations Solution input bz P-bz bz

Storaasli 5/9/03 Analytical and Computational Methods 18 Fixed-Point Iteration: VIVA Diagram

Storaasli 5/9/03 Analytical and Computational Methods 19 Year 2: Exploit Latest FPGAs Plans: - Millions of Matrix Equations: Structures, Electromagnetics & Acoustics - Rapid Static & Dynamic Structural Analyses - Cray Vector Computations in Weather Code (VT PhD) - Robert on Administrator’s Fellowship at Star Bridge Systems - Simulate advanced computing concepts using VIVA - Collaborate with SBS, NSA, A&T… to expand VIVA libraries - Tailor VIVA development for NASA applications - Target applications to NASA programs (e.g. EDB Collaboration??) Rapid Growth in FPGA Capability FPGA (Feb ’01) FPGA (Oct ’02) Xilinx FPGA Gates Multiplies on chip Clock Speed MHz Memory on chip Memory Speed Reconfigure Time GFLOPS Total GFLOPs XC K Kb 466 Gb/s 100ms (10 FPGAs) XC2V million (97x) (3x) 3.5 Mb (175x) 5 Tb/s (11x) 40ms (2.5x) 47 (120x) 329 (7 FPGAs)

Storaasli 5/9/03 Analytical and Computational Methods 20 Summary What We’re Learning We like FPGA promise – accomplished much Hardware: Testing 3 futuristic FPGA systems FPGAs: Inherently //, flexible, efficient, & fast, dramatic advances VIVA: Powerful & growing (tailor to NASA needs) Applications: ‘02 - Diverse “pathfinder” algorithms developed Speed: Year 1: 4 GFLOPS => Year 2: 329 GFLOPS Future: exploit capability on NASA “cutting edge” innovations ‘03 - Comprehensive NASA engineering applications

Storaasli 5/9/03 Analytical and Computational Methods 21 Langley Reconfigurable Computing Research 1. Singleterry, Robert C., Jaroslav Sobieszczanski-Sobieski, and Samuel Brown. “Field-Programmable Gate Array Computer in Structural Analysis: an Initial Exploration.” 43 rd American Institute of Aeronautics and Astronautics (AIAA) Structures, Structural Dynamics, and Materials Conference. April 22-25, Storaasli, Olaf O., Robert C. Singleterry, and Samuel Brown. “Scientific Computations on a NASA Reconfigurable Hypercomputer.” Abstract accepted for 5 th Military and Aerospace Programmable Logic Devices (MAPLD) Conference, Paper in preparation. September 10-12, Fithian, William, Samuel Brown, and Tyler Reed. “Object Synchronization in VIVA 1.5.” Briefing prepared for VIVA users at NASA Marshall, Eglin AFB, Progress Forge, Inc., and Star Bridge Systems, Inc. March 26, Barr, Kristen, Shaun Foley, and Robert A. Lewis II. “Hypercomputing with the CORDIC Algorithm.” August, Presentation of research conducted under Dr. Olaf O. Storaasli, June-August, Butler, Patrick. New Horizons Governors School Mentorship Project. May, Presentation of research conducted under Dr. Olaf O. Storaasli, September 2000 – May Dandawate, Neha. “Reckless Speeding: The Investigation of the Programming Capabilities of the HAL Hypercomputer.” July, Presentation of research conducted under Dr. Olaf O. Storaasli, June – July, Dandawate, Neha. “The Investigation of the Programming Capabilities of the HAL-15 Hypercomputer.” July, Paper on research conducted under Dr. Olaf O. Storaasli, June – July, Fithian, William. “Developing a Matrix Equation Solver for the HAL-15 Hypercomputer.” December, Proposal for research to be conducted under Dr. Olaf O. Storaasli, September 2001 – May Fithian, William. “Developing a Matrix Equation Solver for the HAL-15.” May, Presentation of research conducted under Dr. Olaf O. Storaasli, September 2001 – May Fithian, William. “Jacobi Iterative Matrix Equation Solver for Star Bridge Systems FPGA Hypercomputer.” September, In preparation. 11. Foley, Shaun. “Scientific Hypercomputing.” August, Paper describing research conducted under Dr. Olaf O. Storaasli, June – August, Krishnamurthy, Siddhartha. “Development of an Integration Algorithm for Field Programable Gate Arrays using VIVA.” July, Paper describing research conducted under Dr. Robert C. Singleterry, June – Aug Further Information: Google: “olaf acmb”