Reconfigurable Supercomputing: What are the Problems? What are the Solutions? Reiner Hartenstein TU Kaiserslautern Dagstuhl, Germany, April 2 - 7, 2006.

Slides:



Advertisements
Similar presentations
CASES 2002 Intl Conference on Compilers, Architectures and Synthesis for Embedded Systems Embedded Architectures: Configurable, Re-configurable, or what?
Advertisements

The von Neumann Syndrome Reiner Hartenstein TU Kaiserslautern TU Delft, Sept 28, (v.2)
Reconfigurable Supercomputing means to brave the paradigm chasm Reiner Hartenstein HiPEAC Workshop on Reconfigurable Computing Ghent, Belgium January 28,
Altera FLEX 10K technology in Real Time Application.
A reconfigurable system featuring dynamically extensible embedded microprocessor, FPGA, and customizable I/O Borgatti, M. Lertora, F. Foret, B. Cali, L.
Khaled A. Al-Utaibi  Computers are Every Where  What is Computer Engineering?  Design Levels  Computer Engineering Fields  What.
An Introduction to Reconfigurable Computing Mitch Sukalski and Craig Ulmer Dean R&D Seminar 11 December 2003.
Lecture 9: Coarse Grained FPGA Architecture October 6, 2004 ECE 697F Reconfigurable Computing Lecture 9 Coarse Grained FPGA Architecture.
Reconfigurable Supercomputing: Hindernisse und Chancen Reiner Hartenstein TU Kaiserslautern Universität Mannheim, 13. Dez
MSE 2005 Reconfigurable Computing (RC) being Mainstream: Torpedoed by Education Reiner Hartenstein TU Kaiserslautern International Conference on Microelectronic.
© 2006, Reconfigurable Computing Reiner Hartenstein Computing Meeting EU, ESU, Brussells, May 18, 2006.
IPDPS 2004 Software or Configware? About the Digital Divide of Parallel Computing Reiner Hartenstein TU Kaiserslautern Santa Fe, NM, April , 2004.
From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.
Reconfigurable HPC Reconfigurable HPC part 1 Introduction Reiner Hartenstein TU Kaiserslautern May 14, 2004, TU Tallinn, Estonia.
(keynote) (from HPC to) New Horizons of Very High Performance Computing (VHPC): Hurdles and Chances Reiner Hartenstein TU Kaiserslautern Rhodes Island,
Reconfigurable Supercomputing: Hurdles and Chances Reiner Hartenstein TU Kaiserslautern Dresden, Gemany, June , 2006 International Supercomputer.
Extensible Processors. 2 ASIP Gain performance by:  Specialized hardware for the whole application (ASIC). −  Almost no flexibility. −High cost.  Use.
Some Thoughts on Technology and Strategies for Petaflops.
Processor Technology and Architecture
Computational Astrophysics: Methodology 1.Identify astrophysical problem 2.Write down corresponding equations 3.Identify numerical algorithm 4.Find a computer.
Seminar at Kyushu University Reconfigurable Technologies (1) Reiner Hartenstein TU Kaiserslautern July 23, 2004, Fukuoka, Japan.
Chapter 4 Processor Technology and Architecture. Chapter goals Describe CPU instruction and execution cycles Explain how primitive CPU instructions are.
Seven Minute Madness: Reconfigurable Computing Dr. Jason D. Bakos.
SSS 4/9/99CMU Reconfigurable Computing1 The CMU Reconfigurable Computing Project April 9, 1999 Mihai Budiu
Seven Minute Madness: Reconfigurable Computing Dr. Jason D. Bakos.
CS curricula update proposed: by adding Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern EAB meeting, Philadelphia,1 Nov 2005.
Heterogeneous Computing Dr. Jason D. Bakos. Heterogeneous Computing 2 “Traditional” Parallel/Multi-Processing Large-scale parallel platforms: –Individual.
How to cope with the Power Wall Reiner Hartenstein TU Kaiserslautern DRAFT PATMOS 2015, the 25 th International Workshop on Power and Timing Modeling,
GallagherP188/MAPLD20041 Accelerating DSP Algorithms Using FPGAs Sean Gallagher DSP Specialist Xilinx Inc.
Octavo: An FPGA-Centric Processor Architecture Charles Eric LaForest J. Gregory Steffan ECE, University of Toronto FPGA 2012, February 24.
Presenter MaxAcademy Lecture Series – V1.0, September 2011 Introduction and Motivation.
Introduction to Reconfigurable Computing Greg Stitt ECE Department University of Florida.
The Transdisciplinary Responsibility of CS Curricula Reiner Hartenstein TU Kaiserslautern San Diego, CA, USA, June , 2006 THE NINTH WORLD CONFERENCE.
February 12, 1998 Aman Sareen DPGA-Coupled Microprocessors Commodity IC’s for the Early 21st Century by Aman Sareen School of Electrical Engineering and.
 Design model for a computer  Named after John von Neuman  Instructions that tell the computer what to do are stored in memory  Stored program Memory.
COMPUTER SCIENCE &ENGINEERING Compiled code acceleration on FPGAs W. Najjar, B.Buyukkurt, Z.Guo, J. Villareal, J. Cortes, A. Mitra Computer Science & Engineering.
High-Level Interconnect Architectures for FPGAs An investigation into network-based interconnect systems for existing and future FPGA architectures Nick.
TRIPS – An EDGE Instruction Set Architecture Chirag Shah April 24, 2008.
1 Computer Architecture Research Overview Rajeev Balasubramonian School of Computing, University of Utah
Publication: Ra Inta, David J. Bowman, and Susan M. Scott. Int. J. Reconfig. Comput. 2012, Article 2 (January 2012), 1 pages. DOI= /2012/ Naveen.
High-Performance Computing An Applications Perspective REACH-IIT Kanpur 10 th Oct
SJSU SPRING 2011 PARALLEL COMPUTING Parallel Computing CS 147: Computer Architecture Instructor: Professor Sin-Min Lee Spring 2011 By: Alice Cotti.
Frank Casilio Computer Engineering May 15, 1997 Multithreaded Processors.
Reminder Lab 0 Xilinx ISE tutorial Research Send me an if interested Looking for those interested in RC with skills in compilers/languages/synthesis,
Computer Organization and Design Computer Abstractions and Technology
Introduction to Reconfigurable Computing Greg Stitt ECE Department University of Florida.
VLSI-SoC 2001 IFIP - LIRMM Stream-based Arrays: Converging Design Flows for both, Reiner Hartenstein University of Kaiserslautern December 2- 4, 2001,
EE3A1 Computer Hardware and Digital Design
1 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames) Reconfigurable Architectures Forces that drive.
COARSE GRAINED RECONFIGURABLE ARCHITECTURES 04/18/2014 Aditi Sharma Dhiraj Chaudhary Pruthvi Gowda Rachana Raj Sunku DAY
Reconfigurable HPC Notes on datastream-based FFT
The von Neumann Syndrome calls for a Revolution Reiner Hartenstein TU Kaiserslautern Reno, NV, November 11, HPRCTA'07 - First.
Von Neumann Computers Article Authors: Rudolf Eigenman & David Lilja
Baring It All to Software: Raw Machines E. Waingold, M. Taylor, D. Srikrishna, V. Sarkar, W. Lee, V. Lee, J. Kim, M. Frank, P. Finch, R. Barua, J. Babb,
1 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames) CPRE 583 Reconfigurable Computing Lecture.
Cray XD1 Reconfigurable Computing for Application Acceleration.
Jan. 5, 2000Systems Architecture II1 Machine Organization (CS 570) Lecture 1: Overview of High Performance Processors * Jeremy R. Johnson Wed. Sept. 27,
CML Path Selection based Branching for CGRAs ShriHari RajendranRadhika Thesis Committee : Prof. Aviral Shrivastava (Chair) Prof. Jennifer Blain Christen.
New-School Machine Structures Parallel Requests Assigned to computer e.g., Search “Katz” Parallel Threads Assigned to core e.g., Lookup, Ads Parallel Instructions.
Hardware and Software By: Kyle Face. Hardware! Main computer hardware components Hardware refers to the physical components of a computer (What you can.
Presented by Reconfigurable HPC Research at ORNL using Field-Programmable Gate Arrays (FPGAs) Olaf O. Storaasli Future Technologies Group Computer Science.
Introduction to Computers - Hardware
Computer Organization and Architecture Lecture 1 : Introduction
Fang Fang James C. Hoe Markus Püschel Smarahara Misra
ECE354 Embedded Systems Introduction C Andras Moritz.
Architecture & Organization 1
Architecture & Organization 1
Embedded Architectures: Configurable, Re-configurable, or what?
The performance requirements for DSP applications continue to grow and the traditional solutions do not adequately address this new challenge Paradigm.
Presentation transcript:

Reconfigurable Supercomputing: What are the Problems? What are the Solutions? Reiner Hartenstein TU Kaiserslautern Dagstuhl, Germany, April 2 - 7, 2006 Dynamically Reconfigurable Architectures

© 2006, TU Kaiserslautern 2 ISC2006 BoF SessionTitle and Abstract Is Reconfigurable Computing the Next Generation Supercomputing? Advances in reconfigurable computing, particularly FPGA (field-programmable gate array) technology, have reached a performance level where they rival and exceed the performance of general purpose processors for the right applications. FPGAs have gotten cheaper thanks to smaller geometries, multimillion gate counts and volume market leverage from ASIC preproduction and other conventional uses. The potential benefit from the widespread incorporation of FPGA technology into high-performance applications is high, provided present day barriers to their incorporation can be overcome. This session will focus on defining the anticipated market changes, anticipated roles of FPGA technology in high-performance computing (from accelerators to hybrid architectures), characterizing present day barriers to the incorporation of FPGA technology (such as identifying the right applications), and partnering efforts required (tools, benchmarks, standards, etc.)to speed the adoption of reconfigurable technology in high-performance supercomputing. Keywords: Reconfigurable computing, FPGA Accelerators, Supercomputing Date and Time This BoF session is part of the conference program and will take place within a 45 minute-slot on Wednesday 28. June 2006 from 18: :30. BoF Organizers John Abott Chief Analyst, The 451 Group, USA Dr. Joshua Harr CTO, Linux Networx, USA As CTO for Linux Networ x, Dr. Joshu a Harr has the respon sibility of laying the technic al roadma p for the compa ny and is leading the team develo ping cluster manag ement tools. Josh's experie nce with parallel process ing, distrib uted comput ing, large server farms, and Linux clusteri ng began when he built an eight- node cluster system out of used compo nents while in college. An industr y expert, Josh has been called upon to consult with busines ses and lecture in college classro oms. He earned a Ph.D. in comput ational chemis try and a bachel or's degree in molecu lar biolog y from BYU. Dr. Eric Stahlberg Organizing founder OpenFPGA, Ohio Supercomputer Center (OSC), USA The Supercomputing Paradox Rapidly growing listed Teraflops Often limited sustained Teraflops Almost stalled application implementation progress Increasing number of processors running in parallel COTS processor decreasing cost Very high total cost of the Tera(?)flops promising technology poor results Scientists waiting for affordable compute capacity

© 2006, TU Kaiserslautern 3 dangerously telling this to the supercomputing people: You … used the wrong roadmap the past 20 years !!!

© 2006, TU Kaiserslautern 4 ISC2006 BoF SessionTitle and Abstract Is Reconfigurable Computing the Next Generation Supercomputing? Advances in reconfigurable computing, particularly FPGA (field-programmable gate array) technology, have reached a performance level where they rival and exceed the performance of general purpose processors for the right applications. FPGAs have gotten cheaper thanks to smaller geometries, multimillion gate counts and volume market leverage from ASIC preproduction and other conventional uses. The potential benefit from the widespread incorporation of FPGA technology into high-performance applications is high, provided present day barriers to their incorporation can be overcome. This session will focus on defining the anticipated market changes, anticipated roles of FPGA technology in high-performance computing (from accelerators to hybrid architectures), characterizing present day barriers to the incorporation of FPGA technology (such as identifying the right applications), and partnering efforts required (tools, benchmarks, standards, etc.)to speed the adoption of reconfigurable technology in high-performance supercomputing. Keywords: Reconfigurable computing, FPGA Accelerators, Supercomputing Date and Time This BoF session is part of the conference program and will take place within a 45 minute-slot on Wednesday 28. June 2006 from 18: :30. BoF Organizers John Abott Chief Analyst, The 451 Group, USA Dr. Joshua Harr CTO, Linux Networx, USA As CTO for Linux Networ x, Dr. Joshu a Harr has the respon sibility of laying the technic al roadma p for the compa ny and is leading the team develo ping cluster manag ement tools. Josh's experie nce with parallel process ing, distrib uted comput ing, large server farms, and Linux clusteri ng began when he built an eight- node cluster system out of used compo nents while in college. An industr y expert, Josh has been called upon to consult with busines ses and lecture in college classro oms. He earned a Ph.D. in comput ational chemis try and a bachel or's degree in molecu lar biolog y from BYU. Dr. Eric Stahlberg Organizing founder OpenFPGA, Ohio Supercomputer Center (OSC), USA progress stalled

© 2006, TU Kaiserslautern 5 ISC2006 BoF SessionTitle and Abstract Is Reconfigurable Computing the Next Generation Supercomputing? Advances in reconfigurable computing, particularly FPGA (field-programmable gate array) technology, have reached a performance level where they rival and exceed the performance of general purpose processors for the right applications. FPGAs have gotten cheaper thanks to smaller geometries, multimillion gate counts and volume market leverage from ASIC preproduction and other conventional uses. The potential benefit from the widespread incorporation of FPGA technology into high-performance applications is high, provided present day barriers to their incorporation can be overcome. This session will focus on defining the anticipated market changes, anticipated roles of FPGA technology in high-performance computing (from accelerators to hybrid architectures), characterizing present day barriers to the incorporation of FPGA technology (such as identifying the right applications), and partnering efforts required (tools, benchmarks, standards, etc.)to speed the adoption of reconfigurable technology in high-performance supercomputing. Keywords: Reconfigurable computing, FPGA Accelerators, Supercomputing Date and Time This BoF session is part of the conference program and will take place within a 45 minute-slot on Wednesday 28. June 2006 from 18: :30. BoF Organizers John Abott Chief Analyst, The 451 Group, USA Dr. Joshua Harr CTO, Linux Networx, USA As CTO for Linux Networ x, Dr. Joshu a Harr has the respon sibility of laying the technic al roadma p for the compa ny and is leading the team develo ping cluster manag ement tools. Josh's experie nce with parallel process ing, distrib uted comput ing, large server farms, and Linux clusteri ng began when he built an eight- node cluster system out of used compo nents while in college. An industr y expert, Josh has been called upon to consult with busines ses and lecture in college classro oms. He earned a Ph.D. in comput ational chemis try and a bachel or's degree in molecu lar biolog y from BYU. Dr. Eric Stahlberg Organizing founder OpenFPGA, Ohio Supercomputer Center (OSC), USA 3 Reconfigurable Computing Paradoxes The high performance paradox The low power paradox Reconfigurable Computing Education Paradox

© 2006, TU Kaiserslautern 6 The Pervasiveness of RC 162, , , , , ,000 # of hits by Google 1,620, , , , ,000 1,490,000 # of hits by Google search “FPGA and ….”

© 2006, TU Kaiserslautern 7 going into every application area Almost 10 million hits

© 2006, TU Kaiserslautern 8 We now also have the hardware / configware / software chasm The Reconfigurable Computing Education Paradox: Curricula still ignore these extremely hot new challenges in addition to the hardware / software chasm its run-away accelerated pervasiveness, despite of all these educational deficits …. educational deficits

© 2006, TU Kaiserslautern 9 Computing Curricula 2004 (1) Within about 500 pages the term reconfigurable is not found – nor its synonyms

© 2006, TU Kaiserslautern 10 obsolete von Neumann‘s monopoly inside curricula is obsolete

© 2006, TU Kaiserslautern 11 von Neumann is not the common model progra m counter DPU CPU RAM memory von Neumann bottleneck von Neumann instruction-stream- based machine co-processors accelerator CPU instruction- stream- based data- stream- based hardware morphware software mainframe age: microprocessor age: wagging the dog the tail is vN paradigm dominance ? dual paradigm

© 2006, TU Kaiserslautern 12 modern FPGA bestsellers: The new model is reality: FPGA fabrics, together with several µprocessors, several memory banks, and other IP cores, on the same COTS microchip

© 2006, TU Kaiserslautern 13 Bill Gates Speech by Bill Gates at a summit meeting of US state governors: "American high schools are obsolete." "The high schools of today teach kids about today's computers like on a 50-year-old mainframe. „Without re-design for the needs of the 21st century, we will keep limiting - even ruining - the lives of millions of Americans every year."

© 2006, TU Kaiserslautern 14 carved out of stone The most important cultural revolution since the invention of text characters: it‘s not the mainframe It is the Microchip !

© 2006, TU Kaiserslautern 15 RC education needed 35 submissions from Australia, Brasil, India, USA, and throughout Europe Jürgen Becker Jörg Henkel R. Hartenstein

© 2006, TU Kaiserslautern 16 ISC2006 BoF SessionTitle and Abstract Is Reconfigurable Computing the Next Generation Supercomputing? Advances in reconfigurable computing, particularly FPGA (field-programmable gate array) technology, have reached a performance level where they rival and exceed the performance of general purpose processors for the right applications. FPGAs have gotten cheaper thanks to smaller geometries, multimillion gate counts and volume market leverage from ASIC preproduction and other conventional uses. The potential benefit from the widespread incorporation of FPGA technology into high-performance applications is high, provided present day barriers to their incorporation can be overcome. This session will focus on defining the anticipated market changes, anticipated roles of FPGA technology in high-performance computing (from accelerators to hybrid architectures), characterizing present day barriers to the incorporation of FPGA technology (such as identifying the right applications), and partnering efforts required (tools, benchmarks, standards, etc.)to speed the adoption of reconfigurable technology in high-performance supercomputing. Keywords: Reconfigurable computing, FPGA Accelerators, Supercomputing Date and Time This BoF session is part of the conference program and will take place within a 45 minute-slot on Wednesday 28. June 2006 from 18: :30. BoF Organizers John Abott Chief Analyst, The 451 Group, USA Dr. Joshua Harr CTO, Linux Networx, USA As CTO for Linux Networ x, Dr. Joshu a Harr has the respon sibility of laying the technic al roadma p for the compa ny and is leading the team develo ping cluster manag ement tools. Josh's experie nce with parallel process ing, distrib uted comput ing, large server farms, and Linux clusteri ng began when he built an eight- node cluster system out of used compo nents while in college. An industr y expert, Josh has been called upon to consult with busines ses and lecture in college classro oms. He earned a Ph.D. in comput ational chemis try and a bachel or's degree in molecu lar biolog y from BYU. Dr. Eric Stahlberg Organizing founder OpenFPGA, Ohio Supercomputer Center (OSC), USA Reconfigurable Computing Paradoxes The high performance paradox The low power paradox Reconfigurable Computing Education Paradox

© 2006, TU Kaiserslautern 17 ISC2006 BoF SessionTitle and Abstract Is Reconfigurable Computing the Next Generation Supercomputing? Advances in reconfigurable computing, particularly FPGA (field-programmable gate array) technology, have reached a performance level where they rival and exceed the performance of general purpose processors for the right applications. FPGAs have gotten cheaper thanks to smaller geometries, multimillion gate counts and volume market leverage from ASIC preproduction and other conventional uses. The potential benefit from the widespread incorporation of FPGA technology into high-performance applications is high, provided present day barriers to their incorporation can be overcome. This session will focus on defining the anticipated market changes, anticipated roles of FPGA technology in high-performance computing (from accelerators to hybrid architectures), characterizing present day barriers to the incorporation of FPGA technology (such as identifying the right applications), and partnering efforts required (tools, benchmarks, standards, etc.)to speed the adoption of reconfigurable technology in high-performance supercomputing. Keywords: Reconfigurable computing, FPGA Accelerators, Supercomputing Date and Time This BoF session is part of the conference program and will take place within a 45 minute-slot on Wednesday 28. June 2006 from 18: :30. BoF Organizers John Abott Chief Analyst, The 451 Group, USA Dr. Joshua Harr CTO, Linux Networx, USA As CTO for Linux Networ x, Dr. Joshu a Harr has the respon sibility of laying the technic al roadma p for the compa ny and is leading the team develo ping cluster manag ement tools. Josh's experie nce with parallel process ing, distrib uted comput ing, large server farms, and Linux clusteri ng began when he built an eight- node cluster system out of used compo nents while in college. An industr y expert, Josh has been called upon to consult with busines ses and lecture in college classro oms. He earned a Ph.D. in comput ational chemis try and a bachel or's degree in molecu lar biolog y from BYU. Dr. Eric Stahlberg Organizing founder OpenFPGA, Ohio Supercomputer Center (OSC), USA The FPGA Low Power Paradox „very power-hungry“ [Rick Kornfeld*] *) personal communication The awful technology of FPGAs: FPGAs run at lower clock frequencies, draw much more power and are more expensive. Reducing the electricity bill by an order of magnitude and more by supercomputer 2 FPGA migration

© 2006, TU Kaiserslautern 18 telling this to the low power design people ? you … used the wrong roadmap the past 15 years: use FPGAs ! ISLPED, Oct 4 – 6, Tegernsee PATMOS, Sep 13 – 15, Montpellier 1991 : Kaiserslautern, Germany 1992 : Paris, France 1993 : Montpellier, France

© 2006, TU Kaiserslautern 19 ISC2006 BoF SessionTitle and Abstract Is Reconfigurable Computing the Next Generation Supercomputing? Advances in reconfigurable computing, particularly FPGA (field-programmable gate array) technology, have reached a performance level where they rival and exceed the performance of general purpose processors for the right applications. FPGAs have gotten cheaper thanks to smaller geometries, multimillion gate counts and volume market leverage from ASIC preproduction and other conventional uses. The potential benefit from the widespread incorporation of FPGA technology into high-performance applications is high, provided present day barriers to their incorporation can be overcome. This session will focus on defining the anticipated market changes, anticipated roles of FPGA technology in high-performance computing (from accelerators to hybrid architectures), characterizing present day barriers to the incorporation of FPGA technology (such as identifying the right applications), and partnering efforts required (tools, benchmarks, standards, etc.)to speed the adoption of reconfigurable technology in high-performance supercomputing. Keywords: Reconfigurable computing, FPGA Accelerators, Supercomputing Date and Time This BoF session is part of the conference program and will take place within a 45 minute-slot on Wednesday 28. June 2006 from 18: :30. BoF Organizers John Abott Chief Analyst, The 451 Group, USA Dr. Joshua Harr CTO, Linux Networx, USA As CTO for Linux Networ x, Dr. Joshu a Harr has the respon sibility of laying the technic al roadma p for the compa ny and is leading the team develo ping cluster manag ement tools. Josh's experie nce with parallel process ing, distrib uted comput ing, large server farms, and Linux clusteri ng began when he built an eight- node cluster system out of used compo nents while in college. An industr y expert, Josh has been called upon to consult with busines ses and lecture in college classro oms. He earned a Ph.D. in comput ational chemis try and a bachel or's degree in molecu lar biolog y from BYU. Dr. Eric Stahlberg Organizing founder OpenFPGA, Ohio Supercomputer Center (OSC), USA Reconfigurable Computing Paradoxes The high performance paradox The low power paradox Reconfigurable Computing Education Paradox

© 2006, TU Kaiserslautern 20 The High Performance Paradox Effective integration density much worse than the Gordon Moore curve: by a factor of more than 10,000 85% of all designers hate their tools The awful technology of FPGAs: FPGAs run at lower clock frequencies, and are more expensive.

© 2006, TU Kaiserslautern 21 fine-grained RC: 1 st DeHon‘s Law # reconfigurability overhead> routing congestion wiring overhead overhead: >> FPGA logical FPGA routed density: FPGA physical (Gordon Moore curve) transistors / microchip (microprocessor) immense area inefficiency [1996: Ph. D, MIT]

© 2006, TU Kaiserslautern 22 coarse-grained RC: Hartenstein‘s Law # FPGA routed >> (Gordon Moore curve) transistors / microchip rDPA physical rDPA logical area efficiency very close to Moore‘s law [1996: ISIS, Austin, TX] e.g. KressArray family

© 2006, TU Kaiserslautern µ feature size MOPS / milliWatt standard microprocessor DSP instruction set processors (fine grained reconf.) FPGAs hardwired Claassen‘s Law

© 2006, TU Kaiserslautern µ feature size MOPS / milliWatt standard microprocessor DSP instruction set processors (fine grained reconf.) FPGAs hardwired Claassen‘s Law hardwired and coarse-grained reconf. (rDPA) : Hartenstein‘s Amendment

© 2006, TU Kaiserslautern 25 Selection of published speed-up factors P4 7% / yr 50% / yr Los Alamos traffic simulation 47 real-time face detection 6000 video-rate stereo vision 900 pattern recognition 730 SPIHT wavelet-based image compression 457 Smith-Waterman pattern matching 288 BLAST 52 protein identification 40 molecular dynamics simulation 88 Reed-Solomon Decoding 2400 Viterbi Decoding 400 FFT MAC Grid-based DRC: no FPGA: DPLA on MoM by TU-KL Grid-based DRC: no FPGA: DPLA on MoM by TU-KL D FIR filter (no FPGA: DPLA by TU-KL) 39,4 Lee Routing ( DPLA by TU-KL) 160 Grid-based DRC („fair comparizon“) DSP and wireless Image processing, Pattern matching, Multimedia Bioinformatics GRAPE 20 Astrophysics MoM Xputer architecture crypto Microprocessor relative performance Memory X 2 / yr

© 2006, TU Kaiserslautern 26 2 nd DeHon‘s Law Computational Density µ feature size RISC FPGA [IEEE COMPUTER, 2000]

© 2006, TU Kaiserslautern 27 ISC2006 BoF SessionTitle and Abstract Is Reconfigurable Computing the Next Generation Supercomputing? Advances in reconfigurable computing, particularly FPGA (field-programmable gate array) technology, have reached a performance level where they rival and exceed the performance of general purpose processors for the right applications. FPGAs have gotten cheaper thanks to smaller geometries, multimillion gate counts and volume market leverage from ASIC preproduction and other conventional uses. The potential benefit from the widespread incorporation of FPGA technology into high-performance applications is high, provided present day barriers to their incorporation can be overcome. This session will focus on defining the anticipated market changes, anticipated roles of FPGA technology in high-performance computing (from accelerators to hybrid architectures), characterizing present day barriers to the incorporation of FPGA technology (such as identifying the right applications), and partnering efforts required (tools, benchmarks, standards, etc.)to speed the adoption of reconfigurable technology in high-performance supercomputing. Keywords: Reconfigurable computing, FPGA Accelerators, Supercomputing Date and Time This BoF session is part of the conference program and will take place within a 45 minute-slot on Wednesday 28. June 2006 from 18: :30. BoF Organizers John Abott Chief Analyst, The 451 Group, USA Dr. Joshua Harr CTO, Linux Networx, USA As CTO for Linux Networ x, Dr. Joshu a Harr has the respon sibility of laying the technic al roadma p for the compa ny and is leading the team develo ping cluster manag ement tools. Josh's experie nce with parallel process ing, distrib uted comput ing, large server farms, and Linux clusteri ng began when he built an eight- node cluster system out of used compo nents while in college. An industr y expert, Josh has been called upon to consult with busines ses and lecture in college classro oms. He earned a Ph.D. in comput ational chemis try and a bachel or's degree in molecu lar biolog y from BYU. Dr. Eric Stahlberg Organizing founder OpenFPGA, Ohio Supercomputer Center (OSC), USA The three RC Paradoxes poor technology brilliant results poor tools very poor education

© 2006, TU Kaiserslautern 28 ISC2006 BoF SessionTitle and Abstract Is Reconfigurable Computing the Next Generation Supercomputing? Advances in reconfigurable computing, particularly FPGA (field-programmable gate array) technology, have reached a performance level where they rival and exceed the performance of general purpose processors for the right applications. FPGAs have gotten cheaper thanks to smaller geometries, multimillion gate counts and volume market leverage from ASIC preproduction and other conventional uses. The potential benefit from the widespread incorporation of FPGA technology into high-performance applications is high, provided present day barriers to their incorporation can be overcome. This session will focus on defining the anticipated market changes, anticipated roles of FPGA technology in high-performance computing (from accelerators to hybrid architectures), characterizing present day barriers to the incorporation of FPGA technology (such as identifying the right applications), and partnering efforts required (tools, benchmarks, standards, etc.)to speed the adoption of reconfigurable technology in high-performance supercomputing. Keywords: Reconfigurable computing, FPGA Accelerators, Supercomputing Date and Time This BoF session is part of the conference program and will take place within a 45 minute-slot on Wednesday 28. June 2006 from 18: :30. BoF Organizers John Abott Chief Analyst, The 451 Group, USA Dr. Joshua Harr CTO, Linux Networx, USA As CTO for Linux Networ x, Dr. Joshu a Harr has the respon sibility of laying the technic al roadma p for the compa ny and is leading the team develo ping cluster manag ement tools. Josh's experie nce with parallel process ing, distrib uted comput ing, large server farms, and Linux clusteri ng began when he built an eight- node cluster system out of used compo nents while in college. An industr y expert, Josh has been called upon to consult with busines ses and lecture in college classro oms. He earned a Ph.D. in comput ational chemis try and a bachel or's degree in molecu lar biolog y from BYU. Dr. Eric Stahlberg Organizing founder OpenFPGA, Ohio Supercomputer Center (OSC), USA Why supercomputing / HPC failed instruction-stream-based: memory-cycle-hungry the wrong way, how the data are moved around instruction fetch overhead because of the interconnect network architecture address computation overhead and other overhead sequencing overhead The law or More:

© 2006, TU Kaiserslautern 29 Earth Simulator 5120 Processors, 5000 pins each ES 20: TFLOPS Crossbar weight: 220 t, 3000 km of cable, moving data around inside the

© 2006, TU Kaiserslautern 30 data moved around by software i.e. by memory-cycle-hungry instruction streams which fully hit the memory wall P&R: move locality of operation, not data ! extremely unbalanced stolen from Bob Colwell CPU

© 2006, TU Kaiserslautern 31 An Archetype Common Model needed Guidance for organizing efficient solutions Make the project manageable Allow to share lessions between applications and between disciplines Useful simple archetype not widely accepted An archetype common model should provide.... Progress stalled by the software/configware chasm Configware Industry from the support undergraduate educastion

© 2006, TU Kaiserslautern 32 ISC2006 BoF SessionTitle and Abstract Is Reconfigurable Computing the Next Generation Supercomputing? Advances in reconfigurable computing, particularly FPGA (field-programmable gate array) technology, have reached a performance level where they rival and exceed the performance of general purpose processors for the right applications. FPGAs have gotten cheaper thanks to smaller geometries, multimillion gate counts and volume market leverage from ASIC preproduction and other conventional uses. The potential benefit from the widespread incorporation of FPGA technology into high-performance applications is high, provided present day barriers to their incorporation can be overcome. This session will focus on defining the anticipated market changes, anticipated roles of FPGA technology in high-performance computing (from accelerators to hybrid architectures), characterizing present day barriers to the incorporation of FPGA technology (such as identifying the right applications), and partnering efforts required (tools, benchmarks, standards, etc.)to speed the adoption of reconfigurable technology in high-performance supercomputing. Keywords: Reconfigurable computing, FPGA Accelerators, Supercomputing Date and Time This BoF session is part of the conference program and will take place within a 45 minute-slot on Wednesday 28. June 2006 from 18: :30. BoF Organizers John Abott Chief Analyst, The 451 Group, USA Dr. Joshua Harr CTO, Linux Networx, USA As CTO for Linux Networ x, Dr. Joshu a Harr has the respon sibility of laying the technic al roadma p for the compa ny and is leading the team develo ping cluster manag ement tools. Josh's experie nce with parallel process ing, distrib uted comput ing, large server farms, and Linux clusteri ng began when he built an eight- node cluster system out of used compo nents while in college. An industr y expert, Josh has been called upon to consult with busines ses and lecture in college classro oms. He earned a Ph.D. in comput ational chemis try and a bachel or's degree in molecu lar biolog y from BYU. Dr. Eric Stahlberg Organizing founder OpenFPGA, Ohio Supercomputer Center (OSC), USA The new paradigm: how the data are traveling transport-triggered: an old hat pipeline, or chaining systolic array asynchronous (via handshake) wavefront array no, not by instruction execution

© 2006, TU Kaiserslautern 33 DPA x x x x x x x x x | || xx x x x x xx x -- - input data streams xx x x x x xx x x x x x x x x x x | | | | | | | | | | | | | | output data streams „ data streams “ time port # time port # time port # Flowware defines:... which data item at which time at which port Def.: data streams (flowware) (pipe network) source and sink ? H. T. Kung systolic arrays:

© 2006, TU Kaiserslautern 34 Data streams source and sink: not my job Not my Job!

© 2006, TU Kaiserslautern 35 x x x x x x x x x | || xx x x x x xx x -- - input data streams xx x x x x xx x x x x x x x x x x | | | | | | | | | | | | | | output data streams „ data streams “ distributed memory ASM On-chip Auto-Sequencing Memory RAM GAG ASM implemented by distributed on- chip memory

© 2006, TU Kaiserslautern 36 ISC2006 BoF SessionTitle and Abstract Is Reconfigurable Computing the Next Generation Supercomputing? Advances in reconfigurable computing, particularly FPGA (field-programmable gate array) technology, have reached a performance level where they rival and exceed the performance of general purpose processors for the right applications. FPGAs have gotten cheaper thanks to smaller geometries, multimillion gate counts and volume market leverage from ASIC preproduction and other conventional uses. The potential benefit from the widespread incorporation of FPGA technology into high-performance applications is high, provided present day barriers to their incorporation can be overcome. This session will focus on defining the anticipated market changes, anticipated roles of FPGA technology in high-performance computing (from accelerators to hybrid architectures), characterizing present day barriers to the incorporation of FPGA technology (such as identifying the right applications), and partnering efforts required (tools, benchmarks, standards, etc.)to speed the adoption of reconfigurable technology in high-performance supercomputing. Keywords: Reconfigurable computing, FPGA Accelerators, Supercomputing Date and Time This BoF session is part of the conference program and will take place within a 45 minute-slot on Wednesday 28. June 2006 from 18: :30. BoF Organizers John Abott Chief Analyst, The 451 Group, USA Dr. Joshua Harr CTO, Linux Networx, USA As CTO for Linux Networ x, Dr. Joshu a Harr has the respon sibility of laying the technic al roadma p for the compa ny and is leading the team develo ping cluster manag ement tools. Josh's experie nce with parallel process ing, distrib uted comput ing, large server farms, and Linux clusteri ng began when he built an eight- node cluster system out of used compo nents while in college. An industr y expert, Josh has been called upon to consult with busines ses and lecture in college classro oms. He earned a Ph.D. in comput ational chemis try and a bachel or's degree in molecu lar biolog y from BYU. Dr. Eric Stahlberg Organizing founder OpenFPGA, Ohio Supercomputer Center (OSC), USA How the data are moved DMA, vN move processor [Jack Lipovski, EUROMiCRO, Nice, 1975] Henk Corporaal coins the term “transport-triggered” Application-specific distributed memory [Catthoor et al.] ASM use GAG generic address generator [TU-KL publ.: Tokyo NH journal] by the way: GAG st…. by TI [TI patent 1995] MoM: GAG-based storage scheme methodology [Herz*] *) [see Michael Herz et al.: ICECS 2002 (Dubrovnik)]

© 2006, TU Kaiserslautern 37 The dual paradigm approach von Neumann paradigm Kress-Kung paradigm Software Engineering Configware Engineering ASM CPU

© 2006, TU Kaiserslautern 38 Mathematical Synthesis Methods algebraic methods i. e., linear projections yields only uniform arrays w. linear pipes only for applications with regular data dependencies

© 2006, TU Kaiserslautern 39 Coarse-grained reconfigurable arrays are a Generalization of the Systolic Array.... discard algebraic synthesis methods [Rainer Kress] the achievement: also non-linear and non-uniform pipes, and even more wild pipe structures possible now reconfigurability really makes sense use optimization algorithms instead, for example: simulated annealing R. Kress

© 2006, TU Kaiserslautern 40 array size: 10 x 16 = 160 rDPUs Coarse grain is about computing, not logic rout thru only not used backbus connect SNN filter on KressArray (mainly a pipe network) [Ulrich Nageldinger] Example: mapping onto rDPA by DPSS: based on simulated annealing rDPU, 32 bit no CPU tool: KressArray Xplorer: diss. Ulrich Nageldinger (downloadable)

© 2006, TU Kaiserslautern 41 Software / Configware Co-Compilation Resource Parameters supporting different platforms Analyzer / Profiler SW code SW compiler paradigm “vN" machine CW Code CW compiler anti machine paradigm Partitioner C language source FW Code simulated annealing [Juergen Becker’s CoDe-X, 1996]

© 2006, TU Kaiserslautern 42 Software / Configware Co-Compilation Resource Parameters supporting different platforms Analyzer / Profiler SW code SW compiler paradigm “vN" machine CW Code CW compiler anti machine paradigm Partitioner C language source FW Code simulated annealing For thesis see book exhibit rack at library entrance [Juergen Becker’s CoDe-X, 1996]

© 2006, TU Kaiserslautern 43 Distributed Memory Parallelism Capability ASM array size example: 10 x 16 NN ports interconnect layer ASM backbus connect layers …

© 2006, TU Kaiserslautern 44 Applications for coarse-grained arrays (on-chip distributed memory for intermediate results) Multi-standard world HDTV receiver with steady I/O data streams at constant speed: Wide variety of multimedia applications Wide variety of real-time applications Many other applications

© 2006, TU Kaiserslautern 45 The wrong mind set.... „but you can‘t implement decisions!“ (remark of a high-ranked industrial research head – discussion after a talk by Ulrich Nageldinger – RAW Orlando)

© 2006, TU Kaiserslautern 46 a tiny section of the pipe network S +

© 2006, TU Kaiserslautern 47 The wrong mind set A B R C section of a very large pipe network: decision not knowing this solution: symptom of the hardware / software chasm and the configware / software chasm „but you can‘t implement decisions!“ =1=0

© 2006, TU Kaiserslautern 48 introducing hardware description languages (in the mid‘ seventies) “The decision box becomes a (de)multiplexer” This is so simple: why did it take decades to find out ? The wrong mind set – the wrong road map!

© 2006, TU Kaiserslautern 49 hypothetical branching example to illustrate software-to-configware migration *) if no intermediate storage in register file C = 1 simple conservative CPU example memory cycles nano seconds if C then read A read instruction1100 instruction decoding read operand*1100 operate & reg. transfers if not C then read B read instruction1100 instruction decoding add & store read instruction1100 instruction decoding operate & reg. transfers store result1100 total 5500 S = R + (if C then A else B endif); S + ABR C clock 200 MHz (5 nanosec) =1 section of a major pipe network on rDPU no memory cycles: speed-up factor = 100

© 2006, TU Kaiserslautern 50 why the RC paradigm shift is so important Move the stool or the grand piano? by Software by Configware

© 2006, TU Kaiserslautern 51 the data-stream-based approach has no von Neumann bottle- neck … understand only this parallelism solution: the instruction-stream-based approach von Neumann bottle- necks... cannot cope with this one

© 2006, TU Kaiserslautern 52 ISC2006 BoF SessionTitle and Abstract Is Reconfigurable Computing the Next Generation Supercomputing? Advances in reconfigurable computing, particularly FPGA (field-programmable gate array) technology, have reached a performance level where they rival and exceed the performance of general purpose processors for the right applications. FPGAs have gotten cheaper thanks to smaller geometries, multimillion gate counts and volume market leverage from ASIC preproduction and other conventional uses. The potential benefit from the widespread incorporation of FPGA technology into high-performance applications is high, provided present day barriers to their incorporation can be overcome. This session will focus on defining the anticipated market changes, anticipated roles of FPGA technology in high-performance computing (from accelerators to hybrid architectures), characterizing present day barriers to the incorporation of FPGA technology (such as identifying the right applications), and partnering efforts required (tools, benchmarks, standards, etc.)to speed the adoption of reconfigurable technology in high-performance supercomputing. Keywords: Reconfigurable computing, FPGA Accelerators, Supercomputing Date and Time This BoF session is part of the conference program and will take place within a 45 minute-slot on Wednesday 28. June 2006 from 18: :30. BoF Organizers John Abott Chief Analyst, The 451 Group, USA Dr. Joshua Harr CTO, Linux Networx, USA As CTO for Linux Networ x, Dr. Joshu a Harr has the respon sibility of laying the technic al roadma p for the compa ny and is leading the team develo ping cluster manag ement tools. Josh's experie nce with parallel process ing, distrib uted comput ing, large server farms, and Linux clusteri ng began when he built an eight- node cluster system out of used compo nents while in college. An industr y expert, Josh has been called upon to consult with busines ses and lecture in college classro oms. He earned a Ph.D. in comput ational chemis try and a bachel or's degree in molecu lar biolog y from BYU. Dr. Eric Stahlberg Organizing founder OpenFPGA, Ohio Supercomputer Center (OSC), USA What means Reconfigurable Computing? microprogramming? switching the multiplexers? concurrency of 64 or 256 CPUs on a single chip? routing ALU result to a register? it means using the Kress/Kung machine paradigm !

© 2006, TU Kaiserslautern 53 vN paradigm loosing its dominance RAMP project proposes: Run LINUX on FPGAs

© 2006, TU Kaiserslautern 54 Cray XD1 vN paradigm loosing its dominance Xilinx inside ! Xilinx FPGA

© 2006, TU Kaiserslautern 55 Recommended Pentium successor Discard most caches Have 64* cores with clever interconnect for: concurrent processes, for multithreading, and, Kung-Kress rDPA array The Desk-top Supercomputer!

© 2006, TU Kaiserslautern 56 ISC2006 BoF SessionTitle and Abstract Is Reconfigurable Computing the Next Generation Supercomputing? Advances in reconfigurable computing, particularly FPGA (field-programmable gate array) technology, have reached a performance level where they rival and exceed the performance of general purpose processors for the right applications. FPGAs have gotten cheaper thanks to smaller geometries, multimillion gate counts and volume market leverage from ASIC preproduction and other conventional uses. The potential benefit from the widespread incorporation of FPGA technology into high-performance applications is high, provided present day barriers to their incorporation can be overcome. This session will focus on defining the anticipated market changes, anticipated roles of FPGA technology in high-performance computing (from accelerators to hybrid architectures), characterizing present day barriers to the incorporation of FPGA technology (such as identifying the right applications), and partnering efforts required (tools, benchmarks, standards, etc.)to speed the adoption of reconfigurable technology in high-performance supercomputing. Keywords: Reconfigurable computing, FPGA Accelerators, Supercomputing Date and Time This BoF session is part of the conference program and will take place within a 45 minute-slot on Wednesday 28. June 2006 from 18: :30. BoF Organizers John Abott Chief Analyst, The 451 Group, USA Dr. Joshua Harr CTO, Linux Networx, USA As CTO for Linux Networ x, Dr. Joshu a Harr has the respon sibility of laying the technic al roadma p for the compa ny and is leading the team develo ping cluster manag ement tools. Josh's experie nce with parallel process ing, distrib uted comput ing, large server farms, and Linux clusteri ng began when he built an eight- node cluster system out of used compo nents while in college. An industr y expert, Josh has been called upon to consult with busines ses and lecture in college classro oms. He earned a Ph.D. in comput ational chemis try and a bachel or's degree in molecu lar biolog y from BYU. Dr. Eric Stahlberg Organizing founder OpenFPGA, Ohio Supercomputer Center (OSC), USA What means Reconfigurable Computing ? The key issue: which is the underlying paradigm? Operation not based on instruction-streams at run time No instruction fetch at run time machine paradigm is data stream-based: Kress-Kung Undergraduate education needs a dual paradigm approach: symbiosis of von Neumann / Kress-Kung

© 2006, TU Kaiserslautern 57 thank you

© 2006, TU Kaiserslautern 58 END

© 2006, TU Kaiserslautern 59 ISC2006 BoF SessionTitle and Abstract Is Reconfigurable Computing the Next Generation Supercomputing? Advances in reconfigurable computing, particularly FPGA (field-programmable gate array) technology, have reached a performance level where they rival and exceed the performance of general purpose processors for the right applications. FPGAs have gotten cheaper thanks to smaller geometries, multimillion gate counts and volume market leverage from ASIC preproduction and other conventional uses. The potential benefit from the widespread incorporation of FPGA technology into high-performance applications is high, provided present day barriers to their incorporation can be overcome. This session will focus on defining the anticipated market changes, anticipated roles of FPGA technology in high-performance computing (from accelerators to hybrid architectures), characterizing present day barriers to the incorporation of FPGA technology (such as identifying the right applications), and partnering efforts required (tools, benchmarks, standards, etc.)to speed the adoption of reconfigurable technology in high-performance supercomputing. Keywords: Reconfigurable computing, FPGA Accelerators, Supercomputing Date and Time This BoF session is part of the conference program and will take place within a 45 minute-slot on Wednesday 28. June 2006 from 18: :30. BoF Organizers John Abott Chief Analyst, The 451 Group, USA Dr. Joshua Harr CTO, Linux Networx, USA As CTO for Linux Networ x, Dr. Joshu a Harr has the respon sibility of laying the technic al roadma p for the compa ny and is leading the team develo ping cluster manag ement tools. Josh's experie nce with parallel process ing, distrib uted comput ing, large server farms, and Linux clusteri ng began when he built an eight- node cluster system out of used compo nents while in college. An industr y expert, Josh has been called upon to consult with busines ses and lecture in college classro oms. He earned a Ph.D. in comput ational chemis try and a bachel or's degree in molecu lar biolog y from BYU. Dr. Eric Stahlberg Organizing founder OpenFPGA, Ohio Supercomputer Center (OSC), USA

© 2006, TU Kaiserslautern 60 Backup for Discussion:

© 2006, TU Kaiserslautern 61 Term to be used for „soft hardware“ accelware adaptware adjustware altware alterware arrangeware changeware conformware doughware fabricsware fabrixware fitware flexware formware FPware gateware gateroutware hpcware LUTware matchware modiware morphware® morfware mouldware muxware parware paraware passware pathware patchware performware perfware perware pipeware platformware railware rangeware RCware ressourceware routware routeware routingware RTware shapeware shuntware shuntingware speedware speedupware suiteware switchware switchingware streamware structware transferware transware variware varyware warpware xferware xware send yourproposal to: unfortunately “Morphware” is trademarked

© 2006, TU Kaiserslautern 62 Compilation: Software vs. Configware source program software compiler software code Software Engineering configware code mapper configware compiler scheduler flowware code source „ program “ Configware Engineering placement & routing data C, FORTRAN MATHLAB

© 2006, TU Kaiserslautern 63 Co-Compilation software compiler software code Software / Configware Co-Compiler configware code mapper configware compiler scheduler flowware code data C, FORTRAN, MATHLAB automatic SW / CW partitioner

© 2006, TU Kaiserslautern 64 Why use Reconfigurable Computing Exploit spatial parallelism, and.. … high bandwidth and low latency memory access Ride the technology curve avoiding specific silicon Adapt to change: standards, trends, ….. Reduce risk Adapt to application / deployment requirements instead of spec. hardware? instead of software? … and fine-grained parallelism when useful

© 2006, TU Kaiserslautern 65 Computing Curricula 2004 (2) # CE Configware Engineering missing volume: CE missing

© 2006, TU Kaiserslautern 66 Computing Curricula 2004 (3)

© 2006, TU Kaiserslautern 67 Computing Curricula 2004 (4) … how it should be CONFIGWARE MORPHWARE morphware and configware added

© 2006, TU Kaiserslautern 68