Enabling Technologies for Reconfigurable Computing Enabling Technologies for Reconfigurable Computing and Software / Configware Co-Design Part 3: Resources.

Slides:



Advertisements
Similar presentations
VHDL Design of Multifunctional RISC Processor on FPGA
Advertisements

Computer Architecture
Field Programmable Gate Array
FPGA (Field Programmable Gate Array)
Hao wang and Jyh-Charn (Steve) Liu
Enabling Technologies for Reconfigurable Computing Enabling Technologies for Reconfigurable Computing Part 2: Stream-based Computing for RC Wednesday,
TIE Extensions for Cryptographic Acceleration Charles-Henri Gros Alan Keefer Ankur Singla.
Stonewalled Progress of Computing Efficiency 1 Reiner Hartenstein (keynote) SA - Sep 1 16: :50
Khaled A. Al-Utaibi  Computers are Every Where  What is Computer Engineering?  Design Levels  Computer Engineering Fields  What.
EELE 367 – Logic Design Module 2 – Modern Digital Design Flow Agenda 1.History of Digital Design Approach 2.HDLs 3.Design Abstraction 4.Modern Design Steps.
Lecture 9: Coarse Grained FPGA Architecture October 6, 2004 ECE 697F Reconfigurable Computing Lecture 9 Coarse Grained FPGA Architecture.
Extensible Processors. 2 ASIP Gain performance by:  Specialized hardware for the whole application (ASIC). −  Almost no flexibility. −High cost.  Use.
Spring 08, Jan 15 ELEC 7770: Advanced VLSI Design (Agrawal) 1 ELEC 7770 Advanced VLSI Design Spring 2007 Introduction Vishwani D. Agrawal James J. Danaher.
Enabling Technologies for Reconfigurable Computing Enabling Technologies for Reconfigurable Computing Part 3: Resources for RC Wednesday, November 21,
Configurable System-on-Chip: Xilinx EDK
Introduction to ARM Architecture, Programmer’s Model and Assembler Embedded Systems Programming.
Programmable logic and FPGA
Vacuum tubes Transistor 1948 –Smaller, Cheaper, Less heat dissipation, Made from Silicon (Sand) –Invented at Bell Labs –Shockley, Brittain, Bardeen ICs.
6/30/2015HY220: Ιάκωβος Μαυροειδής1 Moore’s Law Gordon Moore (co-founder of Intel) predicted in 1965 that the transistor density of semiconductor chips.
UCB November 8, 2001 Krishna V Palem Proceler Inc. Customization Using Variable Instruction Sets Krishna V Palem CTO Proceler Inc.
SSS 4/9/99CMU Reconfigurable Computing1 The CMU Reconfigurable Computing Project April 9, 1999 Mihai Budiu
Implementation of DSP Algorithm on SoC. Mid-Semester Presentation Student : Einat Tevel Supervisor : Isaschar Walter Accompaning engineer : Emilia Burlak.
Embedded Systems Programming
Foundation and XACTstepTM Software
Chapter 6 Memory and Programmable Logic Devices
Computer Organization and Assembly language
GallagherP188/MAPLD20041 Accelerating DSP Algorithms Using FPGAs Sean Gallagher DSP Specialist Xilinx Inc.
© 2011 Xilinx, Inc. All Rights Reserved Intro to System Generator This material exempt per Department of Commerce license exception TSU.
General FPGA Architecture Field Programmable Gate Array.
Computer performance.
Xilinx at Work in Hot New Technologies ® Spartan-II 64- and 32-bit PCI Solutions Below ASSP Prices January
EKT303/4 PRINCIPLES OF PRINCIPLES OF COMPUTER ARCHITECTURE (PoCA)
© 2003 Xilinx, Inc. All Rights Reserved CORE Generator System.
1 3-General Purpose Processors: Altera Nios II 2 Altera Nios II processor A 32-bit soft core processor from Altera Comes in three cores: Fast, Standard,
COMPUTER SCIENCE &ENGINEERING Compiled code acceleration on FPGAs W. Najjar, B.Buyukkurt, Z.Guo, J. Villareal, J. Cortes, A. Mitra Computer Science & Engineering.
SYSTEM-ON-CHIP (SoC) AND USE OF VLSI CIRCUIT DESIGN TECHNOLOGY.
Automated Design of Custom Architecture Tulika Mitra
Research on Reconfigurable Computing Using Impulse C Carmen Li Shen Mentor: Dr. Russell Duren February 1, 2008.
집적회로 Spring 2007 Prof. Sang Sik AHN Signal Processing LAB.
ASIP Architecture for Future Wireless Systems: Flexibility and Customization Joseph Cavallaro and Predrag Radosavljevic Rice University Center for Multimedia.
J. Christiansen, CERN - EP/MIC
Page 1 Reconfigurable Communications Processor Principal Investigator: Chris Papachristou Task Number: NAG Electrical Engineering & Computer Science.
F. Gharsalli, S. Meftali, F. Rousseau, A.A. Jerraya TIMA laboratory 46 avenue Felix Viallet Grenoble Cedex - France Embedded Memory Wrapper Generation.
VLSI DESIGN CONFERENCE 1998 TUTORIAL Embedded System Design and Validation: Building Systems from IC cores to Chips Rajesh Gupta University of California,
Chapter 8 CPU and Memory: Design, Implementation, and Enhancement The Architecture of Computer Hardware and Systems Software: An Information Technology.
VLSI-SoC 2001 IFIP - LIRMM Stream-based Arrays: Converging Design Flows for both, Reiner Hartenstein University of Kaiserslautern December 2- 4, 2001,
EE3A1 Computer Hardware and Digital Design
EKT303/4 PRINCIPLES OF PRINCIPLES OF COMPUTER ARCHITECTURE (PoCA)
Computer Engineering 1502 Advanced Digital Design Professor Donald Chiarulli Computer Science Dept Sennott Square
Development of Programmable Architecture for Base-Band Processing S. Leung, A. Postula, Univ. of Queensland, Australia A. Hemani, Royal Institute of Tech.,
Tools - LogiBLOX - Chapter 5 slide 1 FPGA Tools Course The LogiBLOX GUI and the Core Generator LogiBLOX L BX.
Reconfigurable HPC Notes on datastream-based FFT
M.Mohajjel. Why? TTM (Time-to-market) Prototyping Reconfigurable and Custom Computing 2Digital System Design.
Chapter 5: Computer Systems Design and Organization Dr Mohamed Menacer Taibah University
Lecture 7: Overview Microprocessors / microcontrollers.
Enabling Technologies for Reconfigurable Computing Enabling Technologies for Reconfigurable Computing and Software / Configware Co-Design Part 2: Data-Stream-based.
SUBJECT : DIGITAL ELECTRONICS CLASS : SEM 3(B) TOPIC : INTRODUCTION OF VHDL.
Reiner Hartenstein University of Kaiserslautern
Programmable Logic Devices
ECE354 Embedded Systems Introduction C Andras Moritz.
Reiner Hartenstein University of Kaiserslautern
Embedded Systems Design
FPGAs in AWS and First Use Cases, Kees Vissers
Chapter 1: Introduction
Memory Organisation for Datastream-based Reconfigurable Computing
Reconfigurable Computing
A Digital Signal Prophecy The past, present and future of programmable DSP and the effects on high performance applications Continuing technology enhancements.
Embedded Architectures: Configurable, Re-configurable, or what?
The performance requirements for DSP applications continue to grow and the traditional solutions do not adequately address this new challenge Paradigm.
Programmable Logic- How do they do that?
Presentation transcript:

Enabling Technologies for Reconfigurable Computing Enabling Technologies for Reconfigurable Computing and Software / Configware Co-Design Part 3: Resources for RC and Data-Stream-based Computing - Reiner Hartenstein University of Kaiserslautern July 8, 2002, ENST, Paris, France

© 2002, University of Kaiserslautern 2 Schedule timeslot – Reconfigurable Computing (RC) – coffee break – Data-Stream-based Computing – lunch break – Resources for RC and Data-Stream-based Computing – Recent developments – Discussion

© 2002, University of Kaiserslautern 3 >> Configware Industry Configware Industry Terminology MoPL data-procedural language Anti architecture and circuitry Stream-based Memory Architecture

© 2002, University of Kaiserslautern 4 Configware heading for mainstream Configware market taking off for mainstream FPGA-based designs more complex, even SoC No design productivity and quality without good configware libraries (soft IP cores) from various application areas. FPGA vendors and a growing no. of independent configware houses (soft IP core vendors) and design services.

© 2002, University of Kaiserslautern 5 OS for PLDs separate EDA software market, comparable to the compiler / OS market in computers, Cadence, Mentor, Synopsys just jumped in. < 5% Xilinx / Altera income from EDA software Alliances with hundreds of partners providing hundreds of IP cores, synthesizable (hopefully) (WWW sites difficult to navigate)

© 2002, University of Kaiserslautern 6 >> Terminology Configware Industry Terminology MoPL data-procedural language Anti architecture and circuitry Stream-based Memory Architecture

© 2002, University of Kaiserslautern 7 Terminology

© 2002, University of Kaiserslautern 8 Terminology & Acronyms Software (SW): procedural sources* Configware (CW): structural sources Hardware (HW): hardwired platforms ASIC: customizable hardwired platforms Flexware (FW): reconfigurable platforms FPGA: field-programmable gate array FPL: field-programmable logic RC: reconfigurable computing RL: reconfigurable logic RA: reconfigurable array *) note: firmware is SW ! DPU: datapath unit DPA: datapath array rDPU: reconfigurable DPU rDPA: reconfigurable DPA

© 2002, University of Kaiserslautern 9 Babylonial Confusion Communication between areas, and between abstraction levels – mainly because of non- intuitive, misleading or ambiguos terminology

© 2002, University of Kaiserslautern 10 >> MoPL data-procedural language Configware Industry Terminology MoPL data-procedural language Anti architecture and circuitry Stream-based Memory Architecture

© 2002, University of Kaiserslautern 11 Fundamental Ideas available (1) Data Sequencer Methodology Data-procedural Languages (Duality with v N)... supporting memory bandwidth optimization Soft Data Path Synthesis Algorithms Parallelizing Loop Transformation Methods Compilers supporting Soft Machines SW / CW Partitioning Co-Compilers

© 2002, University of Kaiserslautern 12 Fundamental Ideas available (2) Programming Xputers Similarities to programming computers How not to get confused by similarities What benefits vs. Computers ?

© 2002, University of Kaiserslautern 13 Programming Language Paradigms easy to learn

© 2002, University of Kaiserslautern 14 Similar Programming Language Paradigms very easy to learn

© 2002, University of Kaiserslautern 15 JPEG zigzag scan pattern x y EastScan is step by [1,0] end EastScan; SouthScan is step by [0,1] endSouthScan; *> Declarations NorthEastScan is loop 8 times until [*,1] step by [1,-1] endloop end NorthEastScan; SouthWestScan is loop 8 times until [1,*] step by [-1,1] endloop end SouthWestScan; HalfZigZag is EastScan loop 3 times SouthWestScan SouthScan NorthEastScan EastScan endloop end HalfZigZag; goto PixMap[1,1] HalfZigZag; SouthWestScan uturn (HalfZigZag) HalfZigZag data counter published in 1993

© 2002, University of Kaiserslautern 16 >> Anti architecture and circuitry Configware Industry Terminology MoPL data-procedural language Anti architecture and circuitry Stream-based Memory Architecture

© 2002, University of Kaiserslautern 17 GAG = Address Generatorc Generic GAU generic address unit Scheme Base Slider B0B0 Limit Slider L0L0 0 B [ Address Stepper AA A  A | || | L ] limit all 3 are copies of the same BSU stepper circuit GAU

© 2002, University of Kaiserslautern 18 GAG = Address Generator Generic + / – A A Address Escape Clause End Detect endExec Step Counter =o maxStepCount init tag 0 B Base [ L Limit ]  A stepVector | |  A L B 0 [] | || | limit BSU: Basic Stepper Unit stepper sequencing BSU = Stepper Unit Basic

© 2002, University of Kaiserslautern 19 GAG Complex Sequencer Implementation Limit Slider Base Slider GAU Address Stepper B0B0 AA L0L0 A Limit Slider Base Slider GAU Address Stepper B0B0 AA L0L0 A Limit Slider Base Slider GAU Address Stepper B0B0 AA L0L0 A GAU GAG Generic Address Generator SDS GAG VLIW stack

© 2002, University of Kaiserslautern 20 Generic Sequence Examples Limit Slider Base Slider GAU Address Stepper B0B0 AA L0L0 A a) b) c) e) f) g) video scan -90º rotated video scan sheared video scan non-rectangular video scan zigzag video scan spiral scan feed-back-driven scans atomic scan linear scan -45º rotated (mirx (v scan)) perfect shuffle until

© 2002, University of Kaiserslautern 21 address Slider Demo B 0 L0L0 Limit Slider GAU floor F B0B0 Base Slider Address Stepper AA A

© 2002, University of Kaiserslautern 22 XMDS Scan Pattern Editor GUI

© 2002, University of Kaiserslautern 23 >> Stream-based Memory Architecture Configware Industry Terminology MoPL data-procedural language Anti architecture and circuitry Stream-based Memory Architecture

© 2002, University of Kaiserslautern 24 MoM Xputer Architecture rDPA Multiple RAM banks Smart memory interface Scan Window „Cache“ published in 1990

© 2002, University of Kaiserslautern 25 Antimachine: MoM architecture

© 2002, University of Kaiserslautern 26 Linear Filter Application 11 x 22: initial 9 x 20 = 180 [Dissertation Michael Herz] 1620

© 2002, University of Kaiserslautern 27 Linear Filter: scanline unrolling 3 x 20 = 60900

© 2002, University of Kaiserslautern o Rotation of Scan Pattern 3 x 10 =

© 2002, University of Kaiserslautern 29 Linear Filter Application: final after inner scan line loop unrolling final design after scan line unrolling hardw. level access optim. initial design Parallelized Merged Buffer Linear Filter Application with example image of x=22 by y=11 pixel Speed-up factor: 11,2

© 2002, University of Kaiserslautern 30 MoM Application Examples Image Processing Grid-based design rule check [1983 * ] –4 by 4 word scan cache –Pattern-matching based –Our own nMOS „DPLA“ design –design rule violation pixel map automatically generated from textual design rules –256 M&C nMOS, 800 single metal CMOS –Speed-up > vs. Motorola *) „machine“ not yet discovered

© 2002, University of Kaiserslautern 31 MoM Architecture Features Scan Cache Size adjustable at run time Any other shape than square supported 2-dimensional memory space Supports generic „scan patterns“ –Subject of parallel access transformations –compare Francky Cathoor et al. Supports visualization

© 2002, University of Kaiserslautern 32 Hot Research Topic: Memory Architectures High Performance Embedded Memory Architectures [Cathoor et al.] High Performance Memory Communication Architectures [Herz] Custom Memory Management Methodology [Cathoor et al] Data Reuse Transformations [Kougia et al.] Data Reuse Exploration [Soudris, Wuytak] Rapidly greowing market: IP cores, module generators ets.

© 2002, University of Kaiserslautern 33 Processor Memory Performance Gap von Neumann bottleneck

© 2002, University of Kaiserslautern 34 rDPAs: classical cache does not help the memory bandwidth problem is often more dramatic then for microprocessors classical interleaving is not practicable, since based on sequential instruction streams classical caches do not help, since instruction sequencing is not used the problem: throughput of parallel data streams, not instruction streams super pipe networks, no parallel computers ! Stream-based arrays are a memory bandwidth problem

© 2002, University of Kaiserslautern 35 Cache does not help.... however, the anti machine has no v.N. bottleneck!

© 2002, University of Kaiserslautern 36 Data-Stream-based Soft Anti Machine Scheduler Memory (data memory) memory bank... “instructions” rDPA Compiler Sequencers (data stream generator)

© 2002, University of Kaiserslautern 37 The Disk Farm? or a System On a Card? The 500GB disc card LOTS of bandwidth A few disks replaced by >10s Gbytes RAM and a processor 14" MicroDrive:1.7” x 1.4” x 0.2” 2006: ? 1999: 340 MB, 5400 RPM, 5 MB/s, 15 ms seek 2006: 9 GB, 50 MB/s ? (1.6X/yr capacity, 1.4X/yr BW) Integrated IRAM processor 2x height Connected via crossbar switch growing like Moore’s law 16 Mbytes; ; 1.6 Gflops; 6.4 Gops 10,000+ nodes in one rack! 100/board = 1 TB; 0.16 Tflops [Gordon Bell, Jim Gray, ISCA2000]

© 2002, University of Kaiserslautern 38 >>> Coarse Grain - END -

© 2002, University of Kaiserslautern 39 Appendix - APPENDIX -

© 2002, University of Kaiserslautern 40 Alliances

© 2002, University of Kaiserslautern 41 Xilinx Alliances The Software AllianceEDA Program... Xilinx Inc.'s Foundation... free WebPACK downloadable tool palette The Xilinx XtremeDSP Initiative (with Mentor Graphics) MathWorks / Xilinx Alliance. The Wind River / Xilinx alliance #

© 2002, University of Kaiserslautern 42 The Software Alliance EDA Program provides a wide selection of EDA tools Acugen Software, Agilent EEsof EDA, Aldec, Aptix, Auspy Development, Cadence, Celoxica, Dolphin Integration, Elanix, Exemplar, Flynn Systems, Hyperlynx, IKOS Systems, Innoveda, Mentor Graphics, MiroTech, Model Technoloy, Protel International, Simucad, SynaptiCAD, Synopsys, Synplicity, Translogic, Virtual Computer Corporation. helps leading EDA vendors to integrate Xilinx Alliance software tightly into their tools

© 2002, University of Kaiserslautern 43 The Xilinx AllianceCORE program a cooperation between Xilinx and third-party core developers, to produce a broad selection of industry-standard solutions for use in Xilinx platforms. - Partners are: Amphion Semiconductor, Ltd. ARC Cores CAST, Inc. DELTATEC Derivation Systems, Inc. Dolphin Integration (Grenoble) Eureka Technology Inc. Frontier Design Inc. GV & Associates, Inc. inSilicon Corporation iCODING Technology Inc. Loarant Corporation Mindspeed Technologies - A Conexant Business (formerly Applied Telecom) | MemecCore Mentor Graphics Inventra NewLogic Technologies, Inc. (Europe) NMI Electronics Paxonet Communications, Inc. Perigee, LLC Rapid Prototypes Inc. sci-worx GmbH (Hannover, Germany) SysOnChip TILAB (Telecom Italia Lab) VAutomation Virtual IP Group, Inc. XYLON.

© 2002, University of Kaiserslautern 44 The Xilinx Reference Design Alliance Program The Xilinx Reference Design Alliance Program helps the development of multi-component reference designs that incorporate Xilinx devices and other semiconductors. The designs are fully functional, but no warranties, no liability. Partners are:. ADI Engineering Innovative Integration JK microsystems, Inc. LYR Technologies NetLogic Microsystems

© 2002, University of Kaiserslautern 45 The Xilinx University Program The Xilinx University Program provides Xilinx Student Edition Software, Professor Workshops, a Xilinx University User Group, Presentation Materials and Lab Files, Course Examples, Research, Books, etc.

© 2002, University of Kaiserslautern 46 Altera offers over a hundred IP cores (1) modulator, synchronizer, DDR SDRAM controller, Hadamar transform, interrupt controller, Real86 16 bit microprocessor, floating point, FIR filter, discrete cosine, ATM cell processor, and many others. controller, UART, microprocessor, decoder, bus control, USB controller, PCI bus interface, viterbi controller, fast Ethernet MAC receiver or transmitter, Altera offers over a hundred IP cores like, for example:

© 2002, University of Kaiserslautern 47 Altera offers over a hundred IP cores (2) from Altera | AMIRIX Systems, Inc. Amphion Semiconductor, Ltd. Arasan Chip Systems, Inc. CAST, Inc. Digital Core Design Eureka Technology Inc. HammerCores Innocor Ktech Telecommunications, Inc. Lexra Computing Engines Mentor Graphics - Inventra Modelware Ncomm, Inc. NewLogic Technologies Northwest Logic Nova Engineering, Inc. Palmchip Corporation Paxonet Communications PLD Applications Sciworx Simple Silicon Tensilica TurboConcept.

© 2002, University of Kaiserslautern 48 Altera IP core design services Altera IP core design services are available from: Northwest Logic

© 2002, University of Kaiserslautern 49 Altera Certified Design Center (CDC) Program Certified Design Center (CDC) Program: Barco Silex El Camino GmbH Excel Consultants Plextek Reflex Consulting Sci-worx Tality Zaiq Technologies.

© 2002, University of Kaiserslautern 50 The Altera Consultants Alliance Program (ACAP): The Altera Consultants Alliance Program (ACAP): lists 41 offices in North America and 29 in the rest of the world.

© 2002, University of Kaiserslautern 51 Devlopment boards Devlopment boards are offered from: Altera El Camino GmbH Gid'el Limited Nova Engineering, Inc. PLD Applications Princeton Technology Group RPA Electronics Design, LLC Tensilica.

© 2002, University of Kaiserslautern 52 Consultants and services not listed by Xilinx nor Altera (index) Algotronix, Edinburgh, Andraka Consulting Group Arkham Technology, Pasadena, CA Barco Silex, Louvain-la-Neuve, Belgium, Bottom Line Technologies, Milford, NJ Codelogic, Helderberg, South Africa, Coelacanth Engineering, Norwell, MASS Comit Systems, Inc., Santa Clara, CA EDTN Programmable Logic Design Center Flexibilis, Tampere, Finland, Geoff Bostock Designs, Wiltshire, England, Great River Technology, Alberquerque, NM, New Horizons GB Ltd, United Kingdom, North West Logic Silicon System Solutions, Canterbury, Australia, Smartech, Tampere, Finland, Tekmosv, Austin, Texas, The Rockland Group, Garden Valley, CA Nick Tredennick, Los Gatos, California, Vitesse,

© 2002, University of Kaiserslautern 53 Consultants and services not listed by Xilinx nor Altera (1) Algotronix, Edinburgh, Reconfigurable Computing and FPL in software radio, communications and computer security Andraka Consulting Group high performance FPGA designs for DSP applications Arkham Technology, Pasadena, low cost IP cores for Xilinx and Atmel, embedded processor, DSP, wireless communication, COM / CORBA / DirectX, client-server database programming, software internationalization, PCB design Barco Silex, Louvain-la-Neuve, Belgium, IP integration boards for ASIC and FPGA, consultancy, design, sub-contracting

© 2002, University of Kaiserslautern 54 Consultants and services not listed by Xilinx nor Altera (2) Bottom Line Technologies, Milford, New Jersey, FPGA design, training, designing Xilinx parts since 1985 Codelogic, Helderberg, South Africa, consulting, FPGA design services Coelacanth Engineering, Norwell, Massachusetts, design services, test development services, in wireless communication, DSP-based instrumentation, mixed-signal ATE Comit Systems, Inc., Santa Clara, California, DSP, ASIC, networking, embedded control in avionics -- FPGA / ASIC design and system software EDTN Programmable Logic Design Center

© 2002, University of Kaiserslautern 55 Consultants and services not listed by Xilinx nor Altera (3) FirstPass, Castle Rock, Colorado Vitesse, ASIC design Flexibilis, Tampere, Finland, VHDL IP cores for Xilinx products Geoff Bostock Designs, Wiltshire, England, FPGA design services Great River Technology, Alberquerque, New Mexico, FPGA design services in digital video and point-to-point data transmission for aerospace, military, and commercial broadcasters New Horizons GB Ltd, United Kingdom, FPGA design and training, Xilinx specialist North West Logic; FPGA and embedded processor design in digital communications, digital video

© 2002, University of Kaiserslautern 56 Consultants and services not listed by Xilinx nor Altera (4) Silicon System Solutions, Canterbury, Australia, VHDL IP cores for the ASIC and FPGA/CPLD/EPLD markets Smartech, Tampere, Finland, ASIC and FPGA design Tekmosv, Austin, Texas, Multiple Designs on a Single Gate Array, HDL synthesis, design conversions, chip debug, test generation The Rockland Group, Garden Valley, California, a TeleConsulting organization about logic design for FPGAs Nick Tredennick, Los Gatos, California, investor and consultant

© 2002, University of Kaiserslautern 57 Terms

© 2002, University of Kaiserslautern 58 Confusing Terminology Computer Science and EE as well as ist R&D and applicatgion areas suffer from a babylonial confusion. Communication not only between Computer Science and EE, but also between ist special areas, even between ist different abstraction levels is made difficult – mainly because of immature terminology in relation to reconfigurable circuits and their applications. Terms are rarely standardized and often used with drastically different meanings – even within then same special area. Often terms have been so badly coined, that they are not self- explanatory, but misleading. A demonstratory example is the comparizon of terms used used in VHDL and Verilog. Ideal are "intuitive" terms. But often Intuition yields the wrong idea. Whenever a new term appears in teaching, I often have to tell the students, that the term does not mean, what he believes.

© 2002, University of Kaiserslautern 59 Terms (1). TermMeaningExample HardwarehardwiredProcessor, ASIC FlexwareReconfigurable (structurally programmable) FPLA, FPGA, KressArray FirmwareMicroprogramme (rarely used after introduction of RISC proc.) IBM 360 Computer Family Softwareprocedural programs (sequentially executable by a CPU) Word, C, OS, Compiler, etc. Configwarestructural programs, soft IP cores, personalizing CPLD, FPGA, or other Flexware for rDPA FPGA configuration, e. g. as a logic circuit, state machine, datapath, function [à la Ingo Kreuz]

© 2002, University of Kaiserslautern 60 Terms (2). TermMeaningExample dataobjects of computing “data” property depends on the moment of watching Bits, numbers, operands, results, any text (also compiler input) lists, graphs, tables, images,... data streamordered, also parallel data word lists, obtained by scheduling I/O data streams for systolic or other arrays programmingpersonalisation by loading programm code procedural code or structural code: for (re)configuration programsource text or object code for programming procedural oder structural [à la Ingo Kreuz]

© 2002, University of Kaiserslautern 61 Terms (3). TermMeaningExample boot programsimple program to enable programming - usually saved in non-volatile memory comparable to the starter of the motor of a car bootingload and execute a boot program [à la Ingo Kreuz]

© 2002, University of Kaiserslautern 62 Hardware Terms (1) TermMeaningExample machineexecution unit, driven by deterministic sequencer von Neumann machine „dataflow machine“ not a machine, since without a deterministic sequencer (exotic concept) (sleeping research area) CPUInstruction Set Processor ("von Neumann”): program counter (instruction sequencer) and DPU - mode of operation: deterministically instruction-driven ARM, Pentium core, [à la Ingo Kreuz]

© 2002, University of Kaiserslautern 63 Hardware Terms (2) TermMeaningExample DPUdata path unit, processes operands - no CPU since without sequencer - no maschine ALU with registers, multiplexers etc. ComputerCPU with RAM and interfaces Parallel Computer ensemble of several Computers Xputerdeterministically data-driven Machine, (transport-triggered) - data counter(s) used instead of a program counterm MoM architectures (Kaiserslautern) dataflow machine indeterministically data-driven (execution sequence unpredictable) (sleeping research area) [à la Ingo Kreuz]

© 2002, University of Kaiserslautern 64 Terms on Parallelism (1) TermMeaningExample parallelismseveral levels of parallelism distinguished parallel processes, parallelism at instruction set level, pipelines, concurrentparallel processes run on different CPUs of a parallel computer - may occasionally exchange signals or data weather prognisis, complex simulations, etc. ISP (instruction set parallelism) several CPUs run in parallel by clocked synchronization VLIW (very long instruction word) computer [à la Ingo Kreuz]

© 2002, University of Kaiserslautern 65 Terms on Parallelism (2) TermMeaningExample pipeliningseveral uniform or different DPUs running simultaneously - connected to a pipeline by buffer registers. pipelined CPUs, pipe networks, systolic, etc. chainingseveral uniform or different DPUs running simultaneously - connected to a pipeline without buffer registers Schaltnetze, komplexe arithmetische Operatoren Pipe networkEnsemble of DPUs, also multiple pipelines, also with irregular or wild structures systolisc arrays, stream-based computing arrays [à la Ingo Kreuz]

© 2002, University of Kaiserslautern 66 Terms on Parallelism (3) TermMeaningExample Systolic ArrayPipe network with only linear (straight-on, no branching), uniform pipelines (all DPUs hardwired and with same functionality) pipelines Matrix computation, DSP, DNA sequencing, etc. stream-based computing arrays (super-systolic arrays) pipe network, configured before fabrication image processing, DSP, complex functions and algorithms (coarse grain) reconf. stream- based arrays stream-based arrays, configurable after fabrication KressArray [à la Ingo Kreuz]

© 2002, University of Kaiserslautern 67 Counterparts categorypropertycounterpart programing mode procedural (classical) structural (synthesis, design) - „field-programmable“, PLA „programming“, etc. machine: principle of operation controlflow-driven (instruction-driven) : v. Neumann Data-driven: Xputer machine system: principle of operation instruction-flow-driven (parallel computer etc.) Data-stream-based (systolisc array, DPU array, KressArray) Set-up time (datapaths switched thru) during run time; (instruction-driven) before run time: FPGA (at compile time) Gate Array (at fabrication) [à la Ingo Kreuz]

© 2002, University of Kaiserslautern

© 2002, University of Kaiserslautern 69 Efficient Memory Communication should be directly supported by the Mapper Tools sequencers memory ports application not used Legend: Optimized Parallel Memory Controller An example by Nageldinger’s KressArray Xplorer Synthesizable Memory Communication

© 2002, University of Kaiserslautern 70 Opportunities by new patent laws ? to clever guys being keen on patents: don‘t file for patent following details ! everything shown in this presentation has been published years ago