WorldScape Defense Company, L.L.C. Company Proprietary Slide 1 An Ultra-High Performance Scalable Processing Architecture for HPC and Embedded Applications.

Slides:



Advertisements
Similar presentations
System Area Network Abhiram Shandilya 12/06/01. Overview Introduction to System Area Networks SAN Design and Examples SAN Applications.
Advertisements

AMD OPTERON ARCHITECTURE Omar Aragon Abdel Salam Sayyad This presentation is missing the references used.
The AMD Athlon ™ Processor: Future Directions Fred Weber Vice President, Engineering Computation Products Group.
The Central Processing Unit: What Goes on Inside the Computer.
Khaled A. Al-Utaibi  Computers are Every Where  What is Computer Engineering?  Design Levels  Computer Engineering Fields  What.
Ver 0.1 Page 1 SGI Proprietary Introducing the CRAY SV1 CRAY SV1-128 SuperCluster.
Course-Grained Reconfigurable Devices. 2 Dataflow Machines General Structure:  ALU-computing elements,  Programmable interconnections,  I/O components.
Discovering Computers 2010
Three-Dimensional Template Correlation: Object Recognition in 3D Voxel Data Tom VanCourtBoston University Yongfeng GuECE Department Martin Herbordt CAAD.
TigerSHARC and Blackfin Different Applications. Introduction Quick overview of TigerSHARC Quick overview of Blackfin low power processor Case Study: Blackfin.
Some Thoughts on Technology and Strategies for Petaflops.
A System Solution for High- Performance, Low Power SDR Yuan Lin 1, Hyunseok Lee 1, Yoav Harel 1, Mark Woh 1, Scott Mahlke 1, Trevor Mudge 1 and Krisztian.
VIRAM-1 Architecture Update and Status Christoforos E. Kozyrakis IRAM Retreat January 2000.
Performance Analysis of the IXP1200 Network Processor Rajesh Krishna Balan and Urs Hengartner.
Modern trends in computer architecture and semiconductor scaling are leading towards the design of chips with more and more processor cores. Highly concurrent.
ClearSpeed CSX620 Overview. References ClearSpeed Technical Training Slides for ClearSpeed Accelerator 620, software version 3.0, Slide Sets 1-6, Presentor:
Martin Kruliš by Martin Kruliš (v1.0)1.
CS402 PPP # 2 MIPS BASIC INFORMATION By George Koutsogiannakis 1.
1 Computer Science, University of Warwick Architecture Classifications A taxonomy of parallel architectures: in 1972, Flynn categorised HPC architectures.
Computer Organization and Assembly language
System Architecture A Reconfigurable and Programmable Gigabit Network Interface Card Jeff Shafer, Hyong-Youb Kim, Paul Willmann, Dr. Scott Rixner Rice.
A Flexible Architecture for Simulation and Testing (FAST) Multiprocessor Systems John D. Davis, Lance Hammond, Kunle Olukotun Computer Systems Lab Stanford.
Presenter MaxAcademy Lecture Series – V1.0, September 2011 Introduction and Motivation.
Defense and Space Electronics SystemsConfidential and Proprietary Honeywell Wednesday (04/09) Evening Page 1 Honeywell Proprietary Multi Processor Board.
Xilinx at Work in Hot New Technologies ® Spartan-II 64- and 32-bit PCI Solutions Below ASSP Prices January
Department of Computer and Information Science, School of Science, IUPUI Dale Roberts, Lecturer Computer Science, IUPUI CSCI.
1 Lecture 7: Part 2: Message Passing Multicomputers (Distributed Memory Machines)
RSC Williams MAPLD 2005/BOF-S1 A Linux-based Software Environment for the Reconfigurable Scalable Computing Project John A. Williams 1
Chapter 4 The System Unit: Processing and Memory Prepared by : Mrs. Sara salih.
Computer Processing of Data
Topic:The Motorola M680X0 Family Team:Ulrike Eckardt Frederik Fleck André Kudra Jan Schuster Date:Thursday, 12/10/1998 CS-350 Computer Organization Term.
1 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames) CPRE 583 Reconfigurable Computing Lecture.
Chapter 2 Computer Clusters Lecture 2.2 Computer Cluster Architectures.
Company LOGO High Performance Processors Miguel J. González Blanco Miguel A. Padilla Puig Felix Rivera Rivas.
RiceNIC: A Reconfigurable and Programmable Gigabit Network Interface Card Jeff Shafer, Dr. Scott Rixner Rice Computer Architecture:
Copyright © 2007 Heathkit Company, Inc. All Rights Reserved PC Fundamentals Presentation 27 – A Brief History of the Microprocessor.
An Ultra-High Performance Architecture for Embedded Defense Signal and Image Processing Applications September 24, 2003 Authors Ken Cameron
Taking the Complexity out of Cluster Computing Vendor Update HPC User Forum Arend Dittmer Director Product Management HPC April,
Improved air combat awareness - with AESA and next-generation signal processing Main beam jamming rejection Wide transmit beam Communication Side lobe.
1 S DBG Local Area Memory Port—P2100 What is Better I/O, and When?
EMBEDDED SYSTEMS ON PCI. INTRODUCTION EMBEDDED SYSTEMS PERIPHERAL COMPONENT INTERCONNECT The presentation involves the success of the widely adopted PCI.
Computer Organization & Assembly Language © by DR. M. Amer.
1 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames) Reconfigurable Architectures Forces that drive.
L/O/G/O Input Output Chapter 4 CS.216 Computer Architecture and Organization.
Lecture 6. VFP & NEON in ARM
M U N - February 17, Phil Bording1 Computer Engineering of Wave Machines for Seismic Modeling and Seismic Migration R. Phillip Bording February.
Hardware Benchmark Results for An Ultra-High Performance Architecture for Embedded Defense Signal and Image Processing Applications September 29, 2004.
Sep. 17, 2002BESIII Review Meeting BESIII DAQ System BESIII Review Meeting IHEP · Beijing · China Sep , 2002.
Academic PowerPoint Computer System – Architecture.
Playstation2 Architecture Architecture Hardware Design.
Input/Output Organization III: Commercial Bus Standards CE 140 A1/A2 20 August 2003.
Lecture 3: Computer Architectures
Fundamentals of Programming Languages-II
1 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames) CPRE 583 Reconfigurable Computing Lecture.
CSC 360- Instructor: K. Wu Review of Computer Organization.
Cluster Computers. Introduction Cluster computing –Standard PCs or workstations connected by a fast network –Good price/performance ratio –Exploit existing.
Advanced Rendering Technology The AR250 A New Architecture for Ray Traced Rendering.
Lecture 13 Parallel Processing. 2 What is Parallel Computing? Traditionally software has been written for serial computation. Parallel computing is the.
A next-generation many-core processor with reliability, fault tolerance and adaptive power management features optimized for embedded.
Introduction to Computers - Hardware
Microprocessors Personal Computers Embedded Systems Programmable Logic
Presented by: Tim Olson, Architect
Christopher Han-Yu Chou Supervisor: Dr. Guy Lemieux
عمارة الحاسب.
The performance requirements for DSP applications continue to grow and the traditional solutions do not adequately address this new challenge Paradigm.
ClearSpeed CSX620 Overview
Computer Evolution and Performance
CSE 502: Computer Architecture
ADSP 21065L.
Cluster Computers.
Presentation transcript:

WorldScape Defense Company, L.L.C. Company Proprietary Slide 1 An Ultra-High Performance Scalable Processing Architecture for HPC and Embedded Applications Presentation For IPDPS Conference 28 April 2004 Presentation For IPDPS Conference 28 April 2004

WorldScape Defense Company, L.L.C. Company Proprietary WorldScape Defense Company, L.L.C. Company Proprietary Slide 2 CS301 Up Close Multi-Threaded Array Processor 25.6 GFLOPS 3W worst-case, 2W typical 200MHz 64 PEs, 4 Kbytes each PE Array Control SRAM Bus ClearConnect bus 64-bit full duplex 1.6 Gbyte/s each direction 2x 0.8-Gbyte/s bridge ports Scratchpad memory 128 Kbytes of SRAM Availability Currently available

WorldScape Defense Company, L.L.C. Company Proprietary WorldScape Defense Company, L.L.C. Company Proprietary Slide 3 Multi-Threaded Array Processing Architecture Multi-threaded Array Processor Fully programmable in C Hardware multi-threading Extensible instruction set Scalable internal parallelism Array of Processing Elements (PEs) Compute, bandwidth scale together From 10s to 1,000s of PEs Built-in PE redundancy High performance, low power ~10 GFLOPS/Watt Multiple high speed I/O channels

WorldScape Defense Company, L.L.C. Company Proprietary WorldScape Defense Company, L.L.C. Company Proprietary Slide 4 Processing Elements PEs are highly optimised execution units: ALU, MAC, FPU High-bandwidth, multiport register file High bandwidth per PE DMA (PIO, SIO) Closely coupled SRAM for data 64 PEs at 200MHz 25.6 GFLOPS 51.2 Gbyte/s bandwidth to PE memory 12,800 MIPS Supports multiple data types: 8, 16, 24, 32-bit,... fixed-point arithmetic 32-bit IEEE floating-point arithmetic

WorldScape Defense Company, L.L.C. Company Proprietary WorldScape Defense Company, L.L.C. Company Proprietary Slide 5 ClearConnect TM High-Speed Bus Lanes from 25 to 100Gbit/s full duplex Packet switched architecture Scales to 4 lanes per bus Lane widths: 32 to 256-bit Distributed arbitration Low power Highly flexible

WorldScape Defense Company, L.L.C. Company Proprietary WorldScape Defense Company, L.L.C. Company Proprietary Slide 6 CS301 Up Close Multi-Threaded Array Processor 25.6 GFLOPS 3W worst-case, 2W typical 200MHz 64 PEs, 4 Kbytes each PE Array Control SRAM Bus ClearConnect bus 64-bit full duplex 1.6 Gbyte/s each direction 2x 0.8-Gbyte/s bridge ports Scratchpad memory 128 Kbytes of SRAM Availability Currently available

WorldScape Defense Company, L.L.C. Company Proprietary WorldScape Defense Company, L.L.C. Company Proprietary Slide 7 Off the shelf Products CS PE chip - 2W, 25 GFLOPS - Hardware Development Support Fully functional SDK - Application Support - Software Libraries Dual 64 PCI Development Board – 50 GFLOPS performance - Acceleration for clusters and HPC applications - Development environment for embedded applications - Growing catalog of software application libraries - Scalable with robust evolution path

WorldScape Defense Company, L.L.C. Company Proprietary WorldScape Defense Company, L.L.C. Company Proprietary Slide 8 Systems Integration Examples PC plug-in accelerator Coprocessors in a PC server* Coprocessors in a blade server* COTS hardware *Images courtesy of Angstrom Microsystems **Image courtesy of Office of Naval Research Silver Fox **Algorithmdevelopment for embedded applications

WorldScape Defense Company, L.L.C. Company Proprietary WorldScape Defense Company, L.L.C. Company Proprietary Slide 9 WorldScape’s Offering Chip Technology - 64 PE/256 PE… - customizable… Support Tools - SDK, VSIPL, PCA morphware… Board Level Integration - custom, I/O, i/f, … Application Integration - FFT, PC, HSI, SceneServer …