Download presentation
Presentation is loading. Please wait.
Published byPhebe Nichols Modified over 9 years ago
1
WorldScape Defense Company, L.L.C. Company Proprietary Slide 1 An Ultra-High Performance Scalable Processing Architecture for HPC and Embedded Applications Presentation For IPDPS Conference 28 April 2004 Presentation For IPDPS Conference 28 April 2004
2
WorldScape Defense Company, L.L.C. Company Proprietary WorldScape Defense Company, L.L.C. Company Proprietary Slide 2 CS301 Up Close Multi-Threaded Array Processor 25.6 GFLOPS 3W worst-case, 2W typical 200MHz 64 PEs, 4 Kbytes each PE Array Control SRAM Bus ClearConnect bus 64-bit full duplex 1.6 Gbyte/s each direction 2x 0.8-Gbyte/s bridge ports Scratchpad memory 128 Kbytes of SRAM Availability Currently available
3
WorldScape Defense Company, L.L.C. Company Proprietary WorldScape Defense Company, L.L.C. Company Proprietary Slide 3 Multi-Threaded Array Processing Architecture Multi-threaded Array Processor Fully programmable in C Hardware multi-threading Extensible instruction set Scalable internal parallelism Array of Processing Elements (PEs) Compute, bandwidth scale together From 10s to 1,000s of PEs Built-in PE redundancy High performance, low power ~10 GFLOPS/Watt Multiple high speed I/O channels
4
WorldScape Defense Company, L.L.C. Company Proprietary WorldScape Defense Company, L.L.C. Company Proprietary Slide 4 Processing Elements PEs are highly optimised execution units: ALU, MAC, FPU High-bandwidth, multiport register file High bandwidth per PE DMA (PIO, SIO) Closely coupled SRAM for data 64 PEs at 200MHz 25.6 GFLOPS 51.2 Gbyte/s bandwidth to PE memory 12,800 MIPS Supports multiple data types: 8, 16, 24, 32-bit,... fixed-point arithmetic 32-bit IEEE floating-point arithmetic
5
WorldScape Defense Company, L.L.C. Company Proprietary WorldScape Defense Company, L.L.C. Company Proprietary Slide 5 ClearConnect TM High-Speed Bus Lanes from 25 to 100Gbit/s full duplex Packet switched architecture Scales to 4 lanes per bus Lane widths: 32 to 256-bit Distributed arbitration Low power Highly flexible
6
WorldScape Defense Company, L.L.C. Company Proprietary WorldScape Defense Company, L.L.C. Company Proprietary Slide 6 CS301 Up Close Multi-Threaded Array Processor 25.6 GFLOPS 3W worst-case, 2W typical 200MHz 64 PEs, 4 Kbytes each PE Array Control SRAM Bus ClearConnect bus 64-bit full duplex 1.6 Gbyte/s each direction 2x 0.8-Gbyte/s bridge ports Scratchpad memory 128 Kbytes of SRAM Availability Currently available
7
WorldScape Defense Company, L.L.C. Company Proprietary WorldScape Defense Company, L.L.C. Company Proprietary Slide 7 Off the shelf Products CS301 64 PE chip - 2W, 25 GFLOPS - Hardware Development Support Fully functional SDK - Application Support - Software Libraries Dual 64 PCI Development Board – 50 GFLOPS performance - Acceleration for clusters and HPC applications - Development environment for embedded applications - Growing catalog of software application libraries - Scalable with robust evolution path
8
WorldScape Defense Company, L.L.C. Company Proprietary WorldScape Defense Company, L.L.C. Company Proprietary Slide 8 Systems Integration Examples PC plug-in accelerator Coprocessors in a PC server* Coprocessors in a blade server* COTS hardware *Images courtesy of Angstrom Microsystems **Image courtesy of Office of Naval Research Silver Fox **Algorithmdevelopment for embedded applications
9
WorldScape Defense Company, L.L.C. Company Proprietary WorldScape Defense Company, L.L.C. Company Proprietary Slide 9 WorldScape’s Offering Chip Technology - 64 PE/256 PE… - customizable… Support Tools - SDK, VSIPL, PCA morphware… Board Level Integration - custom, I/O, i/f, … Application Integration - FFT, PC, HSI, SceneServer …
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.