ClearSpeed CSX620 Overview

Slides:



Advertisements
Similar presentations
Intel Pentium 4 ENCM Jonathan Bienert Tyson Marchuk.
Advertisements

Slides Prepared from the CI-Tutor Courses at NCSA By S. Masoud Sadjadi School of Computing and Information Sciences Florida.
Yaron Doweck Yael Einziger Supervisor: Mike Sumszyk Spring 2011 Semester Project.
Comp Sci Floating Point Arithmetic 1 Ch. 10 Floating Point Unit.
Processor history / DX/SX SX/DX Pentium 1997 Pentium MMX
Introduction CS 524 – High-Performance Computing.
Aug. 24, 2007ELEC 5200/6200 Project1 Computer Design Project ELEC 5200/6200-Computer Architecture and Design Fall 2007 Vishwani D. Agrawal James J.Danaher.
Performance D. A. Patterson and J. L. Hennessey, Computer Organization & Design: The Hardware Software Interface, Morgan Kauffman, second edition 1998.
IXP1200 Microengines Apparao Kodavanti Srinivasa Guntupalli.
ClearSpeed CSX620 Overview. References ClearSpeed Technical Training Slides for ClearSpeed Accelerator 620, software version 3.0, Slide Sets 1-6, Presentor:
The PTX GPU Assembly Simulator and Interpreter N.M. Stiffler Zheming Jin Ibrahim Savran.
1-1 Embedded Software Development Tools and Processes Hardware & Software Hardware – Host development system Software – Compilers, simulators etc. Target.
CS402 PPP # 2 MIPS BASIC INFORMATION By George Koutsogiannakis 1.
PlayStation 2 Architecture Irin Jose Farid Momin Quy Ngo Olivia Wong.
Robotics Research Laboratory Louisiana State University.
Training Program on GPU Programming with CUDA 31 st July, 7 th Aug, 14 th Aug 2011 CUDA Teaching UoM.
MICE III 68000/20/30 MICETEK International Inc. CPU MICEIII MICEView Examples Contents Part 1: An introduction to the MC68000,MC68020 and Part.
Computer Systems 1 Fundamentals of Computing Von Neumann & Fetch Execute Cycle.
RM2D Let’s write our FIRST basic SPIN program!. The Labs that follow in this Module are designed to teach the following; Turn an LED on – assigning I/O.
COMP3221 lec04--prog-model.1 Saeid Nooshabadi COMP 3221 Microprocessors and Embedded Systems Lecture 4: Programmer’s Model of Microprocessors
4 November 2008NGS Innovation Forum '08 11 NGS Clearspeed Resources Clearspeed and other accelerator hardware on the NGS Steven Young Oxford NGS Manager.
An Ultra-High Performance Architecture for Embedded Defense Signal and Image Processing Applications September 24, 2003 Authors Ken Cameron
PDCS 2007 November 20, 2007 Accelerating the Complex Hessenberg QR Algorithm with the CSX600 Floating-Point Coprocessor Yusaku Yamamoto 1 Takafumi Miyata.
Associative Functions implemented on ClearSpeed CSX600 Mike Yuan.
EG280 Computer Science for Engineers Fundamental Concepts Chapter 1.
Accelerating the Singular Value Decomposition of Rectangular Matrices with the CSX600 and the Integrable SVD September 7, 2007 PaCT-2007, Pereslavl-Zalessky.
Associative Functions implemented on ClearSpeed CSX600 Mike Yuan.
Computer Organization & Assembly Language © by DR. M. Amer.
Hardware Benchmark Results for An Ultra-High Performance Architecture for Embedded Defense Signal and Image Processing Applications September 29, 2004.
Computing Systems & Programming ECE Fundamental Concepts Chapter 1 Engineering Problem Solving.
Motherboard A motherboard allows all the parts of your computer to receive power and communicate with one another.
Chapter 5: Computer Systems Design and Organization Dr Mohamed Menacer Taibah University
Playstation2 Architecture Architecture Hardware Design.
Spring 2009 Programming Fundamentals I Java Programming XuanTung Hoang Lecture No. 8.
1 Implementation of Polymorphic Matrix Inversion using Viva Arvind Sudarsanam, Dasu Aravind Utah State University.
WorldScape Defense Company, L.L.C. Company Proprietary Slide 1 An Ultra-High Performance Scalable Processing Architecture for HPC and Embedded Applications.
 System Requirements are the prerequisites needed in order for a software or any other resources to execute efficiently.  Most software defines two.
Co-Processor Architectures Fermi vs. Knights Ferry Roger Goff Dell Senior Global CERN/LHC Technologist |
Static DLX processor Understanding its architecture and available toolset.
1 TM 1 Embedded Systems Lab./Honam University ARM Microprocessor Programming Model.
Chapter 1 slides1 What is C? A high-level language that is extremely useful for engineering computations. A computer language that has endured for almost.
Hardware Architecture
1 ECE 734 Final Project Presentation Fall 2000 By Manoj Geo Varghese MMX Technology: An Optimization Outlook.
A next-generation many-core processor with reliability, fault tolerance and adaptive power management features optimized for embedded.
Computer Organization Exam Review CS345 David Monismith.
GCSE Computing - The CPU
Chapter 1 Introduction.
Introduction to microprocessor (Continued) Unit 1 Lecture 2
ClearSpeed Programming Language Cn
Computer Organization & Assembly language
Vector Processing => Multimedia
NT1110 Computer Structure and Logic
Microcomputer Systems 1
Associative Functions implemented on ClearSpeed CSX600
Technology and Historical Perspective: A peek of the microprocessor Evolution 11/14/2018 cpeg323\Topic1a.ppt.
NVIDIA Fermi Architecture
Computer Electronic device Accepts data - input
Introduction to Computer Systems
Apparao Kodavanti Srinivasa Guntupalli
Computer Electronic device Accepts data - input
Star Bridge Systems, Inc.
General Optimization Issues
Understanding the TigerSHARC ALU pipeline
GCSE Computing - The CPU
System Programming By Prof.Naveed Zishan.
CSE 502: Computer Architecture
ADSP 21065L.
Overview of System Development for Windows CE.NET
Husky Energy Chair in Oil and Gas Research
Presentation transcript:

ClearSpeed CSX620 Overview

References ClearSpeed Technical Training Slides for ClearSpeed Accelerator 620, software version 3.0, Slide Sets 1-6, Presentor: Brian Summers (senior engineer), December 2007 Acknowledgement: Many slides used here are from Slide Set 1. ClearSpeed Introductory Programming Manual, January 2008

Topics Overview of ClearSpeed Board ClearSpeed Technology Company Accelerators ClearSpeed and HPC Hardware Overview Performance Software Development Kit (SDK) Application Examples Help and Support Topics omitted from ClearSpeed Overview Installing Hardware and Software Most topics in SDK overview - Some will be covered later E.g., Cn Language, Cn Libraries, compiler, debugging Cn, assembler, linker, simulator, graphics profiler, libraries. Moving Data Tuning Tips

ClearSpeed CSX600 Accelerator Board A PCI-X card equipped with two ClearSpeed CSX600 coprocessors

Performance Specifications of CSX600 Sustained double-precision performance of 25 GFLOPS on DGEMM 10 W max power consumption 250 MHz clock speed Transfer speed of internal memory: 96 Gbyes/s Transfer speed of external memory: 3.2 Gbytes/s

Multi-threaded Array Processing (MTAP) architecture of CSX600 Mono execution unit - process non-parallel data - handle program flow control Poly execution unit - 96 PEs - 6KB SRAM - dual 64-bit FPU - integer ALU - 32/64-bit floating-point multiplier & adder - 128B register files

Cn language Similar to standard C Main difference is poly variables Example code: #include <stdiop.h> // Output support #include <lib_ext.h> // Extra functions to support features of hardware int main() { poly int n; n = get_penum(); // individual PE number printfp("PE number: %d\n", n); // Output different message per PE return 0; } poly short get_penum(): number of current PE mono short get_num_pes(): number of PEs on CSX processor

Note: Do not contact ClearSpeed about a homework problem, answering a question, etc. They expect these questions to be professional level questions from owners of their CSX 620 boards – not student questions about their class or homework.