HPEC-1 SMHS 7/7/2016 MIT Lincoln Laboratory Focus 3: Cell Sharon Sacco / MIT Lincoln Laboratory HPEC Workshop 19 September 2007 This work is sponsored.

Slides:



Advertisements
Similar presentations
Instructor Notes We describe motivation for talking about underlying device architecture because device architecture is often avoided in conventional.
Advertisements

Implementation of 2-D FFT on the Cell Broadband Engine Architecture William Lundgren Gedae), Kerry Barnes (Gedae), James Steed (Gedae)
Cell Broadband Engine. INF5062, Carsten Griwodz & Pål Halvorsen University of Oslo Cell Broadband Engine Structure SPE PPE MIC EIB.
Utilization of GPU’s for General Computing Presenter: Charlene DiMeglio Paper: Aspects of GPU for General Purpose High Performance Computing Suda, Reiji,
Evolution of Chip Design ECE 111 Spring A Brief History 1958: First integrated circuit – Flip-flop using two transistors – Built by Jack Kilby at.
K-means clustering –An unsupervised and iterative clustering algorithm –Clusters N observations into K clusters –Observations assigned to cluster with.
ELEC 6200, Fall 07, Oct 29 McPherson: Vector Processors1 Vector Processors Ryan McPherson ELEC 6200 Fall 2007.
© 2011 The MITRE Corporation. All rights reserved. Sharon Sacco / The MITRE Corporation HPEC 2011 Focus 3: GPU.
Cell Broadband Processor Daniel Bagley Meng Tan. Agenda  General Intro  History of development  Technical overview of architecture  Detailed technical.
HPEC_GPU_DECODE-1 ADC 8/6/2015 MIT Lincoln Laboratory GPU Accelerated Decoding of High Performance Error Correcting Codes Andrew D. Copeland, Nicholas.
MIT Lincoln Laboratory XYZ 3/11/2005 VSIPL and SAR Performance on Multiple Generations of Intel® Processors Peter Carlston, Platform Architect,
On Chip Nonlinear Digital Compensation for RF Receiver Karen M. G. V. Gettings, Andrew K. Bolstad, Show-Yah Stuart Chen, Benjamin A. Miller and Michael.
Implementation of Digital Front End Processing Algorithms with Portability Across Multiple Processing Platforms September 20-21, 2011 John Holland, Jeremy.
© 2010 The MITRE Corporation. All rights reserved. Session 2: Many Core Sharon Sacco / The MITRE Corporation HPEC 2010 Approved for Public Release:
Evaluation of Multi-core Architectures for Image Processing Algorithms Masters Thesis Presentation by Trupti Patil July 22, 2009.
PVTOL: A High-Level Signal Processing Library for Multicore Processors
Christian Steinle, University of Mannheim, Institute of Computer Engineering1 L1 Tracking – Status CBMROOT And Realisation Christian Steinle, Andreas Kugel,
Cell Broadband Engine Architecture Bardia Mahjour ENCM 515 March 2007 Bardia Mahjour ENCM 515 March 2007.
Samuel Williams, John Shalf, Leonid Oliker, Shoaib Kamil, Parry Husbands, Katherine Yelick Lawrence Berkeley National Laboratory ACM International Conference.
© 2005 Mercury Computer Systems, Inc. Yael Steinsaltz, Scott Geaghan, Myra Jean Prelle, Brian Bouzas,
XYZ 10/2/2015 MIT Lincoln Laboratory Sidecar Network Adapters: Open Architecture for ISR Data Sources Craig McNally, Larry Barbieri, Charles Yee.
Gedae Portability: From Simulation to DSPs to the Cell Broadband Engine James Steed, William Lundgren, Kerry Barnes Gedae, Inc
HPEC_NDP-1 MAH 10/11/2015 MIT Lincoln Laboratory Efficient Multidimensional Polynomial Filtering for Nonlinear Digital Predistortion Matthew Herman, Benjamin.
Slide 1 MIT Lincoln Laboratory Toward Mega-Scale Computing with pMatlab Chansup Byun and Jeremy Kepner MIT Lincoln Laboratory Vipin Sachdeva and Kirk E.
Programming Examples that Expose Efficiency Issues for the Cell Broadband Engine Architecture William Lundgren Gedae), Rick Pancoast.
HPEC-1 MIT Lincoln Laboratory Session 1: GPU: Graphics Processing Units Miriam Leeser / Northeastern University HPEC Conference 15 September 2010.
© 2007 SET Associates Corporation SAR Processing Performance on Cell Processor and Xeon Mark Backues, SET Corporation Uttam Majumder, AFRL/RYAS.
Radar Pulse Compression Using the NVIDIA CUDA SDK
MIT Lincoln Laboratory Projective Transform on Cell: A Case Study Sharon Sacco, Hahn Kim, Sanjeev Mohindra, Peter Boettcher, Chris Bowen, Nadya Bliss,
MIT Lincoln Laboratory jca-1 KAM 5/27/2016 Panel Session: Multicore Meltdown? James C. Anderson MIT Lincoln Laboratory HPEC07 Wednesday, 19 September.
HPEC SMHS 9/24/2008 MIT Lincoln Laboratory Large Multicore FFTs: Approaches to Optimization Sharon Sacco and James Geraci 24 September 2008 This.
Improved air combat awareness - with AESA and next-generation signal processing Main beam jamming rejection Wide transmit beam Communication Side lobe.
Accelerating the Singular Value Decomposition of Rectangular Matrices with the CSX600 and the Integrable SVD September 7, 2007 PaCT-2007, Pereslavl-Zalessky.
HPEC2002_Session1 1 DRM 11/11/2015 MIT Lincoln Laboratory Session 1: Novel Hardware Architectures David R. Martinez 24 September 2002 This work is sponsored.
MIT Lincoln Laboratory jca-1 KAM 11/13/2015 Panel Session: Paving the Way for Multicore Open Systems Architectures James C. Anderson MIT Lincoln.
XYZ 11/16/2015 MIT Lincoln Laboratory AltiVec Extensions to the Portable Expression Template Engine (PETE)* Edward Rutledge HPEC September,
Haney - 1 HPEC 9/28/2004 MIT Lincoln Laboratory pMatlab Takes the HPCchallenge Ryan Haney, Hahn Kim, Andrew Funk, Jeremy Kepner, Charles Rader, Albert.
CBELU - 1 James 11/23/2015 MIT Lincoln Laboratory High Performance Simulations of Electrochemical Models on the Cell Broadband Engine James Geraci HPEC.
High Performance Computing Group Feasibility Study of MPI Implementation on the Heterogeneous Multi-Core Cell BE TM Architecture Feasibility Study of MPI.
Hardware Benchmark Results for An Ultra-High Performance Architecture for Embedded Defense Signal and Image Processing Applications September 29, 2004.
LYU0703 Parallel Distributed Programming on PS3 1 Huang Hiu Fung Wong Chung Hoi Supervised by Prof. Michael R. Lyu Department of Computer.
Slide-1 Multicore Theory MIT Lincoln Laboratory Theory of Multicore Algorithms Jeremy Kepner and Nadya Bliss MIT Lincoln Laboratory HPEC 2008 This work.
Debunking the 100X GPU vs. CPU Myth An Evaluation of Throughput Computing on CPU and GPU Present by Chunyi Victor W Lee, Changkyu Kim, Jatin Chhugani,
HPEC HN 9/23/2009 MIT Lincoln Laboratory RAPID: A Rapid Prototyping Methodology for Embedded Systems Huy Nguyen, Thomas Anderson, Stuart Chen, Ford.
Optimizing Ray Tracing on the Cell Microprocessor David Oguns.
Design of A Custom Vector Operation API Exploiting SIMD Intrinsics within Java Presented by John-Marc Desmarais Authors: Jonathan Parri, John-Marc Desmarais,
1 WRB 09/02 HPEC Lincoln Lab Sept 2002 Poster B: Software Technologies andSystems W. Robert Bernecky Naval Undersea Warfare Center Ph: (401) Fax:
WorldScape Defense Company, L.L.C. Company Proprietary Slide 1 An Ultra-High Performance Scalable Processing Architecture for HPC and Embedded Applications.
Presented by Jeremy S. Meredith Sadaf R. Alam Jeffrey S. Vetter Future Technologies Group Computer Science and Mathematics Division Research supported.
Aarul Jain CSE520, Advanced Computer Architecture Fall 2007.
PTCN-HPEC-02-1 AIR 25Sept02 MIT Lincoln Laboratory Resource Management for Digital Signal Processing via Distributed Parallel Computing Albert I. Reuther.
MIT Lincoln Laboratory 1 of 4 MAA 3/8/2016 Development of a Real-Time Parallel UHF SAR Image Processor Matthew Alexander, Michael Vai, Thomas Emberley,
FFTC: Fastest Fourier Transform on the IBM Cell Broadband Engine David A. Bader, Virat Agarwal.
● Cell Broadband Engine Architecture Processor ● Ryan Layer ● Ben Kreuter ● Michelle McDaniel ● Carrie Ruppar.
HPEC 2003 Linear Algebra Processor using FPGA Jeremy Johnson, Prawat Nagvajara, Chika Nwankpa Drexel University.
George Viamontes, Mohammed Amduka, Jon Russo, Matthew Craven, Thanhvu Nguyen Lockheed Martin Advanced Technology Laboratories 3 Executive Campus, 6th Floor.
Itanium® 2 Processor Architecture
M. Richards1 ,D. Campbell1 (presenter), R. Judd2, J. Lebak3, and R
Richard Dorrance Literature Review: 1/11/13
Presented by: Tim Olson, Architect
SmartCell: A Coarse-Grained Reconfigurable Architecture for High Performance and Low Power Embedded Computing Xinming Huang Depart. Of Electrical and Computer.
Session 4: Reconfigurable Computing
A Quantitative Analysis of Stream Algorithms on Raw Fabrics
9/18/2018 Accelerating IMA: A Processor Performance Comparison of the Internal Multiple Attenuation Algorithm Michael Perrone Mgr, Cell Solution Dept.,
Reconfigurable Computing MONARCH/MCHIP
Parallel Processing in ROSA II
Memory System Performance Chapter 3
Large data arrays processing on Cell Broadband Engine
Multicore and GPU Programming
CSE 102 Introduction to Computer Engineering
Presentation transcript:

HPEC-1 SMHS 7/7/2016 MIT Lincoln Laboratory Focus 3: Cell Sharon Sacco / MIT Lincoln Laboratory HPEC Workshop 19 September 2007 This work is sponsored by the Department of the Air Force under Air Force contract FA C Opinions, interpretations, conclusions and recommendations are those of the author and not necessarily endorse by the United States Government.

MIT Lincoln Laboratory HPEC-2 SMHS 7/7/2016 Embedded Processor Evolution 20 years of exponential growth in FLOPS / W Requires switching architectures every ~5 years Cell Processor is current high performance architecture Year MFLOPS / W i860 SHARC PowerPC PowerPC with AltiVec Cell (estimated) i860 XR MPC7447A Cell MPC7410 MPC e 750 SHARC High Performance Embedded Processors FPGA? GPU?

MIT Lincoln Laboratory HPEC-3 SMHS 7/7/2016 Cell Features S ynergistic P rocessing E lement 128 SIMD Registers, 128 bits wide Dual issue instructions Local Store 256 KB Flat memory Memory Flow Controller Built in DMA Engine Element Interconnect Bus 4 ring buses Each ring 16 bytes wide ½ processor speed (VMX) Overall Performance Peak 3.2 GHz: GFLOPS (single), 14.6 GFLOPS (double) Processor to Memory bandwidth: 25.6 GB/s Power usage: ~100 W (estimated) Cell gives ~2 GFLOPS / W Cell offers significant performance benefits over other programmable technologies Max bandwidth 96 bytes / cycle ( GHz)

MIT Lincoln Laboratory HPEC-4 SMHS 7/7/2016 In This Session FFTC: Fastest Fourier Transform for the IBM Cell Broadband Engine –Virat Agarwal, Georgia Institute of Technology Implementation of SIGINT Application on Cell-BE –Richard Besler, Black River Systems Company Performance of a Multicore Matrix Multiplication Library –Robert Cooper, Mercury Computer Systems

MIT Lincoln Laboratory HPEC-5 SMHS 7/7/2016 Other Cell Related Talks Wednesday: Session 4: Novel Applications –Projective Transform on Cell: A Case Study Thursday: Session 5: Multicore Environments –High Performance Simulations of Electrochemical Models on the Cell Broadband Engine –Sourcery VSIPL++ for the Cell/B.E. –Programming Examples that Expose Efficiency Issues for the Cell Broadband Engine Architecture –PVTOL: A High-Level Signal Processing Library for Multicore Processors Focus 5: Benchmarking –Exploring Multi-core Processors with Realistic Signal- and Image- processing Application Benchmarks Poster / Demo C: Cell / GPU Technologies Session 6: Awards Session –Implementation of Polar Format SAT Image Formation on the Cell Broadband Engine