© 2011 Altera CorporationPublic The Trends in Programmable Solutions SoC FPGAs for Embedded Applications and Hardware-Software Co-Design Misha Burich Senior.

Slides:



Advertisements
Similar presentations
You have been given a mission and a code. Use the code to complete the mission and you will save the world from obliteration…
Advertisements

All Programmable FPGAs, SoCs, and 3D ICs
Shared-Memory Model and Threads Intel Software College Introduction to Parallel Programming – Part 2.
Implementing Domain Decompositions Intel Software College Introduction to Parallel Programming – Part 3.
1 Copyright © 2013 Elsevier Inc. All rights reserved. Chapter 4 Computing Platforms.
© David Kirk/NVIDIA and Wen-mei W. Hwu, ECE408/CS483, University of Illinois, Urbana-Champaign 1 ECE408 / CS483 Applied Parallel Programming.
ITRS Roadmap Design + System Drivers Makuhari, December 2007 Worldwide Design ITWG Good morning. Here we present the work that the ITRS Design TWG has.
1 Building a Fast, Virtualized Data Plane with Programmable Hardware Bilal Anwer Nick Feamster.
Reconfigurable Computing After a Decade: A New Perspective and Challenges For Hardware-Software Co-Design and Development Tirumale K Ramesh, Ph.D. Boeing.
MicroKernel Pattern Presented by Sahibzada Sami ud din Kashif Khurshid.
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
My Alphabet Book abcdefghijklm nopqrstuvwxyz.
Chapter 5 Input/Output 5.1 Principles of I/O hardware
Prasanna Pandit R. Govindarajan
Nios Multi Processor Ethernet Embedded Platform Final Presentation
Real Time Versions of Linux Operating System Present by Tr n Duy Th nh Quách Phát Tài 1.
Debugging operating systems with time-traveling virtual machines Sam King George Dunlap Peter Chen CoVirt Project, University of Michigan.
T.B. Skaali, Department of Physics, University of Oslo) FYS 4220/9220 – 2012 / #3 Real Time and Embedded Data Systems and Computing Real-Time Operating.
Shredder GPU-Accelerated Incremental Storage and Computation
Accelerated Linear Algebra Libraries James Wynne III NCCS User Assistance.
Slide 5-1 Copyright © 2004 Pearson Education, Inc. Operating Systems: A Modern Perspective, Chapter 5 5 Device Management.
ABC Technology Project
CS 6143 COMPUTER ARCHITECTURE II SPRING 2014 ACM Principles and Practice of Parallel Programming, PPoPP, 2006 Panel Presentations Parallel Processing is.
Defect Tolerance for Yield Enhancement of FPGA Interconnect Using Fine-grain and Coarse-grain Redundancy Anthony J. YuGuy G.F. Lemieux September 15, 2005.
Copyright © 2006 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Technology Education Introduction to Computer Administration Introduction.
© 2003 Xilinx, Inc. All Rights Reserved Course Wrap Up DSP Design Flow.
Executional Architecture
GG Consulting, LLC I-SUITE. Source: TEA SHARS Frequently asked questions 2.
25 seconds left…...
Week 1.
We will resume in: 25 Minutes.
1 Unit 1 Kinematics Chapter 1 Day
INTEL CONFIDENTIAL Threading for Performance with Intel® Threading Building Blocks Session:
April 30, nm/14nm process technologies for FPGAs Benefits and Challenges Avner Uzan - Eastronics.
Presenter MaxAcademy Lecture Series – V1.0, September 2011 Dataflow Programming with MaxCompiler.
From Model-based to Model-driven Design of User Interfaces.
NetSlices: Scalable Multi-Core Packet Processing in User-Space Tudor Marian, Ki Suh Lee, Hakim Weatherspoon Cornell University Presented by Ki Suh Lee.
A reconfigurable system featuring dynamically extensible embedded microprocessor, FPGA, and customizable I/O Borgatti, M. Lertora, F. Foret, B. Cali, L.
Maciej Gołaszewski Tutor: Tadeusz Sondej, PhD Design and implementation of softcore dual processor system on single chip FPGA Design and implementation.
Digital Signal Processing and Field Programmable Gate Arrays By: Peter Holko.
Some Thoughts on Technology and Strategies for Petaflops.
Define Embedded Systems Small (?) Application Specific Computer Systems.
Configurable System-on-Chip: Xilinx EDK
UCB November 8, 2001 Krishna V Palem Proceler Inc. Customization Using Variable Instruction Sets Krishna V Palem CTO Proceler Inc.
© 2010 Altera Corporation—Public DSP Innovations in 28-nm FPGAs Danny Biran Senior VP of Marketing.
Programmable Logic- How do they do that? 1/16/2015 Warren Miller Class 5: Software Tools and More 1.
EKT303/4 PRINCIPLES OF PRINCIPLES OF COMPUTER ARCHITECTURE (PoCA)
Technion – Israel Institute of Technology Department of Electrical Engineering High Speed Digital Systems Lab Spring 2009.
© 2011 Altera Corporation—Public Introducing Qsys – Next Generation System Integration Platform AP Tech Roadshow.
1 3-General Purpose Processors: Altera Nios II 2 Altera Nios II processor A 32-bit soft core processor from Altera Comes in three cores: Fast, Standard,
© 2008 Altera Corporation—Public Why You’ll Want to Think Altera When You Think About Your Next Embedded System.
© 2010 Altera Corporation—Public Easily Build Designs Using Altera’s Video and Image Processing Framework 2010 Technology Roadshow.
NIOS II Ethernet Communication Final Presentation
EKT303/4 PRINCIPLES OF PRINCIPLES OF COMPUTER ARCHITECTURE (PoCA)
TI Information – Selective Disclosure Implementation of Linear Algebra Libraries for Embedded Architectures Using BLIS September 28, 2015 Devangi Parikh.
Embedded Systems Design with Qsys and Altera Monitor Program
© 2009 Altera Corporation Floating Point Synthesis From Model-Based Design M. Langhammer, M. Jervis, G. Griffiths, M. Santoro.
K-Nearest Neighbor Digit Recognition ApplicationDomainConstraintsKernels/Algorithms Voice Removal and Pitch ShiftingAudio ProcessingLatency (Real-Time)FFT,
© 2008 Altera Corporation—Public 40-nm Stratix IV FPGAs Innovation Without Compromise.
CoDeveloper Overview Updated February 19, Introducing CoDeveloper™  Targeting hardware/software programmable platforms  Target platforms feature.
Heterogeneous Processing KYLE ADAMSKI. Overview What is heterogeneous processing? Why it is necessary Issues with heterogeneity CPU’s vs. GPU’s Heterogeneous.
Hands On SoC FPGA Design
Introduction to Programmable Logic
Enabling machine learning in embedded systems
Texas Instruments TDA2x and Vision SDK
ENG3050 Embedded Reconfigurable Computing Systems
FPGAs in AWS and First Use Cases, Kees Vissers
Using FPGAs with Processors in YOUR Designs
Presentation transcript:

© 2011 Altera CorporationPublic The Trends in Programmable Solutions SoC FPGAs for Embedded Applications and Hardware-Software Co-Design Misha Burich Senior VP & CTO Altera Corporation June 8, 2012

© 2011 Altera CorporationPublic Agenda Market Drivers and Technology Enablers Embedded Computing & FPGAs The emergence of SoC FPGAs Hardware-Software Programmability Summary 2

© 2011 Altera CorporationPublic Market Drivers and Enabling Embedded Technologies 3

© 2011 Altera CorporationPublic 4 Most Markets Rely on Embedded Processing Communications and Broadcast Consumer and Automotive Test and Medical Computer and Storage Military and Industrial

© 2011 Altera CorporationPublic Enabling Technologies: 200X-201X Moores law enabled high density programmable platforms 5 Fine-Grained Massively Parallel Heterogeneous Arrays FPGAs DSPs CPUs Single Cores Coarse-Grained Massively Parallel Processor Arrays Multi-Cores Coarse-Grained CPUs and DSPs Multi-Cores Arrays

© 2011 Altera CorporationPublic The Age of Systems-On-Chip ARM based SoCs by Apple, Nvidia, Qualcomm, TI, others 6 SoC FPGAs by Altera, Xilinx, others

© 2011 Altera CorporationPublic Also, There Is More-Than-Moore System-in-package multi-die integration POP, 2.5D, 3D, micro-bumps, through-silicon-vias (TSV) Example: Intels integration of Atom processor with Alteras FPGA E600C 7

© 2011 Altera CorporationPublic Embedded Computing on FPGAs 8 Source: Gartner Sept % FPGA Design Starts Soon, a majority of FPGA designs will feature an embedded processor

© 2011 Altera CorporationPublic Alteras Embedded Initiative 9 Cortex-M1MP32Cortex A9 Single FPGA Design Flow for a Broad Array of Industry- Leading Embedded Processors ColdFire V1 Atom E6XXC FPGA Design Software System Integration Tool SOFT CORE PROCESSORS SoC (PiP) SoC FPGA

© 2011 Altera CorporationPublic Modern FPGA: Massively Parallel 10 Million + Logic Elements Multi-billion Transistors Thousands of 20-Kbit Memory Blocks Thousands of MAC DSP Blocks Many External Memory, LVDS, And Transceivers IO Blocks

© 2011 Altera CorporationPublic FPGA : Ultimately Configurable Multicore 11 Processor Memory Processor Memory Processor Memory Can implement many coarse- grained processors Exploiting the fine grained parallelism of the FPGA for acceleration Possibly heterogeneous Optimized for different tasks Customizable to suit the needs of a particular application

© 2011 Altera CorporationPublic An Example: Crescendo Networks A Programmable Eight-Processor Architecture Using Embedded Soft Processor Nios II On a Single FPGA For Layer 4/7 Applications 12

© 2011 Altera CorporationPublic Hardware-Software Co-Design for SoC FPGAs 13

© 2011 Altera CorporationPublic SoC FPGA Programming Models Traditional FPGAs programming model is RTL State machines, datapaths, arbitration, buffering, etc. Also need a programming model that represents an FPGA as: Processor with hardware accelerators Configurable multi-core device 14

© 2011 Altera CorporationPublic C/C++ Compilers for FPGAs Several commercial offerings C/C++/System C High-Level-Synthesis to RTL C-to-Silicon Compiler (Cadence) Impulse C (Impulse Accelerated Technologies) Catapult C (Mentor) SynphonyC (Synopsys) AutoESL (now Xilinx) Cynthesizer (Forte), NEC 15 Synthesized blocks

© 2011 Altera CorporationPublic Integration of Embedded Processors and Hardware Accelerators in FPGAs: C2H 16 Program Memory Processor: Nios II, ARM DMA Accelerator DMA Data Memory Arbiter Data Memory Arbiter Qsys

© 2011 Altera CorporationPublic Identify the Software Bottleneck main () { …variable declarations… init(); while (!error && got_data()) { do_user_interface(); gather_statistics(); if (got_new_data()) d_transform(in_buf, out_buf); check_for_errors(); } cleanup(); } 17 Execution Time

© 2011 Altera CorporationPublic Compile C Function Into Hardware d_transform (int *t, int *p) { …setup code… for (i = 0; i < Buf_Size; i++) { …loop overhead… *t = transform (*p); t++; p++; } …exit code… } 18 Hardware Accelerator Hardware Accelerator Use HLS Compiler Or, Write the Function in HDL…

© 2011 Altera CorporationPublic The Result main () { …variable declarations… init(); while (!error && got_data()) { do_user_interface(); gather_statistics(); if (got_new_data()) d_transform(in_buf, out_buf); check_for_errors(); } cleanup(); } 19 Execution Time d_transform

© 2011 Altera CorporationPublic OpenCL Approach to SoC FPGA Programming 20

© 2011 Altera CorporationPublic What is OpenCL OpenCL is a programming model developed by the Khronos group to support multi-core programming and silicon acceleration An industry consortium creating open API standards Commitment to royalty-free standards 21

© 2011 Altera CorporationPublic OpenCL Portability 22 Source: RapidMind

© 2011 Altera CorporationPublic OpenCL Structure Natural separation between the code that runs on accelerators and the code that manages those accelerators The host code is pure software that can be executed on any sort of conventional microprocessor Soft processor, Embedded hard processor, external x86 processor The kernel code is C with a minimal set of extensions that allows for the specification of parallelism and memory hierarchy 23

© 2011 Altera CorporationPublic OpenCL Host Program Pure software written in standard C Communicates with the Accelerator Device via a set of library routines which abstract the communication between the host processor and the kernels 24 main() { read_data_from_file( … ); maninpulate_data( … ); clEnqueueWriteBuffer( … ); clEnqueueTask(…, my_kernel, …); clEnqueueReadBuffer( … ); display_result_to_user( … ); } Copy data from Host to FPGA Ask the FPGA to run a particular kernel Copy data from FPGA to Host

© 2011 Altera CorporationPublic OpenCL SoC FPGA Target 25 DDR* Interface IP Application Specific External Protocols Accelerator / Datatapath Computation Accelerator / Datatapath Computation Host Program Kernels Processor Memory Processor Memory Processor Memory

© 2011 Altera CorporationPublic Summary 26 The era of SoC programmable solutions The age of SoC FPGAs The era of innovation Massively parallel embedded programmable solutions The new embedded software era Embedded hardware-software co-design and parallel programming

© 2011 Altera CorporationPublic ALTERA, ARRIA, CYCLONE, HARDCOPY, MAX, MEGACORE, NIOS, QUARTUS and STRATIX words and logos are trademarks of Altera Corporation and registered in the United States and are trademarks or registered trademarks in other countries. Thank You