Chair MPSoC MPSoC Programming Solution “ CoreManager” hardware unit for:  Dependency checking  Task scheduling  Local memory management of PEs  C programmable.

Slides:



Advertisements
Similar presentations
Accelerators for HPC: Programming Models Accelerators for HPC: StreamIt on GPU High Performance Applications on Heterogeneous Windows Clusters
Advertisements

Philips Research ICS 252 class, February 3, The Trimedia CPU64 VLIW Media Processor Kees Vissers Philips Research Visiting Industrial Fellow
High-performance Cortex™-M4 MCU
High Performance Embedded Computing © 2007 Elsevier Lecture 15: Embedded Multiprocessor Architectures Embedded Computing Systems Mikko Lipasti, adapted.
VEGAS: Soft Vector Processor with Scratchpad Memory Christopher Han-Yu Chou Aaron Severance, Alex D. Brant, Zhiduo Liu, Saurabh Sant, Guy Lemieux University.
SoC Subsystem Acceleration using Application-Specific Processors (ASIPs) Markus Willems Product Manager Synopsys.
1 U NIVERSITY OF M ICHIGAN 11 1 SODA: A Low-power Architecture For Software Radio Author: Yuan Lin, Hyunseok Lee, Mark Woh, Yoav Harel, Scott Mahlke, Trevor.
© imec 2006 A Scalable Programmable Baseband Platform for Energy-Efficient Reactive Software-Defined-Radio B. Bougard (presenter), D. Novo, F. Naessens,
11 1 Hierarchical Coarse-grained Stream Compilation for Software Defined Radio Yuan Lin, Manjunath Kudlur, Scott Mahlke, Trevor Mudge Advanced Computer.
What Great Research ?s Can RAMP Help Answer? What Are RAMP’s Grand Challenges ?
A System Solution for High- Performance, Low Power SDR Yuan Lin 1, Hyunseok Lee 1, Yoav Harel 1, Mark Woh 1, Scott Mahlke 1, Trevor Mudge 1 and Krisztian.
1 SODA: A Low-power Architecture For Software Radio Yuan Lin 1, Hyunseok Lee 1, Mark Woh 1, Yoav Harel 1, Scott Mahlke 1, Trevor.
University of Michigan Electrical Engineering and Computer Science 1 Resource Recycling: Putting Idle Resources to Work on a Composable Accelerator Yongjun.
A Scalable Low-power Architecture For Software Radio
11 1 SPEX: A Programming Language for Software Defined Radio Yuan Lin, Robert Mullenix, Mark Woh, Scott Mahlke, Trevor Mudge, Alastair Reid 1, and Krisztián.
Automobiles The Scale Vector-Thread Processor Modern embedded systems Multiple programming languages and models Multiple distinct memories Multiple communication.
1 Design and Implementation of Turbo Decoders for Software Defined Radio Yuan Lin 1, Scott Mahlke 1, Trevor Mudge 1, Chaitali.
Technion – Israel Institute of Technology Department of Electrical Engineering High Speed Digital Systems Lab Written by: Haim Natan Benny Pano Supervisor:
1 Targeted execution enabling increased power efficiency John Goodacre Director, Program Management ARM Processor Division August 2009 MPSoC 2009 Anirban.
HPEC_GPU_DECODE-1 ADC 8/6/2015 MIT Lincoln Laboratory GPU Accelerated Decoding of High Performance Error Correcting Codes Andrew D. Copeland, Nicholas.
HW/SW Co-Design of an MPEG-2 Decoder Pradeep Dhananjay Kiran Divakar Leela Kishore Kothamasu Anthony Weerasinghe.
Efficient Hardware dependant Software (HdS) Generation using SW Development Platforms Frédéric ROUSSEAU CASTNESS‘07 Computer Architectures and Software.
The 6713 DSP Starter Kit (DSK) is a low-cost platform which lets customers evaluate and develop applications for the Texas Instruments C67X DSP family.
© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems Operating Systems Overview: Using Hardware.
CASTNESS‘11 Computer Architectures and Software Tools for Numerical Embedded Scalable Systems Workshop & School: Roma January 17-18th 2011 Frédéric ROUSSEAU.
ECE 720T5 Fall 2012 Cyber-Physical Systems Rodolfo Pellizzoni.
DAT2343 Accessing Services Through Interrupts © Alan T. Pinck / Algonquin College; 2003.
University of Michigan Electrical Engineering and Computer Science 1 Dynamic Acceleration of Multithreaded Program Critical Paths in Near-Threshold Systems.
A Flexible Multi-Core Platform For Multi-Standard Video Applications Soo-Ik Chae Center for SoC Design Technology Seoul National University MPSoC 2009.
Real-Time HD Harmonic Inc. Real Time, Single Chip High Definition Video Encoder! December 22, 2004.
Introduction, background, jargon Jakub Yaghob. Literature T.G.Mattson, B.A.Sanders, B.L.Massingill: Patterns for Parallel Programming, Addison- Wesley,
Performance Model & Tools Summary Hung-Hsun Su UPC Group, HCS lab 2/5/2004.
ASIP Architecture for Future Wireless Systems: Flexibility and Customization Joseph Cavallaro and Predrag Radosavljevic Rice University Center for Multimedia.
Chapter 2 Parallel Architecture. Moore’s Law The number of transistors on a chip doubles every years. – Has been valid for over 40 years – Can’t.
Developing a SDR Testbed Alex Dolan Mohammad Khan Ahmet Unsal Project Advisor Dr. Aditya Ramamoorthy.
Hybrid Prototyping of MPSoCs Samar Abdi Electrical and Computer Engineering Concordia University Montreal, Canada
Fluid Software: Handling Heterogeneous Many-Core for Programmer Productivity Nate Clark.
Real Time Operating Systems Introduction to Real-Time Operating Systems (Part I) Course originally developed by Maj Ron Smith.
ATtiny23131 A SEMINAR ON AVR MICROCONTROLLER ATtiny2313.
Hardware Benchmark Results for An Ultra-High Performance Architecture for Embedded Defense Signal and Image Processing Applications September 29, 2004.
1)Leverage raw computational power of GPU  Magnitude performance gains possible.
VLSI Algorithmic Design Automation Lab. THE TI OMAP PLATFORM APPROACH TO SOC.
Data Management for Decision Support Session-4 Prof. Bharat Bhasker.
Shouqing Hao Institute of Computing Technology, Chinese Academy of Sciences Processes Scheduling on Heterogeneous Multi-core Architecture.
© 2004, D. J. Foreman 1 Device Mgmt. © 2004, D. J. Foreman 2 Device Management Organization  Multiple layers ■ Application ■ Operating System ■ Driver.
Data/Frame Memory PE 0 PE 1 PE 2 PE 3 PE N … Control Instruction Memory Interconnect The SIMD Concept.
A 1.2V 26mW Configurable Multiuser Mobile MIMO-OFDM/-OFDMA Baseband Processor Motivations –Most are single user, SISO, downlink OFDM solutions –Training.
Rigel: An Architecture and Scalable Programming Interface for a 1000-core Accelerator Paper Presentation Yifeng (Felix) Zeng University of Missouri.
1 of 14 Lab 2: Formal verification with UPPAAL. 2 of 14 2 The gossiping persons There are n persons. All have one secret to tell, which is not known to.
1 of 14 Lab 2: Design-Space Exploration with MPARM.
Fast Energy Evaluation of Embedded Applications for Many-core Systems Felipe Rosa, Luciano Ost, Thiago Raupp, Fernando Moraes, Ricardo Reis.
The World Leader in High Performance Signal Processing Solutions Heterogeneous Multicore for blackfin implementation Open Platform Solutions Steven Miao.
Embedded Real-Time Systems Processing interrupts Lecturer Department University.
Z IGBEE and OSAL Jaehoon Woo KNU RTLAB. KNU RTLAB.
Sridhar Rajagopal Bryan A. Jones and Joseph R. Cavallaro
Operating Systems CMPSC 473
Resource Aware Scheduler – Initial Results
System On Chip.
Embedded Systems Design
FPGAs in AWS and First Use Cases, Kees Vissers
Ke Bai and Aviral Shrivastava Presented by Bryce Holton
Designing an LTE Baseband MPSoC With a Novel Multi-Core HW/SW Platform Concept Gerhard Fettweis, and Emil Matus, Torsten Limberg, Markus Winter, Reimund.
CS294-1 Reading Aug 28, 2003 Jaein Jeong
Accessing Services Through Interrupts
Multi Core Processing What is term Multi Core?.
A small SOPC-based aircraft autopilot system that contains an FPGA with a Nios processor core, a DSP processor, and memory is seen above. The bottom sensor.
Chip&Core Architecture
Multicore and GPU Programming
DSP Architectures for Future Wireless Base-Stations
Martin Croome VP Business Development GreenWaves Technologies.
Presentation transcript:

chair MPSoC MPSoC Programming Solution “ CoreManager” hardware unit for:  Dependency checking  Task scheduling  Local memory management of PEs  C programmable  No synchronization interrupts  OS scheduling eased Operating System Process Thread t1t2 t3 t4t5 t6 t1t2 t3t4 t6 t5 CP CoreManager TU DresdenSlide 1Heterogeneous MPSoC with Hardware Supported Dynamic Task Scheduling for SDR Processor 0 Processor 1Processor 2 Processor 3

chair TU DresdenSlide 2Heterogeneous MPSoC with Hardware Supported Dynamic Task Scheduling for SDR Heterogeneous MPSoC: ‘Tomahawk’ Vector Fixed Point DSP Scalar Floating Point DSP Core Manager 10 mm Control Processor Scratchpad Memory 5.9 mm² ~280 mW 3.8 mm², ~85 mW 2.5 mm² ~30 mW 3.3 mm², ~27 mW nm UMC; 40 GOPS, MHz LDPC Decoder Filter ASIP Peripherals

chair TU DresdenSlide 3Heterogeneous MPSoC with Hardware Supported Dynamic Task Scheduling for SDR Software Scaling Results 0% or 50% probability of dependence between tasks, 4kB data transfers (in and out)  Hardware task scheduling = power and performance efficient solution for MPSoC programming problem  Scalability depends on:  Task-to-Scheduling time ratio  Inter-Task dependency  Baseband signal processing:  Task time ~10 2 – 10 4 cycles  SW scheduling: ~1000 cycles/task  HW accelerated scheduling: ~60 cycles/task Number of Cores SpeedUp