1 EE249 Discussion A Method for Architecture Exploration for Heterogeneous Signal Processing Systems Sam Williams EE249 Discussion Section October 15,

Slides:



Advertisements
Similar presentations
Network II.5 simulator ..
Advertisements

Computer Architecture
© 2004 Wayne Wolf Topics Task-level partitioning. Hardware/software partitioning.  Bus-based systems.
Hardware/ Software Partitioning 2011 年 12 月 09 日 Peter Marwedel TU Dortmund, Informatik 12 Germany Graphics: © Alexandra Nolte, Gesine Marwedel, 2003 These.
ECOE 560 Design Methodologies and Tools for Software/Hardware Systems Spring 2004 Serdar Taşıran.
CS-334: Computer Architecture
1/1/ / faculty of Electrical Engineering eindhoven university of technology Introduction Part 3: Input/output and co-processors dr.ir. A.C. Verschueren.
- 1 -  P. Marwedel, Univ. Dortmund, Informatik 12, 05/06 Universität Dortmund Hardware/Software Codesign.
Chapter 8 Hardware Conventional Computer Hardware Architecture.
1 HW/SW Partitioning Embedded Systems Design. 2 Hardware/Software Codesign “Exploration of the system design space formed by combinations of hardware.
Behavioral Design Outline –Design Specification –Behavioral Design –Behavioral Specification –Hardware Description Languages –Behavioral Simulation –Behavioral.
Mahapatra-Texas A&M-Fall'001 cosynthesis Introduction to cosynthesis Rabi Mahapatra CPSC498.
I/O Hardware n Incredible variety of I/O devices n Common concepts: – Port – connection point to the computer – Bus (daisy chain or shared direct access)
04/16/2010CSCI 315 Operating Systems Design1 I/O Systems Notice: The slides for this lecture have been largely based on those accompanying an earlier edition.
1 Pupil Detection and Tracking System Lior Zimet Sean Kao EE 249 Project Mentors: Dr. Arnon Amir Yoshi Watanabe.
Scheduling with Optimized Communication for Time-Triggered Embedded Systems Slide 1 Scheduling with Optimized Communication for Time-Triggered Embedded.
Recap – Our First Computer WR System Bus 8 ALU Carry output A B S C OUT F 8 8 To registers’ input/output and clock inputs Sequence of control signal combinations.
Midterm Tuesday October 23 Covers Chapters 3 through 6 - Buses, Clocks, Timing, Edge Triggering, Level Triggering - Cache Memory Systems - Internal Memory.
Transaction Level Modeling Definitions and Approximations Trevor Meyerowitz EE290A Presentation May 12, 2005.
Dipartimento di Informatica - Università di Verona Networked Embedded Systems The HW/SW/Network Cosimulation-based Design Flow Introduction Transaction.
ECE 526 – Network Processing Systems Design IXP XScale and Microengines Chapter 18 & 19: D. E. Comer.
Processor Types And Instruction Sets Barak Perelman CS147 Prof. Lee.
Chapter 17 Microprocessor Fundamentals William Kleitz Digital Electronics with VHDL, Quartus® II Version Copyright ©2006 by Pearson Education, Inc. Upper.
Universität Dortmund  P. Marwedel, Univ. Dortmund, Informatik 12, 2003 Hardware/software partitioning  Functionality to be implemented in software.
1 A survey on Reconfigurable Computing for Signal Processing Applications Anne Pratoomtong Spring2002.
Students: Oleg Korenev Eugene Reznik Supervisor: Rolf Hilgendorf
Input / Output CS 537 – Introduction to Operating Systems.
Yoshinori Takeuchi Osaka University 1MPSoC Osaka University  Expansion of multi-functional portable multimedia devices requires high performance.
1 Presenter: Ming-Shiun Yang Sah, A., Balakrishnan, M., Panda, P.R. Design, Automation & Test in Europe Conference & Exhibition, DATE ‘09. A Generic.
hardware and operating systems basics.
ECE 526 – Network Processing Systems Design Network Processor Architecture and Scalability Chapter 13,14: D. E. Comer.
1 3-General Purpose Processors: Altera Nios II 2 Altera Nios II processor A 32-bit soft core processor from Altera Comes in three cores: Fast, Standard,
4 Linking the Components. © 2005 Pearson Addison-Wesley. All rights reserved Figure 4.1 This chapter focuses on how the hardware layer components are.
THE PHILIPS NEXPERIA DIGITAL VIDEO PLATFORM. The Digital Video Revolution  Transition from Analog to Digital Video  Navigate, store, retrieve and share.
Automated Design of Custom Architecture Tulika Mitra
Computers organization & Assembly Language Chapter 0 INTRODUCTION TO COMPUTING Basic Concepts.
Section 10: Advanced Topics 1 M. Balakrishnan Dept. of Comp. Sci. & Engg. I.I.T. Delhi.
SystemC and Levels of System Abstraction: Part I.
Hardware/Software Co-design Design of Hardware/Software Systems A Class Presentation for VLSI Course by : Akbar Sharifi Based on the work presented in.
Macro instruction synthesis for embedded processors Pinhong Chen Yunjian Jiang (william) - CS252 project presentation.
A Methodology for Architecture Exploration of heterogeneous Signal Processing Systems Paul Lieverse, Pieter van der Wolf, Ed Deprettere, Kees Vissers.
F. Gharsalli, S. Meftali, F. Rousseau, A.A. Jerraya TIMA laboratory 46 avenue Felix Viallet Grenoble Cedex - France Embedded Memory Wrapper Generation.
Dr Mohamed Menacer College of Computer Science and Engineering Taibah University CE-321: Computer.
Performance Characterization and Architecture Exploration of PicoRadio Data Link Layer Mei Xu and Rahul Shah EE249 Project Fall 2001 Mentor: Roberto Passerone.
- 1 - EE898_HW/SW Partitioning Hardware/software partitioning  Functionality to be implemented in software or in hardware? No need to consider special.
COMPUTER ARCHITECTURE. Recommended Text 1Computer Organization and Architecture by William Stallings 2Structured Computer Organisation Andrew S. Tanenbaum.
2 nd Year - 1 st Semester Asst. Lect. Mohammed Salim Computer Architecture I 1.
Computer Architecture Lecture 2 System Buses. Program Concept Hardwired systems are inflexible General purpose hardware can do different tasks, given.
Low-Power Wireless Video System Advisor: Professor Alex Doboli Students: Christian Austin Artur Kasperek Edward Safo.
ECEG-3202 Computer Architecture and Organization Chapter 3 Top Level View of Computer Function and Interconnection.
6. A PPLICATION MAPPING 6.3 HW/SW partitioning 6.4 Mapping to heterogeneous multi-processors 1 6. Application mapping (part 2)
The Microprocessor-based PC System Microprocessor Course Electrical Engineering Department University of Indonesia.
PROJECT - ZYNQ Yakir Peretz Idan Homri Semester - winter 2014 Duration - one semester.
Lecture 12: Reconfigurable Systems II October 20, 2004 ECE 697F Reconfigurable Computing Lecture 12 Reconfigurable Systems II: Exploring Programmable Systems.
 Introduction to Micro processor Introduction to Micro processor  Microprocessor instruction and opcodes Microprocessor instruction and opcodes  Mnemonics.
High Performance Embedded Computing © 2007 Elsevier Chapter 7, part 3: Hardware/Software Co-Design High Performance Embedded Computing Wayne Wolf.
Programmable Logic Controllers LO1: Understand the design and operational characteristics of a PLC system.
CPU/BIOS/BUS CES Industries, Inc. Lesson 8.  Brain of the computer  It is a “Logical Child, that is brain dead”  It can only run programs, and follow.
THE MICROPROCESSOR A microprocessor is a single chip of silicon that performs all of the essential functions of a computer central processor unit (CPU)
Question What technology differentiates the different stages a computer had gone through from generation 1 to present?
System-on-Chip Design Homework Solutions
Programmable Hardware: Hardware or Software?
Introduction to cosynthesis Rabi Mahapatra CSCE617
CoCentirc System Studio (CCSS) by
Architecture Mapping 최기영 (서울대학교, 전기컴퓨터공학부) Copyrightⓒ2003.
Presentation transcript:

1 EE249 Discussion A Method for Architecture Exploration for Heterogeneous Signal Processing Systems Sam Williams EE249 Discussion Section October 15, 2002

2 EE249 Discussion Related Work – System Level Modeling and Analysis Polis/CFSMs –Elements are mapped to hardware and software components –Performance evaluated via simulation –Hardware/Software synthesis Chinook –Design of embedded systems –Mapping to IP blocks –Synthesized communication RASSP System –VHDL modeling of DSPs –ADEPT environment for hardware/software co-design Abstraction of architecture models can provide a speed up in design space exploration SPADE separates architecture from application models –Functionality is not modeled in architecture

3 EE249 Discussion Basics – Workloads and Resources Applications generate workloads –Computation –Communication –Storage Architecture provides resources –Computation  processors, coprocessors, ASIC’s, etc… –Communication  buses, ethernet, specialized interfaces, etc… –Memory  RAMs, ROMs, etc… System is realization of graph connecting computation/memory components via communication components, and the mapping of applications onto it

4 EE249 Discussion Basics – Traces Signals –Logic transitions –Hardware specific Instructions –Specific to ISA –RISC instructions Macro Instructions / Functions (extremely coarse-grain) –iDCT –Structure moves a b c dli a0,addr ld a1,0(a0) addi a1,1,a1 sd a1,0(a0) load_next_frame(frame); decode_frame(frame,temp); copy_frame_to_buffer(temp); update(); …

5 EE249 Discussion Architecture Modeling Functional models not required Data dependent behavior results in data dependent traces Built from library of components Processing Resource: –Trace Driven Execution Unit = trace interpreter Table of latencies for each instruction Could be extended for other metrics (power, cost, etc…) –Some number of communication interfaces Translates generic internal protocol to specific one Other Resources included buses, and memories

6 EE249 Discussion Application Modeling Map functions to Kahn process networks unbounded FIFO’s – acceptable approximation Read/Write operations –generate a trace entry (bytes transferred over channel) –performs the port accesses in the Kahn Process Network Execution operation –only generates trace entries _______ ______ _______ ________ ___________ _____ ________ _______ ______ _________ _____ _______ ___________ _____ ____________ ________ _______ ________ _______ ___________ ________ F1 F2 F3 P1P2

7 EE249 Discussion Mapping, Simulation, and Analysis Mapping –Processes are mapped to a TDEU (n to 1) –Ports are mapped to interfaces of the TDEU (1 to 1) Simulation –Application and Architectural models are co-simulated –Traces are generated on the fly –Performance is generated by co-simulating traces on architecture Analysis –Utilization, Stalls, Latencies, Bandwidth –Could add power, area, cost, etc…

8 EE249 Discussion The Y-Chart Applications and architecture are clearly separable Several applications will be run on this system Representative applications are collected Designer makes a best guess at architecture System is evaluated by mapping each application to the architecture, simulating, and analyzing resulting numbers Designer then redesigns architecture and/or applications and repeats the mapping/simulation flow

9 EE249 Discussion Y-Chart (continued) Applications (C/C++) Application Models SpecBlocks Architecture Model Mapping Analysis Simulations remap repartition rearchitect Function|Latency Table Cycle accurate simulator Databook Guesses

10 EE249 Discussion MPEG2 Example C code was partitioned and mapped to Kahn Process Network Run standalone to gather frequencies of operations, and bandwidth requirements Mapped to TriMedia MPEG2 system (10 processing elements/33 interface) Simulations on a series of streams / bus loads / frame periods, resulting in a metric frames dropped Slow down for performance simulation was about 3600 from hardware –300 CPU days for a 2 hour movie –Limits to only analyze short clips

11 EE249 Discussion Conclusion Easy exploration of heterogeneous programmable architectures On the fly trace driven co-simulation Functionality is not required, only behavior Can be extended to analyze any number of metrics (power, cost, area, etc…) – they didn’t –Frames_Dropped(x,y,z,…)=0 –Power(x,y,z,…)<25W –Cost(x,y,z,…)<$30 ×Application is partitioned by hand ×Mapping is performed by hand ×Performance characteristics of components must be simulated, known, or estimated