System-level Trade-off of Networks-on-Chip Architecture Choices Network-on-Chip System-on-Chip Group, CSE-IMM, DTU.

Slides:



Advertisements
Similar presentations
Comparison Of Network On Chip Topologies Ahmet Salih BÜYÜKKAYHAN Fall.
Advertisements

Interactive lesson about operating system
November 23, 2005 Egor Bondarev, Michel Chaudron, Peter de With Scenario-based PA Method for Dynamic Component-Based Systems Egor Bondarev, Michel Chaudron,
ECOE 560 Design Methodologies and Tools for Software/Hardware Systems Spring 2004 Serdar Taşıran.
Dynamic Thread Assignment on Heterogeneous Multiprocessor Architectures Pree Thiengburanathum Advanced computer architecture Oct 24,
Static Bus Schedule aware Scratchpad Allocation in Multiprocessors Sudipta Chattopadhyay Abhik Roychoudhury National University of Singapore.
Modeling shared cache and bus in multi-core platforms for timing analysis Sudipta Chattopadhyay Abhik Roychoudhury Tulika Mitra.
1 Multi - Core fast Communication for SoPC Multi - Core fast Communication for SoPC Technion – Israel Institute of Technology Department of Electrical.
Embedded and Real Time Systems Lecture #4 David Andrews
SERC Security Systems Engineering Initiative Dr. Clifford Neuman, Director USC Center for Computer Systems Security Information Sciences Institute University.
Formalizing the ARTS MPSoC Model in UPPAAL Jan Madsen Embedded Systems Engineering Group Informatics and Mathematical Modeling Technical University of.
1 of 16 March 30, 2000 Bus Access Optimization for Distributed Embedded Systems Based on Schedulability Analysis Paul Pop, Petru Eles, Zebo Peng Department.
1 Oct 2, 2003 Design Optimization of Mixed Time/Event-Triggered Distributed Embedded Systems Traian Pop, Petru Eles, Zebo Peng Embedded Systems Laboratory.
1 of 14 1/15 Design Optimization of Multi-Cluster Embedded Systems for Real-Time Applications Paul Pop, Petru Eles, Zebo Peng, Viaceslav Izosimov Embedded.
ARTIST2 Network of Excellence on Embedded Systems Design cluster meeting –Bologna, May 22 nd, 2006 System Modelling Infrastructure Activity leader : Jan.
Electronic Systems 1 Problem 20: MP3 mapped on NoC-based MPSoC Application Model SDF model (without auto-concurrency) of (modified) MP3 in a certain mode.
Network-on-Chip: Communication Synthesis Department of Computer Science Texas A&M University.
A Tool for Describing and Evaluating Hierarchical Real-Time Bus Scheduling Policies Author: Trevor Meyerowitz, Claudio Pinello, Alberto DAC2003, June 24,2003.
Torino (Italy) – June 25th, 2013 Ant Colony Optimization for Mapping, Scheduling and Placing in Reconfigurable Systems Christian Pilato Fabrizio Ferrandi,
1 Embedded Computer System Laboratory RTOS Modeling in Electronic System Level Design.
(2 + 1) + 4 = 2 + (1 + 4) Associative Property of Addition.
Design and Implementation of a Single System Image Operating System for High Performance Computing on Clusters Christine MORIN PARIS project-team, IRISA/INRIA.
1 Presenter: Ming-Shiun Yang Sah, A., Balakrishnan, M., Panda, P.R. Design, Automation & Test in Europe Conference & Exhibition, DATE ‘09. A Generic.
EtherCAT Protocol Implementation Issues on an Embedded Linux Platform
Course Outline DayContents Day 1 Introduction Motivation, definitions, properties of embedded systems, outline of the current course How to specify embedded.
1 Albert Ferrer-Florit, Steve Parkes Space Technology Centre University of Dundee QoS for SpaceWire networks SpW-RT prototyping.
German National Research Center for Information Technology Research Institute for Computer Architecture and Software Technology German National Research.
CHALLENGING SCHEDULING PROBLEM IN THE FIELD OF SYSTEM DESIGN Alessio Guerri Michele Lombardi * Michela Milano DEIS, University of Bologna.
Understanding Operating Systems Flynn & McHoes
Providing QoS with Virtual Private Machines Kyle J. Nesbit, James Laudon, and James E. Smith.
Real-Time Operating Systems for Embedded Computing 李姿宜 R ,06,10.
Copyright © George Coulouris, Jean Dollimore, Tim Kindberg This material is made available for private study and for direct.
Analysis and Optimization of Mixed-Criticality Applications on Partitioned Distributed Architectures Domițian Tămaș-Selicean, Sorin Ovidiu Marinescu and.
Managing Distributed, Shared L2 Caches through OS-Level Page Allocation Jason Bosko March 5 th, 2008 Based on “Managing Distributed, Shared L2 Caches through.
Summary :-Distributed Process Scheduling Prepared By:- Monika Patel.
C OMPARING T HREE H EURISTIC S EARCH M ETHODS FOR F UNCTIONAL P ARTITIONING IN H ARDWARE -S OFTWARE C ODESIGN Theerayod Wiangtong, Peter Y. K. Cheung and.
Computer Networks 2 Network Topology Prepared May Lau 2011.
A Utility-based Approach to Scheduling Multimedia Streams in P2P Systems Fang Chen Computer Science Dept. University of California, Riverside
A Systematic Approach to the Design of Distributed Wearable Systems Urs Anliker, Jan Beutel, Matthias Dyer, Rolf Enzler, Paul Lukowicz Computer Engineering.
System-level power analysis and estimation September 20, 2006 Chong-Min Kyung.
Axel Jantsch 1 Networks on Chip Axel Jantsch 1 Shashi Kumar 1, Juha-Pekka Soininen 2, Martti Forsell 2, Mikael Millberg 1, Johnny Öberg 1, Kari Tiensurjä.
Axel Jantsch 1 Networks on Chip A Paradigm Change ? Axel Jantsch Laboratory of Electronics and Computer Systems, Royal Institute of Technology, Stockholm.
Teaching The Principles Of System Design, Platform Development and Hardware Acceleration Tim Kranich
(2 + 1) + 4 = 2 + (1 + 4) Associative Property of Addition.
1 of 14 1/15 Schedulability-Driven Frame Packing for Multi-Cluster Distributed Embedded Systems Paul Pop, Petru Eles, Zebo Peng Embedded Systems Lab (ESLAB)
Multiprocessor SoC integration Method: A Case Study on Nexperia, Li Bin, Mengtian Rong Presented by Pei-Wei Li.
Spring EE 437 Lillevik 437s06-l22 University of Portland School of Engineering Advanced Computer Architecture Lecture 22 Distributed computer Interconnection.
by D. Fisher (2 + 1) + 4 = 2 + (1 + 4) Associative Property of Addition 1.
(2 + 1) + 4 = 2 + (1 + 4) Associative Property of Addition.
Jamie Unger-Fink John David Eriksen.  Allocation and Scheduling Problem  Better MPSoC optimization tool needed  IP and CP alone not good enough  Communication.
Task Mapping and Partition Allocation for Mixed-Criticality Real-Time Systems Domițian Tămaș-Selicean and Paul Pop Technical University of Denmark.
Towards a Framework to Evaluate Performance of the NoCs Mahmoud Moadeli University of Glasgow.
1 of 14 Lab 2: Formal verification with UPPAAL. 2 of 14 2 The gossiping persons There are n persons. All have one secret to tell, which is not known to.
Design Space Exploration for NoC Topologies ECE757 6 th May 2009 By Amit Kumar, Kanchan Damle, Muhammad Shoaib Bin Altaf, Janaki K.M Jillella Course Instructor:
Computer Science and Engineering Parallel and Distributed Processing CSE 8380 April 28, 2005 Session 29.
1 of 14 Lab 2: Design-Space Exploration with MPARM.
Optimization of Time-Partitions for Mixed-Criticality Real-Time Distributed Embedded Systems Domițian Tămaș-Selicean and Paul Pop Technical University.
WoPANets: Decision-support Tool for real-time Networks Design
Deterministic Communication with SpaceWire
Mobile Operating System
Paul Pop, Petru Eles, Zebo Peng
Fault-Tolerant NoC-based Manycore system: Reconfiguration & Scheduling
Real-time Software Design
Gabor Madl Ph.D. Candidate, UC Irvine Advisor: Nikil Dutt
Fault and Energy Aware Communication Mapping with Guaranteed Latency for Applications Implemented on NoC Sorin Manolache, Petru Eles, Zebo Peng {sorma,
Buffer Space Optimisation with Communication Mapping and Traffic Shaping for NoCs Sorin Manolache, Petru Eles, Zebo Peng Linköping University, Sweden.
Final Project presentation
Presented by Neha Agrawal
Department of Electrical Engineering Joint work with Jiong Luo
Communication Driven Remapping of Processing Element (PE) in Fault-tolerant NoC-based MPSoCs Chia-Ling Chen, Yen-Hao Chen and TingTing Hwang Department.
Presentation transcript:

System-level Trade-off of Networks-on-Chip Architecture Choices Network-on-Chip System-on-Chip Group, CSE-IMM, DTU

© System-on-Chip Group, CSE-IMM, DTU 2 Motivation abc 1 os 3 4 HdS mapping Application Middleware Hardware System-on-Chip Network Tasks and their dependencies ac Network b 2 5

© System-on-Chip Group, CSE-IMM, DTU 3 System-level Analysis Ω Consequences of different application decomposition and mappings of tasks to processors – software or hardware Ω Effects of different middleware – scheduling, synchronization and resource allocation policies Ω Effects of different network topologies and communication protocols.

© System-on-Chip Group, CSE-IMM, DTU 4 Outline Ω Motivation Ω Modeling of Communication  Properties of Networks-on-Chip (NoC) Ω Example Ω Design Space Exploration  Timing Aware and others Ω Conclusions

© System-on-Chip Group, CSE-IMM, DTU 5 Modeling of Communication ab a b ab BUS a b L1 R1 L2 R2 R3L4L3 ba 12 NoC R1 L1 R2 L2 R3 1 a b R1 L1 R2 L2 R3 2 Point-to-point Networks-on-Chip (eg. Mesh) BUS NoC combines multi-hop, concurrency and sharing

© System-on-Chip Group, CSE-IMM, DTU 6 System Analysis Methodology Choose hardware Map tasks Choose communication architecture Evaluate the performance and cost Iterate until performance and cost are met Optimal System!! Specifically for NoC

© System-on-Chip Group, CSE-IMM, DTU 7 Networked Multi-processors Ω Data transfers between processors are considered as message tasks Ω The network can be considered as a communication processor on which message tasks are scheduled Ω The network provides,  Topology = resource allocator  Protocol = scheduler

© System-on-Chip Group, CSE-IMM, DTU 8 Design Space Exploration z Tasks and their dependencies 2 x y bac Z Y X Network?? Allocation Aware bac Z YX Network?? Timing Aware Simple MPSoC Example 5 Identical Tasks 3 Inter-task Dependencies 3 Identical Processors Unknown Network!!

© System-on-Chip Group, CSE-IMM, DTU 9 Timing Aware bus b a c BUS x y z R1 R2 R3 L1L2 L3 L1 L2 b a c L x x 5 1 z y L1 L2 b a c L x y z z X: R1,L3,R3,L2,R2 Y: R3,L2,R2 Z: R1,L3,R3 z Tasks and their dependencies 2 x y PE a : 1 & 2 PE b : 3 PE c : 4 & 5 X: R1,L1,R2 Y: R3,L2,R2 Z: R1,L1,R2,L2,R3 X: BUS Y: BUS Z: BUS TORUSMESHBUS 888 L4 L1L2 L3 R3 R2R1 X: R1,L3,R3,L2,R2 Y: R3,L2,R2 Z: R1,L3,R3 X: R1,L3,R3,L2,R2 Y: R3,L2,R2 Z: R1,L3,R3

© System-on-Chip Group, CSE-IMM, DTU 10 Deadline-based Performance b a c bus x y z L1 L2 b a c L x y z z L1 L2 b a c L x x z y QoS Aware Any traffic from “a” has higher priority Timing Aware PE a : 1 & 2 PE b : 3 PE c : 4 &5 L3 L4 y x z b a c bus z y x L1 L2 b a c z L1 L3 b a c z x x xy Allocation Aware PE a : 2 & 3 PE b : 4 &5 PE c : 1 b a c bus x y z L1 L2 b a c L x x z y L1 L2 b a c L x y z z TORUSMESHBUS

© System-on-Chip Group, CSE-IMM, DTU 11 Power Profile Timing Aware PE a : 1 & 2 PE b : 3 PE c : 4 &5 L4 b a c bus z y x L3 b a c z L1 L2 b a c zx xy Allocation Aware PE a : 2 & 3 PE b : 4 &5 PE c : 1 b a c bus x y z L1 L2 b a c L x x z y QoS Aware Any traffic from “a” has higher priority b a c bus x y z L1 L2 b a c L x x z y L2 b a c L3 TORUS L1 L2 b a c L x y z z y x L z x x y z z MESHBUS Deadline- based Performance

© System-on-Chip Group, CSE-IMM, DTU 12 Power Profile L3 L1 L2 b a c z zx xy b a c bus x y z TORUSBUS Deadline- based Performance = 100 power unit = 10 power unit Power Profile power units power-units/cycle power-units/cycle

© System-on-Chip Group, CSE-IMM, DTU 13 Power Profile over 3 Period Torus Bus ~4 cycles faster Torus is faster but causes power spikes!!! 250% 201%

© System-on-Chip Group, CSE-IMM, DTU 14 Conclusions Ω System-level modeling framework which combines application, middleware and execution platform Ω Extension to model network-on-chip Ω Example  System-level trade-off analysis  Early design space exploration Ω Work in progress  Find real application for evaluation!!