By M.M. Bassiri and H. S. Shahhoseini Elisha Colmenar

Slides:



Advertisements
Similar presentations
Part IV: Memory Management
Advertisements

CprE 458/558: Real-Time Systems (G. Manimaran)1 CprE 458/558: Real-Time Systems Fault-Tolerant Scheduling Techniques.
CPE555A: Real-Time Embedded Systems
Inpainting Assigment – Tips and Hints Outline how to design a good test plan selection of dimensions to test along selection of values for each dimension.
1 SECURE-PARTIAL RECONFIGURATION OF FPGAs MSc.Fisnik KRAJA Computer Engineering Department, Faculty Of Information Technology, Polytechnic University of.
CprE 458/558: Real-Time Systems (G. Manimaran)1 CprE 458/558: Real-Time Systems Dynamic Planning Based Scheduling.
Soft Real-Time Semi-Partitioned Scheduling with Restricted Migrations on Uniform Heterogeneous Multiprocessors Kecheng Yang James H. Anderson Dept. of.
Zheming CSCE715.  A wireless sensor network (WSN) ◦ Spatially distributed sensors to monitor physical or environmental conditions, and to cooperatively.
Towards Feasibility Region Calculus: An End-to-end Schedulability Analysis of Real- Time Multistage Execution William Hawkins and Tarek Abdelzaher Presented.
Memory Management Chapter 7. Memory Management Subdividing memory to accommodate multiple processes Memory needs to be allocated efficiently to pack as.
Locality-Aware Request Distribution in Cluster-based Network Servers 1. Introduction and Motivation --- Why have this idea? 2. Strategies --- How to implement?
Memory Management Chapter 5.
On the Task Assignment Problem : Two New Efficient Heuristic Algorithms.
Memory Management Chapter 7.
Network Aware Resource Allocation in Distributed Clouds.
Non-Preemptive Access to Shared Resources in Hierarchical Real-Time Systems Marko Bertogna, Fabio Checconi, Dario Faggioli CRTS workshop – Barcelona, November,
Implementing Codesign in Xilinx Virtex II Pro Betim Çiço, Hergys Rexha Department of Informatics Engineering Faculty of Information Technologies Polytechnic.
GRID’2012 Dubna July 19, 2012 Dependable Job-flow Dispatching and Scheduling in Virtual Organizations of Distributed Computing Environments Victor Toporkov.
Scheduling policies for real- time embedded systems.
Subject: Operating System.
Background Gaussian Elimination Fault Tolerance Single or multiple core failures: Single or multiple core additions: Simultaneous core failures and additions:
1 Memory Management Chapter 7. 2 Memory Management Subdividing memory to accommodate multiple processes Memory needs to be allocated to ensure a reasonable.
Object-Oriented Design and Implementation of the OE-Scheduler in Real-time Environments Ilhyun Lee Cherry K. Owen Haesun K. Lee The University of Texas.
C OMPARING T HREE H EURISTIC S EARCH M ETHODS FOR F UNCTIONAL P ARTITIONING IN H ARDWARE -S OFTWARE C ODESIGN Theerayod Wiangtong, Peter Y. K. Cheung and.
2013/12/09 Yun-Chung Yang Partitioning and Allocation of Scratch-Pad Memory for Priority-Based Preemptive Multi-Task Systems Takase, H. ; Tomiyama, H.
6. Application mapping 6.1 Problem definition
Paper Review Presentation Paper Title: Hardware Assisted Two Dimensional Ultra Fast Placement Presented by: Mahdi Elghazali Course: Reconfigurable Computing.
Task Graph Scheduling for RTR Paper Review By Gregor Scott.
O PTIMAL SERVICE TASK PARTITION AND DISTRIBUTION IN GRID SYSTEM WITH STAR TOPOLOGY G REGORY L EVITIN, Y UAN -S HUN D AI Adviser: Frank, Yeong-Sung Lin.
Operating Systems for Reconfigurable Embedded Platforms: Online Scheduling of Real-Time Tasks Jinxu Ding Ramón Mercado.
Operating Systems for Reconfigurable Embedded Platforms: Online Scheduling of Real-Time Tasks -Ramkumar Shankar.
Real-Time Support for Mobile Robotics K. Ramamritham (+ Li Huan, Prashant Shenoy, Rod Grupen)
MROrder: Flexible Job Ordering Optimization for Online MapReduce Workloads School of Computer Engineering Nanyang Technological University 30 th Aug 2013.
Copyright © Cengage Learning. All rights reserved. 12 Analysis of Variance.
Run-time Adaptive on-chip Communication Scheme 林孟諭 Dept. of Electrical Engineering National Cheng Kung University Tainan, Taiwan, R.O.C.
Static Process Scheduling
1 JOB SEQUENCING WITH DEADLINES The problem is stated as below. There are n jobs to be processed on a machine. Each job i has a deadline d i ≥ 0 and profit.
Physically Aware HW/SW Partitioning for Reconfigurable Architectures with Partial Dynamic Reconfiguration Sudarshan Banarjee, Elaheh Bozorgzadeh, Nikil.
1 Hardware-Software Co-Synthesis of Low Power Real-Time Distributed Embedded Systems with Dynamically Reconfigurable FPGAs Li Shang and Niraj K.Jha Proceedings.
Clock Driven Scheduling
Determining Optimal Processor Speeds for Periodic Real-Time Tasks with Different Power Characteristics H. Aydın, R. Melhem, D. Mossé, P.M. Alvarez University.
Contents  Introduction  Image Collection planning system  Optimized Satellite Planning System for Multi-Satellite Operation  Optimization.
Memory management The main purpose of a computer system is to execute programs. These programs, together with the data they access, must be in main memory.
CSIE & NC Chaoyang University of Technology Taichung, Taiwan, ROC
Memory Management Chapter 7.
Memory Management.
ITEC 202 Operating Systems
Dynamo: A Runtime Codesign Environment
Memory Allocation The main memory must accommodate both:
Xiaodong Wang, Shuang Chen, Jeff Setter,
Conception of parallel algorithms
Parallel Programming By J. H. Wang May 2, 2017.
Chapter 9 – Real Memory Organization and Management
Chapter 8 – Processor Scheduling
FPGA: Real needs and limits
Imprecise Computation September 7, 2006
Clock Driven Scheduling
A WRM-based Application
Period Optimization for Hard Real-time Distributed Automotive Systems
Effective Social Network Quarantine with Minimal Isolation Costs
Department of Computer Science & Engineering
Multi-hop Coflow Routing and Scheduling in Data Centers
Jian Huang, Matthew Parris, Jooheung Lee, and Ronald F. DeMara
Mi Zhou, Li-Hong Shang Yu Hu, Jing Zhang
Limited-Preemption Scheduling of Sporadic Tasks Systems
Processes and operating systems
COMP108 Algorithmic Foundations Dynamic Programming
IIS Progress Report 2016/01/18.
Anand Bhat*, Soheil Samii†, Raj Rajkumar* *Carnegie Mellon University
Presentation transcript:

Configuration Reusing in On-Line Task Scheduling for Reconfigurable Computing Systems By M.M. Bassiri and H. S. Shahhoseini Elisha Colmenar School of Engineering Engineering Systems and Computing

Outline Introduction Background Real-Time Application Defining the Problem Proposed Algorithm Simulation Results Critiques Conclusion

Introduction Reconfigurable Computing (RC) systems can Reconfigured at runtime Support partial reconfigurability Execute tasks in true multitasking manner Reconfigurable Operating System Reconfigurable Processing Unit (RPU) Task scheduler that manages resources for tasks and assigns them to the RPU This paper targets 1D area scheduling. - So we all know the what RC is. RC systems can be reconfig at runtime and support partial reconfig which makes us able to execute tasks in a true multitasking manner. To manage such systems at runtime, a reconfig OS is needed. All operating systems need an efficient task scheduler to manager One main responsibility is the management of resources

Background Let task Ti = (wi, hi, ei, ai, di, ri) Soft real-time and independent tasks Let task Ti = (wi, hi, ei, ai, di, ri) wi = width hi = height ei = execution time ai = arrival time di = deadline ri = reconfiguration time of the task Soft real-time : missed deadline is not fatal Independent task: No task dependency between tasks

Background Cont’d Two types of RPU area models for task mapping 2D Area Model 1D Area Model Flexible 2D Area Model Partitioned 2D Flexible 1D Partitioned 1D

Real-Time Application There are large variety of applications which can satisfy the stated assumptions such as: Reconfigurable co-processor implementation Image and video processing Cryptography Telecommunication Neural network implementation

Defining the Problem Find an efficient on-line scheduler and placement of independent tasks in the RPU with 1D area model. The main goals are to minimize: Task rejection ratio (TRR) Overall execution time of the tasks Reconfiguration overhead Reconfig overhead -> reconfig time

Proposed Algorithm Reusing-based scheduling (RBS) Ti = Task Ai = arrival time Ri = reconfigurable time RBS method is that some of arrived tasks are not removed from RPU surface after the completion of their execution and they stay on RPU surface as long as possible. This eliminates reconfiguration overhead since we are reusing the configuration that is already on the RPU. However, with this method alone, it is impossible to maintain all arrived tasks on the RPU. Therefore, the algorithm is extended

Division of Tasks Arriving task Existing tasks are categorized Non-significant Significant Large overhead Probability of recurrence (Poisson’s probability distribution) Existing tasks are categorized Running tasks Scheduled tasks Preserved tasks Divide the arriving tasks into two groups: (i.e. P2 and P1) - Significant (P2) : These are preserved for configuration reuse to reduce reconfig overhead If Large reconfig overhead (Oi) and large probability of recurrence. - Non-significant (P1): Removed from RPU after their completion Divide the existing tasks into three categories: - Running tasks: Currently executing on the RPU - Scheduled tasks: Scheduled but have not started yet - Preserved tasks: Execution have been completed. Divide the RPU into three: free area, occupied area and preserved area

Total Scheme of RPU Partitioning RPU Area divided into: Free area Preserved area Occupied Area This prevents surface fragmentation Divide the arriving tasks into two groups: significant and non-significant. (i.e. P2 and P1) Divide the existing tasks into three categories: Running tasks, scheduled tasks and preserved tasks. Divide the RPU into three: (i.e. Shaded area and white area) - Free area: no tasks - Occupied area: running tasks and scheduled tasks - Preserved area : preserved tasks As you can see in this figure. Free Area and preserved area. Non sig P1 and sig P2 tasks that have a certain width (Wp1) This can be dynamically resized during runtime

Dynamic Partitioning Partition Utilization (PU) where Woccupied = total width of occupied area Wpreserved = total width of preserved area Wfree = total width of free area on Partition Pi WPi = width of partition i Constraint for max allowable (feasible) partition size.

Task Placement Best-fit policy Smallest preserved task that can accommodate arrived task selected for replacement Least Probability of recurrence (LPR) policy Task that has smallest probability of recurrence Large enough to accommodate the new task Replace old task with new task

Pseudo-code of Algorithm if (SGi = 0) /*Non-significant Task */ if(free area in P1) {Use Stuffng method} else /*No free Area*/ {if ( Partition Resizing is feasible) {P1 is expanded and new task is placed in P1} else /*No Resizing*/ {Task is rejected}} else /*SGi = 1 Significant Task*/ { Scan occupied area and preserved area of P2} if( Task exists in P2) /*Reuse Task*/ if(existing instance is running) {fi = sj + ei /*Finish time = start time - exec time*/ LST = di - ei /* Latest start time = dead time = exec time */ if (LST > fj) {Finish running instance and schedule task to run new task immediately after } else /*LST < fj*/ if (Free Area is free) {Use free Area} elseif (Partition Expansion Feasible) {Resize the partition} elseif (Any task on P2 can be replaced ) {Replace with new Task} else { New Task is Rejected }} else {Use free Area} Use Stuffng method /*Schedules tasks into arbitrary free rectangles that will exist in the future*/

Simulation Setup C ++ Language 1GHz Pentium III computer Simulated device 96 × 64 RCU (Xilinx XCV1000 FPGA – Virtex 2.5) There are four groups of input Each task group includes 20 task sets which includes 50 synthetic tasks

Simulation Setup Cont’D Probability distribution rate that is based on real world applications are shown in Table 1 Simulated arriving tasks have a probability distribution rate that is based on real world applications

Simulation Setup Cont’d There are four groups of input with a parameter called repetition rate (RR): Task groups: Group A: 5 ≤ RRA ≤ 15 Group B: 15 < RRB ≤ 30 Group C: 30 < RRC ≤ 60 Group D: RRD = 0 The other parameters are shown in Table 2

Simulation Setup Cont’d For the simulated results, instead of using arriving task time they used a Workload parameter: Ti = Arriving Task ΔT = Time interval from arrival of first to last task W = total width of RPU

Simulation Setup Cont’d Reusing-Based Scheduling (RBS) Window-based Stuffing (WBS) Fragmentation-Aware by Handa (FA-H) Fragmentation-Aware by Cui (FA-C)

Simulation Results Task Group A Task Group B Task Group C Task Group D As the task repition becomes larger, RBS task rejection ratio is lower. Also as expected the TRR is not better in group D because the repition ratio is equal to zero.

Total Execution Time of Tasks RBS is much better than the other algorithms because of the high configuration reusing in RBS alg.

Runtime Comparison

Critiques Can be reproduced but should use the latest Xilinx device (Virtex 6 or 7) and faster computer It was well written Good flow from the beginning to end Simulation Results Clarity of Workload parameter was not obvious for various algorithms Only uses 1D area model when there are 2D area models that are more efficient with configurable resource utilization In my own opinion: This paper Critiques The algorithm presented in this article can be reproduced, but should be reproduced using the latest Xilinx device (Vertex 6) and faster computer It was well written with a good flow from the beginning with a small background to the end with simulation results. The Workload simulation results were not really necessary. Them more important ones were the execution and runtime comparison results The algorithm pro Only uses 1D area model when there are 2D area models that are more efficient with configurable resource utilization.

Conclusion Reusing-Based Scheduling (RBS) Performs on-line scheduling and placement based on maximal task reusing. RBS is a solution to the main goals: Minimizes task rejection ratio (TRR) Has the lowest execution time of the tasks Reduces reconfiguration overhead

THANK YOU