Partitioning Presented by AMIT KUMAR GUPTA(2001VLS007)

Slides:

Advertisements

Similar presentations

Heuristic Search techniques

Advertisements

G5BAIM Artificial Intelligence Methods

Constraint Optimization We are interested in the general non-linear programming problem like the following Find x which optimizes f(x) subject to gi(x)

Hardware/ Software Partitioning 2011 年 12 月 09 日 Peter Marwedel TU Dortmund, Informatik 12 Germany Graphics: © Alexandra Nolte, Gesine Marwedel, 2003 These.

ECE-777 System Level Design and Automation Hardware/Software Co-design

1 An Adaptive GA for Multi Objective Flexible Manufacturing Systems A. Younes, H. Ghenniwa, S. Areibi uoguelph.ca.

Using Parallel Genetic Algorithm in a Predictive Job Scheduling

Zhiguo Ge, Weng-Fai Wong, and Hock-Beng Lim Proceedings of the Design, Automation, and Test in Europe Conference, 2007 (DATE’07) April /4/17.

Neural and Evolutionary Computing - Lecture 4 1 Random Search Algorithms. Simulated Annealing Motivation Simple Random Search Algorithms Simulated Annealing.

- 1 -  P. Marwedel, Univ. Dortmund, Informatik 12, 05/06 Universität Dortmund Hardware/Software Codesign.

Spie98-1 Evolutionary Algorithms, Simulated Annealing, and Tabu Search: A Comparative Study H. Youssef, S. M. Sait, H. Adiche

Content Based Image Clustering and Image Retrieval Using Multiple Instance Learning Using Multiple Instance Learning Xin Chen Advisor: Chengcui Zhang Department.

Date:2011/06/08 吳昕澧 BOA: The Bayesian Optimization Algorithm.

Multiobjective VLSI Cell Placement Using Distributed Simulated Evolution Algorithm Sadiq M. Sait, Mustafa I. Ali, Ali Zaidi.

1 HW/SW Partitioning Embedded Systems Design. 2 Hardware/Software Codesign “Exploration of the system design space formed by combinations of hardware.

CS244-Introduction to Embedded Systems and Ubiquitous Computing Instructor: Eli Bozorgzadeh Computer Science Department UC Irvine Winter 2010.

Process Scheduling for Performance Estimation and Synthesis of Hardware/Software Systems Slide 1 Process Scheduling for Performance Estimation and Synthesis.

System Partitioning Kris Kuchcinski

MAE 552 – Heuristic Optimization

D Nagesh Kumar, IIScOptimization Methods: M1L4 1 Introduction and Basic Concepts Classical and Advanced Techniques for Optimization.

Universität Dortmund  P. Marwedel, Univ. Dortmund, Informatik 12, 2003 Hardware/software partitioning  Functionality to be implemented in software.

PRESENTED BY: MOHAMAD HAMMAM ALSAFRJALANI UFL ECE Dept. 3/19/2010 UFL ECE Dept 1 SYSTEM LEVEL HARDWARE/SOFTWARE PARTITIONING BASED ON SIMULATED ANNEALING.

Modeling and simulation of systems Simulation optimization and example of its usage in flexible production system control.

1 Local search and optimization Local search= use single current state and move to neighboring states. Advantages: –Use very little memory –Find often.

Optimization Problems - Optimization: In the real world, there are many problems (e.g. Traveling Salesman Problem, Playing Chess ) that have numerous possible.

1 Distributed Energy-Efficient Scheduling for Data-Intensive Applications with Deadline Constraints on Data Grids Cong Liu and Xiao Qin Auburn University.

1 Exploring Custom Instruction Synthesis for Application-Specific Instruction Set Processors with Multiple Design Objectives Lin, Hai Fei, Yunsi ACM/IEEE.

Heuristic Optimization Methods Tabu Search: Advanced Topics.

Major objective of this course is: Design and analysis of modern algorithms Different variants Accuracy Efficiency Comparing efficiencies Motivation thinking.

CS244-Introduction to Embedded Systems and Ubiquitous Computing Instructor: Eli Bozorgzadeh Computer Science Department UC Irvine Winter 2010.

Resource Mapping and Scheduling for Heterogeneous Network Processor Systems Liang Yang, Tushar Gohad, Pavel Ghosh, Devesh Sinha, Arunabha Sen and Andrea.

Exact and heuristics algorithms

C OMPARING T HREE H EURISTIC S EARCH M ETHODS FOR F UNCTIONAL P ARTITIONING IN H ARDWARE -S OFTWARE C ODESIGN Theerayod Wiangtong, Peter Y. K. Cheung and.

Task Graph Scheduling for RTR Paper Review By Gregor Scott.

6. A PPLICATION MAPPING 6.3 HW/SW partitioning 6.4 Mapping to heterogeneous multi-processors 1 6. Application mapping (part 2)

Reactive Tabu Search Contents A brief review of search techniques

1 Copyright  2001 Pao-Ann Hsiung SW HW Module Outline l Introduction l Unified HW/SW Representations l HW/SW Partitioning Techniques l Integrated HW/SW.

Optimization Problems

D Nagesh Kumar, IIScOptimization Methods: M8L5 1 Advanced Topics in Optimization Evolutionary Algorithms for Optimization and Search.

ESE 566: Hardware/Software Co-Design of Embedded Systems Fall 2005 Instructor: Dr. Alex Doboli. Paper discussed in class: P. Eles, Z. Peng, K. Kuchcinski,

Multi-cellular paradigm The molecular level can support self- replication (and self- repair). But we also need cells that can be designed to fit the specific.

EVOLUTIONARY SYSTEMS AND GENETIC ALGORITHMS NAME: AKSHITKUMAR PATEL STUDENT ID: GRAD POSITION PAPER.

Optimization Problems

CSCI 4310 Lecture 10: Local Search Algorithms

Heuristic Optimization Methods

Digital Optimization Martynas Vaidelys.

School of Computer Science & Engineering

C.-S. Shieh, EC, KUAS, Taiwan

Auburn University COMP7330/7336 Advanced Parallel and Distributed Computing Data Partition Dr. Xiao Qin Auburn University.

Improving cache performance of MPEG video codec

Artificial Intelligence (CS 370D)

Subject Name: Operation Research Subject Code: 10CS661 Prepared By:Mrs

Designing of Cellular Mobile Networks Using Modern Heuristics

CSCI1600: Embedded and Real Time Software

Effective Social Network Quarantine with Minimal Isolation Costs

Objective of This Course

Heuristic search INT 404.

Optimization Problems

CSE 589 Applied Algorithms Spring 1999

Multi-Objective Optimization

Packet Classification with Evolvable Hardware Hash Functions

School of Computer Science & Engineering

“Hard” Optimization Problems

Algorithms for Budget-Constrained Survivable Topology Design

The use of Neural Networks to schedule flow-shop with dynamic job arrival ‘A Multi-Neural Network Learning for lot Sizing and Sequencing on a Flow-Shop’

EE368 Soft Computing Genetic Algorithms.

Boltzmann Machine (BM) (§6.4)

Operating Systems: Internals and Design Principles, 6/E

CSCI1600: Embedded and Real Time Software

Alex Bolsoy, Jonathan Suggs, Casey Wenner

Presentation transcript:

Partitioning Presented by AMIT KUMAR GUPTA(2001VLS007) RAM BABU ROY(2001VLS022)

Agenda………………… Motivation Objective System partitioning-structural and functional Major partitioning issues Surveying of basic algorithms Partitioning functionality among hardware components Partitioning functionality among both hardware and software components Hardware components are implemented by designing structure Software components by compiling software Applications of partitioning

Motivation……………………… Design automation to system level To support integrated design of HW/SW

What HW/SW partitioning actually means?….. Selection of appropriate part of the system for HW/SW implementation This has got crucial impact on cost and overall performance of the system For small systems partitioning can be done by designer’s experience and intuition For large systems it needs high performance heuristics and CAD tools

Steps in partitioning HW/SW partitioning Problem Formulation Optimize Performance Cost minimization Satisfy all design constraints Some performance constraints are only HW,so dedicated ASIC/FPGA used

Partitioning approaches differ in……………….. Initial specification Level of granularity Degree of automation Cost function Partitioning Algorithm

Structural partitioning Structural partitioning can be easily mapped to a graph partitioning problem Size/performance tradeoffs are difficult Large number of objects Limited to only hardware designs

Functional partitioning Divide the functionality into non divisible pieces called functional objects. Advantages Size/performance tradeoffs Small no of objects Hardware and software solutions both possible

Partitioning issues Specification abstraction level Granularity System-component allocation Metrics and estimations Objective and closeness functions Partitioning algorithms Output Flow of control and designer interaction

Metrics……… Monetary cost Execution time Communication bit rates Power consumption,Area,Pins Testability,reliability,program size,data size,memory size

Partitioning algorithms…….. Constructive algorithms Iterative algorithms

Basic partitioning algorithms……… Random mapping Hierarchical clustering Multistage clustering Group migration Ratio cut Kerninghan-Lin algorithm Simulated annealing Genetic evolution Integer linear programming

Hardware software partitioning algorithms Greedy algorithms Hill-climbing algorithms Binary constraint search(BCS) HW/SW partitioning Energy-conscious HW/SW partitioning Preference-driven hierarchical HW/SW partitioning Simulated annealing Tabu search

Energy-Conscious HW/SW-Partitioning of Embedded Systems: Basic Concept Energy dissipation is a hot topic in the design of – especially mobile embedded systems. This is because applications like digital video cameras, cellular phones etc. draw their current from batteries that spend a limited amount of energy only. we show that energy-conscious HW/SW partitioning can lead to drastic reductions of energy dissipation of a whole embedded system. The obtained results show energy savings up 59% while the performance remains approximately the same or becomes even slightly higher. As a main result, energy-conscious HW/SW-partitioning is a promising method to be deployed in addition to classical energy and/or power reduction methods. Since the power dissipation varies according to the executed instruction, the term software energy is justified.

Preference-Driven Hierarchical Hardware/Software Partitioning We present a hierarchical evolutionary approach to hardware/software partitioning for real-time embedded systems. In contrast to most of previous approaches, we apply a hierarchical structure and dynamically determine the granularity of tasks and hardware modules to adaptively optimize the solution while keeping the search space as small as possible. Efficient ranking is another problem addressed in this paper. Experiment results show that our algorithm is both effective and efficient.

Hierarchical Models and our approach

Hierarchical Models and our approach…………………….

Hierarchical Evolutionary Algorithm In hardware/software partitioning problem, for a nonhierarchical task graph, each node is to be assigned to a hardware module. In EA, such a node-hardware tuple becomes a gene in an individual. However, for the hierarchical task graph, how to encode genes needs some careful consideration. A simple approach is to associate each element with a finest level task node. Note that no task is represented more than once in an HTG instance. This guarantees the correctness when constructing the individual. We use the notation (Vi;Mk) to denote task Vi is assigned to (hardware) module Mk. Then a gene list for the instance in Figure 2(a) might be f(V1;M1) (V21;M2) (V22;M3) (V31;M4) (V32;M5)g, and f(V1;M01) (V21;M02)(V22;M03) (V31a;M04) (V31b;M05) (V31c;M06) (V31d;M07)(V32;M08)g for Figure 2(b).

Hierarchical Evolutionary Algorithm………………. Ns is total nodes at the finest level Ni is no of nodes at the present instance K is user defined constant G is no of iterations Θ is the probability of going in deep of a complex node 1- Θ is probability of mapping Vi to another hardware module

Hierarchical Evolutionary Algorithm……………………

Preference Driven Ranking When solving the partitioning problem, how to handle multiple, often conflicting design objectives is not easy. ISMAUT offers an efficient way to compare alternative design according to the designer’s preferences. Let the fitness of a design x be represented by Vx, and denote the kth of x attribute by ak(x), then where vk() maps the raw attribute values to set [0; 1] and wk is the corresponding weight.

Preference Driven Ranking……………………… Designs are considered to be more desirable. Let x, x’ be two individuals with attribute a k(x) and a k(x’) k = 1, 2, .. , n. Suppose that according to the designer preference, x is considered to be preferable to x’, denoted by x > x’.

Preference Driven Ranking…………………… Solving the linear programming problems can therefore be transformed to check the objective function values at each of these extreme points. To compare two indifferent individuals x and y,calculate

Example Results:

Summary of preference driven Hw/Sw partition we present several techniques to improve the hardware/software partitioning process for large, complex embedded systems. We proposed the use of both hierarchical task specification and hardware modules. To facilitate the partitioning process, we extended the existing EA approach so that it can effectively handle hierarchical structures. we introduced the idea of employing the extreme points in multi-objective linear programming to eliminate the time-consuming procedure of solving multiple linear programming problem instances. The experimental results obtained so far have clearly demonstrated the advantages of our proposed approach.

Overview of the co-synthesis environment

Overview of the co-synthesis environment….. initial system specification- a set of processes interacting through communication channels. This specification is further decomposed into units of smaller granularity. The partitioning algorithm generates as output a model consisting of two sets of interacting processes The processes in one set are marked as candidates for hardware implementation, while the processes in the other set are marked as software implementation candidates. The main goal of partitioning is to maximize performance in terms of execution speed.

The partitioning steps

The partitioning steps…… 1. Extraction of blocks of statements, loops, and subprograms: processes that are responsible for most of the execution time spent inside a process (regions with a large CL). Candidate regions are typically loops and subprograms, but can also be blocks of statements with a high CL. -The designer guides identification and extraction of the regions and decides implicitly on the granularity of further partitioning a. By identifying a certain region to be extracted (regardless of its CL) assigning hardware or software partition b. By imposing boundary values: 2. Process graph generation: 3. Partitioning of the process graph: 4. Process merging: During the first step one or several child processes are possibly extracted from a parent process. If, as result of step 3, some of the child processes are assigned to the same partition with their parent process, they are, optionally, merged back together.

Objectives to be considered 1. To identify basic regions (processes, subprograms, loops, and blocks of statements which are responsible for most of the execution time in order to be assigned to the hardware partition; 2. To minimize communication between the hardware and software domains; 3. To increase parallelism within the resulted system at the following three levels: - internal parallelism of each hardware process (during high-level synthesis, operations are scheduled to be executed in parallel by the available functional units); - parallelism between processes assigned to the hardware partition; - parallelism between the hardware coprocessor and the microprocessor executing the software processes.

Statistics used Two types of statistics are used by the partitioning algorithm: 1. Computation load (CL) of a basic region is a quantitative measure of the total computation executed by that region, considering all its activations during the simulation process. It is expressed as the total number of operations executed inside that region, where each operation is weighted with a coefficient depending on its relative complexity -The relative computation load (RCL) of a block of statements, loop, or a subprogram is the computation load of the respective basic region divided by the computation load of the process the region belongs to. The relative computation load of a process is the computation load of that process divided by the total computation load of the system. 2. Communication intensity (CI) on a channel connecting two processes is expressed as the total number of send operations executed on the respective channel.

Simulated annealing Iterative improvement algorithms based on neighborhood search are widely used for hardware/software partitioning. To avoid being trapped in a local minimum heuristics are implemented which are very often based on simulated annealing Simulated annealing selects the neighboring solution randomly and always accepts an improved solution. It also accepts worse solutions with a certain probability that depends on the deterioration of the cost function and on a control parameter called temperature. Simulated annealing algorithms can be quickly implemented and are widely applicable to many different problems. Limitation - long execution time, large amount of experiments needed to tune the algorithm.

Simulated annealing algorithm

Cooling schedules

Generation of a new solution with improved move

Partitioning times with SA: simple moves (SM) and improved moves (IM)

Variation of cost function during simulated annealing with simple moves for 100 nodes

Variation of cost function during simulated annealing with improved moves for 100 nodes

Partitioning time with SA

Tabu search algorithm Tabu search controls uphill moves not purely randomly but in an intelligent way. The tabu search approach accepts uphill moves and stimulates convergence toward a global optimum by creating and exploiting data structures to take advantage of the search history at selection of the next move. Two key elements of the TS algorithm are the data structures called short and long term memory. Short term memory stores information relative to the most recent history of the search. It is used in order to avoid cycling that could occur if a certain move returns to a recently visited solution. Long term memory, on the other side, stores information on the global evolution of the algorithm. These are typically frequency measures relative to the occurrence of a certain event. They can be applied to perform diversification which is meant to improve exploration of the solution space by broadening the spectrum of visited solutions.

Tabu search algorithm

Parameters and CPU time with TS

Partitioning times with SA and TS

Future possibilities Many systems exhibit a high degree of regularity(regularity means many of the behaviors in the system are identical,differing only the data on which they operate).Future algorithms should include techniques to partition regular and semi-regular behaviors. Feedback metrics to the system and again partition Since partitioning is a quite mature field,the majority of future tasks will involve adaptation of existing techniques for applicability at the functional level Develop an algorithm which partition at multiple levels of granularity Combine functional partitioning with high-level synthesis

References Petru Eles, Zebo Peng, Krzysztof Kuchcinski, Alexa Doboli “System Level Hardware/Software Partitioning Based on Simulated Annealing and Tabu Search” Gang Quan Xiaobo(Sharon) Hu Garrison Greenwood “Preference-Driven Hierarchical Hardware/Software Partitioning” J¨org Henkel Yanbing Li “Energy-Conscious HW/SW-Partitioning of Embedded Systems:A Case Study on an MPEG-2 Encoder” Gajski D D, Vahid F, Narayan S,Gong J “Specification and Design of Embedded Systems”