1 Simulated Evolution Algorithm for Multi- Objective VLSI Netlist Bi-Partitioning Sadiq M. Sait,, Aiman El-Maleh, Raslan Al Abaji King Fahd University.

Slides:



Advertisements
Similar presentations
Topics Electrical properties of static combinational gates:
Advertisements

4/22/ Clock Network Synthesis Prof. Shiyan Hu Office: EREC 731.
Improving Placement under the Constant Delay Model Kolja Sulimma 1, Ingmar Neumann 1, Lukas Van Ginneken 2, Wolfgang Kunz 1 1 EE and IT Department University.
1 Closed-Loop Modeling of Power and Temperature Profiles of FPGAs Kanupriya Gulati Sunil P. Khatri Peng Li Department of ECE, Texas A&M University, College.
Simulated Evolution Algorithm for Multi- Objective VLSI Netlist Bi-Partitioning Sadiq M. Sait, Aiman El-Maleh, Raslan Al-Abaji King Fahd University of.
Spie98-1 Evolutionary Algorithms, Simulated Annealing, and Tabu Search: A Comparative Study H. Youssef, S. M. Sait, H. Adiche
CMOS Circuit Design for Minimum Dynamic Power and Highest Speed Tezaswi Raja, Dept. of ECE, Rutgers University Vishwani D. Agrawal, Dept. of ECE, Auburn.
Multiobjective VLSI Cell Placement Using Distributed Simulated Evolution Algorithm Sadiq M. Sait, Mustafa I. Ali, Ali Zaidi.
Fuzzy Simulated Evolution for Power and Performance of VLSI Placement Sadiq M. Sait Habib Youssef Junaid A. KhanAimane El-Maleh Department of Computer.
Finite State Machine State Assignment for Area and Power Minimization Aiman H. El-Maleh, Sadiq M. Sait and Faisal N. Khan Department of Computer Engineering.
VLSI Layout Algorithms CSE 6404 A 46 B 65 C 11 D 56 E 23 F 8 H 37 G 19 I 12J 14 K 27 X=(AB*CD)+ (A+D)+(A(B+C)) Y = (A(B+C)+AC+ D+A(BC+D)) Dr. Md. Saidur.
1 HW/SW Partitioning Embedded Systems Design. 2 Hardware/Software Codesign “Exploration of the system design space formed by combinations of hardware.
Power-Aware Placement
CS244-Introduction to Embedded Systems and Ubiquitous Computing Instructor: Eli Bozorgzadeh Computer Science Department UC Irvine Winter 2010.
VLSI Layout Algorithms CSE 6404 A 46 B 65 C 11 D 56 E 23 F 8 H 37 G 19 I 12J 14 K 27 X=(AB*CD)+ (A+D)+(A(B+C)) Y = (A(B+C)+AC+ D+A(BC+D)) Dr. Md. Saidur.
1 Simulated Evolution Algorithm for Multiobjective VLSI Netlist Bi-Partitioning By Dr Sadiq M. Sait Dr Aiman El-Maleh Raslan Al Abaji King Fahd University.
Chapter 2 – Netlist and System Partitioning
EDA (CS286.5b) Day 5 Partitioning: Intro + KLFM. Today Partitioning –why important –practical attack –variations and issues.
Local Unidirectional Bias for Smooth Cutsize-delay Tradeoff in Performance-driven Partitioning Andrew B. Kahng and Xu Xu UCSD CSE and ECE Depts. Work supported.
1 Simulated Evolution Algorithm for Multiobjective VLSI Netlist Bi-Partitioning By Dr Sadiq M. Sait Dr Aiman El-Maleh Raslan Al Abaji King Fahd University.
Fuzzy Simulated Evolution for Power and Performance of VLSI Placement Sadiq M. SaitHabib Youssef Junaid A. KhanAimane El-Maleh Department of Computer Engineering.
Mehdi Amirijoo1 Power estimation n General power dissipation in CMOS n High-level power estimation metrics n Power estimation of the HW part.
Fast Force-Directed/Simulated Evolution Hybrid for Multiobjective VLSI Cell Placement Junaid Asim Khan Dept. of Elect. & Comp. Engineering, The University.
Iterative Algorithms for Low Power VLSI Placement Sadiq M. Sait, Ph.D Department of Computer Engineering King Fahd University of Petroleum.
Processing Rate Optimization by Sequential System Floorplanning Jia Wang 1, Ping-Chih Wu 2, and Hai Zhou 1 1 Electrical Engineering & Computer Science.
1 General Iterative Heuristics for VLSI Multiobjective Partitioning by Dr. Sadiq M. Sait Dr. Aiman El-Maleh Mr. Raslan Al Abaji King Fahd University Computer.
1 Topology Design of Structured Campus Networks by Habib Youssef Sadiq M. SaitSalman A. Khan Department of Computer Engineering King Fahd University of.
Fuzzy Evolutionary Algorithm for VLSI Placement Sadiq M. SaitHabib YoussefJunaid A. Khan Department of Computer Engineering King Fahd University of Petroleum.
1 Circuit Partitioning Presented by Jill. 2 Outline Introduction Cut-size driven circuit partitioning Multi-objective circuit partitioning Our approach.
1 Enhancing Performance of Iterative Heuristics for VLSI Netlist Partitioning Dr. Sadiq M. Sait Dr. Aiman El-Maleh Mr. Raslan Al Abaji. Computer Engineering.
1 Topology Design of Structured Campus Networks by Habib Youssef Sadiq M. SaitSalman A. Khan Department of Computer Engineering King Fahd University of.
1 ENTITY test is port a: in bit; end ENTITY test; DRC LVS ERC Circuit Design Functional Design and Logic Design Physical Design Physical Verification and.
Practical Aspects of Logic Gates COE 202 Digital Logic Design Dr. Aiman El-Maleh College of Computer Sciences and Engineering King Fahd University of Petroleum.
Register-Transfer (RT) Synthesis Greg Stitt ECE Department University of Florida.
Graph partition in PCB and VLSI physical synthesis Lin Zhong ELEC424, Fall 2010.
ICCAD 2003 Algorithm for Achieving Minimum Energy Consumption in CMOS Circuits Using Multiple Supply and Threshold Voltages at the Module Level Yuvraj.
CSE 242A Integrated Circuit Layout Automation Lecture: Partitioning Winter 2009 Chung-Kuan Cheng.
Introduction to VLSI Design – Lec01. Chapter 1 Introduction to VLSI Design Lecture # 2 A Circuit Design Example.
Power Reduction for FPGA using Multiple Vdd/Vth
Research on Analysis and Physical Synthesis Chung-Kuan Cheng CSE Department UC San Diego
POWER-DRIVEN MAPPING K-LUT-BASED FPGA CIRCUITS I. Bucur, N. Cupcea, C. Stefanescu, A. Surpateanu Computer Science and Engineering Department, University.
Lecture 12 Review and Sample Exam Questions Professor Lei He EE 201A, Spring 2004
CAD for Physical Design of VLSI Circuits
Logic Synthesis for Low Power(CHAPTER 6) 6.1 Introduction 6.2 Power Estimation Techniques 6.3 Power Minimization Techniques 6.4 Summary.
March 20, 2007 ISPD An Effective Clustering Algorithm for Mixed-size Placement Jianhua Li, Laleh Behjat, and Jie Huang Jianhua Li, Laleh Behjat,
1 Moore’s Law in Microprocessors Pentium® proc P Year Transistors.
CSE 494: Electronic Design Automation Lecture 2 VLSI Design, Physical Design Automation, Design Styles.
Massachusetts Institute of Technology 1 L14 – Physical Design Spring 2007 Ajay Joshi.
1 Exploring Custom Instruction Synthesis for Application-Specific Instruction Set Processors with Multiple Design Objectives Lin, Hai Fei, Yunsi ACM/IEEE.
Ho-Lin Chang, Hsiang-Cheng Lai, Tsu-Yun Hsueh, Wei-Kai Cheng, Mely Chen Chi Department of Information and Computer Engineering, CYCU A 3D IC Designs Partitioning.
A Routing Approach to Reduce Glitches in Low Power FPGAs Quang Dinh, Deming Chen, Martin D. F. Wong Department of Electrical and Computer Engineering University.
10/25/ VLSI Physical Design Automation Prof. David Pan Office: ACES Lecture 3. Circuit Partitioning.
1 5. Application Examples 5.1. Programmable compensation for analog circuits (Optimal tuning) 5.2. Programmable delays in high-speed digital circuits (Clock.
Modern VLSI Design 4e: Chapter 3 Copyright  2008 Wayne Wolf Topics n Pseudo-nMOS gates. n DCVS logic. n Domino gates. n Design-for-yield. n Gates as IP.
Deferred Decision Making Enabled Fixed- Outline Floorplanner Jackey Z. Yan and Chris Chu DAC 2008.
RF network in SoC1 SoC Test Architecture with RF/Wireless Connectivity 1. D. Zhao, S. Upadhyaya, M. Margala, “A new SoC test architecture with RF/wireless.
CS244-Introduction to Embedded Systems and Ubiquitous Computing Instructor: Eli Bozorgzadeh Computer Science Department UC Irvine Winter 2010.
1 Interconnect/Via. 2 Delay of Devices and Interconnect.
Physical Synthesis Buffer Insertion, Gate Sizing, Wire Sizing,
Simulated Evolution Algorithm for Multi- Objective VLSI Netlist Bi-Partitioning Sadiq M. Sait, Aiman El-Maleh, Raslan Al Abaji King Fahd University of.
ICS 252 Introduction to Computer Design
EE415 VLSI Design THE INVERTER [Adapted from Rabaey’s Digital Integrated Circuits, ©2002, J. Rabaey et al.]
CprE566 / Fall 06 / Prepared by Chris ChuPartitioning1 CprE566 Partitioning.
Interconnect Characteristics of 2.5-D System Integration Scheme Yangdong (Steven) Deng & Wojciech P. Maly
Unified Adaptivity Optimization of Clock and Logic Signals Shiyan Hu and Jiang Hu Dept of Electrical and Computer Engineering Texas A&M University.
Memory Segmentation to Exploit Sleep Mode Operation
Andrew B. Kahng and Xu Xu UCSD CSE and ECE Depts.
Chapter 2 – Netlist and System Partitioning
Aiman H. El-Maleh Sadiq M. Sait Syed Z. Shazli
Fast Min-Register Retiming Through Binary Max-Flow
Presentation transcript:

1 Simulated Evolution Algorithm for Multi- Objective VLSI Netlist Bi-Partitioning Sadiq M. Sait,, Aiman El-Maleh, Raslan Al Abaji King Fahd University of Petroleum & Minerals Dhahran, Saudi Arabia 27 May 2003, IEEE ISCAS, Bangkok, Thailand

2 Introduction Problem Formulation Cost Functions Proposed Approaches Experimental results Conclusion Outline

3 Design Characteristics 0.13M 12MHz 1.5um CAE Systems, Silicon compilation 7.5M 333MHz 0.25um Cycle-based simulation, Formal Verification 3.3M 200MHz 0.6um Top-Down Design, Emulation 1.2M 50MHz 0.8um HDLs, Synthesis 0.06M 2MHz 6um SPICE Simulation Key CAD Capabilities The challenges to sustain such a fast growth to achieve giga-scale integration have shifted in a large degree, from the process of manufacturing technologies to the design technology. VLSI Technology Trends

4 Technology0.1 um Transistors200 M Logic gates40 M Size520 mm 2 Clock GHz Chip I/O’s4,000 Wiring levels Voltage Power160 Watts Supply current~160 Amps Performance Power consumption Noise immunity Area Cost Time-to-market Tradeoffs!!! The VLSI Chip in 2006

5 1.System Specification 2.Functional Design 3.Logic Design 4.Circuit Design 5.Physical Design 6.Design Verification 7.Fabrication 8.Packaging Testing and Debugging VLSI design process is carried out at a number of levels. VLSI Design Cycle

6 Physical design converts a circuit description (behavioral/structural), into a geometric description. This description is used to manufacture a chip. 1.Partitioning 2.Floorplanning and Placement 3.Routing 4.Compaction The physical design cycle consists of: Physical Design

7 Decomposition of a complex system into smaller subsystems. Each subsystem can be designed independently speeding up the design process (divide-and conquer-approach). Dividing a complex IC into a number of functional blocks, each of them designed by one or a team of engineers. The partitioning scheme has to minimize the interconnections between subsystems. Why we need Partitioning ?

8 System Level Partitioning Board Level Partitioning Chip Level Partitioning System PCBs Chips Subcircuits / Blocks Levels of Partitioning

9 Partitioning Algorithms Group Migration Iterative Heuristics Performance Driven 1.Kernighan-Lin 2.Fiduccia- Mattheyeses (FM) 3.Multilevel K-way Partitioning Others 1.Simulated annealing 2.Simulated evolution 3.Tabu Search 4.Genetic 1.Lawler et al. 2.Vaishnav 3.Choi et al. 4.Jun’ichiro et al. 1.Spectral 2.Multilevel Spectral Classification of Partitioning Algorithms

10 Related previous Works 1999Two low power oriented techniques based on simulated annealing (SA) algorithm by choi et al. 1969A bottom-up approach for delay optimization (clustering) was proposed by Lawler et al. 1998A circuit partitioning algorithm under path delay constraint is proposed by jun’ichiro et al. The proposed algorithm consists of the clustering and iterative improvement phases. 1999Enumerative partitioning algorithm targeting low power is proposed in Vaishnav et al. Enumerates alternate partitionings and selects a partitioning that has the same delay but less power dissipation. (not feasible for huge circuits.)

11 Need for Power optimization Portable devices Power consumption is a hindrance in further integration Increasing clock frequency Need for delay optimization In current sub micron design wire delay tend to dominate gate delay. Larger die size imply long on-chip global routes, which affect performance Optimizing delay due to off-chip capacitance Motivation

12 Objective Design a class of iterative algorithms for VLSI multi-objective partitioning. Explore partitioning from a wider angle and consider circuit delay, power dissipation and interconnect in the same time, under a given balance constraint

13 Objectives Power cost is optimized Delay cost is optimized Cutset cost is optimized Constraint Balanced partitions to a certain tolerance degree (10%) Problem formulation

14 Problem formulation the circuit is modeled as a hypergraph H(V,E), where V ={v 1,v 2,v 3,… v n } is a set of modules (cells). And E = {e 1, e 2, e 3,… e k } is a set of hyperedges. Being the set of signal nets, each net is a subset of V containing the modules that the net connects. A two-way partitioning of a set of nodes V is to determine two subsets V A and V B such that V A U V B = V and V A  V B = 

15 Based on hypergraph model H = (V, E) Cost 1: c(e) = 1 if e spans more than 1 block Cutset = sum of hyperedge costs Efficient gain computation and update cutset = 3 Cutset

16 path  : SE 1  C 1  C 4  C 5  SE 2. Delay  = CD SE1 + CD C1 + CD C4 + CD C5 + CD SE2 CD C1 = BD C1 + LF C1 * ( Coffchip + CINP C2 + CINP C3 + CINP C4 ) Delay Model

17 Delay(Pi) = Pi: is any path Between 2 cells or nodes P: set of all paths of the circuit. Delay

18 The average dynamic power consumed by CMOS logic gate in a synchronous circuit is given by: Ni is the number of output gate transition per cycle (Switching Probability) : is the load capacitance Power

19 : Load Capacitances driven by a cell before Partitioning : Additional load due to off chip capacitance. (cut net) Total Power dissipation of a Circuit: Power

20 : Can be assumed identical for all nets :Set of Visible gates Driving a load outside the partition. Power

21 Weighted Sum Approach 1.Problems in choosing weights. 2.Need to tune for every circuit. Unifying Objectives, How ?

22 Imprecise values of the objectives –best represented by linguistic terms that are basis of fuzzy algebra Conflicting objectives Operators for aggregating function Fuzzy logic for cost function

23 1.The cost to membership mapping 2.Linguistic fuzzy rule for combining the membership values in an aggregating function 3.Translation of the linguistic rule in form of appropriate fuzzy operators 4.And-like operators: Min operator  = min (  1,  2) And-like OWA:  =  * min (  1,  2) + ½ (1-  ) (  1+  2) Or-like operatorsMax operator  = max (  1,  2) Or-like OWA:  =  * max (  1,  2) + ½ (1-  ) (  1+  2) Where  is a constant in range [0,1] Fuzzy logic for Multi-objective function

24 WhereO i and C i are lower bound and actual cost of objective “i”  i (x) is the membership of solution x in set “good ‘i’ ” g i is the relative acceptance limit for each objective. Membership functions

25 A good partitioning can be described by the following fuzzy rule IF solution has small cutset AND low power AND short delay AND good Balance. THEN it is a good solution Fuzzy linguistic rule

26 The above rule is translated to AND-like OWA Represent the total Fuzzy fitness of the solution, our aim is to Maximize this fitness Respectively (Cutset, Power, Delay, Balance) Fitness Fuzzy cost function

27 Simulated Evolution Algorithm Simulated evolution Begin Start with an initial feasible Partition S Repeat Evaluation : Evaluate the G i (goodness) of all modules Selection : For each V i (cell) DO begin if Random Rm > G i then select the cell End For Allocation : For each selected V i (cell) DO begin Move the cell to destination Block. End For Until Stopping criteria is satisfied. Return best solution. End

28 Cut goodness d i : set of all nets, Connected and not cut. w i : set of all nets, Connected and cut.

29 Power Goodness V i is the set of all nets connected and Ui is the set of all nets connected and cut.

30 Delay Goodness Ki: is the set of cells in all paths passing by cell i. Li: is the set of cells in all paths passing by cell i and are not in same block as i.

31 Final selection Fuzzy rule IF Cell I is near its optimal Cut-set goodness as compared to other cells AND AND THEN it has a high goodness. near its optimal net delay goodness as compared to other cells OR T (max) (i) is much smaller than T max near its optimal power goodness compared to other cells

32 T max :delay of most critical path in current iteration. T (max) (i) :delay of longest path traversing cell i. X path = T max / T (max) (i) Fuzzy Goodness Fuzzy Goodness: Respectively (Cutset, Power, Delay ) goodness.

33 Experimental Results ISCAS Benchmark Circuits

34 SimE versus Tabu Search & GA against time Circuit S13207

35 Experimental Results SimE versus Ts versus GA SimE results were better than TS and GA, with faster execution time.

36 Conclusion Re-write this The present work successfully addressed the important issue of reducing power and delay consumption in VLSI circuits. The present work successfully formulate and provide solutions to the problem of multi- objective VLSI partitioning TS partitioning algorithm outperformed GA in terms of quality of solution and execution time

37 Thank you