Distributed Dynamic BDD Reordering

Slides:

Advertisements

Similar presentations

Hybrid BDD and All-SAT Method for Model Checking Orna Grumberg Joint work with Assaf Schuster and Avi Yadgar Technion – Israel Institute of Technology.

Advertisements

AI Pathfinding Representing the Search Space

An Introduction to Artificial Intelligence

PODC 2007 © 2007 IBM Corporation Constructing Scalable Overlays for Pub/Sub With Many Topics Problems, Algorithms, and Evaluation G. Chockler, R. Melamed,

ECE 667 Synthesis and Verification of Digital Circuits

Adopt Algorithm for Distributed Constraint Optimization

Traveling Salesperson Problem

Using Parallel Genetic Algorithm in a Predictive Job Scheduling

Lauritzen-Spiegelhalter Algorithm

CILK: An Efficient Multithreaded Runtime System. People n Project at MIT & now at UT Austin –Bobby Blumofe (now UT Austin, Akamai) –Chris Joerg –Brad.

Cost-based Workload Balancing for Ray Tracing on a Heterogeneous Platform Mario Rincón-Nigro PhD Showcase Feb 17 th, 2012.

CPSC 689: Discrete Algorithms for Mobile and Wireless Systems Spring 2009 Prof. Jennifer Welch.

A Comparison of Layering and Stream Replication Video Multicast Schemes Taehyun Kim and Mostafa H. Ammar.

DAST, Spring © L. Joskowicz 1 Data Structures – LECTURE 1 Introduction Motivation: algorithms and abstract data types Easy problems, hard problems.

Beowulf Cluster Computing Each Computer in the cluster is equipped with: – Intel Core 2 Duo 6400 Processor(Master: Core 2 Duo 6700) – 2 Gigabytes of DDR.

1 Internet Networking Spring 2004 Tutorial 6 Network Cost of Minimum Spanning Tree.

UNIVERSITY OF JYVÄSKYLÄ Resource Discovery in Unstructured P2P Networks Distributed Systems Research Seminar on Mikko Vapa, research student.

DAST, Spring © L. Joskowicz 1 Data Structures – LECTURE 1 Introduction Motivation: algorithms and abstract data types Easy problems, hard problems.

Distributed Constraint Optimization * some slides courtesy of P. Modi

USING SAT-BASED CRAIG INTERPOLATION TO ENLARGE CLOCK GATING FUNCTIONS Ting-Hao Lin, Chung-Yang (Ric) Huang Graduate Institute of Electrical Engineering,

Modularizing B+-trees: Three-Level B+-trees Work Fine Shigero Sasaki* and Takuya Araki NEC Corporation * currently with 1st Nexpire Inc.

Task Scheduling for Highly Concurrent Analytical and Transactional Main-Memory Workloads Iraklis Psaroudakis (EPFL), Tobias Scheuer (SAP AG), Norman May.

1 Automatic Refinement and Vacuity Detection for Symbolic Trajectory Evaluation Orna Grumberg Technion Haifa, Israel Joint work with Rachel Tzoref.

Intelligent Database Systems Lab 1 Advisor ： Dr. Hsu Graduate ： Jian-Lin Kuo Author ： Silvia Nittel Kelvin T.Leung Amy Braverman 國立雲林科技大學 National Yunlin.

Shared Memory Parallelization of Decision Tree Construction Using a General Middleware Ruoming Jin Gagan Agrawal Department of Computer and Information.

Exact methods for ALB ALB problem can be considered as a shortest path problem The complete graph need not be developed since one can stop as soon as in.

The Fast Optimal Voltage Partitioning Algorithm For Peak Power Density Minimization Jia Wang, Shiyan Hu Department of Electrical and Computer Engineering.

Robin McDougall Scott Nokleby Mechatronic and Robotic Systems Laboratory 1.

Synchronization Transformations for Parallel Computing Pedro Diniz and Martin Rinard Department of Computer Science University of California, Santa Barbara.

A Faster Approximation Scheme for Timing Driven Minimum Cost Layer Assignment Shiyan Hu*, Zhuo Li**, and Charles J. Alpert** *Dept of ECE, Michigan Technological.

Paper # – 2009 A Comparison of Heterogeneous Video Multicast schemes: Layered encoding or Stream Replication Authors: Taehyun Kim and Mostafa H.

Lecture 3: Uninformed Search

On the Relation between SAT and BDDs for Equivalence Checking Sherief Reda Rolf Drechsler Alex Orailoglu Computer Science & Engineering Dept. University.

Design Issues of Prefetching Strategies for Heterogeneous Software DSM Author :Ssu-Hsuan Lu, Chien-Lung Chou, Kuang-Jui Wang, Hsiao-Hsi Wang, and Kuan-Ching.

1 Distributed BDD-based Model Checking Orna Grumberg Technion, Israel Joint work with Tamir Heyman, Nili Ifergan, and Assaf Schuster CAV00, FMCAD00, CAV01,

Sporadic model building for efficiency enhancement of the hierarchical BOA Genetic Programming and Evolvable Machines (2008) 9: Martin Pelikan, Kumara.

Instruction Scheduling Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved.

1 Minimum Bayes-risk Methods in Automatic Speech Recognition Vaibhava Geol And William Byrne IBM ； Johns Hopkins University 2003 by CRC Press LLC 2005/4/26.

CMPS 3130/6130 Computational Geometry Spring 2017

The NP class. NP-completeness

Knowledge Representation

CILK: An Efficient Multithreaded Runtime System

Hybrid BDD and All-SAT Method for Model Checking

Data Structures Lab Algorithm Animation.

Ioannis E. Venetis Department of Computer Engineering and Informatics

Analysis of Algorithms

RE-Tree: An Efficient Index Structure for Regular Expressions

Artificial Intelligence Problem solving by searching CSC 361

Database Applications (15-415) DBMS Internals- Part III Lecture 15, March 11, 2018 Mohammad Hammoud.

Local Instruction Scheduling

Communication and Memory Efficient Parallel Decision Tree Construction

Multi-Way Search Trees

SAT-Based Area Recovery in Technology Mapping

Degree-aware Hybrid Graph Traversal on FPGA-HMC Platform

SAT-Based Optimization with Don’t-Cares Revisited

CIS 488/588 Bruce R. Maxim UM-Dearborn

Lectures on Graph Algorithms: searching, testing and sorting

Multi-Objective Optimization

Sungho Kang Yonsei University

Chapter 11 Limitations of Algorithm Power

Artificial Intelligence

Branch and Bound Searching Strategies

Chapter 4: Simulation Designs

What is Computer Science About? Part 2: Algorithms

Tools for the development of parallel applications

Some Graph Algorithms.

Midterm COM3220 Open book/open notes Tuesday, April 28, 6pm pm

Fast Min-Register Retiming Through Binary Max-Flow

Algorithm Course Algorithms Lecture 3 Sorting Algorithm-1

2019/9/14 The Deep Learning Vision for Heterogeneous Network Traffic Control Proposal, Challenges, and Future Perspective Author: Nei Kato, Zubair Md.

Presentation transcript:

Distributed Dynamic BDD Reordering Ziv Nevo IBM Haifa Research Lab Haifa, Israel Monica Farkash IBM Systems Group Austin, Texas

Outline BDDs BDD Reordering Rudell’s Sifting Algorithm Empirical Observations Distributed Dynamic BDD Reordering Experimental Results Conclusions

BDDs (Bryant, 1986) Data structure for efficiently storing and manipulating Boolean functions Widely used in EDA tools Model checkers Synthesizers Optimizers More … BDDs were somewhat replaced by SAT-based techniques. Yet, BDDs are still dominant in some application

BDDs (a & b & c) | (c & d) A (reduced ordered) BDD is a layered acyclic directed graph Each BDD node is labeled with a Boolean variable Variables are encountered at most once and in the same order on every path from root to a leaf BDD nodes with the same label are therefore layered in levels a b c c d 1

BDD Reordering BDD size (number of nodes) is highly sensitive to the selected variable order; sizes may vary from linear to exponential BDD-based applications should therefore carefully choose an initial BDD order to start with In addition, BDD-based applications usually change their set of represented functions on-the-fly It is therefore wise to switch orders from time to time – Dynamic BDD Reordering

BDD Reordering But … Finding the best BDD order is NP-complete Heuristics are therefore required

BDD Reordering Many heuristics for producing a good initial BDD order One popular algorithm for dynamic BDD reordering – Rudell’s sifting algorithm (Rudell, 1993) Rudell’s sifting algorithm utilizes many (efficient) exchanges of adjacent variables

Rudell’s Sifting Algorithm While there are unselected variables Select an unselected variable with the maximal number of nodes Exchange the selected variable with its predecessor until it becomes the first variable in the ordering Exchange the selected variable with its successor until it becomes the last variable in the ordering Exchange the selected variable with its predecessor until it comes back to the position where BDD size was minimal

Rudell’s Sifting Algorithm While there are unselected variables Select a variable Search up Search down Bring to an optimal place

Optimizations to Rudell’s Algorithm Search towards closest end first Set an upper limit on BDD size during search Group sifting (Panda and Somezni, 1995) Block restricted sifting (Meinel and Slobodova, 1997) Sampling methods (Slobodova and Meinel, 1998) Lower bounds (Drechsler and Gunther, 2001; Ebendt and Drechsler, 2005)

Rudell’s Sifting Algorithm Yet, some applications spend nearly half their runtime doing dynamic BDD reordering

Empirical Observations We analyzed a selection of reordering session (BDDs with 106 – 107 nodes) using Rudell’s algorithm Only about a third of the variables relocate 1%-6% of the variables account for 70% of the total gain in BDD nodes

Empirical Observations - Conclusions Rudell’s sifting algorithm performs many best-place searches that have no (or negligible) effect It is hard to predict whether a specific best-place search will bring significant results We can however, perform these searches in parallel Futile searches do not induce a change in the common data structures Since many searches are futile, synchronization efforts are minimal

Distributed Dynamic BDD Reordering A Master-Slaves design Master broadcasts the BDD to all slaves, then distributes search tasks to slaves Each search task involves finding a better place for a single variable Slaves report back to master their search results: best place and gain in BDD size Master moves variables according to search results

Distributed Dynamic BDD Reordering Master Slave1 Slave2

Distributed Dynamic BDD Reordering Master Slave1 Slave2 And so on…

Distributed Reordering - Properties Slaves never put variables back in place Slaves can be allocated per reordering session Slaves can be easily added or dropped on-the-fly Most optimizations to Rudell’s algorithm may be applied Master may decide to consider only significant gains Master may split “Search up” task from “Search down” task

Synchronization Modes Synchronous mode – Master waits for answers from all slaves, applies search results, then provides slaves with new tasks Continuous mode – Master provides jobs to slaves as soon as they becomes idle. Only when a slave reports a significant gain, master holds job allocation and waits for answers from all slaves

Merging Slave Results Two non-intersecting relocations provide independent gains Master may therefore apply more than one relocation on each iteration Master may be greedy – apply best-gain relocation first

Experimental Results - Setting We implemented distributed reordering within the BDD-based model-checking engines of the formal- verification tool, RuleBase PE We used five 2.4GHz Intel Xeon machines with 2GB RAM each Machines are connected using a 1GBit Ethernet We applied upper bound, closest end and lower bounds optimizations A topological sort determined initial BDD order

Experimental Results – Reorder Time Using 4 slaves Continuous merging time is 23.4% on average

Experimental Results – Total MC Time Using 4 slaves Continuous merging time is 47.2% on average

Experimental Results - Scalability Average speedup grows by ~0.2 per additional slave

Conclusions Rudell’s sifting algorithm tends to leave many BDD variables in place Lots of computations have no (or negligible) effect, and thus can be easily parallelized Our distributed version of Rudell’s algorithm speeds- up reordering by a factor of 4 using 4 slaves No significant deterioration in resulting order quality Many (possibly better) ways to merge slave results