Dynamic State-Space Partitioning in External-Memory Graph Search Rong Zhou † and Eric A. Hansen ‡ † Palo Alto Research Center ‡ Mississippi State University.

Slides:



Advertisements
Similar presentations
Hybrid BDD and All-SAT Method for Model Checking Orna Grumberg Joint work with Assaf Schuster and Avi Yadgar Technion – Israel Institute of Technology.
Advertisements

Algorithm Engineering Parallele Suche Stefan Edelkamp.
An Introduction to Artificial Intelligence
Greedy Algorithms.
Chapter 6: Memory Management
Chapter 2: Memory Management, Early Systems
Chapter 2: Memory Management, Early Systems
Memory Management, Early Systems
Optimal Instruction Scheduling for Multi-Issue Processors using Constraint Programming Abid M. Malik and Peter van Beek David R. Cheriton School of Computer.
Hashing. CENG 3512 Motivation The primary goal is to locate the desired record in a single access of disk. – Sequential search: O(N) – B+ trees: O(log.
Fall 2008Parallel Query Optimization1. Fall 2008Parallel Query Optimization2 Bucket Sizes and I/O Costs Bucket B does not fit in the memory in its entirety,
Dynamic Programming.
Enforcing Sequential Consistency in SPMD Programs with Arrays Wei Chen Arvind Krishnamurthy Katherine Yelick.
Recent Progress in the Design and Analysis of Admissible Heuristic Functions Richard E. Korf Computer Science Department University of California, Los.
CSCE 580 ANDREW SMITH JOHNNY FLOWERS IDA* and Memory-Bounded Search Algorithms.
Hashing Part One Reaching for the Perfect Search Most of this material stolen from "File Structures" by Folk, Zoellick and Riccardi.
Biointelligence Lab School of Computer Sci. & Eng. Seoul National University Artificial Intelligence Chapter 8 Uninformed Search.
Experiments We measured the times(s) and number of expanded nodes to previous heuristic using BFBnB. Dynamic Programming Intuition. All DAGs must have.
1 Solving problems by searching Chapter 3. 2 Why Search? To achieve goals or to maximize our utility we need to predict what the result of our actions.
I/O-Algorithms Lars Arge January 31, Lars Arge I/O-algorithms 2 Random Access Machine Model Standard theoretical model of computation: –Infinite.
CISC October Goals for today: Foster’s parallel algorithm design –Partitioning –Task dependency graph Granularity Concurrency Collective communication.
1 Solving problems by searching Chapter 3. 2 Why Search? To achieve goals or to maximize our utility we need to predict what the result of our actions.
Constraint Processing Techniques for Improving Join Computation: A Proof of Concept Anagh Lal & Berthe Y. Choueiry Constraint Systems Laboratory Department.
A Constraint Satisfaction Problem (CSP) is a combinatorial decision problem defined by a set of variables, a set of domain values for these variables,
FALL 2004CENG 3511 Hashing Reference: Chapters: 11,12.
Query Execution 15.5 Two-pass Algorithms based on Hashing By Swathi Vegesna.
Uninformed Search Reading: Chapter 3 by today, Chapter by Wednesday, 9/12 Homework #2 will be given out on Wednesday DID YOU TURN IN YOUR SURVEY?
Query Optimization 3 Cost Estimation R&G, Chapters 12, 13, 14 Lecture 15.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 5: February 2, 2009 Architecture Synthesis (Provisioning, Allocation)
CS 584. Discrete Optimization Problems A discrete optimization problem can be expressed as (S, f) S is the set of all feasible solutions f is the cost.
E.G.M. PetrakisHashing1 Hashing on the Disk  Keys are stored in “disk pages” (“buckets”)  several records fit within one page  Retrieval:  find address.
A Grid-enabled Branch and Bound Algorithm for Solving Challenging Combinatorial Optimization Problems Authors: M. Mezmaz, N. Melab and E-G. Talbi Presented.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 5: February 2, 2009 Architecture Synthesis (Provisioning, Allocation)
1 Solving problems by searching Chapter 3. 2 Why Search? To achieve goals or to maximize our utility we need to predict what the result of our actions.
CSE 373 Data Structures Lecture 15
Informed Search Idea: be smart about what paths to try.
BiGraph BiGraph: Bipartite-oriented Distributed Graph Partitioning for Big Learning Jiaxin Shi Rong Chen, Jiaxin Shi, Binyu Zang, Haibing Guan Institute.
Using Abstraction to Speed Up Search Robert Holte University of Ottawa.
 Optimal Packing of High- Precision Rectangles By Eric Huang & Richard E. Korf 25 th AAAI Conference, 2011 Florida Institute of Technology CSE 5694 Robotics.
Distributed Verification of Multi-threaded C++ Programs Stefan Edelkamp joint work with Damian Sulewski and Shahid Jabbar.
Physical Database Design I, Ch. Eick 1 Physical Database Design I About 25% of Chapter 20 Simple queries:= no joins, no complex aggregate functions Focus.
Cost-Optimal Symbolic Pattern Database Planning with State Trajectory and Preference Constraints Stefan Edelkamp University of Dortmund.
Li Wang Haorui Wu University of South Carolina 04/02/2015 A* with Pattern Databases.
Major objective of this course is: Design and analysis of modern algorithms Different variants Accuracy Efficiency Comparing efficiencies Motivation thinking.
External Storage Primary Storage : Main Memory (RAM). Secondary Storage: Peripheral Devices –Disk Drives –Tape Drives Secondary storage is CHEAP. Secondary.
1 Solving problems by searching 171, Class 2 Chapter 3.
FALL 2005 CENG 351 Data Management and File Structures 1 Hashing.
1 CPS216: Data-intensive Computing Systems Operators for Data Access (contd.) Shivnath Babu.
Parallel External Directed Model Checking with Linear I/O Shahid Jabbar Stefan Edelkamp Computer Science Department University of Dortmund, Dortmund, Germany.
External A* Stefan Edelkamp, Shahid Jabbar (ich) University of Dortmund, Germany and Stefan Schrödl (DaimlerChrysler, CA)
External Memory Value Iteration Stefan Edelkamp, Shahid Jabbar Chair for Programming Systems, University of Dortmund, Germany Blai Bonet Departamento de.
1 CPS216: Advanced Database Systems Notes 05: Operators for Data Access (contd.) Shivnath Babu.
Lecture 6 : External Sorting Bong-Soo Sohn Assistant Professor School of Computer Science and Engineering Chung-Ang University.
CS 584. Discrete Optimization Problems A discrete optimization problem can be expressed as (S, f) S is the set of all feasible solutions f is the cost.
Chapter 15 A External Methods. © 2004 Pearson Addison-Wesley. All rights reserved 15 A-2 A Look At External Storage External storage –Exists beyond the.
I/O Efficient Directed Model Checking Shahid Jabbar and Stefan Edelkamp, Computer Science Department University of Dortmund, Germany.
CS4432: Database Systems II
Biointelligence Lab School of Computer Sci. & Eng. Seoul National University Artificial Intelligence Chapter 8 Uninformed Search.
TOPIC: TOward Perfect InfluenCe Graph Summarization Lei Shi, Sibai Sun, Yuan Xuan, Yue Su, Hanghang Tong, Shuai Ma, Yang Chen.
CENG Hashing for files. CENG 3512 Introduction Idea: to reference items in a table directly by doing arithmetic operations to transform keys into.
Hybrid BDD and All-SAT Method for Model Checking
Distributed Dynamic BDD Reordering
Hashing CENG 351.
Database Applications (15-415) DBMS Internals- Part V Lecture 17, March 20, 2018 Mohammad Hammoud.
Finding Heuristics Using Abstraction
Objective of This Course
Search Exercise Search Tree? Solution (Breadth First Search)?
Artificial Intelligence Chapter 8 Uninformed Search
Heavy Hitters in Streams and Sliding Windows
Algorithm Course Algorithms Lecture 3 Sorting Algorithm-1
Presentation transcript:

Dynamic State-Space Partitioning in External-Memory Graph Search Rong Zhou † and Eric A. Hansen ‡ † Palo Alto Research Center ‡ Mississippi State University

External-memory graph search Internal memory vs. External memory 1 ~ 4 GB 160 GB ~ 1.5 TB  External memory is cheap and almost inexhaustible  But random access of external memory (e.g., for duplicate detection) is 10 5 ~ 10 6 times slower than internal memory

Previous work  Hash-based delayed duplicate detection [Korf & Schultze AAAI-05; Korf JACM-08]  Structured duplicate detection [Zhou & Hansen AAAI-04, 06]  Both use state-space abstraction to …  partition nodes into buckets or disk files  leverage graph local structure to save RAM or disk space

Structured duplicate detection [Zhou & Hansen AAAI-04]  Localizes memory references in duplicate detection by exploiting graph structure revealed by a state-space projection function  Example of projection function … blank pos. = ?? ? ?? ? ? ? ??? ?? ? ? ?? ? ?? ? ? ? ??? ?? ? ? ?? ? ?? ? ? ? ??? ?? ? ? …

Abstract state-space graph  Created by state-space projection function  Example B0B0 B3B3 B1B1 B2B2 B8B8 B4B4 B5B5 B6B6 B7B7 16 abstract states > 10 trillion states B9B9 B 10 B 11 B 12 B 13 B 14 B 15

Duplicate-detection scope A set of blocks (of stored nodes) that is guaranteed to contain all stored successor nodes of the currently-expanding node B1B1 B0B0 B4B4 B0B0 B3B3 B1B1 B2B2 B8B8 B4B4 B5B5 B6B6 B7B7 B9B9 B 10 B 11 B 12 B 13 B 14 B 15 B0B0 B1B1 B4B4 B3B3 B2B2 B8B8 B5B5 B6B6 B7B7 B9B9 B 10 B 11 B 12 B 13 B 14 B 15 B2B2 B3B3 B5B5 B6B6 B7B7 B8B8 B 14 …

Edge Partitioning Reduces duplicate-detection scope to one block of stored nodes – Guaranteed! B1B1 B0B0 B3B3 B1B1 B2B2 B8B8 B4B4 B5B5 B6B6 B7B7 B9B9 B 10 B 11 B 12 B 13 B 14 B 15 B1B1 B0B0 B3B3 B2B2 B8B8 B5B5 B6B6 B7B7 B9B9 B 10 B 11 B 12 B 13 B 14 B 15 B4B4 B2B2 B3B3 B5B5 B6B6 B7B7 B8B8 B 14 B0B0 B4B4 … B3B3 B2B2 B8B8 B5B5 B6B6 B7B7 B9B9 B 10 B 11 B 12 B 13 B 14 B 15 B4B4 B1B1 B0B0 B4B4 B1B1 B4B4

What is a good abstraction?  Capture local structure  DDD: Interleaving expansion and merging  SDD: Fewer incremental expansions  Distribute nodes evenly into buckets  Make sure largest bucket fits in RAM  Not too many buckets  Achieving both is challenging, especially for static abstraction

A pathological example Degenerative state-space projection function In theory, there are 518,918,400 buckets. Start Goal But most (> 99.99%) of them are empty!

Greedy abstraction algorithm  Starts with a “blank” abstraction  Mark all state variables as unselected  While ( size of abstract graph  M )  Find an unselected variable V i s.t. adding it to current abstraction minimizes largest bucket size  Add V i into set of abstraction variables  Mark V i as selected  Update current abstraction  Move nodes to their new buckets

Example NodeXYZ a146 b156 c246 d256 e347 f357 VarsValuesStates {X} {X = 1}{a, b} {X = 2}{c, d} {X = 3}{e, f} {Y} {Y = 4}{a, c, e} {Y = 5}{b, d, f} {Z} {Z = 6}{a, b, c, d} {Z = 7}{e, f} VarsValuesStates {X,Y} {X = 1, Y = 4}{a} {X = 1, Y = 5}{b} {X = 2, Y = 4}{c} {X = 2, Y = 5}{d} {X = 3, Y = 4}{e} {X = 3, Y = 5}{f} {X,Z} {X = 1, Z = 6}{a,b} {X = 1, Z = 7}  {X = 2, Z = 6}{c, d} {X = 2, Z = 7}  {X = 3, Z = 6}  {X = 3, Z = 7}{e, f} Nodes 1 st Iteration 2 nd Iteration

Computational results  Planning results on 15 Puzzle  First planner to optimally solve all 100 of Korf’s 15 Puzzle instances (93 for previous best solver)  5x RAM for #88)  Uses only Manhattan-Distance heuristic  STRIPS planning (6 domains from IPC)  Peak RAM reduced by up to ~19x  Better time-space tradeoff  Improves with accuracy of heuristic function

Bucket size histogram for instance #88

Conclusion and future work  Not all abstractions are created equal – even for the ones with the same resolution!  Largest bucket depends on starting state  Static abstraction ineffective for heuristic search  Future work  Sampling approach  Parallel search