Constraint Systems Laboratory April 21, 2005Lal: M.S. thesis defense1 Neighborhood Interchangeability (NI) for Non-Binary CSPs & Application to Databases.

Slides:



Advertisements
Similar presentations
Hybrid BDD and All-SAT Method for Model Checking Orna Grumberg Joint work with Assaf Schuster and Avi Yadgar Technion – Israel Institute of Technology.
Advertisements

An Array-Based Algorithm for Simultaneous Multidimensional Aggregates By Yihong Zhao, Prasad M. Desphande and Jeffrey F. Naughton Presented by Kia Hall.
Constraint Optimization Presentation by Nathan Stender Chapter 13 of Constraint Processing by Rina Dechter 3/25/20131Constraint Optimization.
Outline Interchangeability: Basics Robert Beyond simple CSPs Relating & Comparing Interchangeability Shant Compacting the Search Space – AND/OR graphs,
A First Practical Algorithm for High Levels of Relational Consistency Shant Karakashian, Robert Woodward, Christopher Reeson, Berthe Y. Choueiry & Christian.
A Constraint Satisfaction Problem (CSP) is a combinatorial decision problem defined by a set of variables {A,B,C,…}, a set of domain values for these variables,
Foundations of Constraint Processing, Fall 2005 October 21, 2005CSPs and Relational DBs1 Foundations of Constraint Processing CSCE421/821, Fall 2005:
Constraint Processing Techniques for Improving Join Computation: A Proof of Concept Anagh Lal & Berthe Y. Choueiry Constraint Systems Laboratory Department.
1 Refining the Basic Constraint Propagation Algorithm Christian Bessière and Jean-Charles Régin Presented by Sricharan Modali.
Foundations of Constraint Processing, Spring 2008 Evaluation to BT SearchApril 16, Foundations of Constraint Processing CSCE421/821, Spring 2008:
Constraint Systems Laboratory Oct 21, 2004Guddeti: MS thesis defense1 An Improved Restart Strategy for Randomized Backtrack Search Venkata P. Guddeti Constraint.
An Empirical Study of the Performance of Preprocessing and Look-ahead Techniques for Solving Finite Constraint Satisfaction Problems Zheying Jane Yang.
CS263 Lecture 19 Query Optimisation.  Motivation for Query Optimisation  Phases of Query Processing  Query Trees  RA Transformation Rules  Heuristic.
Anagh Lal Monday, April 14, Chapter 9 – Tree Decomposition Methods Anagh Lal CSCE Advanced Constraint Processing.
An Approximation of Generalized Arc-Consistency for Temporal CSPs Lin Xu and Berthe Y. Choueiry Constraint Systems Laboratory Department of Computer Science.
Improving Backtrack Search For Solving the TCSP Lin Xu and Berthe Y. Choueiry Constraint Systems Laboratory Department of Computer Science and Engineering.
A Constraint Satisfaction Problem (CSP) is a combinatorial decision problem defined by a set of variables, a set of domain values for these variables,
Foundations of Constraint Processing Evaluation to BT Search 1 Foundations of Constraint Processing CSCE421/821, Spring
Solvable problem Deviation from best known solution [%] Percentage of test runs ERA RDGR RGR LS Over-constrained.
Efficient Techniques for Searching the Temporal CSP Lin Xu and Berthe Y. Choueiry Constraint Systems Laboratory Department of Computer Science and Engineering.
Constraint Satisfaction Problems
Foundations of Constraint Processing, Fall 2005 Sep 20, 2005BT: A Theoretical Evaluation1 Foundations of Constraint Processing CSCE421/821, Fall 2005:
Hierarchical Constraint Satisfaction in Spatial Database Dimitris Papadias, Panos Kalnis And Nikos Mamoulis.
A Constraint Satisfaction Problem (CSP) is a combinatorial decision problem defined by a set of variables, a set of domain values for these variables,
1.A finer version of PPC. 2.Cheaper than PPC and F-W. 3.Guarantees the minimal network. 4.Automatically decomposes the graph into its bi-connected components:
1 Improving Hash Join Performance through Prefetching _________________________________________________By SHIMIN CHEN Intel Research Pittsburgh ANASTASSIA.
Constraint Systems Laboratory March 26, 2007Reeson–Undergraduate Thesis1 Using Constraint Processing to Model, Solve, and Support Interactive Solving of.
Constraint Systems Laboratory December 9, 2005ISI AI Seminar Series1 Symmetry Detection in Constraint Satisfaction Problems & its Application in Databases.
Constraint Systems Laboratory 11/22/2005Zheng – Comprehensive1 Survey of Techniques for Detecting and Exploiting Symmetry in Constraint Satisfaction Problems.
Distributed Scheduling. What is Distributed Scheduling? Scheduling: –A resource allocation problem –Often very complex set of constraints –Tied directly.
Ryan Kinworthy 2/26/20031 Chapter 7- Local Search part 2 Ryan Kinworthy CSCE Advanced Constraint Processing.
Query Processing Presented by Aung S. Win.
Access Path Selection in a Relational Database Management System Selinger et al.
Introduction to Job Shop Scheduling Problem Qianjun Xu Oct. 30, 2001.
©Silberschatz, Korth and Sudarshan13.1Database System Concepts Chapter 13: Query Processing Overview Measures of Query Cost Selection Operation Sorting.
Because the localized R(*,m)C does not consider combinations of relations across clusters, propagation between clusters is hindered. Synthesizing a global.
Constraint Systems Laboratory 10/24/2015Bayer–MS Thesis Defense1 Reformulating Constraint Satisfaction Problems with Application to Geospatial Reasoning.
Constraint Systems Laboratory 11/26/2015Zhang: MS Project Defense1 OPRAM: An Online System for Assigning Capstone Course Students to Sponsored Projects.
Constraint Systems Laboratory Presented by: Robert J. Woodward, Amanda Swearngin 1 Berthe Y. Choueiry 2 Eugene C. Freuder 3 1 ESQuaReD Laboratory, University.
Schreiber, Yevgeny. Value-Ordering Heuristics: Search Performance vs. Solution Diversity. In: D. Cohen (Ed.) CP 2010, LNCS 6308, pp Springer-
August 30, 2004STDBM 2004 at Toronto Extracting Mobility Statistics from Indexed Spatio-Temporal Datasets Yoshiharu Ishikawa Yuichi Tsukamoto Hiroyuki.
Constraint Systems Laboratory R.J. Woodward 1, S. Karakashian 1, B.Y. Choueiry 1 & C. Bessiere 2 1 Constraint Systems Laboratory, University of Nebraska-Lincoln.
Foundations of Constraint Processing, Fall 2004 October 3, 2004Interchangeability in CSPs1 Foundations of Constraint Processing CSCE421/821, Fall 2004:
Foundations of Constraint Processing, Spring 2009 Structure-Based Methods: An Introduction 1 Foundations of Constraint Processing CSCE421/821, Spring 2009.
ERA on an over-constrained problem A Constraint-Based System for Hiring & Managing Graduate Teaching Assistants Ryan Lim, Praveen Venkata Guddeti, and.
Shortcomings of Traditional Backtrack Search on Large, Tight CSPs: A Real-world Example Venkata Praveen Guddeti and Berthe Y. Choueiry The combination.
Problem Solving with Constraints CSPs and Relational DBs1 Problem Solving with Constraints CSCE496/896, Fall
Hybrid BDD and All-SAT Method for Model Checking
CSPs: Search and Arc Consistency Computer Science cpsc322, Lecture 12
Consistency Methods for Temporal Reasoning
A First Practical Algorithm for High Levels of Relational Consistency
CSPs: Search and Arc Consistency Computer Science cpsc322, Lecture 12
An Empirical Study of the Performance
Empirical Comparison of Preprocessing and Lookahead Techniques for Binary Constraint Satisfaction Problems Zheying Jane Yang & Berthe Y. Choueiry Constraint.
CSPs: Search and Arc Consistency Computer Science cpsc322, Lecture 12
CSPs and Relational DBs
Problem Solving with Constraints
Intelligent Backtracking Algorithms: A Theoretical Evaluation
Evaluation of (Deterministic) BT Search Algorithms
Evaluation of (Deterministic) BT Search Algorithms
Intelligent Backtracking Algorithms: A Theoretical Evaluation
Intelligent Backtracking Algorithms: A Theoretical Evaluation
Constraint Satisfaction Problems & Its Application in Databases
Intelligent Backtracking Algorithms: A Theoretical Evaluation
Intelligent Backtracking Algorithms: A Theoretical Evaluation
Evaluation of (Deterministic) BT Search Algorithms
Problem Solving with Constraints
Problem Solving with Constraints
Intelligent Backtracking Algorithms: A Theoretical Evaluation
Reformulating the Dual Graphs of CSPs
Presentation transcript:

Constraint Systems Laboratory April 21, 2005Lal: M.S. thesis defense1 Neighborhood Interchangeability (NI) for Non-Binary CSPs & Application to Databases Anagh Lal Constraint Systems Laboratory Computer Science & Engineering University of Nebraska-Lincoln Research supported by NSF CAREER award # and by Maude Hammond Fling Faculty Research Fellowship.

Constraint Systems Laboratory April 21, 2005Lal: M.S. thesis defense2 Main contributions CSPs 1.Interchangeability: An algorithm for neighborhood interchangeability (NI) in non-binary CSPs 2.Dynamic bundling: Integrating NI + backtrack search for solving non-binary CSPs 3.Exploratory: Towards detecting substitutability Databases 1. A new model of the join query as a CSP 2. A new sorting-based bundling algorithm 3. A new sort-merge join algorithm that produces bundled tuples 4.Exploratory: Application to materialized views

Constraint Systems Laboratory April 21, 2005Lal: M.S. thesis defense3 Outline Background Neighborhood Interchangeability (NI) for non-binary CSPs Empirical evaluations Database algorithms based on dynamic bundling Conclusions & future work

Constraint Systems Laboratory April 21, 2005Lal: M.S. thesis defense4 Constraint Satisfaction Problem Given P = ( V, D, C ) –V : set of variables –D : set of their domains –C : set of constraints restricting the acceptable combination of values for variables –Solution is a consistent assignment of values to variables Query: find 1 solution, all solutions, etc. Examples: SAT, scheduling, product configuration NP-Complete in general V3V3 {d}{d} {a, b, d}{a, b, c} {c, d, e, f} V4V4 V2V2 V1V1

Constraint Systems Laboratory April 21, 2005Lal: M.S. thesis defense5 Systematic search Basic mechanism –DFS & backtracking (BT) –Variable being instantiated:current variable –Uninstantiated variables:future variables –Instantiated variables: past variables Constraint propagation –Remove values inconsistent with constraints –Forward checking filters domains of future variables given the instantiation of current variable

Constraint Systems Laboratory April 21, 2005Lal: M.S. thesis defense6 Value interchangeability [Freuder, ‘91] Equivalent values in the domain of a variable {c, d, e, f } {d}{d} {a, b, d}{a, b, c} V4V4 V2V2 V1V1 V3V3 Full Interchangeability (FI): –d, e, f interchangeable for V 2 in any solution Neighborhood Interchangeability (NI): –Efficiently approximates FI –Finds e, f but misses d –Discrimination tree DT(V x )

Constraint Systems Laboratory April 21, 2005Lal: M.S. thesis defense7 Dynamic bundling [Our group, ‘01] –Dynamically identifies NI –Finds fatter solution than BT & static bundling –Never less efficient than BT & static bundling Bundling: using NI in search BT Static bundling S cd, e, f d V1V1 V2V2 Dynamic bundling ce, fd d V1V1 V2V2 S cefd d V1V1 V2V2 S V3V3 {d}{d} {a, b, d}{a, b, c} { c, d, e, f } V4V4 V2V2 V1V1 Static bundling [Haselböck, ‘93]

Constraint Systems Laboratory April 21, 2005Lal: M.S. thesis defense8 Robust solutions Single solution V 1  d V 2  e V 3  a V 4  c Robust solution V 1  {d} V 2  {d, e, f} V 3  {a} V 4  {b, c} V3V3 {d}{d} {a, b, d}{a, b, c} {c, d, e, f} V4V4 V2V2 V1V1 Solution bundle: Cartesian product of bundles of variables Solution-bundle size = 1  3  1  2 = 6

Constraint Systems Laboratory April 21, 2005Lal: M.S. thesis defense9 Phase transition [Cheeseman et al. ‘91] Significant increase of cost around critical value In CSPs, order parameter is constraint tightness & ratio Algorithms compared around phase transition Cost of solving Mostly solvable problems Mostly un-solvable problems Critical value of order parameter Order parameter

Constraint Systems Laboratory April 21, 2005Lal: M.S. thesis defense10 Non-binary CSPs Constraint Variable C1C1 C2C2 C3C3 C4C4 VV1V1 V2V2 VV3V3 V2V2 V3V3 V4V4 V1V1 V4V C4C4 {1, 2, 3, 4, 5, 6} {1, 2, 3} C2C2 C1C1 C3C3 V1V1 V2V2 V3V3 V4V4 V Scope(C x ): the set of variables involved in C x Arity(C x ): size of scope Computing NI for non-binary CSPs is not a trivial extension from binary CSPs

Constraint Systems Laboratory April 21, 2005Lal: M.S. thesis defense11 CSP parameters n number of variables a domain size t constraint tightness ratio of number of disallowed tuples over all possible tuples deg degree of a variable c k number of constraints of arity k p k = c k / ( n k ) constraint ratio

Constraint Systems Laboratory April 21, 2005Lal: M.S. thesis defense12 Outline Background Neighborhood Interchangeability (NI) for non-binary CSPs –Non-binary discrimination tree (nb-DT) Empirical evaluations Database algorithms based on dynamic bundling Conclusions & future work

Constraint Systems Laboratory April 21, 2005Lal: M.S. thesis defense13 NI for non-binary CSPs 1.Building an nb-DT for each constraint –Determines the NI sets of variable given constraint 2.Intersecting partitions from nb-DTs –Yields NI sets of V (partition of D V ) 3.Processing paths in nb-DTs –Gives, for free, updates necessary for forward checking C4C4 {1, 2, 3, 4, 5, 6} C2C2 C1C1 C3C3 V1V1 V2V2 V3V3 V4V4 V {1, 2} {5, 6}{3, 4} Root nb-DT(V, C 1 ) Root {1, 2} {3, 4} {6} {5} nb-DT(V, C 2 )

Constraint Systems Laboratory April 21, 2005Lal: M.S. thesis defense14 Building an nb-DT: nb-DT(V, C 1 ) (, ) {1, 2} Root C1C1 VV1V1 V2V (, ) Annotation Path {1} Domain of V O (deg. a (k+1). (1 - t)) (, ) {3, 4} (, ) {5, 6}

Constraint Systems Laboratory April 21, 2005Lal: M.S. thesis defense15 Bundling = Search + NI Benefits of bundling 1.Bundles solutions 2.Bundles no-goods Dynamic bundling (DynBndl) –Re-computes NI during search –Yields larger bundles,boosts effects of bundling Skeptics’ objection to DynBndl –Costly & not worthwhile We show that the converse holds {3, 4} {2} {1} {1, 2} {1, 3} {1} {3} {1} No- good bundle V V4V4 V3V3 V1V1 V2V2 Solution bundle

Constraint Systems Laboratory April 21, 2005Lal: M.S. thesis defense16 Advantages of DynBndl We exploit nb-DTs for forward checking DynBndl versus FC (BT+ forward checking) –Finding all solutions: theoretically best –Finding first solution: empirical evidence DynBndl yields multiple, robust for less cost

Constraint Systems Laboratory April 21, 2005Lal: M.S. thesis defense17 Outline Background Neighborhood Interchangeability (NI) for non-binary CSPs Empirical evaluations Database algorithms based on dynamic bundling Conclusions & future work

Constraint Systems Laboratory April 21, 2005Lal: M.S. thesis defense18 Empirical evaluations DynBndl versus FC (BT+forward checking) Experiments –Effect of varying tightness –In the phase-transition region Effect of varying domain size Effect of varying constraint ratio (CR) Randomly generated problems, Model B ANOVA to statistically compare performance of DynBndl and FC with varying t t-distribution for confidence intervals

Constraint Systems Laboratory April 21, 2005Lal: M.S. thesis defense19 Experimental set-up Generated 16 data sets –n = {20,30}  a = {10,15}  {CR1,CR2,CR3,CR4} –9—12 values for t  [25%,75%] –1,000 instances per tightness value Performance measurements –FBS, size of the first solution bundle –NV, number of nodes visited in the search tree –CC, number of constraints checked –CPU time

Constraint Systems Laboratory April 21, 2005Lal: M.S. thesis defense20 Analysis: Varying tightness Low tightness –Large FBS 33 at t= (Dataset #13, t=0.35) –Small additional cost Phase transition –Multiple solutions present –Maximum no-good bundling causes max savings in CPU time, NV, & CC High tightness –Problems mostly unsolvable –Overhead of bundling minimal n=20 a=15 CR=CR Tightness Time [sec] #NV, hundreds t FBS NV CPU time DynBndl FC DynBndl FC

Constraint Systems Laboratory April 21, 2005Lal: M.S. thesis defense21 Analysis: Varying domain size Increasing a in phase- transition –FBS increases: More chances for symmetry –CPU time decreases: more bundling of no- goods CRImprov (CPU) % FBS a=10a=15a=10a=15 CR CR CR CR Increasing a (n=30) Because the benefits of DynBndl increase with increasing domain size, DynBndl is particularly interesting for database applications where large domains are typical

Constraint Systems Laboratory April 21, 2005Lal: M.S. thesis defense22 Outline Background Neighborhood Interchangeability (NI) for non-binary CSPs Empirical evaluations Database algorithms based on dynamic bundling –Sorting-based bundling algorithm –Dynamic-bundling-based join algorithm Conclusions & future work

Constraint Systems Laboratory April 21, 2005Lal: M.S. thesis defense23 Databases & CSPs DB terminologyCSP terminology Table, relationConstraint (relational constraint) Join conditionConstraint (join-condition constraint) AttributeCSP variable Tuple in a tableTuple in a constraint or allowed by one A sequence of natural joinsAll solutions to a CSP Same computational problems, different cost models –Databases: minimize # I/O operations –CSP community: # CPU operations Challenges for using CSP techniques in DB –Use of lighter data structures to minimize memory usage –Fit in the iterator model of database engines

Constraint Systems Laboratory April 21, 2005Lal: M.S. thesis defense24 Join operator R1 x  y R2 –Most expensive operator in terms of I/O –  is “=”  Equi-Join x is same as y  Natural Join Join algorithms –Nested Loop –Sorting-based Sort-Merge, Progressive Merge-Join (PMJ) Partitions relations by sorting, minimizes # scans of relations –Hashing-based

Constraint Systems Laboratory April 21, 2005Lal: M.S. thesis defense25 The join query Join query SELECT R2.A,R2.B,R2.C FROM R1,R2 WHERE R1.A=R2.A AND R1.B=R2.B AND R1.C=R2.C Result: 10 tuples in 3 nested tuples R1R2 (Compacted) A B C {1, 5} {12, 13, 14} {23} {2, 4} {10} {25} {6} {13, 14} {27}

Constraint Systems Laboratory April 21, 2005Lal: M.S. thesis defense26 Modeling join query as a CSP Attributes of relations  CSP variables Attribute values  variable domains Relations  relational constraints Join conditions  join-condition constraints SELECT R1.A,R1.B,R1.C FROM R1,R2 WHERE R1.A=R2.A AND R1.B=R2.B AND R1.C=R2.C R1.AR1.BR1.C R2.AR2.B R2.C R1 R2

Constraint Systems Laboratory April 21, 2005Lal: M.S. thesis defense27 Progressive Merge Join PMJ: a sort-merge algorithm by [Dittrich et al. ‘03] Two phases 1.Sorting: sorts sub-sets of relations & produces early results 2.Merging phase: merges sorted sub-sets We use the framework of the PMJ for our external join

Constraint Systems Laboratory April 21, 2005Lal: M.S. thesis defense28 New join algorithm Sorting & merging phases –Load sub-sets of relations in memory –Compute in-memory join using dynamic bundling In-memory join –Uses sorting-based bundling (shown next) –Computes join of in-memory relations using dynamically computed bundles Cool animation upon request

Constraint Systems Laboratory April 21, 2005Lal: M.S. thesis defense29 Computing a bundle of R1.A Partition Unequal partitions Symmetric partitions Bundle {1, 5} R1 A B C Partition of a constraint –Tuples of the relation having the same value of R1.A Compare projected tuples of first partition with those of another partition Compare with every other partition to get complete bundle

Constraint Systems Laboratory April 21, 2005Lal: M.S. thesis defense30 Experiments XXL library for implementation & evaluation Data sets Random: 2 relations R1, R2 with same schema as example –Each relation: 10,000 tuples –Memory size: 4,000 tuples –Page size 200 tuples Real-world problem: 3 relations, 4 attributes Compaction rate achieved –Random problem: 1.48 –Savings even with (very) preliminary implementation –Real-world problem: 2.26 (69 tuples in 32 nested tuples)

Constraint Systems Laboratory April 21, 2005Lal: M.S. thesis defense31 Outline Background Neighborhood Interchangeability (NI) for non-binary CSPs Empirical evaluations Database algorithms based on dynamic bundling Conclusions & future work

Constraint Systems Laboratory April 21, 2005Lal: M.S. thesis defense32 Conclusions Algorithm for computing NI sets in non-binary CSPs DynBndl –produces multiple robust solutions –significantly reduces cost of search at phase transition New dynamic-bundling-based join algorithm Constraint Processing inspires innovative solutions to fundamental difficult problems in Databases

Constraint Systems Laboratory April 21, 2005Lal: M.S. thesis defense33 Future work Sort constraint definitions to improve CSP techniques Design bundling mechanisms for gap & linear constraints in Constraint Databases Explore benefits of bundling in Databases –Sampling operator –Main-memory databases –Automatic categorization of query results

Constraint Systems Laboratory April 21, 2005Lal: M.S. thesis defense34 Thanks!!

Constraint Systems Laboratory April 21, 2005Lal: M.S. thesis defense35 Related work Join algorithms –Well established algorithms –Do not focus on exploiting symmetry Database compression –Output results are not compressed –Compression at value level, not tuple level

Constraint Systems Laboratory April 21, 2005Lal: M.S. thesis defense36 Related work (contd) [Mamoulis & Papadias 1998] –Join using FC for spatial DB –Restricted to binary constraints –No compaction of solution space [Bayardo et al. 1996] –Reduce the number of the intermediate tuples of a sequence of joins [Rich et al. 1993] –Do not compact join attribute values –Does not detect redundancy present in the grouped sub- relations

Constraint Systems Laboratory April 21, 2005Lal: M.S. thesis defense37 Analysis of overheads For Bundling –Additional data structures: 2 arrays, 1 pointer –Only 1 array (Processed values) may become cumbersome Array size is largest –when all the values of a variable are in one bundle –But, this case also leads to best savings!

Constraint Systems Laboratory April 21, 2005Lal: M.S. thesis defense38 Sorting-based bundling Heuristic for variable ordering Place variables linked by join conditions as close to each other as possible R1.A R2.A R1.B R2.B R1.C R2.C R1 R2  Sort relations using above ordering  Next: Compute bundles of variable ahead in variable ordering ( R1.A )

Constraint Systems Laboratory April 21, 2005Lal: M.S. thesis defense39 Join using bundling Computing bundle for R1.A ABC ABC Processed values R1 Processed values R2 Select partition to compare for R1.ASymmetric partitions, Adding to bundle of R1.A, Current bundle of R1.A = {1, 5} Computing bundle for R2.A Select partition to compare Symmetric partitions, Adding to bundle of R2.A, Current bundle of R2.A = {1, 5} Update processed values for R1.A 5 5 Update processed values for R2.A R1 R2 R2.C R1.A R2.A R1.B R2.B R1.C

Constraint Systems Laboratory April 21, 2005Lal: M.S. thesis defense40 Join using bundling 5 1, 5 5 Current bundle of R1.A = {1, 5} Current bundle of R2.A = {1, 5} Common(R1.A, R2.A) = {1, 5} Compute current constraint of R1 Assign {1, 5} to R1.A ABC ABC Processed values R1 Processed values R2 R1 R2 R2.C R1.A R2.A R1.B R2.B R1.C 1, 5 Assign {1, 5} to R2.A Compute current constraint of R2 Next variable R1.B