Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Cooperative Database System (CoBase) for Query Relaxation Wesley W. Chu, Hua Yang, and Gladys Chow Presented by David Liu.

Similar presentations


Presentation on theme: "A Cooperative Database System (CoBase) for Query Relaxation Wesley W. Chu, Hua Yang, and Gladys Chow Presented by David Liu."— Presentation transcript:

1 A Cooperative Database System (CoBase) for Query Relaxation Wesley W. Chu, Hua Yang, and Gladys Chow Presented by David Liu

2 6/24/2015David Liu, UCB Database Seminar Motivation §Often times when you query, you want ‘about the same’ instead of ‘exactly’ ŸMedical Image Diagnosis—match images to diseases §Other times, you might not even want near items, just the least far ŸARPA/Rome Planning Labs Initiative (ARPI) Transportation problem

3 6/24/2015David Liu, UCB Database Seminar High Level description of solution §View a query Q’s response set R as a subset of all information stored in the database §All records in R satisfy a set of constraints C put forth by Q §If R is empty, then perform incremental relaxation

4 6/24/2015David Liu, UCB Database Seminar CoBase §Main design features: ŸRelaxation: if there’s no exact match, try to find a ‘close’ neighbor and see if he matches ŸControl: allow the user to control relaxations ŸExplanation: justify relaxations to the user in semantic terms

5 6/24/2015David Liu, UCB Database Seminar Architecture Source: A Cooperative Database System for Query Relaxation, page 4

6 6/24/2015David Liu, UCB Database Seminar Demonstration

7 6/24/2015David Liu, UCB Database Seminar Relaxation: Type Abstraction Hierarchies §Sample query: SELECT * FROM Students s WHERE s.GPA = 3.700 §Suppose that there are no students with GPA = 3.700, but some with 3.682 and another with 3.702 §We might conceptually have wanted the student table to return these tuples §We can use Type Abstraction Hierarchies (TAHs) to classify GPA’s conceptually

8 6/24/2015David Liu, UCB Database Seminar Relaxation: Type Abstraction Hierarchy(TAH)

9 6/24/2015David Liu, UCB Database Seminar TAH Operators §There are two special operators used to exploit the TAH: ŸGeneralize(node x)—get the parent of x, which which encapsulates instances which are similar to x ŸSpecialize(node x)—get the set of all instances represented by node x. Definition: ŸNote: these two operators not inverses

10 6/24/2015David Liu, UCB Database Seminar TAH Operators §A relaxation can be seen as: ŸSpecialize(Generalize(x)): where x is the value/predicate that we are trying to relax §An n-level relaxation is then: ŸSpecialize(Generalize n (x)): which is the same as n iterative generalizations followed by a specialization

11 6/24/2015David Liu, UCB Database Seminar Relaxation Example § Example: subtree of the GPA TAH: ŸGeneralize(3.700) will yield node A ŸSpecialize(Generalize(3.700)) will yield the set of values: {3.667,…,4.000} ŸSpecialize(Generalize 2 (3.700)) will yield the following set: Ÿ{3.352,…,3.700,…,4.000}

12 6/24/2015David Liu, UCB Database Seminar Multi-attribute Type Abstraction Hierarchy (MTAH) §MTAH’s are multiple-attribute type abstraction hierarchies §These are a generalization of single- attribute TAH’s §MTAH’s can be used to classify geographical data

13 6/24/2015David Liu, UCB Database Seminar MTAHs: Example Based on: A Cooperative Database System for Query Relaxation, page 6 Bizerte Tunis Saminjah Sfax Gabes Jerba Gafsa El_Borma Djedeida

14 6/24/2015David Liu, UCB Database Seminar Automatic Generation of TAH’s §Main idea: Ÿrecursively partition search space into two until each partition has less than T items ŸRepartition each partition further to obtain N-ary partition. This is done with a hill climbing algorithm

15 6/24/2015David Liu, UCB Database Seminar Automatic Generation of TAH’s §Main idea: ŸBinary partitioning: recursively partition search space into two until each partition has less than T items ŸN-ary partitioning: Repartition each partition further to obtain N-ary partition. This is done with a hill climbing algorithm

16 6/24/2015David Liu, UCB Database Seminar Automatic Generation of TAH’s §After each partition, calculate the Categorical Utility of the partitioning to decide whether to terminate §Relaxation Errors to measure utility

17 6/24/2015David Liu, UCB Database Seminar Generation of TAH’s complexity §In general, partitioning is exponential: O(N N ) where N is the number of items §Partitioning a sorted set into contiguous clusters allows O(n 2 ) worst-case performance and O(n log n) average performance

18 6/24/2015David Liu, UCB Database Seminar CoSQL §Extension to SQL to add relaxation operators ŸContext Free ŸContext Sensitive ŸControl ŸInteractive

19 6/24/2015David Liu, UCB Database Seminar CoSQL: Context Free §Approximate Ÿ^v 1 ŸReturn values approximate to v 1 §Between two members Ÿbetween(v 1,v 2 ) ŸReturn values between two values §Within a set ŸWithin(v 1,v 2,…,v n ) ŸSpecifies set membership

20 6/24/2015David Liu, UCB Database Seminar CoSQL: Context Sensitive §Context sensitive nearness ŸNear-to X §User-specified nearness ŸSimilar to X based-on ((a 1 w 1 ) (a 2 w 2 )…(a n w n ) Ÿ a i are attributes and w i are weights

21 6/24/2015David Liu, UCB Database Seminar CoSQL: Control Operators §Prioritization of relaxation ŸRelaxation-order(a 1,a 2,…,a n ) §Relaxation restriction ŸNot-relaxable(a 1,a 2,…,a n ) §Preference-list ŸPreference-list(v 1,v 2,…,v n ) on a particular attribute a §Unacceptable values ŸUnacceptable-list(v 1,v 2,…,v n ) on a particular attribute a

22 6/24/2015David Liu, UCB Database Seminar CoSQL: Control Operators cont’d §Using another TAH ŸAlternative-TAH(TAH-Name) §Restricting amount of relaxation ŸRelaxation-level(v) §Answer-set(s) ŸSpecifies the minimum set of answers

23 6/24/2015David Liu, UCB Database Seminar CoSQL: Interactive operators §Nearer, further ŸThese Interactive operators are invoked after the user see’s an answer-set Ÿnot SQL per se ŸUsed to interactively control geographical queries

24 6/24/2015David Liu, UCB Database Seminar Explanation Mediators §By having automated relaxation, the user loses understanding of the system §Explanation mediator explains relaxations and justifies them to the user §Explanations come from an explanation dictionary

25 6/24/2015David Liu, UCB Database Seminar Performance §Queries from the ARPI transportation domain had the following results: ŸQuery relaxation time 1/5 (2 secs) of database retrieval time ŸDatabase retrieval time (10 secs) ŸExplanation time also another 1/5 (2 secs) of database retrieval time ŸTotal overhead is about 40% ŸMost important measure: relaxation quality, is difficult to measure ŸUnclear: exact running times of TAH generation and storage spaces for these TAH’s

26 6/24/2015David Liu, UCB Database Seminar TAH’s and B-trees? §TAH’s are much like B-tree indexes: ŸHierarchical ŸCluster-based ŸPartition search space ŸTAH:B-tree::MTAH:R-tree sWith the exception that R-trees allow overlapping partitions ŸTAH like iterative access method that traverses up and down the tree

27 6/24/2015David Liu, UCB Database Seminar Applications §Medical Image matching §ARPI Transportation Planning §Electronic Warfare

28 6/24/2015David Liu, UCB Database Seminar Evaluation §Mutually exclusive partitioning could be a problem ŸOptimal arrangement for this CoBase’s relaxation approach is to radiate outward from the querying ‘epicenter’ §Multiple dimension exacerbates the partitioning problem §Indexing techniques might be beneficial to allow overlapping partitions

29 6/24/2015David Liu, UCB Database Seminar The End

30 6/24/2015David Liu, UCB Database Seminar Categorical Utility(CU) §Categorical Utility is the objective value of a partition §RE of a point: ŸX i is a point, P(x j )=probability of point x j

31 6/24/2015David Liu, UCB Database Seminar Categorical Utility(CU) §Categorical Utility is the objective value of a partition §RE of a partition: ŸC is a partition, x i ’s are the points in the partition, P(x i ) is the probability of occurrence of each point, RE(x i ) is the relaxation error of the point in the partition

32 6/24/2015David Liu, UCB Database Seminar Categorical Utility(CU) §Categorical Utility is the objective value of a partition §RE of a partition: ŸP is a partitioning, P(C k ) is the probability of occurrence of each partition, RE(C k ) is the relaxation error of the partition


Download ppt "A Cooperative Database System (CoBase) for Query Relaxation Wesley W. Chu, Hua Yang, and Gladys Chow Presented by David Liu."

Similar presentations


Ads by Google