1 Chengkai Li Kevin-Chen-Chuan Chang Ihab Ilyas Sumin Song Presented by: Mariam John CSE 6392 03/20/2006 RankSQL: Query Algebra and Optimization for Relational.

Slides:



Advertisements
Similar presentations
Answering Approximate Queries over Autonomous Web Databases Xiangfu Meng, Z. M. Ma, and Li Yan College of Information Science and Engineering, Northeastern.
Advertisements

Query optimisation.
Efficient Processing of Top- k Queries in Uncertain Databases Ke Yi, AT&T Labs Feifei Li, Boston University Divesh Srivastava, AT&T Labs George Kollios,
Efficient IR-Style Keyword Search over Relational Databases Vagelis Hristidis University of California, San Diego Luis Gravano Columbia University Yannis.
Supporting top-k join queries in relational databases Ihab F. Ilyas, Walid G. Aref, Ahmed K. Elmagarmid Presented by Rebecca M. Atchley Thursday, April.
RankSQL: Supporting Ranking Queries in RDBMS Chengkai Li (UIUC) Mohamed A. Soliman (Univ. of Waterloo) Kevin Chen-Chuan Chang (UIUC) Ihab F. Ilyas (Univ.
1 RankSQL: Query Algebra and Optimization for Relational Top-k Queries Chengkai Li (UIUC) joint work with Kevin Chen-Chuan Chang (UIUC) Ihab F. Ilyas (U.
CS 540 Database Management Systems
Supporting Ad-Hoc Ranking Aggregates Chengkai Li (UIUC) joint work with Kevin Chang (UIUC) Ihab Ilyas (Waterloo)
Efficient Query Evaluation on Probabilistic Databases
SUPPORTING TOP-K QUERIES IN RELATIONAL DATABASES. PROCEEDINGS OF THE 29TH INTERNATIONAL CONFERENCE ON VERY LARGE DATABASES, MARCH 2004 Sowmya Muniraju.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 52 Database Systems I Relational Algebra.
Sensitivity Analysis & Explanations for Robust Query Evaluation in Probabilistic Databases Bhargav Kanagal, Jian Li & Amol Deshpande.
FlexPref: A Framework for Extensible Preference Evaluation in Database Systems Justin J. Levandoski Mohamed F. Mokbel Mohamed E. Khalefa.
1 Ranked Queries over sources with Boolean Query Interfaces without Ranking Support Vagelis Hristidis, Florida International University Yuheng Hu, Arizona.
Flow Algorithms for Two Pipelined Filtering Problems Anne Condon, University of British Columbia Amol Deshpande, University of Maryland Lisa Hellerstein,
-Shourie Boddupalli. Data Parallelism Data Parallelism is a form of parallelization of computing across multiple processors in parallel computing environment.
Query Execution Professor: Dr T.Y. Lin Prepared by, Mudra Patel Class id: 113.
1 Ivan Lanese Computer Science Department University of Bologna Italy Concurrent and located synchronizations in π-calculus.
Depth Estimation for Ranking Query Optimization Karl Schnaitter, UC Santa Cruz Joshua Spiegel, BEA Systems, Inc. Neoklis Polyzotis, UC Santa Cruz.
Evaluating Top-k Queries over Web-Accessible Databases Nicolas Bruno Luis Gravano Amélie Marian Columbia University.
Query Execution Chapter 15 Section 15.1 Presented by Khadke, Suvarna CS 257 (Section II) Id
Graph Algebra with Pattern Matching and Aggregation Support 1.
Minimal Probing: Supporting Expensive Predicates for Top-k Queries Kevin C. Chang Seung-won Hwang Univ. of Illinois at Urbana-Champaign.
Efficient Query Evaluation over Temporally Correlated Probabilistic Streams Bhargav Kanagal, Amol Deshpande ΗΥ-562 Advanced Topics on Databases Αλέκα Σεληνιωτάκη.
Data Warehouse Operational Issues Potential Research Directions.
A Metadata Based Approach For Supporting Subsetting Queries Over Parallel HDF5 Datasets Vignesh Santhanagopalan Graduate Student Department Of CSE.
Database Management 9. course. Execution of queries.
Michael Cafarella Alon HalevyNodira Khoussainova University of Washington Google, incUniversity of Washington Data Integration for Relational Web.
1 Evaluating top-k Queries over Web-Accessible Databases Paper By: Amelie Marian, Nicolas Bruno, Luis Gravano Presented By Bhushan Chaudhari University.
1 A Theoretical Framework for Association Mining based on the Boolean Retrieval Model on the Boolean Retrieval Model Peter Bollmann-Sdorra.
DATA-DRIVEN UNDERSTANDING AND REFINEMENT OF SCHEMA MAPPINGS Data Integration and Service Computing ITCS 6010.
An Extensible Test Framework for Microsoft StreamInsight Alex Raizman Asvin Ananthanarayan Anton Kirilov Badrish Chandramouli Mohamed Ali.
Query Optimization Arash Izadpanah. Introduction: What is Query Optimization? Query optimization is the process of selecting the most efficient query-evaluation.
Querying Structured Text in an XML Database By Xuemei Luo.
Join Synopses for Approximate Query Answering Swarup Achrya Philip B. Gibbons Viswanath Poosala Sridhar Ramaswamy Presented by Bhushan Pachpande.
Academic Year 2014 Spring. MODULE CC3005NI: Advanced Database Systems “QUERY OPTIMIZATION” Academic Year 2014 Spring.
Map-Reduce-Merge: Simplified Relational Data Processing on Large Clusters Hung-chih Yang(Yahoo!), Ali Dasdan(Yahoo!), Ruey-Lung Hsiao(UCLA), D. Stott Parker(UCLA)
Privacy Preservation of Aggregates in Hidden Databases: Why and How? Arjun Dasgupta, Nan Zhang, Gautam Das, Surajit Chaudhuri Presented by PENG Yu.
Supporting Top-k join Queries in Relational Databases Ihab F. Ilyas, Walid G. Aref, Ahmed K. Elmagarmid Presented by: Richa Varshney.
Keyword Searching and Browsing in Databases using BANKS Seoyoung Ahn Mar 3, 2005 The University of Texas at Arlington.
© ETH Zürich Eric Lo ETH Zurich a joint work with Carsten Binnig (U of Heidelberg), Donald Kossmann (ETH Zurich), Tamer Ozsu (U of Waterloo) and Peter.
1 Biometric Databases. 2 Overview Problems associated with Biometric databases Some practical solutions Some existing DBMS.
1University of Texas at Arlington.  Introduction  Motivation  Requirements  Paper’s Contribution.  Related Work  Overview of Ripple Join  Rank.
Spatial Query Processing Spatial DBs do not have a set of operators that are considered to be basic elements in a query evaluation. Spatial DBs handle.
All right reserved by Xuehua Shen 1 Optimal Aggregation Algorithms for Middleware Ronald Fagin, Amnon Lotem, Moni Naor (PODS01)
Supporting Top-k join Queries in Relational Databases Ihab F. Ilyas, Walid G. Aref, Ahmed K. Elmagarmid Presented by: Z. Joseph, CSE-UT Arlington.
Effective Keyword-Based Selection of Relational Databases By Bei Yu, Guoliang Li, Karen Sollins & Anthony K. H. Tung Presented by Deborah Kallina.
Chapter 5 Ranking with Indexes 1. 2 More Indexing Techniques n Indexing techniques:  Inverted files - best choice for most applications  Suffix trees.
 CS 405G: Introduction to Database Systems Lecture 6: Relational Algebra Instructor: Chen Qian.
Dec. 13, 2002 WISE2002 Processing XML View Queries Including User-defined Foreign Functions on Relational Databases Yoshiharu Ishikawa Jun Kawada Hiroyuki.
CSC271 Database Systems Lecture # 7. Summary: Previous Lecture  Relational keys  Integrity constraints  Views.
Supporting Ranking and Clustering as Generalized Order-By and Group-By Chengkai Li (UIUC) joint work with Min Wang Lipyeow Lim Haixun Wang (IBM) Kevin.
Answering Top-k Queries with Multi-Dimensional Selections: The Ranking Cube Approach Dong Xin, Jiawei Han, Hong Cheng, Xiaolei Li Department of Computer.
Service Marts: a Service Framework for Search Computing Alessandro Campi Andrea Maesani.
03/02/20061 Evaluating Top-k Queries Over Web-Accessible Databases Amelie Marian Nicolas Bruno Luis Gravano Presented By: Archana and Muhammed.
Rate-Based Query Optimization for Streaming Information Sources Stratis D. Viglas Jeffrey F. Naughton.
REED : Robust, Efficient Filtering and Event Detection in Sensor Network Daniel J. Abadi, Samuel Madden, Wolfgang Lindner Proceedings of the 31st VLDB.
1 VLDB, Background What is important for the user.
Supporting Ranking and Clustering as Generalized Order-By and Group-By
RankSQL: Query Algebra and Optimization for Relational Top-k Queries
Seung-won Hwang, Kevin Chen-Chuan Chang
Supporting Ad-Hoc Ranking Aggregates
RankSQL: Query Algebra and Optimization for Relational Top-k Queries
Chapter 15 QUERY EXECUTION.
Data Integration for Relational Web
Query Execution Presented by Jiten Oswal CS 257 Chapter 15
Probabilistic Databases
Query Specific Ranking
Presentation transcript:

1 Chengkai Li Kevin-Chen-Chuan Chang Ihab Ilyas Sumin Song Presented by: Mariam John CSE /20/2006 RankSQL: Query Algebra and Optimization for Relational Top- k Queries

2 Contents  Introduction  RankSQL  Ranking Query Model  Rank-Relational Algebra  Ranking Query Plans:Execution Model  Conclusion

3 Introduction  Top-k queries provides only the top k query results according to a user-specified ranking function.  Most of the available solutions are in the middleware, or focus on specific operators and queries.  Top-k queries are not treated as first class query type in RDBMS. Relational algebra has no notion for ranking.

4 RankSQL  Provides seamless support and integration of top-k queries with the existing SQL query facility in RDBMS.  Supports ranking as a first-class database construct.  Extends relational algebra and query optimization.

5 Example of a Top-k Query  SELECT * FROM Hotel h, Restaurant r, Museum m WHERE c1 AND c2 AND c3 ORDER BY p1+p2+p3 LIMIT k c1: r.cuisine=Italian p1: cheap(h.price) c2: h.price+r.price<100 p2: close(h.addr,r.addr) c3: r.area=m.area p3: related(m.collection, “dinosaur ”)

6 Rank Query Model  Rank relational query has 4 types of predicates: Filtering – Boolean-selection predicates Boolean-join predicates Ranking – rank-selection predicates rank-join predicates  Goal is to support rank relational queries efficiently. Filtering Ranking

7 Rank-Relational Query  Such queries add a ranking dimension to query processing and optimization.  Filtering restricts tuple “membership” by applying a Boolean function of Boolean selection or join predicates.  Ranking restricts “order” by applying a monotonic scoring function of ranking predicates.

8 Ranking as First-Class Construct  Support for ranking as a first class construct in RDBMS is lacking.  Relational algebra models Boolean filtering as a first class construct in query processing.   c1 is a selection over R, and c2 is a join condition over R * S

9 Filtering as a First-Class Construct  Algebra framework supports the following for Boolean filtering: - splitting - interleaving  Enable query optimization to transform from canonical form to efficient query plans.

10 Ranking as First-Class Construct  Algebraic support for optimization is lacking for ranking.  The sorting operator is ‘monolithic’.  It may be beneficial to evaluate ranking predicates one by one and interleave them with Boolean filtering.

11 Challenges  First, we must extend relational algebra to do the following:  Handle ranking  Define algebraic laws to handle equivalence transformation  Second, we need to generalize query optimization techniques to integrate the parallel dimensions of Boolean filtering and ranking.

12 Rank-Relational Algebra  Rank-Relation is a relation with its tuples scored and ordered accordingly  How do we rank a relation, given

13 Ranking principle  Maximum possible score of a tuple t, denoted by, is defined as: = if = 1otherwise

14 Examples of Rank-Relations

15 Operators  Need to extend relational-algebra operators for manipulating rank- relations.  For supporting ranking as a first-class construct, define a new operator ‘μ’.  This new ‘rank’ operator should satisfy the two requirements: splitting and interleaving.

16 New Operator, μ  Extend relational algebra by adding a new rank operator, μ. What does mean?  Extend the original semantics of existing operators with rank-awareness, enabling interaction with the new rank operator.  Extend relational algebra such that it gives several equivalences relevant to ranking.

17 Results of Operators

18 Ranking Query Plans: Execution Model  Extend the common execution model to handle rank query.  Operators incrementally output rank relations.  Query has an explicitly requested result size.  Key capability of a rank-aware operator is to decide if enough information has been obtained from its input tuples in order to incrementally produce the next ranked output tuple.

19 Example

20 Conclusion  RankSQL is a system that provides a systematic framework to support efficient evaluation of top-k queries in RDBMS.  Extend relational algebra to make ranking a first-class construct.  Query execution model is extended to handle ranking query.  Rank-aware operators are selective and context-sensitive.