CSE 6392 – Data Exploration and Analysis in Relational Databases April 20, 2006.

Slides:



Advertisements
Similar presentations
1 Top-K Algorithms: Concepts and Applications by Demetris Zeinalipour Visiting Lecturer Department of Computer Science University of Cyprus Department.
Advertisements

Efficient Processing of Top- k Queries in Uncertain Databases Ke Yi, AT&T Labs Feifei Li, Boston University Divesh Srivastava, AT&T Labs George Kollios,
Chapter 5: Introduction to Information Retrieval
 Introduction  Views  Related Work  Preliminaries  Problems Discussed  Algorithm LPTA  View Selection Problem  Experimental Results.
RankSQL: Supporting Ranking Queries in RDBMS Chengkai Li (UIUC) Mohamed A. Soliman (Univ. of Waterloo) Kevin Chen-Chuan Chang (UIUC) Ihab F. Ilyas (Univ.
SPARK: Top-k Keyword Query in Relational Databases Yi Luo, Xuemin Lin, Wei Wang, Xiaofang Zhou Univ. of New South Wales, Univ. of Queensland SIGMOD 2007.
Top-k Query Evaluation with Probabilistic Guarantees By Martin Theobald, Gerald Weikum, Ralf Schenkel.
Exploring Reduction for Long Web Queries Niranjan Balasubramanian, Giridhar Kuamaran, Vitor R. Carvalho Speaker: Razvan Belet 1.
Ming Hua, Jian Pei Simon Fraser UniversityPresented By: Mahashweta Das Wenjie Zhang, Xuemin LinUniversity of Texas at Arlington The University of New South.
A New Suffix Tree Similarity Measure for Document Clustering Hung Chim, Xiaotie Deng City University of Hong Kong WWW 2007 Session: Similarity Search April.
Web Document Clustering: A Feasibility Demonstration Hui Han CSE dept. PSU 10/15/01.
Optimized Query Execution in Large Search Engines with Global Page Ordering Xiaohui Long Torsten Suel CIS Department Polytechnic University Brooklyn, NY.
6/15/20151 Top-k algorithms Finding k objects that have the highest overall grades.
Circumventing Data Quality Problems Using Multiple Join Paths Yannis Kotidis, Athens University of Economics and Business Amélie Marian, Rutgers University.
1 Ranked Queries over sources with Boolean Query Interfaces without Ranking Support Vagelis Hristidis, Florida International University Yuheng Hu, Arizona.
Aggregation Algorithms and Instance Optimality
Introduction to Information Retrieval Introduction to Information Retrieval Hinrich Schütze and Christina Lioma Lecture 7: Scores in a Complete Search.
Evaluating the Performance of IR Sytems
Top- K Query Evaluation with Probabilistic Guarantees Martin Theobald, Gerhard Weikum, Ralf Schenkel Presenter: Avinandan Sengupta.
CS246 Ranked Queries. Junghoo "John" Cho (UCLA Computer Science)2 Traditional Database Query (Dept = “CS”) & (GPA > 3.5) Boolean semantics Clear boundary.
Keyword Search in Relational Databases Jaehui Park Intelligent Database Systems Lab. Seoul National University
DBease: Making Databases User-Friendly and Easily Accessible Guoliang Li, Ju Fan, Hao Wu, Jiannan Wang, Jianhua Feng Database Group, Department of Computer.
Sanjay Agarwal Surajit Chaudhuri Gautam Das Presented By : SRUTHI GUNGIDI.
Mehdi Kargar Aijun An York University, Toronto, Canada Keyword Search in Graphs: Finding r-cliques.
Probabilistic Ranking of Database Query Results Surajit Chaudhuri, Microsoft Research Gautam Das, Microsoft Research Vagelis Hristidis, Florida International.
Ranking Queries on Uncertain Data: A Probabilistic Threshold Approach Wenjie Zhang, Xuemin Lin The University of New South Wales & NICTA Ming Hua,
1 Evaluating top-k Queries over Web-Accessible Databases Paper By: Amelie Marian, Nicolas Bruno, Luis Gravano Presented By Bhushan Chaudhari University.
CSE 6331 © Leonidas Fegaras Information Retrieval 1 Information Retrieval and Web Search Engines Leonidas Fegaras.
Towards Robust Indexing for Ranked Queries Dong Xin, Chen Chen, Jiawei Han Department of Computer Science University of Illinois at Urbana-Champaign VLDB.
Ranking in Information Retrieval Systems Prepared by: Mariam John CSE /23/2006.
Keyword Search in Databases using PageRank By Michael Sirivianos April 11, 2003.
Mehdi Kargar Aijun An York University, Toronto, Canada Keyword Search in Graphs: Finding r-cliques.
1 Opinion Retrieval from Blogs Wei Zhang, Clement Yu, and Weiyi Meng (2007 CIKM)
1University of Texas at Arlington.  Introduction  Motivation  Requirements  Paper’s Contribution.  Related Work  Overview of Ripple Join  Rank.
Ranking objects based on relationships Computing Top-K over Aggregation Sigmod 2006 Kaushik Chakrabarti et al.
All right reserved by Xuehua Shen 1 Optimal Aggregation Algorithms for Middleware Ronald Fagin, Amnon Lotem, Moni Naor (PODS01)
Supporting Top-k join Queries in Relational Databases Ihab F. Ilyas, Walid G. Aref, Ahmed K. Elmagarmid Presented by: Z. Joseph, CSE-UT Arlington.
Winter Semester 2003/2004Selected Topics in Web IR and Mining5-1 5 Index Pruning 5.1 Index-based Query Processing 5.2 Pruning with Combined Authority/Similarity.
Presented by Suresh Barukula 2011csz  Top-k query processing means finding k- objects, that have highest overall grades.  A query in multimedia.
Answering Top-k Queries Using Views Gautam Das (Univ. of Texas), Dimitrios Gunopulos (Univ. of California Riverside), Nick Koudas (Univ. of Toronto), Dimitris.
Ranking Instructor: Gautam Das Class notes Prepared by Sushanth Sivaram Vallath.
Supporting Ranking and Clustering as Generalized Order-By and Group-By Chengkai Li (UIUC) joint work with Min Wang Lipyeow Lim Haixun Wang (IBM) Kevin.
Optimal Aggregation Algorithms for Middleware By Ronald Fagin, Amnon Lotem, and Moni Naor.
Date: 2013/4/1 Author: Jaime I. Lopez-Veyna, Victor J. Sosa-Sosa, Ivan Lopez-Arevalo Source: KEYS’12 Advisor: Jia-ling Koh Speaker: Chen-Yu Huang KESOSD.
Introduction to Information Retrieval Introduction to Information Retrieval CS276 Information Retrieval and Web Search Chris Manning and Pandu Nayak Efficient.
03/02/20061 Evaluating Top-k Queries Over Web-Accessible Databases Amelie Marian Nicolas Bruno Luis Gravano Presented By: Archana and Muhammed.
Relational Operator Evaluation. overview Projection Two steps –Remove unwanted attributes –Eliminate any duplicate tuples The expensive part is removing.
A Unified Approach to Ranking in Probabilistic Databases Jian Li, Barna Saha, Amol Deshpande University of Maryland, College Park, USA VLDB
Heuristic Alignment Algorithms Hongchao Li Jan
Pete Bohman Adam Kunk.  Introduction  Related Work  System Overview  Indexing Scheme  Ranking  Evaluation  Conclusion.
Navigation Aided Retrieval Shashank Pandit & Christopher Olston Carnegie Mellon & Yahoo.
Implementation of Vector Space Model March 27, 2006.
1 VLDB, Background What is important for the user.
3: Search & retrieval: Structures. The dog stopped attacking the cat, that lived in U.S.A. collection corpus database web d1…..d n docs processed term-doc.
Structured-Value Ranking in Update- Intensive Relational Databases Jayavel Shanmugasundaram Cornell University (Joint work with: Lin Guo, Kevin Beyer,
Substitution Method: Solve the linear system. Y = 3x + 2 Equation 1 x + 2y=11 Equation 2.
Supporting Ranking and Clustering as Generalized Order-By and Group-By
Information Retrieval and Web Search
Probabilistic Data Management
HITS Hypertext Induced Topic Selection
Popular Ranking Algorithms
DBMS with probabilistic model
HITS Hypertext Induced Topic Selection
8. Efficient Scoring Most slides were adapted from Stanford CS 276 course and University of Munich IR course.
Navigation-Aided Retrieval
Prefer: A System for the Efficient Execution
Query Specific Ranking
Probabilistic Information Retrieval
Introduction to XML IR XML Group.
VECTOR SPACE MODEL Its Applications and implementations
Presentation transcript:

CSE 6392 – Data Exploration and Analysis in Relational Databases April 20, 2006

Ranking Using Materialized View View – results of a query Materialized View – persistent results Two problems need to be solved: 1.Which views should be materialized? 2.Given a query, how do you best use the materialized views?

Ranking Query f: w 1 x 1 +w 2 x 2 +…+w m x m k (number of tuples) output: top-k tuples Possible ranking algorithms: -scan: only uses the base table -TA – uses “views” for sorted lists x1x1 xmxm t1t1 tntn

Ranking Query – Materialized Views In this new (not yet published) work, tackling the problem of using the materialized views rather than the traditional “skinny” tables Assume that we already have a bunch of materialized views corresponding to ranking queries: Ex. sorted k-tuples for functions (with materialized views): 3x 1 +2x 2 +5x 3 (Q1) 2x 1 +3x 2 (Q2) 2x 2 +4x 3 (Q3) If we get another query that matches one of these, can use the materialized views.

Ranking Query – An Early Idea However, suppose we get the following query: Q: 2x 1 +4x 2 +x 3 How do we solve this? An early idea: Ex. Q: 2x 1 +5x 2 +4x 3 Could do the TA algorithm on Q 2 + Q 3 Linear programming.

Ranking Query – Current Solution Geometric background. Suppose you have the following: Q 1 : 2x 1 +4x 2 +x 3, and k = 1 (top tuple) X Y Perpendicular line (3,2) iso-score line (every point on line has some score) Highest score is the best

Ranking Query – How Does This Actually Work? In original TA algorithm, the advantage is the stopping condition. In this approach, the stopping condition is when the linear programming solution drops below the threshold. This paper is not published yet.

Summary of Ranking 1)Fast execution of ranking queries/functions scan, TA, Lp TA inverted lists 2)Ranking function in IR vector space/TF-IDF probabilistic 3)Ranking on the web PageRank HITS 4)Ranking in databases keyword search (DBXplorer, Discover, Ranks) Probabilistic info retrieval