Presentation is loading. Please wait.

Presentation is loading. Please wait.

Ranking objects based on relationships Computing Top-K over Aggregation Sigmod 2006 Kaushik Chakrabarti et al.

Similar presentations


Presentation on theme: "Ranking objects based on relationships Computing Top-K over Aggregation Sigmod 2006 Kaushik Chakrabarti et al."— Presentation transcript:

1 Ranking objects based on relationships Computing Top-K over Aggregation Sigmod 2006 Kaushik Chakrabarti et al.

2 Outline Motivation The framework and problem definition Proposed solution Discussions Experiments

3 Heating up discussion We basically know how web search engines work. –Having web crawlers collecting web-page information, index and rank them. How do we define searching in a relational database –Free-style search v.s. SQL + predicates ? –What’s the expected outcome? –How do we rank results?

4 Motivation Searching over a relational database –information scattered in different relations

5 Motivation Full text search, aggregation already supported by RDBMS –What else do we need in order to perform good searching?

6 Related work Information Retrieval (full text searching) Researches in Text Databases Explore database via foreign key-primary key –DBExplorer (ICDE 2002) –BANKS (ICDE 2002) –DISCOVER (VLDB 2002) What are related work missing –Target objects don’t contain keywords –Lack of scoring function for query results –Not utilizing aggregates to put together search results for multiple keywords

7 Contributions Introduce an interesting problem domain Define “Object Finder” (OF) queries Propose scoring functions Propose a solution to process OF query –Return top K ranked results –Efficient early termination property

8 System Overview

9 Scoring functions Scoring Matrixes and row- column- marginal's

10 Scoring semantics All Query Keywords Present in each document –can be too restrictive All Query Keywords Present in Set of Related Documents –can not use MIN as row-marginal scoring Pseudo-document Approach: –enlarged searching space

11 Problem definition Object finder problem:

12 Process OF query as Top-K query Top-K query incorporates ranking. Results are total ordered if we process strong top-K A good algorithm can utilize early termination to avoid processing of results that are not in top-K

13 Top-K query processing General framework: Supporting Ad-hoc Ranking Aggregates SIGMOD 2006 ( presented in May) *SELECT* ga_1,..ga_n, F ----groups *FROM* R1,...,Rh ----source rel *WHERE* c1 AND... cl ----join cond. *GROUP BY* ga_1,...ga_n----group def. *ORDER BY* F ----ordering func. an aggregate *LIMIT* k ----Top-k setting

14 Top-K query processing For OF query, it is select TOId, TOValue, score(TOId) from TargetTable T, R, L1,...,LN where R.TOId = T.TOID and R.DocId=Li.DocID (i=1..N) group by TOId, TOValue order By score(TOId) limit k

15 My work is done (please try to recall my last talk)

16 Algorithm : Generate-Prune Phrase I : Compute top-K candidates

17 Algorithm Overview

18 Algorithm Phrase II Compute exact top-K

19 Discussions In this work –Choice of aggregation function –ranking function in general –How do you think of this work Not limited –Impact of more complicated schema –Impact of selectivity of the query

20 Experiment Results Faster than SQL Faster than Generate-Only Robust to # of keywords and selections Intuitive Results

21 Experiments Faster than SQL

22 Experiments Faster than Generate-Only

23 Experiments Robust to # of keywords and selections

24 Thank you Questions to discuss?


Download ppt "Ranking objects based on relationships Computing Top-K over Aggregation Sigmod 2006 Kaushik Chakrabarti et al."

Similar presentations


Ads by Google