Presentation is loading. Please wait.

Presentation is loading. Please wait.

Efficient Type-Ahead Search on Relational Data: a TASTIER Approach Guoliang Li 1, Shengyue Ji 2, Chen Li 2, Jianhua Feng 1 1 Tsinghua University, Beijing,

Similar presentations


Presentation on theme: "Efficient Type-Ahead Search on Relational Data: a TASTIER Approach Guoliang Li 1, Shengyue Ji 2, Chen Li 2, Jianhua Feng 1 1 Tsinghua University, Beijing,"— Presentation transcript:

1 Efficient Type-Ahead Search on Relational Data: a TASTIER Approach Guoliang Li 1, Shengyue Ji 2, Chen Li 2, Jianhua Feng 1 1 Tsinghua University, Beijing, China 2 University of California, Irvine, CA, USA

2 Tsinghua & UC Irvine Efficient Type-Ahead Search on Relational Data Guoliang Li, Shengyue Ji, Chen Li, Jianhua Feng Efficient Type-Ahead Search on Relational Data Guoliang Li, Shengyue Ji, Chen Li, Jianhua Feng Traditional Keyword Search MUST Type in Complete keywords

3 Tsinghua & UC Irvine Efficient Type-Ahead Search on Relational Data Guoliang Li, Shengyue Ji, Chen Li, Jianhua Feng Efficient Type-Ahead Search on Relational Data Guoliang Li, Shengyue Ji, Chen Li, Jianhua Feng Type-Ahead Search Advantages:  Interactive: data exploration in relational databases  Full-text search: full-text search on-the-fly

4 Tsinghua & UC Irvine Efficient Type-Ahead Search on Relational Data Guoliang Li, Shengyue Ji, Chen Li, Jianhua Feng Efficient Type-Ahead Search on Relational Data Guoliang Li, Shengyue Ji, Chen Li, Jianhua Feng Challenges and Preliminaries  Efficiency requirement (milliseconds vs. seconds)  Client-side processing  Network delay  Server-side processing  Opportunities:  Subsequent queries can be answered incrementally

5 Tsinghua & UC Irvine Efficient Type-Ahead Search on Relational Data Guoliang Li, Shengyue Ji, Chen Li, Jianhua Feng Efficient Type-Ahead Search on Relational Data Guoliang Li, Shengyue Ji, Chen Li, Jianhua Feng Fundamentals  Data  R: a relational database with a set of tables  D: a set of distinct words tokenized from the data in R

6 Tsinghua & UC Irvine Efficient Type-Ahead Search on Relational Data Guoliang Li, Shengyue Ji, Chen Li, Jianhua Feng Efficient Type-Ahead Search on Relational Data Guoliang Li, Shengyue Ji, Chen Li, Jianhua Feng Fundamentals  Query  Q = {p 1, p 2, …, p l }: a set of prefixes  Query result  R Q : a set of subtrees (called Steiner trees) such that each subtree has all query prefixes, i.e., a set of relevant tuples connected through foreign keys such that each answer has all query prefixes (conjunctive)

7 Tsinghua & UC Irvine Efficient Type-Ahead Search on Relational Data Guoliang Li, Shengyue Ji, Chen Li, Jianhua Feng Efficient Type-Ahead Search on Relational Data Guoliang Li, Shengyue Ji, Chen Li, Jianhua Feng Traditional Keyword Search  Data Graph  database  search  sigmod  sigir  signature  Query: {database search sigmod}  Answers: Steiner trees(radius  r) a2a3 a5 a2a3 a5

8 Tsinghua & UC Irvine Efficient Type-Ahead Search on Relational Data Guoliang Li, Shengyue Ji, Chen Li, Jianhua Feng Efficient Type-Ahead Search on Relational Data Guoliang Li, Shengyue Ji, Chen Li, Jianhua Feng Type-Ahead Search  Data Graph  database  search  sigmod  sigir  signature  Query: {database search sig}  Answer: Steiner trees(radius  r) a2a3 a5 a2a3 a5

9 Tsinghua & UC Irvine Efficient Type-Ahead Search on Relational Data Guoliang Li, Shengyue Ji, Chen Li, Jianhua Feng Efficient Type-Ahead Search on Relational Data Guoliang Li, Shengyue Ji, Chen Li, Jianhua Feng Type-Ahead Search in Relational Data  Step 1  Incremental prefix matching  Step 2  Incrementally find relevant connected tuples that contain query prefixes  Contributions  Efficiently Finding answers using  -step forward index  Improving search efficiency graph partition query prediction

10 Tsinghua & UC Irvine Efficient Type-Ahead Search on Relational Data Guoliang Li, Shengyue Ji, Chen Li, Jianhua Feng Efficient Type-Ahead Search on Relational Data Guoliang Li, Shengyue Ji, Chen Li, Jianhua Feng Step 1: Incremental Prefix Matching  Example  D = {sigmod, search, spark, yu, graph}  Q = “graph s”  W s ={sigmod, search, spark}  Q’ = “graph sig”  W sig ={sigmod}

11 Tsinghua & UC Irvine Efficient Type-Ahead Search on Relational Data Guoliang Li, Shengyue Ji, Chen Li, Jianhua Feng Efficient Type-Ahead Search on Relational Data Guoliang Li, Shengyue Ji, Chen Li, Jianhua Feng Tire Index Graph

12 Tsinghua & UC Irvine Efficient Type-Ahead Search on Relational Data Guoliang Li, Shengyue Ji, Chen Li, Jianhua Feng Efficient Type-Ahead Search on Relational Data Guoliang Li, Shengyue Ji, Chen Li, Jianhua Feng Incremental Prefix Matching  sigmod, search, spark, yu, graph graph searchsigmodspark s

13 Tsinghua & UC Irvine Efficient Type-Ahead Search on Relational Data Guoliang Li, Shengyue Ji, Chen Li, Jianhua Feng Efficient Type-Ahead Search on Relational Data Guoliang Li, Shengyue Ji, Chen Li, Jianhua Feng Step 2: Finding answers  graph  How to efficiently find answers? yu Graph Yu

14 Tsinghua & UC Irvine Efficient Type-Ahead Search on Relational Data Guoliang Li, Shengyue Ji, Chen Li, Jianhua Feng Efficient Type-Ahead Search on Relational Data Guoliang Li, Shengyue Ji, Chen Li, Jianhua Feng Contributions  Step 1  Incremental prefix matching  Step 2  Efficiently Finding answers using  -step forward index  Improving search efficiency graph partition query prediction

15 Tsinghua & UC Irvine Efficient Type-Ahead Search on Relational Data Guoliang Li, Shengyue Ji, Chen Li, Jianhua Feng Efficient Type-Ahead Search on Relational Data Guoliang Li, Shengyue Ji, Chen Li, Jianhua Feng  -step forward index Graph Yu Search

16 Tsinghua & UC Irvine Efficient Type-Ahead Search on Relational Data Guoliang Li, Shengyue Ji, Chen Li, Jianhua Feng Efficient Type-Ahead Search on Relational Data Guoliang Li, Shengyue Ji, Chen Li, Jianhua Feng Finding answers using  -step forward index Yu s

17 Tsinghua & UC Irvine Efficient Type-Ahead Search on Relational Data Guoliang Li, Shengyue Ji, Chen Li, Jianhua Feng Efficient Type-Ahead Search on Relational Data Guoliang Li, Shengyue Ji, Chen Li, Jianhua Feng Finding answers using  -step forward index p Yu s

18 Tsinghua & UC Irvine Efficient Type-Ahead Search on Relational Data Guoliang Li, Shengyue Ji, Chen Li, Jianhua Feng Efficient Type-Ahead Search on Relational Data Guoliang Li, Shengyue Ji, Chen Li, Jianhua Feng Contributions  Step 1  Incremental prefix matching  Step 2  Efficiently Finding answers using  -step forward index  Improving search efficiency graph partition query prediction

19 Tsinghua & UC Irvine Efficient Type-Ahead Search on Relational Data Guoliang Li, Shengyue Ji, Chen Li, Jianhua Feng Efficient Type-Ahead Search on Relational Data Guoliang Li, Shengyue Ji, Chen Li, Jianhua Feng Graph Partition  Step 1  Find subgraphs that contain query prefixes  Step 2  Find answers within subgraphs Graph

20 Tsinghua & UC Irvine Efficient Type-Ahead Search on Relational Data Guoliang Li, Shengyue Ji, Chen Li, Jianhua Feng Efficient Type-Ahead Search on Relational Data Guoliang Li, Shengyue Ji, Chen Li, Jianhua Feng Graph Partition  Q= “Graph Yu”  Step 1: find subgraphs S 2, S 3  Step 2: find answers within S 2, S 3

21 Tsinghua & UC Irvine Efficient Type-Ahead Search on Relational Data Guoliang Li, Shengyue Ji, Chen Li, Jianhua Feng Efficient Type-Ahead Search on Relational Data Guoliang Li, Shengyue Ji, Chen Li, Jianhua Feng High-Quality Graph Partition  A: S 1,S 2  B: S 1,S 2  C: S 1,S 2 S1S1 S2S2 S3S3 S4S4  D: S 1,S 2  E: S 1,S 2  F: S 1,S 2  A: S 3  B: S 4  C: S 3  D: S 4  E: S 3,S 4  F: S 3,S 4 Advantages: 1.Shorten List 2.Subgraph Pruning

22 Tsinghua & UC Irvine Efficient Type-Ahead Search on Relational Data Guoliang Li, Shengyue Ji, Chen Li, Jianhua Feng Efficient Type-Ahead Search on Relational Data Guoliang Li, Shengyue Ji, Chen Li, Jianhua Feng Keyword-Sensitive Partition  Graph  Hypergraph  G(V, E)  G h (V h,E h ) V h =V if (u,v)  E, then (u,v)  E h, if u 1, u 2, …, u n contain a same keyword, then (u 1, u 2, …, u n )  E h  Hypergraph Partition B

23 Tsinghua & UC Irvine Efficient Type-Ahead Search on Relational Data Guoliang Li, Shengyue Ji, Chen Li, Jianhua Feng Efficient Type-Ahead Search on Relational Data Guoliang Li, Shengyue Ji, Chen Li, Jianhua Feng Contributions  Step 1  Incremental prefix matching  Step 2  Efficiently Finding answers using  -step forward index  improving search efficiency graph partition query prediction

24 Tsinghua & UC Irvine Efficient Type-Ahead Search on Relational Data Guoliang Li, Shengyue Ji, Chen Li, Jianhua Feng Efficient Type-Ahead Search on Relational Data Guoliang Li, Shengyue Ji, Chen Li, Jianhua Feng Query Prediction

25 Tsinghua & UC Irvine Efficient Type-Ahead Search on Relational Data Guoliang Li, Shengyue Ji, Chen Li, Jianhua Feng Efficient Type-Ahead Search on Relational Data Guoliang Li, Shengyue Ji, Chen Li, Jianhua Feng Previous Method vs. Query Prediction  Previous method Find all potential compute words of query prefixes and compute corresponding answers e.g., {sigmod, sigir, signature, …,} for sig  Query prediction Predict the complete keywords with maximal probabilities and compute corresponding answers using the predicted keywords E.g., predict 2 best keyword {sigmod, sigir} for sig

26 Tsinghua & UC Irvine Efficient Type-Ahead Search on Relational Data Guoliang Li, Shengyue Ji, Chen Li, Jianhua Feng Efficient Type-Ahead Search on Relational Data Guoliang Li, Shengyue Ji, Chen Li, Jianhua Feng Query Prediction  Query-prediction model  Bayesin network  Pr(k i ) = #of occurrences of k i / # of nodes  Pr(k i |k j, k n ) = Pr(k i |k n )

27 Tsinghua & UC Irvine Efficient Type-Ahead Search on Relational Data Guoliang Li, Shengyue Ji, Chen Li, Jianhua Feng Efficient Type-Ahead Search on Relational Data Guoliang Li, Shengyue Ji, Chen Li, Jianhua Feng Query Prediction  Q=“keyword s”keyword search  Q=“keyword search r”keyword search relation

28 Tsinghua & UC Irvine Efficient Type-Ahead Search on Relational Data Guoliang Li, Shengyue Ji, Chen Li, Jianhua Feng Efficient Type-Ahead Search on Relational Data Guoliang Li, Shengyue Ji, Chen Li, Jianhua Feng Experimental Results  Setting  C++, Gnu compiler, FastCGI,  Ubuntu, X5450 3.0GHz CPU, 3GB RAM  Datasets  DBLP  IMDB

29 Tsinghua & UC Irvine Efficient Type-Ahead Search on Relational Data Guoliang Li, Shengyue Ji, Chen Li, Jianhua Feng Efficient Type-Ahead Search on Relational Data Guoliang Li, Shengyue Ji, Chen Li, Jianhua Feng Search Efficiency

30 Tsinghua & UC Irvine Efficient Type-Ahead Search on Relational Data Guoliang Li, Shengyue Ji, Chen Li, Jianhua Feng Efficient Type-Ahead Search on Relational Data Guoliang Li, Shengyue Ji, Chen Li, Jianhua Feng Scalability: Index Size

31 Tsinghua & UC Irvine Efficient Type-Ahead Search on Relational Data Guoliang Li, Shengyue Ji, Chen Li, Jianhua Feng Efficient Type-Ahead Search on Relational Data Guoliang Li, Shengyue Ji, Chen Li, Jianhua Feng Scalability: Search Time

32 Questions ? Thank You! Questions? http://tastier.ics.uci.edu/ http://tastier.cs.tsinghua.edu.cn/ Search: tastier type-ahead search


Download ppt "Efficient Type-Ahead Search on Relational Data: a TASTIER Approach Guoliang Li 1, Shengyue Ji 2, Chen Li 2, Jianhua Feng 1 1 Tsinghua University, Beijing,"

Similar presentations


Ads by Google