Click to edit Present’s Name AP-Tree: Efficiently Support Continuous Spatial-Keyword Queries Over Stream Xiang Wang 1*, Ying Zhang 2, Wenjie Zhang 1, Xuemin.

Slides:



Advertisements
Similar presentations
Multi-Guarded Safe Zone: An Effective Technique to Monitor Moving Circular Range Queries Presented By: Muhammad Aamir Cheema 1 Joint work with Ljiljana.
Advertisements

Computer Science and Engineering Diversified Spatial Keyword Search On Road Networks Chengyuan Zhang 1,Ying Zhang 2,1,Wenjie Zhang 1, Xuemin Lin 3,1, Muhammad.
Indexing DNA Sequences Using q-Grams
Psychological Advertising: Exploring User Psychology for Click Prediction in Sponsored Search Date: 2014/03/25 Author: Taifeng Wang, Jiang Bian, Shusen.
Processing XML Keyword Search by Constructing Effective Structured Queries Jianxin Li, Chengfei Liu, Rui Zhou and Bo Ning Swinburne University of Technology,
1 Spatial Join. 2 Papers to Present “Efficient Processing of Spatial Joins using R-trees”, T. Brinkhoff, H-P Kriegel and B. Seeger, Proc. SIGMOD, 1993.
Computer Science and Engineering Inverted Linear Quadtree: Efficient Top K Spatial Keyword Search Chengyuan Zhang 1,Ying Zhang 1,Wenjie Zhang 1, Xuemin.
Efficient Evaluation of k-Range Nearest Neighbor Queries in Road Networks Jie BaoChi-Yin ChowMohamed F. Mokbel Department of Computer Science and Engineering.
Jianxin Li, Chengfei Liu, Rui Zhou Swinburne University of Technology, Australia Wei Wang University of New South Wales, Australia Top-k Keyword Search.
Counting Distinct Objects over Sliding Windows Presented by: Muhammad Aamir Cheema Joint work with Wenjie Zhang, Ying Zhang and Xuemin Lin University of.
Probabilistic Skyline Operator over Sliding Windows Wenjie Zhang University of New South Wales & NICTA, Australia Joint work: Xuemin Lin, Ying Zhang, Wei.
TI: An Efficient Indexing Mechanism for Real-Time Search on Tweets Chun Chen 1, Feng Li 2, Beng Chin Ooi 2, and Sai Wu 2 1 Zhejiang University, 2 National.
Pete Bohman Adam Kunk.  Introduction  Related Work  System Overview  Indexing Scheme  Ranking  Evaluation  Conclusion.
School of Computer Science and Engineering Finding Top k Most Influential Spatial Facilities over Uncertain Objects Liming Zhan Ying Zhang Wenjie Zhang.
Click to edit Present’s Name SLICE: Reviving Regions-Based Pruning for Reverse k Nearest Neighbors Queries Shiyu Yang 1, Muhammad Aamir Cheema 2,1, Xuemin.
Progressive Computation of The Min-Dist Optimal-Location Query Donghui Zhang, Yang Du, Tian Xia, Yufei Tao* Northeastern University * Chinese University.
Constructing Popular Routes from Uncertain Trajectories Ling-Yin Wei 1, Yu Zheng 2, Wen-Chih Peng 1 1 National Chiao Tung University, Taiwan 2 Microsoft.
Probabilistic Threshold Range Aggregate Query Processing over Uncertain Data Wenjie Zhang University of New South Wales & NICTA, Australia Joint work:
A Generic Framework for Handling Uncertain Data with Local Correlations Xiang Lian and Lei Chen Department of Computer Science and Engineering The Hong.
Quantile-Based KNN over Multi- Valued Objects Wenjie Zhang Xuemin Lin, Muhammad Aamir Cheema, Ying Zhang, Wei Wang The University of New South Wales, Australia.
Suggestion of Promising Result Types for XML Keyword Search Joint work with Jianxin Li, Chengfei Liu and Rui Zhou ( Swinburne University of Technology,
Efficient Processing of Top-k Spatial Keyword Queries João B. Rocha-Junior, Orestis Gkorgkas, Simon Jonassen, and Kjetil Nørvåg 1 SSTD 2011.
Probabilistic Skyline Operator over sliding Windows Wan Qian HKUST DB Group.
Sensor Networks Storage Sanket Totala Sudarshan Jagannathan.
An Integrated Approach to Extracting Ontological Structures from Folksonomies Huairen Lin, Joseph Davis, Ying Zhou ESWC 2009 Hyewon Lim October 9 th, 2009.
Spatial Data Management Chapter 28. Types of Spatial Data Point Data –Points in a multidimensional space E.g., Raster data such as satellite imagery,
Efficient Keyword Search over Virtual XML Views Feng Shao and Lin Guo and Chavdar Botev and Anand Bhaskar and Muthiah Chettiar and Fan Yang Cornell University.
Ranking Queries on Uncertain Data: A Probabilistic Threshold Approach Wenjie Zhang, Xuemin Lin The University of New South Wales & NICTA Ming Hua,
Click to edit Present’s Name Xiaoyang Zhang 1, Jianbin Qin 1, Wei Wang 1, Yifang Sun 1, Jiaheng Lu 2 HmSearch: An Efficient Hamming Distance Query Processing.
Towards Robust Indexing for Ranked Queries Dong Xin, Chen Chen, Jiawei Han Department of Computer Science University of Illinois at Urbana-Champaign VLDB.
April 14, 2003Hang Cui, Ji-Rong Wen and Tat- Seng Chua 1 Hierarchical Indexing and Flexible Element Retrieval for Structured Document Hang Cui School of.
Top-k Similarity Join over Multi- valued Objects Wenjie Zhang Jing Xu, Xin Liang, Ying Zhang, Xuemin Lin The University of New South Wales, Australia.
Computer Science and Engineering Efficiently Monitoring Top-k Pairs over Sliding Windows Presented By: Zhitao Shen 1 Joint work with Muhammad Aamir Cheema.
Efficient Instant-Fuzzy Search with Proximity Ranking Authors: Inci Centidil, Jamshid Esmaelnezhad, Taewoo Kim, and Chen Li IDCE Conference 2014 Presented.
GUIDED BY DR. A. J. AGRAWAL Search Engine By Chetan R. Rathod.
Influence Zone: Efficiently Processing Reverse k Nearest Neighbors Queries Presented By: Muhammad Aamir Cheema Joint work with Xuemin Lin, Wenjie Zhang,
The Sweet Spot between Inverted Indices and Metric-Space Indexing for Top-K–List Similarity Search Evica Milchevski , Avishek Anand ★ and Sebastian Michel.
Efficient Processing of Top-k Spatial Preference Queries
Jun Li, Peng Zhang, Yanan Cao, Ping Liu, Li Guo Chinese Academy of Sciences State Grid Energy Institute, China Efficient Behavior Targeting Using SVM Ensemble.
Spatio-temporal Pattern Queries M. Hadjieleftheriou G. Kollios P. Bakalov V. J. Tsotras.
Monitoring k-NN Queries over Moving Objects Xiaohui Yu University of Toronto Joint work with Ken Pu and Nick Koudas.
Computer Science and Engineering TreeSpan Efficiently Computing Similarity All-Matching Gaoping Zhu #, Xuemin Lin #, Ke Zhu #, Wenjie Zhang #, Jeffrey.
Information Technology Selecting Representative Objects Considering Coverage and Diversity Shenlu Wang 1, Muhammad Aamir Cheema 2, Ying Zhang 3, Xuemin.
Information Technology (Some) Research Trends in Location-based Services Muhammad Aamir Cheema Faculty of Information Technology Monash University, Australia.
Spatial Indexing Techniques Introduction to Spatial Computing CSE 5ISC Some slides adapted from Spatial Databases: A Tour by Shashi Shekhar Prentice Hall.
Efficient OLAP Operations in Spatial Data Warehouses Dimitris Papadias, Panos Kalnis, Jun Zhang and Yufei Tao Department of Computer Science Hong Kong.
MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG Distributed Ranked Data Dissemination in Social Networks Joint work with: Mo Sadoghi Vinod Muthusamy Hans-Arno.
Improving Search for Emerging Applications * Some techniques current being licensed to Bimaple Chen Li UC Irvine.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Toward Entity Retrieval over Structured and Text Data Mayssam Sayyadian, Azadeh Shakery, AnHai Doan, ChengXiang Zhai Department of Computer Science University.
Efficient Semantic Web Service Discovery in Centralized and P2P Environments Dimitrios Skoutas 1,2 Dimitris Sacharidis.
1 VLDB, Background What is important for the user.
Computer Science and Engineering Jianye Yang 1, Ying Zhang 2, Wenjie Zhang 1, Xuemin Lin 1 Influence based Cost Optimization on User Preference 1 The University.
AP-Tree: Efficiently Support Continuous Spatial-Keyword Queries Over Stream lecturer : 秦靖雅.
Spatial Approximate String Search. Abstract This work deals with the approximate string search in large spatial databases. Specifically, we investigate.
Bump hunting In The Dark: Local Discrepancy Maximization on Graphs
A Unified Framework for Efficiently Processing Ranking Related Queries
Dynamic Indexing in SpatialHadoop
Experiment Evaluation
TT-Join: Efficient Set Containment Join
Location Recommendation — for Out-of-Town Users in Location-Based Social Network Yina Meng.
On Efficient Graph Substructure Selection
Efficient Subgraph Similarity All-Matching
Presented by: Mahady Hasan Joint work with
Efficient Cache-Supported Path Planning on Roads
Continuous Density Queries for Moving Objects
Information Retrieval and Web Design
Efficient Processing of Top-k Spatial Preference Queries
Wei Wang University of New South Wales, Australia
Efficient Aggregation over Objects with Extent
Presentation transcript:

Click to edit Present’s Name AP-Tree: Efficiently Support Continuous Spatial-Keyword Queries Over Stream Xiang Wang 1*, Ying Zhang 2, Wenjie Zhang 1, Xuemin Lin 1, Wei Wang 1 1 The University of New South Wales, Australia 2 University of Technology, Sydney, Australia *Presenter

School of Computer Science and Engineering 2 Introduction More and more spatial-textual objects are generated in a streaming fashion –Rapid development of Web 2.0 and GPS-enabled mobile devices –Ensuing proliferation of social applications, e.g., Twitter, Facebook, Foursquare –Various applications: information dissemination, location-based advertising (e.g., Groupon) text location

School of Computer Science and Engineering 3 Introduction Spatial-keyword Query Against Static Objects –Given a set of geo-textual objects, a spatial-keyword query aims to retrieve all the objects which satisfy its spatial and textual constraints. Continuous Spatial-keyword Queries Against Streaming Objects –Users register their interest as continuous spatial-keyword queries on the server. –For each incoming geo-textual object issued by a stream publisher, deliver/push it to all the users who are interested in it immediately. –Publish/Subscribe Framework! Server-initiated model & Continuous query User-initiated model & Snapshot query User-initiated model & Snapshot query

School of Computer Science and Engineering 4 Example A location-based e-coupon system –Users interested in nearby sales register their interest in the system. –E-coupons from nearby shopping malls are delivered to relevant users. Spatial constraint Textual constraint iPad, discount A surface, brand-new C surface, iPad, brand-new, discount nexus, surface, discount, brand-new nexus B

School of Computer Science and Engineering 5 State-of-the-arts R t -Tree (Li et al. THU, KDD 2013) IQ-Tree (Chen et al. NTU, SIGMOD 2013) Spatial indexKeyword index

School of Computer Science and Engineering 6 R t -Tree An R-Tree is built to partition the queries based on their search ranges first. Then in each R-Tree node, the keywords from its descendants are recorded as textual filter. R3R3 R3R3 R4R4 R4R4 R1R1 R1R1 R2R2 R2R2 R5R5 R5R5 R6R6 R6R6 q1q1 q8q8 q6q6 q3q3 q5q5 q2q2 q7q7 q4q4 q9q9 w 1,w 2,w 3, w 4,w 5,w 6 w 1,w 2,w 3,w 4 w 1,w 5,w 6

School of Computer Science and Engineering 7 IQ-Tree Queries are first partitioned by Quad-tree, where each query is indexed into one or multiple cells based on a cost model. Then in each cell, a ranked inverted list is built to further partition queries, where each query is indexed into the posting list corresponding to its least frequent keyword w 2 : q 1, q 2 w 5 : q 4 q6q6 q2q2 q 1,q 4 q8q8 q4q4 q7q7 q7q7 q 1,q 2,q 4

School of Computer Science and Engineering 8 Observation 1 The selection of spatial partition or textual partition should depend on the spatial and keyword distributions of current query workload. Prefer spatial partition Prefer keyword partition Employ spatial partition and keyword partition adaptively based on query distributions! Our solution:

School of Computer Science and Engineering 9 Observation 2 From the perspective of textual constraint, the nature of this problem is a superset containment search –Given a set of objects and a query, each consisting of a set of keywords, find all the objects which are fully contained by the query. –In our case, the roles of object and query are reversed. Ordered Keyword Trie Integrate a variant of ordered keyword trie into our framework for textual filtering! Our solution:

School of Computer Science and Engineering 10 AP-Tree Framework AP-Tree (i.e., Adaptive Partition Tree) is a f-ary tree structure which partitions the queries in a top-down manner. Cost model based partition strategy –Keyword partition: lead to a k-node –Spatial partition: lead to a s-node –Partition the queries into f buckets A q-node is built to store remaining queries when termination condition reaches. q-node k-node s-node k-node q-node

School of Computer Science and Engineering 11 Keyword Partition (k-node) q 1,q 2 q4q4 q6q6 q 3,q 7 q 8, q 9 q5q5 [w 2 ] [w 5 ] [w 3 ] [w 5 ] [w 4 ] q 1,q 2,q 3,q 4,q 5,q 6,q 7,q 8,q 9 q 1,q 2,q 4,q 5 q 3,q 6,q 7 [w 2,w 3 ][w 1 ][w 4,w 5 ] [ w 1,w 2,w 3,w 4,w 5 ] [ w 2,w 3,w 5 ] [ w 4,w 5 ] Consider the keyword combinations! f=3

School of Computer Science and Engineering 12 Spatial Partition (s-node) q1q1 q 1,q 4 q3q3 q 4,q 5 q 1,q 2 q2q2 q9q9 q4q4 q 1,q 2,q 3,q 4,q 5,q 6,q 7,q 8,q 9 q 1,q 3,q 4,q 5 q 1,q 2,q 4,q 9 dummy node f=4

School of Computer Science and Engineering 13 Example of AP-Tree q 1,q 2 q4q4 q5q5 [w 2 ] [w 5 ] [w 3 ] [w 2,w 3 ] [w 1 ] [w 4,w 5 ] q6q6 q3q3 q8q8 q9q9 q7q7 q 1,q 2,q 3,q 4,q 5,q 6,q 7,q 8,q 9 q 1,q 2,q 4,q 5 q 8, q 9 q 3,q 6,q 7

School of Computer Science and Engineering 14 Object Matching At A Glance q 1,q 2 q4q4 q5q5 [w 2 ] [w 5 ] [w 3 ] [w 2,w 3 ] [w 1 ] [w 4,w 5 ] q6q6 q3q3 q8q8 q9q9 w 2,w 3,w 4 q7q7 Candidate: {q 7 } Answer: {q 7 } verify

School of Computer Science and Engineering 15 Cost Model

School of Computer Science and Engineering 16 Keyword Partition Algorithm Local min

School of Computer Science and Engineering 17 Spatial Partition Algorithm

School of Computer Science and Engineering 18 Index Construction and Maintenance Recursively partition the queries using keyword partition and spatial partition adaptively according to the cost model. –If the cost of keyword partition is smaller, build k-node; –Otherwise, build s-node. –When the termination condition reaches, stop partition and assign all the remaining queries to a q-node. Maintenance –Employ KL-Divergence to monitor the difference between the new distribution and old distribution for each s-node and k-node, as new queries are inserted or old query are expired. –When the KL-Divergence exceeds a threshold, re-construct sub-tree of s-node or k-node.

School of Computer Science and Engineering 19 Experiments Baselines –R t -Tree: KDD13. Spatial-first –IQ-Tree: SIGMOD13. Spatial-first –RQ-Tree: Keyword-first Datasets TWEETSGNCARSAIS # objects12.7 M2.2 M 5.7 M Vocabulary size1.7 M208 K81 K Avg. # of object keywords Parameters Range Avg. # of query keywords Avg. query range size (%) Scalability (M)

School of Computer Science and Engineering 20 Experiments Effect on different datasets

School of Computer Science and Engineering 21 Experiments Effect of number of query keywords

School of Computer Science and Engineering 22 Experiments Effect of size of query region

School of Computer Science and Engineering 23 Conclusion Propose a novel Adaptive spatial-keyword Partition indexing structure, namely AP-Tree, to efficiently organize a massive number of queries. The construction of AP-Tree is adaptive to the spatial and keyword distributions under the guide of a cost model. Achieve a high throughput performance improvement compared to the prior techniques.

Thanks! Q&AQ&A