Download presentation
Presentation is loading. Please wait.
Published byPercival Morgan Modified over 8 years ago
1
Presented by: Siddhant Kulkarni Spring 2016
2
Authors: Publication: ICDE 2015 Type: Research Paper 2
3
3
4
Text Segmentation As per previous slide POS Tagging Part of Speech Tagging Context Assignment to words jazz is popular in this country he likes jazz more than country Concept Labelling Use of models to assign meaning read harry potter <- BOOK watch harry potter <- MOVIE 4
5
5 Build a graph -> Weigh the terms using acquired knowledge -> Maximum Spanning Tree for Segmentation
6
Pairwise model for POS Don’t JUST look at the adjacent terms 6
7
Presented by: Zohreh Raghebi Spring 2016:
8
Authors: Publication: ICDE 2015 Type: Research Paper 8
9
The explosion of a vast number of location aware services Predictive queries offer a fundamental type of location-based services based on users’ future locations Predictive range query “find all hotels that are located within two miles from a user’s anticipated location after 30 minutes“ Predictive KNN query “find the three taxis that are closest to a user’s location within the next 10 minutes“ Predictive aggregate query “find the number of cars expected to be around the stadium during the next 20 minutes“ 9
10
(1) Traffic management To predict areas with high traffic in the next half hour (2) Location aware advertising To distribute coupons and sales promotions to customers more likely to show up around a certain store during the sale time in the next hour (3) Routing services To take into consideration the predicted traffic on each road segment to find the shortest path of a user’s trip starting after 15 minutes from the present time 10
11
(4) Ride sharing systems To match the drivers that will pass by a rider’s location within few minutes (5) Store finders To recommend the closest restaurants to a user’s predicted destination in 15 minutes 11
12
To process predictive queries for moving objects on road networks efficiently Index structure, named the Predictive tree (P-tree) Proposed to precompute the predicted moving objects around each node in the underlying road network graph over time 12
13
(1) They consider an Euclidean space where objects can move freely in a two dimensional space Yet, practical predictive location-based services target moving objects on road networks (2) Many techniques using a massive amount of objects’ historical trajectories Historical data is not easily obtainable for many reasons Users’ privacy and data confidentiality Unavailability of historical data in rural areas (3) To support a specific query type only Predictive range query Predictive KNN query Predictive aggregate query 13
14
A novel data structure named Predictive tree (P-tree) We introduce a probability model that computes the likelihood of a node in the road network being a destination to a moving object The probability assignment is made Node’s position in the tree Travel time between the object and this node Number of the sibling nodes Provide an incremental approach to update the predictive tree As objects move around Propose the iRoad framework that leverages the introduced predictive tree To support a wide variety of predictive queries 14
15
At each node in the given road network We keep track of a list of objects predicted to appear in this node, along with their probabilities, and travel time When an object moves we incrementally update the predictive tree This update is reflected to the original nodes in the road network. When a predictive query is received We fetch the up-to-date answer from the node of interest Compile it according to the query type. 15
16
Presented by: Zohreh Raghebi Spring 2016:
17
Authors: Publication: ICDE 2015 Type: Research Paper 17
18
Bump hunting is a common approach to extracting insights from data Looking for regions of a dataset Where a property of interest occurs frequently. In this paper, we apply that approach on graphs A subset of the nodes exhibit a property of interest The goal is to find a connected subgraph (the “bump”) Query nodes appear more often compared to non-query nodes 18
19
We find such a subgraph by maximizing the linear discrepancy The possible weighted difference between the number of query and non-query nodes in the subgraph we consider the problem under a local-access model we assume that only query nodes are provided as input while all other nodes and edges can be discovered only via calls to a costly get- neighbors function from a previously discovered node For example, accessing the social graph of an online social network (e.g., Twitter) is based on such a function 19
20
When one submits a text query to Twitter’s search engine Twitter returns a list of messages that match the query, along with the author of each message. For example, when one submits “Ukraine,” Twitter returns all recent messages that contain that keyword, together with the users who posted them. We wish to perform the following task: Among all Twitter users who posted a relevant message, Find a subset of them that form a local cluster on Twitter’s social network. Our goal is to discover a community of users who talk about that topic Our input consists only of those users who have recently published a relevant message Note that we do not have the entire social graph We can only retrieve the social connections of a user by submitting a query to Twitter’s API 20
21
Consider an online library system that allows its users to perform text search on top of a bibliographic database. Upon receiving a query, the system returns a list of authors that match the query. For example, when one submits the keyword “discrepancy,” The system returns a list of authors who have published about the topic 21
22
We wish to perform the following task: Among all authors in this particular result set, identify a subset of similar authors Nodes represent authors, and edges indicate pairs of authors whose similarity exceeds a user-defined threshold As the similarity threshold is user-defined the graph cannot be pre-computed. An all-pairs similarity join to materialize the graph at query time is impractical We can thus solve the problem without materializing the entire similarity graph 22
23
To devise algorithms that, given as input a set of query nodes Find the maximum-discrepancy connected subgraph Using as few calls to the get-neighbors function as possible Consider the discrepancy-maximization problem under an unrestricted-access model The algorithm can access nodes and edges of the graph in an arbitrary manner 23
24
The solution for the local-access model based on a two-phase approach In the first phase, we make calls to the get-neighbors function to retrieve a subgraph In many cases, the optimal subgraph can be found without retrieving the entire graph In the second phase, we can use any algorithm for the unrestricted-access model on the retrieved subgraph 24
25
Prove that the problem of linear-discrepancy maximization on graphs is NP-hard in the general case Prove that the problem has a polynomial-time solution when the graph is a tree To define fast heuristic algorithms to produce solutions 25
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.