1 A DATA MINING APPROACH FOR LOCATION PREDICTION IN MOBILE ENVIRONMENTS* by Gökhan Yavaş Feb 22, 2005 *: To appear in Data and Knowledge Engineering, Elsevier.

Slides:



Advertisements
Similar presentations
Answering Approximate Queries over Autonomous Web Databases Xiangfu Meng, Z. M. Ma, and Li Yan College of Information Science and Engineering, Northeastern.
Advertisements

Mining Association Rules
ADAPTIVE FASTEST PATH COMPUTATION ON A ROAD NETWORK: A TRAFFIC MINING APPROACH Hector Gonzalez, Jiawei Han, Xiaolei Li, Margaret Myslinska, John Paul Sondag.
Indexing DNA Sequences Using q-Grams
Frequent Itemset Mining Methods. The Apriori algorithm Finding frequent itemsets using candidate generation Seminal algorithm proposed by R. Agrawal and.
Mining Frequent Patterns II: Mining Sequential & Navigational Patterns Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
gSpan: Graph-based substructure pattern mining
Introduction to Algorithms Rabie A. Ramadan rabieramadan.org 2 Some of the sides are exported from different sources.
Fast Algorithms For Hierarchical Range Histogram Constructions
Mining Sequential Patterns Authors: Rakesh Agrawal and Ramakrishnan Srikant. Presenter: Jeremy Dalmer.
Coupling-Aware Length-Ratio- Matching Routing for Capacitor Arrays in Analog Integrated Circuits Kuan-Hsien Ho, Hung-Chih Ou, Yao-Wen Chang and Hui-Fang.
FP (FREQUENT PATTERN)-GROWTH ALGORITHM ERTAN LJAJIĆ, 3392/2013 Elektrotehnički fakultet Univerziteta u Beogradu.
Data Mining Association Analysis: Basic Concepts and Algorithms
New Sampling-Based Summary Statistics for Improving Approximate Query Answers P. B. Gibbons and Y. Matias (ACM SIGMOD 1998) Rongfang Li Feb 2007.
Dimitrios Katsaros* † Yannis Manolopoulos* † Aristotle University, Greece *University of Thessaly, Greece Suffix Tree Based Prediction for Pervasive Computing.
Association Analysis. Association Rule Mining: Definition Given a set of records each of which contain some number of items from a given collection; –Produce.
1 Prediction-based Strategies for Energy Saving in Object Tracking Sensor Networks Yingqi Xu, Wang-Chien Lee Proceedings of the 2004 IEEE International.
Location and Handoff Management Lecture 10. Location and Handoff Management The current point of attachment or location of a subscriber (mobile unit)
Clustered alignments of gene- expression time series data Adam A. Smith, Aaron Vollrath, Cristopher A. Bradfield and Mark Craven Department of Biosatatistics.
Aki Hecht Seminar in Databases (236826) January 2009
WebKDD 2001 Aristotle University of Thessaloniki 1 Effective Prediction of Web-user Accesses: A Data Mining Approach Nanopoulos Alexandros Katsaros Dimitrios.
On the Construction of Energy- Efficient Broadcast Tree with Hitch-hiking in Wireless Networks Source: 2004 International Performance Computing and Communications.
1 Techniques for Efficient Road- Network-Based Tracking of Moving Objects Speaker : Jia-Hui Huang Date : 2006/10/23.
Tracking Moving Objects in Anonymized Trajectories Nikolay Vyahhi 1, Spiridon Bakiras 2, Panos Kalnis 3, and Gabriel Ghinita 3 1 St. Petersburg State University.
Efficient Data Mining for Path Traversal Patterns CS401 Paper Presentation Chaoqiang chen Guang Xu.
Association Rule Mining (Some material adapted from: Mining Sequential Patterns by Karuna Pande Joshi)‏
Online Data Gathering for Maximizing Network Lifetime in Sensor Networks IEEE transactions on Mobile Computing Weifa Liang, YuZhen Liu.
Fast Algorithms for Association Rule Mining
Mining Association Rules
1 Fast Algorithms for Mining Association Rules Rakesh Agrawal Ramakrishnan Srikant Slides from Ofer Pasternak.
Mining Association Rules
TM Biological Sequence Comparison / Database Homology Searching Aoife McLysaght Summer Intern, Compaq Computer Corporation Ballybrit Business Park, Galway,
USpan: An Efficient Algorithm for Mining High Utility Sequential Patterns Authors: Junfu Yin, Zhigang Zheng, Longbing Cao In: Proceedings of the 18th ACM.
UNC Chapel Hill M. C. Lin Point Location Reading: Chapter 6 of the Textbook Driving Applications –Knowing Where You Are in GIS Related Applications –Triangulation.
Network Aware Resource Allocation in Distributed Clouds.
1 A Bayesian Method for Guessing the Extreme Values in a Data Set Mingxi Wu, Chris Jermaine University of Florida September 2007.
1 11 Subcarrier Allocation and Bit Loading Algorithms for OFDMA-Based Wireless Networks Gautam Kulkarni, Sachin Adlakha, Mani Srivastava UCLA IEEE Transactions.
Secure Incremental Maintenance of Distributed Association Rules.
Knowledge Discovery and Delivery Lab (ISTI-CNR & Univ. Pisa)‏ www-kdd.isti.cnr.it Anna Monreale Fabio Pinelli Roberto Trasarti Fosca Giannotti A. Monreale,
Rate-based Data Propagation in Sensor Networks Gurdip Singh and Sandeep Pujar Computing and Information Sciences Sanjoy Das Electrical and Computer Engineering.
Approximate Frequency Counts over Data Streams Loo Kin Kong 4 th Oct., 2002.
Approximate Frequency Counts over Data Streams Gurmeet Singh Manku, Rajeev Motwani Standford University VLDB2002.
Efficient Data Mining for Calling Path Patterns in GSM Networks Information Systems, accepted 5 December 2002 SPEAKER: YAO-TE WANG ( 王耀德 )
ANALYSIS AND IMPLEMENTATION OF GRAPH COLORING ALGORITHMS FOR REGISTER ALLOCATION By, Sumeeth K. C Vasanth K.
Expert Systems with Applications 34 (2008) 459–468 Multi-level fuzzy mining with multiple minimum supports Yeong-Chyi Lee, Tzung-Pei Hong, Tien-Chin Wang.
CS 8751 ML & KDDSupport Vector Machines1 Mining Association Rules KDD from a DBMS point of view –The importance of efficiency Market basket analysis Association.
Mining Quantitative Association Rules in Large Relational Tables ACM SIGMOD Conference 1996 Authors: R. Srikant, and R. Agrawal Presented by: Sasi Sekhar.
August 30, 2004STDBM 2004 at Toronto Extracting Mobility Statistics from Indexed Spatio-Temporal Datasets Yoshiharu Ishikawa Yuichi Tsukamoto Hiroyuki.
Intelligent DataBase System Lab, NCKU, Taiwan Josh Jia-Ching Ying 1, Wang-Chien Lee 2, Tz-Chiao Weng 1 and Vincent S. Tseng 1 1 Department of Computer.
Mining Graph Patterns Efficiently via Randomized Summaries Chen Chen, Cindy X. Lin, Matt Fredrikson, Mihai Christodorescu, Xifeng Yan, Jiawei Han VLDB’09.
A Data Mining Approach for Location Prediction in Mobile Environments Data & Knowledge Engineering Volume 54, Issue 2, August 2005, Pages 121–146 劉康全 1.
Author: Haoyu Song, Murali Kodialam, Fang Hao and T.V. Lakshman Publisher/Conf. : IEEE International Conference on Network Protocols (ICNP), 2009 Speaker:
Predicting the Location and Time of Mobile Phone Users by Using Sequential Pattern Mining Techniques Mert Özer, Ilkcan Keles, Ismail Hakki Toroslu, Pinar.
Paper_topic: Parallel Matrix Multiplication using Vertical Data.
Paging Area Optimization Based on Interval Estimation in Wireless Personal Communication Networks By Z. Lei, C. U. Saraydar and N. B. Mandayam.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Incremental Run-time Application Mapping for Heterogeneous Network on Chip 2012 IEEE 14th International Conference on High Performance Computing and Communications.
An Energy-Efficient Approach for Real-Time Tracking of Moving Objects in Multi-Level Sensor Networks Vincent S. Tseng, Eric H. C. Lu, & Kawuu W. Lin Institute.
1 Discovering Calendar-based Temporal Association Rules SHOU Yu Tao May. 21 st, 2003 TIME 01, 8th International Symposium on Temporal Representation and.
Network Partition –Finding modules of the network. Graph Clustering –Partition graphs according to the connectivity. –Nodes within a cluster is highly.
Dynamic Bandwidth Reservation in Cellular Networks Using Road Topology Based Mobility Predictions InfoCom 2004 Speaker : Bo-Chun Wang
Indexing and Mining Free Trees Yun Chi, Yirong Yang, Richard R. Muntz Department of Computer Science University of California, Los Angeles, CA {
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Mining Complex Data COMP Seminar Spring 2011.
Ning Jin, Wei Wang ICDE 2011 LTS: Discriminative Subgraph Mining by Learning from Search History.
Gspan: Graph-based Substructure Pattern Mining
Effective Prediction of Web-user Accesses: A Data Mining Approach
Lin Lu, Margaret Dunham, and Yu Meng
A DATA MINING APPROACH FOR LOCATION PREDICTION IN MOBILE ENVIRONMENTS*
Effective Prediction of Web-user Accesses: A Data Mining Approach
Continuous Density Queries for Moving Objects
Presentation transcript:

1 A DATA MINING APPROACH FOR LOCATION PREDICTION IN MOBILE ENVIRONMENTS* by Gökhan Yavaş Feb 22, 2005 *: To appear in Data and Knowledge Engineering, Elsevier

2 Outline Introduction Background Work Mobility Prediction Based On Mobility Rules Experimental Results Conclusion Future Work

3 Introduction Personal Communication Systems are becoming more popular Dynamic relocation of users gives rise to the problem of Mobility Management Methods for storing and updating the location information of users Mobility Prediction: the prediction of a user’s next inter- cell movement

4 Motivation Predicted movement can be used for effectively allocating resources instead of blindly allocating excessive resources Benefit to the broadcast program generation [1], data items can be broadcast to the predicted cell Location prediction is crucial in processing of location dependent queries [2], since answer depends on the location of user Queries depending on future positions can be answered by effective location prediction [1] Y. Saygin and O. Ulusoy. Exploiting Data Mining Techniques for Broadcasting Data in Mobile Computing Environments. IEEE Transactions on Knowledge and Data Engineering, 14(6): , [2] R. Agrawal and R. Srikant. Mining sequential patterns. In Proceedings of the IEEE Conference on Data Engineering (ICDE’95), pages 3–14, [2] G. Gok and O. Ulusoy. Transmission of Continuous Query Results in Mobile Computing Systems. Information Sciences, 125(1-4): 37-63, 2000

5 Network Model PCS network partitioned into smaller areas called cells Each cell has a Base Station (BS), used for broadcasting and receiving information Home Location Register (HLR): database which keeps the inter-cell movement history of user Visitor Location Register (VLR): each BS has a database which keeps the profiles of the users located in this cell.

6 Problem Definition It is possible for us to get the movement history of a mobile user from HLR of a user Movement trajectories in the form of T= Partitioned into subsequences, named user actual paths, UAPs UAPs have the form of U= We mine UAPs to find user mobility patterns, UMPs

7 Related Work The roots of our method go back to the Apriori algorithm [3]  Association rule mining Sequential pattern mining problem [4]  Ordering of the items in an itemset must be taken into consideration  Not appropriate for our domain, because does not take into account the network topology [3] R. Agrawal, R. Srikant, Fast Algorithms for mining association rules. In Proceedings of Very Large Databases Conference (VLDB’94), pages , [4] R. Agrawal and R. Srikant. Mining sequential patterns. In Proceedings of the IEEE Conference on Data Engineering (ICDE’95), pages 3–14, 1995.

8 Mobility Prediction Based On Mobility Rules 1.Mining UMPs from Graph Traversals: Movement data mined for discovering regularities (UMP) in inter-cell movements 2.Generation of Mobility Rules: Mobility rules are extracted from UMPs 3.Mobility Prediction: Prediction of next inter-cell movement based on mobility rules

9 Mining UMPs from Graph Traversals Vertices of G: the cells in the coverage region Edges of G: if two cells, A and B, are neighbors in the coverage region, then there are two edges in G, A  B and B  A An example coverage region and corresponding graph G

10 Mining UMPs from Graph Traversals Subsequence definition: Assume we have two UAPs, A = and B =. B is a subsequence of A, iff all cells in B also exist in A while keeping their order in B Example: A=, then B= is a length-2 subsequence of A. In other words, B is contained by A

11 Mining UMPs from Graph Traversals Every candidate has a count value that keeps the support given to this candidate by UAPs This is the point our work extends algorithm in [5, 6] Method in [5, 6] increments the count value of a candidate by 1 if this candidate is contained by a UAP Unfair !!! Treats in the same way  a highly corrupted candidate pattern  a slightly corrupted (or even not corrupted at all) candidate pattern [5] A. Nanopoulos, D. Katsaros, Y. Manolopoulos, A Data Mining Algorithm for Generalized Web Prefetching, IEEE Transactions on Knowledge and Data Engineering, 15(5): , [6] A. Nanopoulos, D. Katsaros, Y. Manolopoulos, Effective Prediction of Web User Accesses: A Data Mining Approach, In Proceedings of the WebKDD Workshop (WebKDD’01), 2001.

12 Mining UMPs from Graph Traversals Should consider the degree of corruption for the mobile motion prediction context suppInc Support assigned to a candidate pattern B by a UAP A (i.e., suppInc)

13 Mining UMPs from Graph Traversals totDist Define totDist value by means of the notion of string alignment Definition 2.1: If x and y are each single character or space, then  (x, y) denotes the score of aligning x and y. In our case, the scoring function is defined as follows:

14 Mining UMPs from Graph Traversals Definition 2.3: Let A be a UAP and B be a pattern. A containment alignment X' maps A and B into strings A‘ and B‘ where:  |A'| = |B'|  B is contained by A, and  Removal of all spaces from A' and B' leaves A and B Total score of the alignment X':

15 Mining UMPs from Graph Traversals For any two patterns, there may be more than one alignment Ex: Consider A=, B=

16 Mining UMPs from Graph Traversals Definition 2.4: An optimal containment alignment of UAP A and pattern B is one that has the minimum possible value for these two patterns  Total score of an alignment: sum of penalties  An optimal alignment should have the minimum number of mismatches, which means the minimum score of alignment totDist(A, B) = Score of the optimal alignment for the UAP A and pattern B

17 Mining UMPs from Graph Traversals Example: Given UAP A= and pattern B=, optimal containment alignment for these:  Score of the alignment = totDist (A, B) = 3 Support assigned to the candidate pattern B by the UAP A:

18 Mining UMPs from Graph Traversals The quality of the patterns will improve since this method is a more accurate way of support counting  Degree of corruption taken into account This will give rise to more accurate mobility rules Resulting in the prediction accuracy improved compared to the accuracy by using the rules that are generated with the former way of support counting Application of different methods for totDist will affect the quality of rules

19 Mining UMPs from Graph Traversals Candidate Generation: Example: C = N + (c k ) : the set of all nodes in G, which have an incoming edge from the cell c k A cell from N + (c k ) is attached to the end of C to generate C' Add C' to the set of Candidates

20 Mining UMPs from Graph Traversals Apriori Pruning can be used? NO due to the nature of our new support counting method Support is no longer monotonically decreasing with the increasing size of the pattern A length-(k-1) subpattern S of a length-k pattern P doesn’t need to be large even if P is large Ex: UAP, P 1 = and its subpattern P 2 = UAP assigns a support  to P 1 and to P 2

21 Mining UMPs from Graph Traversals Example: Use supp min = 1.33 Database of UAPs Set of all large Patterns (UMPs)

22 Generation of Mobility Rules Extract rules from the UMPs For a rule: R:  A confidence value is calculated: Head Tail

23 Generation of Mobility Rules The rules which have confidence higher than conf min are selected All possible mobility rules for the UMPs given in previous example are:

24 Mobility Prediction User has followed a path P= up to now Find the rules whose head parts are contained in P and the last cell in their head is c i-1 Store the first cell of tail along with the (confidence + support) of rule as a tuple Sort these tuples w.r.t. the (confidence + support) values in descending order Select the first m tuples

25 Mobility Prediction Example: Assume that the current trajectory of the user is P= Matching Rules:   Sorted tuple array is: TupleArray = [(5, 85.83), (0, 76.5)] If m=1, then Predicted Cells Set = {5} If m=2, then Predicted Cells Set = {5, 0}

26 Simulation Design Mobile users travel on a 15 by 15 hexagonal shaped network To generate UAPs, first UMPs are generated UMPs are taken as a random walk over the network Two types of UAPs:  Outliers: a random walk over the network  Non-outliers: those which follow a UMP o (outlier percentage): ratio of the number of outliers to the number of non-outliers

27 Simulation Design Corruption mechanism: insert random cells between the consecutive cells of an UMP c (corruption ratio): denotes the ratio of the number of such random cells to the number of cells in the corresponding UMP Three possible outcomes of a prediction  Correct prediction  Incorrect prediction  No prediction Two performance measures:

28 Algorithms Used for Comparison Mobility Prediction Based on Transition Matrix (TM)  A cell-to-cell transition matrix formed  Select the m most probable cells from the transition matrix Ignorant Prediction  Randomly select the m neighboring cells of the current cell

29 Impact of m on Precision and Recall Decreasing precision for both our algorithm and TM Increasing probability of making some incorrect predictions as m increases Increasing recall for all algorithms, but more significant increase for TM and Ignorant prediction

30 Impact of m on Precision and Recall Setting m as small as possible is convenient for our method The increase rate in the recall value from m values 1 to 2 is maximum for TM m ≥ 3 would cause excessive network resource waste Thus choose m = 2

31 Impact of Supp min Reduced recall and precision The increase in the supp min value leads to a decrease in the number of mined mobility rules Number of correct predictions is reduced Choose supp min =0.1

32 Impact of Conf min Increasing precision  Higher quality rules with the increasing conf min  Leading to a higher decrease rate in number of predictions when compared to the decrease rate in number of correct predictions Decreasing recall  The number of mined rules is reduced leading to a decrease in the number of correct predictions Choose conf min =80

33 Impact of Corruption Factor Decreasing precision and recall for our method and TM For all c, better precision than TM but worse recall than TM For our method, as c increases:  The number of mined mobility rules decreases  No prediction in some cases because no matching rules due to the corrupted UAPs

34 Impact of Outlier Percentage Both performance measures not affected significantly for all methods Rules extracted from outlier UAPs not used commonly, thus not reducing recall and precision significantly

35 Conclusion A data mining algorithm for the prediction of user movements in a mobile computing system Algorithm is based on  Mining the mobility patterns of users  Then forming mobility rules from these patterns  Finally predicting a mobile user’s next movements by using the mobility rules A good performance when compared to the performance of Ignorant Method

36 Conclusion Performance when compared to the TM  Better Precision: More accurate predictions Most of its predictions made at each request are correct  Worse Recall: Our method may not make prediction in response to some of the prediction requests Because there may not be any matching rule for the current trajectory of the user when a prediction request is made

37 Future Work For calculating the totDist value, our method:  Decrease the support given to pattern by a UAP as the number of corrupted cells increases in pattern  Other methods may be employed for calculating totDist value No time domain of the mobility patterns and mobility rules considered  In real life, mobility patterns might be related to time  Some specific rules valid for a specific time interval  Extend our algorithm to include the time domain of mobility rules A candidate pruning criterion suitable for our support counting method may be employed

38 ? Questions & Comments