Data Mining and Machine Learning Lab Mobile Location Prediction in Spatio-Temporal Context Data Mining and Machine Learning Lab Arizona State University.

Slides:



Advertisements
Similar presentations
Cipher Techniques to Protect Anonymized Mobility Traces from Privacy Attacks Chris Y. T. Ma, David K. Y. Yau, Nung Kwan Yip and Nageswara S. V. Rao.
Advertisements

Annual International Conference On GIS, GPS AND Remote Sensing.
Data Mining and Machine Learning Lab eTrust: Understanding Trust Evolution in an Online World Jiliang Tang, Huiji Gao and Huan Liu Computer Science and.
By Venkata Sai Pulluri ( ) Narendra Muppavarapu ( )
Learning Trajectory Patterns by Clustering: Comparative Evaluation Group D.
INTRODUCTION TO MACHINE LEARNING Bayesian Estimation.
Towards Twitter Context Summarization with User Influence Models Yi Chang et al. WSDM 2013 Hyewon Lim 21 June 2013.
CS479/679 Pattern Recognition Dr. George Bebis
Budapest May 27, 2008 Unifying mixed linear models and the MASH algorithm for breakpoint detection and correction Anders Grimvall, Sackmone Sirisack, Agne.
Template design only ©copyright 2008 Ohio UniversityMedia Production Spring Quarter  A hierarchical neural network structure for text learning.
Social Media Mining Chapter 5 1 Chapter 5, Community Detection and Mining in Social Media. Lei Tang and Huan Liu, Morgan & Claypool, September, 2010.
CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 10: The Bayesian way to fit models Geoffrey Hinton.
Critical Analysis Presentation: T-Drive: Driving Directions based on Taxi Trajectories Authors of Paper: Jing Yuan, Yu Zheng, Chengyang Zhang, Weilei Xie,
WindMine: Fast and Effective Mining of Web-click Sequences SDM 2011Y. Sakurai et al.1 Yasushi Sakurai (NTT) Lei Li (Carnegie Mellon Univ.) Yasuko Matsubara.
Chen Cheng1, Haiqin Yang1, Irwin King1,2 and Michael R. Lyu1
Report on Intrusion Detection and Data Fusion By Ganesh Godavari.
Time-dependent Similarity Measure of Queries Using Historical Click- through Data Qiankun Zhao*, Steven C. H. Hoi*, Tie-Yan Liu, et al. Presented by: Tie-Yan.
1 LM Approaches to Filtering Richard Schwartz, BBN LM/IR ARDA 2002 September 11-12, 2002 UMASS.
Presented by Zeehasham Rasheed
Computer vision: models, learning and inference
Scalable Text Mining with Sparse Generative Models
Intrusion and Anomaly Detection in Network Traffic Streams: Checking and Machine Learning Approaches ONR MURI area: High Confidence Real-Time Misuse and.
(ACM KDD 09’) Prem Melville, Wojciech Gryc, Richard D. Lawrence
Extracting Places and Activities from GPS Traces Using Hierarchical Conditional Random Fields Yong-Joong Kim Dept. of Computer Science Yonsei.
Masquerade Detection Mark Stamp 1Masquerade Detection.
Overview G. Jogesh Babu. Probability theory Probability is all about flip of a coin Conditional probability & Bayes theorem (Bayesian analysis) Expectation,
Machine Learning CUNY Graduate Center Lecture 3: Linear Regression.
A learning-based transportation oriented simulation system Theo A. Arentze, Harry J.P. Timmermans.
1 A Discriminative Approach to Topic- Based Citation Recommendation Jie Tang and Jing Zhang Presented by Pei Li Knowledge Engineering Group, Dept. of Computer.
Privacy Preserving Data Mining on Moving Object Trajectories Győző Gidófalvi Geomatic ApS Center for Geoinformatik Xuegang Harry Huang Torben Bach Pedersen.
Mirco Nanni, Roberto Trasarti, Giulio Rossetti, Dino Pedreschi Efficient distributed computation of human mobility aggregates through user mobility profiles.
Data Mining and Machine Learning Lab Exploring Temporal Effects for Location Recommendation on Location-Based Social Networks Huiji Gao, Jiliang Tang,
Data Mining and Machine Learning Lab Network Denoising in Social Media Huiji Gao, Xufei Wang, Jiliang Tang, and Huan Liu Data Mining and Machine Learning.
Annealing Paths for the Evaluation of Topic Models James Foulds Padhraic Smyth Department of Computer Science University of California, Irvine* *James.
Statistical NLP: Lecture 8 Statistical Inference: n-gram Models over Sparse Data (Ch 6)
Dynamic 3D Scene Analysis from a Moving Vehicle Young Ki Baik (CV Lab.) (Wed)
Report on Intrusion Detection and Data Fusion By Ganesh Godavari.
Spatial-Temporal Models in Location Prediction Jingjing Wang 03/29/12.
Data Mining and Machine Learning Lab Unsupervised Feature Selection for Linked Social Media Data Jiliang Tang and Huan Liu Computer Science and Engineering.
© 2010 AT&T Intellectual Property. All rights reserved. AT&T, the AT&T logo and all other AT&T marks contained herein are trademarks of AT&T Intellectual.
Chapter6. Statistical Inference : n-gram Model over Sparse Data 이 동 훈 Foundations of Statistic Natural Language Processing.
IRCS/CCN Summer Workshop June 2003 Speech Recognition.
Analysis of Topic Dynamics in Web Search Xuehua Shen (University of Illinois) Susan Dumais (Microsoft Research) Eric Horvitz (Microsoft Research) WWW 2005.
Processing Sequential Sensor Data The “John Krumm perspective” Thomas Plötz November 29 th, 2011.
CONFIDENTIAL1 Hidden Decision Trees to Design Predictive Scores – Application to Fraud Detection Vincent Granville, Ph.D. AnalyticBridge October 27, 2009.
Positional Relevance Model for Pseudo–Relevance Feedback Yuanhua Lv & ChengXiang Zhai Department of Computer Science, UIUC Presented by Bo Man 2014/11/18.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Mining Logs Files for Data-Driven System Management Advisor.
CS Statistical Machine learning Lecture 24
Slides for “Data Mining” by I. H. Witten and E. Frank.
CHAPTER 8 DISCRIMINATIVE CLASSIFIERS HIDDEN MARKOV MODELS.
Michael Isard and Andrew Blake, IJCV 1998 Presented by Wen Li Department of Computer Science & Engineering Texas A&M University.
Date: 2015/11/19 Author: Reza Zafarani, Huan Liu Source: CIKM '15
Network Community Behavior to Infer Human Activities.
Context-based vision system for place and object recognition Antonio Torralba Kevin Murphy Bill Freeman Mark Rubin Presented by David Lee Some slides borrowed.
Intelligent DataBase System Lab, NCKU, Taiwan Josh Jia-Ching Ying 1, Wang-Chien Lee 2, Tz-Chiao Weng 1 and Vincent S. Tseng 1 1 Department of Computer.
Unsupervised Mining of Statistical Temporal Structures in Video Liu ze yuan May 15,2011.
 Present by 陳群元.  Introduction  Previous work  Predicting motion patterns  Spatio-temporal transition distribution  Discerning pedestrians  Experimental.
A Dynamic Conditional Random Field Model for Object Segmentation in Image Sequences Duke University Machine Learning Group Presented by Qiuhua Liu March.
Hybrid Load Forecasting Method With Analysis of Temperature Sensitivities Authors: Kyung-Bin Song, Seong-Kwan Ha, Jung-Wook Park, Dong-Jin Kweon, Kyu-Ho.
Protein Family Classification using Sparse Markov Transducers Proceedings of Eighth International Conference on Intelligent Systems for Molecular Biology.
Combining Evolutionary Information Extracted From Frequency Profiles With Sequence-based Kernels For Protein Remote Homology Detection Name: ZhuFangzhi.
Unsupervised Streaming Feature Selection in Social Media
2005/09/13 A Probabilistic Model for Retrospective News Event Detection Zhiwei Li, Bin Wang*, Mingjing Li, Wei-Ying Ma University of Science and Technology.
Speaker Min-Koo Kang March 26, 2013 Depth Enhancement Technique by Sensor Fusion: MRF-based approach.
CPSC 7373: Artificial Intelligence Lecture 12: Hidden Markov Models and Filters Jiang Bian, Fall 2012 University of Arkansas at Little Rock.
Ch 1. Introduction Pattern Recognition and Machine Learning, C. M. Bishop, Updated by J.-H. Eom (2 nd round revision) Summarized by K.-I.
Week 6 Cecilia La Place.
Fenglong Ma1, Jing Gao1, Qiuling Suo1
CSCI 5822 Probabilistic Models of Human and Machine Learning
Presentation transcript:

Data Mining and Machine Learning Lab Mobile Location Prediction in Spatio-Temporal Context Data Mining and Machine Learning Lab Arizona State University June 18, 2012 Huiji Gao, Jiliang Tang, Huan Liu

Mobile Location Prediction Applications:  Mobile Advertising  Traffic Planning  User oriented coupon dispersion  Disaster Relief Challenges:  Over-fitting Long spatial-temporal trajectories with massive n-gram spatial patterns and sparse temporal patterns, smoothing techniques are indispensable.  Integration Seek a good way to integrate both spatial information and temporal information.

Problem Statement Given a user with a series of his historical visits in a previous time section, and a context of the latest visit location with the time of the next visit, the location prediction problem in Nokia Mobile Data Challenge can be described as finding the probability of Where : the i-th visit at location l; : the i-th visit happens at time t; set as the “ending time of current visit” : the (i-1)-th visit happened at location l k.

Problem Statement Using Bayes’ rule, Consider under the assumption that the probability of current visit time is only relevant to the current visit location.

Problem Statement Spatial PriorTemporal Constraint The probability of next visit at location l given the current visit at l k The probability of the i-th visit happening at time t, observing that the i-th visit location is l.

Spatial Prior  Two properties of spatial prior Power Law Distribution People tend to go to few places many times, and many places few times. Short Term Effect The current visit location is more relevant to the latest visit than the older visit.  Correspondences between language and LBSN modeling [Gao et al. Exploring Social-Historcal Ties on Location-Based Social Networks. ICWSM 2012]

Spatial Prior  Hierarchical Pitman-Yor (HPY) Language Model to generate the spatial prior (HPY spatial prior)  Generate the probability of next location based on an observation of historical visiting sequence.  Consider a combination of all the n-gram patterns in the previous visits with various pattern weights.  The latest visit has higher weight than the older visit. [Gao et al. Exploring Social-Historcal Ties on Location-Based Social Networks. ICWSM 2012]

Temporal Constraint h: Hour of the day, i.e., 10:00am, 3:00pm d: Day of the week, i.e., Monday, Sunday Temporal Constraint: Daily Constraint Hourly Constraint

Temporal Constraint  For a visit location l that has happened at h (d) in the previous visits, it’s easy to get and  For a visit location l that has not happened at h (d) in the previous visits (majority part in the training set), how to compute? Compute and

Temporal Constraint  Distribution of a user’s visits at a specific location in 24 hours. (user id: 013; place id: 3 ) Compute and Maximizing Likelihood

Temporal Constraint Curve Fitting: [user id: 013; place id: 3]

Location Prediction Probability of visiting location l at time t with the latest visit at l k HPY PriorGaussian HPY Prior Hour-Day Model (HPHD)

Experiments Experiment Setting  For Submission: Training set: Set A Testing set: Set C (no ground truth) Number of users: 80  For Evaluation Divide set A into training and testing parts.  Testing set: Toy data provided by Nokia with 3373 unknown locations.  Training set: For each user, all the visits in set A that happened before the visits in testing set.

Experiments  Baseline Methods (Spatial Family) 1.Most Frequent Visit Model (MFV) Consider the most frequent visited location 2.Order-1 Markov Model (OMM) Consider the most frequent two-gram pattern with the latest visit as context. 3.Fallback Markov Model A combination of MFV and OMM 4.HPY Prior Model (HP) Consider the HPY prior only to predict the next location.

Experiments  Baseline Methods (Temporal Family) 5.Most Frequent Hourly Model (MFH) Predict the next visit at time h as the most frequent visit location at h in previous visits. 6.Most Frequent Daily Model (MFD) Predict the next visit at time d as the most frequent visit location at d in previous visits. 7.Most Frequent Hour-Day Model (MFHD) A combination of MFH and MFD (by mutiplying)

Experiments  Baseline Methods (Spatio-Temporal family) 8.HPY Prior Hourly Model (HPH) Consider the HPY prior and hourly information. 9.HPY Prior Daily Model (HPD) Consider the HPY Prior and daily information.  Proposed Method (Spatio-Temporal family) HPY Prior Hour-Day Model (HPHD) Consider the HPY prior, hourly information, and daily information.

Experiments  Results

Conclusions and Future Work  Gaussian Distribution with Two Peaks  An alternative version of HPHD. (AHPHD)  Not stable, sensitive to peak detection.  Five Submissions FHD, HP, HPH, HPHD, AHPHD

Acknowledgements This work is supported, in part, by ONR (N ). All the data used in this work is provided by Nokia Mobile Data Challenge.

Questions?