TribeFlow Mining & Predicting User Trajectories Flavio Figueiredo Bruno Ribeiro Jussara M. AlmeidaChristos Faloutsos 1.

Slides:

Advertisements

Similar presentations

Cloud Computing Resource provisioning Keke Chen. Outline  For Web applications statistical Learning and automatic control for datacenters  For data.

Advertisements

Link Analysis: PageRank

A Probabilistic Model for Road Selection in Mobile Maps Thomas C. van Dijk Jan-Henrik Haunert W2GIS, 5 April 2013.

Constructing Popular Routes from Uncertain Trajectories Ling-Yin Wei 1, Yu Zheng 2, Wen-Chih Peng 1 1 National Chiao Tung University, Taiwan 2 Microsoft.

1 Learning User Interaction Models for Predicting Web Search Result Preferences Eugene Agichtein Eric Brill Susan Dumais Robert Ragno Microsoft Research.

Yuan Yao Joint work with Hanghang Tong, Xifeng Yan, Feng Xu, and Jian Lu MATRI: A Multi-Aspect and Transitive Trust Inference Model 1 May 13-17, WWW 2013.

Inferring Mixtures of Markov Chains Tuğkan BatuSudipto GuhaSampath Kannan University of Pennsylvania.

O PTICAL C HARACTER R ECOGNITION USING H IDDEN M ARKOV M ODELS Jan Rupnik.

Advanced Artificial Intelligence

More on Rankings. Query-independent LAR Have an a-priori ordering of the web pages Q: Set of pages that contain the keywords in the query q Present the.

Generative Topic Models for Community Analysis

Caimei Lu et al. (KDD 2010) Presented by Anson Liang.

Chen Cheng1, Haiqin Yang1, Irwin King1,2 and Michael R. Lyu1

Neighborhood Formation and Anomaly Detection in Bipartite Graphs Jimeng Sun Huiming Qu Deepayan Chakrabarti Christos Faloutsos Speaker: Jimeng Sun.

Link Analysis, PageRank and Search Engines on the Web

Collaborative Ordinal Regression Shipeng Yu Joint work with Kai Yu, Volker Tresp and Hans-Peter Kriegel University of Munich, Germany Siemens Corporate.

Ranking by Odds Ratio A Probability Model Approach let be a Boolean random variable: document d is relevant to query q otherwise Consider document d as.

Probabilistic Model of Sequences Bob Durrant School of Computer Science University of Birmingham (Slides: Dr Ata Kabán)

Lasso regression. The Goals of Model Selection Model selection: Choosing the approximate best model by estimating the performance of various models Goals.

Scalable Text Mining with Sparse Generative Models

CS 188: Artificial Intelligence Fall 2009 Lecture 19: Hidden Markov Models 11/3/2009 Dan Klein – UC Berkeley.

(Some issues in) Text Ranking. Recall General Framework Crawl – Use XML structure – Follow links to get new pages Retrieve relevant documents – Today.

Identifying and Incorporating Latencies in Distributed Data Mining Algorithms Michael Sevilla.

Multigraph Sampling of Online Social Networks Minas Gjoka, Carter Butts, Maciej Kurant, Athina Markopoulou 1Multigraph sampling.

Cao et al. ICML 2010 Presented by Danushka Bollegala.

Modeling Relationship Strength in Online Social Networks Rongjing Xiang: Purdue University Jennifer Neville: Purdue University Monica Rogati: LinkedIn.

Topic Models in Text Processing IR Group Meeting Presented by Qiaozhu Mei.

1 Applications of Relative Importance  Why is relative importance interesting? Web Social Networks Citation Graphs Biological Data  Graphs become too.

Fast Max–Margin Matrix Factorization with Data Augmentation Minjie Xu, Jun Zhu & Bo Zhang Tsinghua University.

Ex-MATE: Data-Intensive Computing with Large Reduction Objects and Its Application to Graph Mining Wei Jiang and Gagan Agrawal.

Streaming Predictions of User Behavior in Real- Time Ethan DereszynskiEthan Dereszynski (Webtrends) Eric ButlerEric Butler (Cedexis) OSCON 2014.

Bayesian Sets Zoubin Ghahramani and Kathertine A. Heller NIPS 2005 Presented by Qi An Mar. 17 th, 2006.

Annealing Paths for the Evaluation of Topic Models James Foulds Padhraic Smyth Department of Computer Science University of California, Irvine* *James.

Improving Web Search Ranking by Incorporating User Behavior Information Eugene Agichtein Eric Brill Susan Dumais Microsoft Research.

Bayesian networks Classification, segmentation, time series prediction and more. Website: Twitter:

Topic Modelling: Beyond Bag of Words By Hanna M. Wallach ICML 2006 Presented by Eric Wang, April 25 th 2008.

Fast Mining and Forecasting of Complex Time-Stamped Events Yasuko Matsubara (Kyoto University), Yasushi Sakurai (NTT), Christos Faloutsos (CMU), Tomoharu.

Source-Selection-Free Transfer Learning

EigenRank: A Ranking-Oriented Approach to Collaborative Filtering IDS Lab. Seminar Spring 2009 강 민 석강 민 석 May 21 st, 2009 Nathan.

Discovering Deformable Motifs in Time Series Data Jin Chen CSE Fall 1.

Summary We propose a framework for jointly modeling networks and text associated with them, such as networks or user review websites. The proposed.

Investigation of Various Factorization Methods for Large Recommender Systems G. Takács, I. Pilászy, B. Németh and D. Tikk 10th International.

Hierarchical Dirichlet Process and Infinite Hidden Markov Model Duke University Machine Learning Group Presented by Kai Ni February 17, 2006 Paper by Y.

Sparse Bayesian Learning for Efficient Visual Tracking O. Williams, A. Blake & R. Cipolloa PAMI, Aug Presented by Yuting Qi Machine Learning Reading.

Link Prediction Topics in Data Mining Fall 2015 Bruno Ribeiro

The Infinite Hierarchical Factor Regression Model Piyush Rai and Hal Daume III NIPS 2008 Presented by Bo Chen March 26, 2009.

Kijung Shin Jinhong Jung Lee Sael U Kang

1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.

Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:

Multi-label Prediction via Sparse Infinite CCA Piyush Rai and Hal Daume III NIPS 2009 Presented by Lingbo Li ECE, Duke University July 16th, 2010 Note:

Latent Feature Models for Network Data over Time Jimmy Foulds Advisor: Padhraic Smyth (Thanks also to Arthur Asuncion and Chris Dubois)

Collaborative Deep Learning for Recommender Systems

Spectral Algorithms for Learning HMMs and Tree HMMs for Epigenetics Data Kevin C. Chen Rutgers University joint work with Jimin Song (Rutgers/Palentir),

October Rotation Paper-Reviewer Matching Dina Elreedy Supervised by: Prof. Sanmay Das.

Fill-in-The-Blank Using Sum Product Network

Motivation Modern search engines for the World Wide Web use methods that require solving huge problems. Our aim: to develop multiscale techniques that.

Extracting Mobile Behavioral Patterns with the Distant N-Gram Topic Model Lingzi Hong Feb 10th.

Large Graph Mining: Power Tools and a Practitioner’s guide

An Infinite Factor Model Hierarchy Via a Noisy-Or Mechanism

Search Engines and Link Analysis on the Web

Integrating Meta-Path Selection With User-Guided Object Clustering in Heterogeneous Information Networks Yizhou Sun†, Brandon Norick†, Jiawei Han†, Xifeng.

CS 188: Artificial Intelligence Spring 2007

Scaling up Link Prediction with Ensembles

Biased Random Walk based Social Regularization for Word Embeddings

Asymmetric Transitivity Preserving Graph Embedding

Probabilistic Latent Preference Analysis

Topic models for corpora and for graphs

GANG: Detecting Fraudulent Users in OSNs

Learning to Rank Typed Graph Walks: Local and Global Approaches

Topological Signatures For Fast Mobility Analysis

Presentation transcript:

TribeFlow Mining & Predicting User Trajectories Flavio Figueiredo Bruno Ribeiro Jussara M. AlmeidaChristos Faloutsos 1

Example: Online Check-Ins 2 2

What Happens x What we See 3 3 L1 L2 L1 L3 Time T1T1 T2T2 T3T3

Different Behaviors Different Trajectories h 5h1h 6h Duration in Location 123 1h0.5h

General Motivation 5 5 ???? ???? Predict where an agent will go next ????

6 How we Navigate?

Navigational Constraints Geographical Constraints –Check-in at Eiffel Tower followed by Statue of Liberty Probably not –Starbucks at JFK followed by McDonalds at Beijing Airport. Possible Application Constraints (e.g., Links on the Application) 7

Simple Navigation Process (e.g., PageRank) –Visible Links –Random Walks w. Random Jumps to Random Nodes

Random Walk Transition Matrix P[Next| Previous] What if we don’t see the links? Trajectories are all we see. What if we don’t see the links? Trajectories are all we see h0.5h And no self-loops: Modeling self-loops is separate problem. See Figueiredo et al. PKDD2014, Benson et al. WWW 2016, …

Latent Random Walk Transition Matrices 10 Latent Transition Matrices –Navigation constraints –Navigation preferences based on user types We must learn: –Transition matrices from trajectory data From multiple users –User preferences over matrices

TribeFlow’s Environments P[Node = 3 | Node = 5, Env = Green] x P[inter-event time > T | Env = Green] P[Env = Green | User = Jane] Time between transitions Transition & Time

Problem Definition Input: –Large dataset of user trajectories from thousands of users Output: –Set of latent transition matrices –Set of user preferences over these matrices That is: –Interpretable and accurate Alice will listen to ‘4’ next!

Defining the Model 13

… … … How many Transition Matrices? We Use a Bayesian Nonparametric Model –Learned from the data 14 Dirichlet process prior over user preference vector (rows)

Decomposing User Trajectories 123 2mi n 1mi n 65 5mi n mi n 2mi n Unlikely in Green Environment (Jump) 2 min Unlikely in Red Environment (Jump) 24 hours

Dealing With Sparseness Sparseness Issue –Millions of items (e.g., web pages / artists) –~10 12 bigrams (Source, Destination) – Last.FM artists –Only ~10 8 (Source, Destination) pairs largest dataset [LastFM-Groups] Thus, Random Walk transition probabilities over vertices rather than edges (without self-loops) 16

Learning the Model 17

Generative model variables: –Environments: Transition Matrix –Simplify inference by breaking up sequence into renewal times of length B Inter-event time distribution –Latent user preferences over environments TribeFlow’s Generative Model 18

TribeFlow Inference Gibbs sampling –Item Transitions –User Preferences –Inter-event times Merge and Split Moves –To infer Dirichlet Process Fully Distributed –Based on AsyncLDA: Asuncion, 2009, NIPS 19

Results 20

TribeFlow at Work 21 Years of data 70% train + validation 30% test time

Latent Markov Embedding (LME), Chen et al., KDD, 2012 Multi-core Latent Markov Embedding, Chen et al., KDD 2013 PRLME, Personalized Ranking LME, Feng et al., IJCAI 2015 Factorizing Machines (FPMC), Rendle et al. WWW 2010 Progression Stages (Stages), Yang et al., WWW 2014 Gravity Model (Gravity) –Commonly used by various authors to measure flow between locations Silva et al., 2006, García-Galivanes et al., CSCW 2014, Smith et al., CSCW Prior Work for Comparison Common issues (except Stages & Gravity): - Don’t scale - Not personalized - No probabilistic interpretation Common issues (except Stages & Gravity): - Don’t scale - Not personalized - No probabilistic interpretation

Predicting Where Users Go Next 23 More Accurate Faster 20 cores one node

Using Data Subsamples 24 Always faster & more accurate Always faster and more accurate 20 cores one node

Further Comparisons 25 TribeFlow vs. MultiLME – Scalability (Chen et al, 2013) –TribeFlow is 413x Faster –At least 12% higher Predictive Likelihood TribeFlow vs. Stages (Yang et al, 2014) –TribeFlow at least 40% higher MRR (Mean Reciprocal Ranking) –In the datasets that we were able to execute Stages TribeFlow vs. Gravity Model (Silva et al, 2006) –Number of users that go from A to B –Gravity Models are Fast (Poisson Regression) –However, TribeFlow was 800% more accurate in MAE

Sense Making: Music Streaming Data 26 Data crawled using Pop artists as seeds. –Natural bias towards Pop music

Sense Making: FourSquare Data Why these environments? These are airports TribeFlow is learning the constraints –No use of GPS information 27

Users are not Synchronized “Everybody” eventually goes through a “Beatles” phase 28

29 X X A A B C G G I x J x K I x r K x t K x s r x s x t = users products time TribeFlow x Temporal Tensors If users are not synchronized:

Conclusions TribeFlow –Predicts & mines user trajectories –Fast & scalable –Accurate & interpretable Thank You! 30