Mining User Similarity from Semantic Trajectories Josh Jia-Ching Ying, Eric Hsueh-Chan Lu, Wang-Chien Lee, Tz-Chiao Weng, and Vincent S. Tseng Presenter: Josh Jia-Ching Ying
Outline Introduction Semantic Trajectory Based Friend Recommendation Experiment Conclusion and future work
Introduction With the rapid growth and fierce competition in the market of social networking services, many service providers have deployed various recommendation services friend recommendation Most of the friend recommendation engines use profiles or on- line behavior of users to make recommendations instead of capturing the ‘’real’’ characteristics in user behavior In recent years, a new breed of social networking services, called location-based social networks (LBSNs), have emerged
Introduction Obviously user similarity plays a crucial role in these friend recommendation services Most studies of measuring mobile users’ similarity focus only on analyzing geographic features of user trajectories The notion of semantic trajectory has been proposed by Alvares et al. in 2007 sequence of locations with semantic tags to capture the landmarks passed by eg. School Park Restaurant
Introduction
Introduction
Semantic Trajectory Based Friend Recommendation Framework Trajectory logs Geographic Information Semantic Trajectory Semantic Trajectory Transformation Semantic Trajectory Pattern Mining Pattern Sets User Similarity Measurement User Similarity Matrix Potential Friends Recommender Smart-phones or PDAs Laptops or PCs 1 3 4 input 2
Semantic Trajectory Transformation GPS trajectory Cell trajectory (Eagle et al. ) - 8 -
Semantic Trajectory Transformation For GPS trajectory (basically follow Alvares et al’s approach) US Post Office Seniore’s Pizza Fremont Park Post Office Restaurant Park
Semantic Trajectory Transformation For Cell trajectory If the stay time > time threshold, the cell is called a stay cell. stay time = leave time – arrive time
Semantic Trajectory Transformation <Stay Cell0, Stay Cell1, Stay Cell2, Stay Cell3> <{Unknown}, {School, Park }, {Park}, {Hospital}>
Framework Trajectory logs Geographic Information Semantic Trajectory Semantic Trajectory Transformation Semantic Trajectory Pattern Mining Pattern Sets User Similarity Measurement User Similarity Matrix Potential Friends Recommender Smart-phones or PDAs Laptops or PCs 1 3 4 input 2
User Similarity Measurement Maximal Semantic Trajectory Pattern Similarity (MSTP-Similarity) Similarity between two Semantic Trajectory Pattern Sets <A,{BC}> … <A,{BC},E> … <D,{AC},E> … <D,{AC},E> … - 13 -
MSTP-Similarity Common part Given two Maximal Semantic Trajectory Patterns, we argue that they are more similar when they have more common parts the longest common sequence (LCS) of the two patterns - 14 -
MSTP-Similarity The participation ratio of the common part to a pattern - 15 -
MSTP-Similarity Pattern similarity Equal Average Weighted Average - 16 -
Similarity between two Users To measure how similar two pattern sets are: Equal weight Weighting by support Weighting by TFIDF user V user U P1 … Pm P1’ … Pn’ There are m×n Maximal Semantic Trajectory Pattern Similarity - 17 -
Weighting by support A pattern with a high support is more important Geometric mean Arithmetic average user V user U P1 … Pm P1’ … Pn’ - 18 -
TFIDF TFIDF=TF*log(IDF) Term frequency Inverse document frequency <A,{BC}>: 6 <A,D> : 2 User a TFIDF=TF*log(IDF) TF: term frequency IDF: inverse document frequency Term frequency User a <A,{BC}>: 6/(6+2) = 3/4 <A,D> :2/(6+2) = 1/4 User b <A,{BC}>: 3/(3+6) = 1/3 <B,E> :6/(3+6) = 2/3 User c <A,{BC}>: 3/(3+2) = 3/5 <A,D> :2/(3+2) = 2/5 Inverse document frequency <A,{BC}>: 3/(1+1+1) = 1 <A,D> : 3/(1+0+1) = 3/2 <B,E> : 3/(0+1+0) = 3 <A,{BC}>: 3 <B,E> : 6 User b <A,{BC}>: 3 <A,D> : 2 User c TFIDF in User a: <A,{BC}>: (3/4)*log1 = 0 <A,D>: (1/4)*log1.5 = 0.04 - 19 -
Using TFIDF as the weight Geometric mean Arithmetic average user V user U P1 … Pm P1’ … Pn’ - 20 -
Experiment — dataset MIT reality mining dataset The Reality Mining project was conducted from 2004-2005 at the MIT Media Laboratory Cell trajectory Cell annotation
Experiment — ground truth MIT Media Laboratory has conducted an online survey, which was completed by 94 mobile users The survey data present the summarized behavior of a mobile user Among the 94 mobile users, 7 users who do not have cell trajectory logs, 10 users who do not have cell annotation logs. remaining 77 mobile users are used in our experiments
Experiment Baseline We directly perform a maximal sequential pattern mining algorithm on the stay cell sequence set for each mobile user
Conclusion and future work We propose a novel framework to support friend recommendation services the semantic trajectories of mobile users MSTP-Similarity for measuring the similarity between two semantic trajectory patterns Through a series of experiments, our proposed friend recommendation framework has excellent performance under various conditions Future work Consider stay time (Duration) for the recommender Consider geographic features for the recommender
Thank you for your attention Question?