Geographical and Temporal Similarity Measurement in Location-based Social Networks Chongqing University of Posts and Telecommunications KTH – Royal Institute of Technology Zhengwu Yuan Yanli Jiang Gyözö Gidofalvi
Outline Introduction Related work A hierarchical spatio-temporal similarity measure in LBSN Empirical evaluations MobiGIS 2013, Orlando, FL2
Introduction MobiGIS 2013, Orlando, FL3 Mobile Internet technology Internet technology Space Location technology Location-based Social Network User Similarity
LBSN Applications MobiGIS 2013, Orlando, FL4
Information Layout of LBSN Gao at al. Data Analysis on Location-Based Social Networks MobiGIS 2013, Orlando, FL5
Outline Introduction Related work A hierarchical spatio-temporal similarity measure in LBSN Empirical evaluations MobiGIS 2013, Orlando, FL6
Traditional: Cosine Similarity MobiGIS 2013, Orlando, FL7 Given a set of commonly rated items I AB, the cosine similarity between two users A and B based on their respective ratings R A,i and R B,i on items i I AB is:
Traditional: Adjusted Cosine Similarity MobiGIS 2013, Orlando, FL8 Given a set of commonly rated items I AB, the adjusted cosine similarity between two users A and B based on the sets of their individually rated items I A and I B and their average individual ratings on these items and is:
Traditional: Pearson Correlation Coefficient MobiGIS 2013, Orlando, FL9 Given a set of commonly rated items I AB, the adjusted cosine similarity between two users A and B based on the sets of their individually rated items I A and I B and their average individual ratings on these items and is:
Similarity in LBSN Similarity along (a combination of) different dimensions: Content layer, e.g.: Ye’11, McKenzie’13 Social layer, e.g.: Ye’12 Geographical layer, e.g.: Li’08 Semantic locations / categories of locations, e.g.: Xiao’10, Bao’12, Ye’11 Temporal sequential similarity, e.g.: Li’08 Check-in temporal similarity, e.g.: Ye’ MobiGIS 2013, Orlando, FL10
Outline Introduction Related work A hierarchical spatio-temporal similarity measure in LBSN Empirical evaluations MobiGIS 2013, Orlando, FL11
A Hierarchical Spatial-Temporal Similarity Measure in LBSN Assumptions about user similarity: The closer is the time and the geographical location that two users access the more similar are the two users to each other The larger is the number of check-ins of two users in nearby locations at similar times, the more similar are the two users to each other Similarity changes with the level of detail Proposed method: Extract spatio-temporal clusters from user check-ins at different spatio-temporal levels of detail For each ST level of detail, measure the cosine similarity between users using the classical Vector Space Model (VSM) with vectors composed of the amount of user visits in different ST clusters Calculate the weighted combination of similarities at different ST levels of detail MobiGIS 2013, Orlando, FL12
Hierarchical Spatio-Temporal Clustering Spatio-temporal variant of DBSCAN: ST-DBSCAN [Birant’07] An object is a core object if within its spatial (Eps_space) and temporal (Eps_time) neighborhood the number of objects is at least MinPts. Definitions for Directly Density-Reachable (DDR), Density-Reachable (DR), and Density-Connected are straight forward extensions. Clusters at different levels of detail: MobiGIS 2013, Orlando, FL13
Vector Space Model Define the user-location matrix within a certain period as where m is the total number of users, n is the number of ST clusters discovered by ST-DBSCAN(Eps_space, Eps_time, MinPts), V ij is the number of check-ins by user i in the ST cluster j, and l is the level of detail in the clustering hierarchy MobiGIS 2013, Orlando, FL14
User Similarity User similarity at a given cluster hierarchy level is according to the cosine similarity of the location vectors of the users: The overall similarity of users is calculated across the cluster hierarchy levels as follows: MobiGIS 2013, Orlando, FL15
Outline Introduction Related work A hierarchical spatio-temporal similarity measure in LBSN Empirical evaluations MobiGIS 2013, Orlando, FL16
Dataset Check-in datasets from Gowalla from the Stanford Network Analysis project for the US cities: MobiGIS 2013, Orlando, FL17
Evaluation Metrics Precision and recall (“relative overlap”) of the visits of a user u r and its most similar user u to the Top-N ST clusters / POIs: MobiGIS 2013, Orlando, FL18
Results MobiGIS 2013, Orlando, FL19 ST generalization at different levels of detail improves performance Combining similarity measures at different ST levels of detail improves precision and recall and outperforms the fine-grained method (see ST-DBSCAN) Considering the amount (not only the existence) of check- in at different ST clusters improves performance (see Jaccard)
Conclusions We have proposed a new method to calculate the user similarity on LBSN based on the spatial and temporal properties of the user check-in data. The method can be applied to recommend location or friends in LBSN, because the key of a recommendation system is the similarity measurement of user or item MobiGIS 2013, Orlando, FL20
Thank you for your attention! Q/A? MobiGIS 2013, Orlando, FL21