Download presentation
Presentation is loading. Please wait.
Published byBruno Tate Modified over 9 years ago
1
Multiple Location Profiling for Users and Relationships from Social Network and Content Rui Li, Shengjie Wang, Kevin Chen-Chuan Chang University of Illinois at Urbana-Champaign
2
Users’ Locations are important for many information services and many others. Lives in: Los Angeles 2 Carol User Social Network Content Provider Local Content Recommendation Local Friends Recommendation
3
Community has explored social network and content to profile users’ locations. Profiling a User’s Home Location Location: Los Angeles Tweets Terrible LA traffic! Want to go to Honolulu for Spring vacation! See Gaga in Hollywood. Good Morning! Mike LA Carol ? Lucy Austin Gaga NY Bob San Diego Jean ? Social Network 3
4
Problem 1 They only profile a single home location. Locations of a user’s friends Locational WordFrequencies Paramount1 Los Angeles1 Hollywood2 Austin2 Tweeted Locational Words Carol lives Los Angeles and studied at Uni. of Texas at Austin Uni. of Texas at Austin o incomplete o inaccurate 4
5
5 Problem 2 They totally miss profiling relationships. Relationships Profiling Carol follows Bob Carol follows Lucy Carol tweets Hollywood both Carol and Lucy studied at Austin Carol lives Los Angeles both Carol and Bob work at Los Angeles o useful !
6
We focus on multiple location profiling for users and relationships. Carol in Real-world Location: Los Angeles Education: Uni. of Texas at Austin Uni. of Texas at Austin Terrible LA traffic! Want to go to Honolulu for Spring vacation! See Gaga in Hollywood. Good Morning! Mike LA Carol ? Lucy Austin Gaga NY Bob San Diego Jean ? Carol’s Location Profile: Los Angeles, Austin Carol follows Lucy: Austin, Austin 6
7
Our approach is to build a model to connect known relationships with unknown locations. Known Relationships Following Relationships Carol follows Lucy Carol follows Mike …. Tweeting Relationships Carol tweets Hollywood Carol tweets Honolulu …. Users’ Locations ? Unknown Locations 7 MLP Model Generation Model Inference Algorithm
8
Challenge 1 How to connect users’ locations with relationships? A.from users’ locations to following relationships B.from users’ locations to tweeting relationships Challenge 2 How to model that the relationships are mixed? A.some relationships are not based on locations. B.each relationship is based on a different location. Challenge 3 How to utilize home locations from labeled users? There are three challenges for building MLP. 8
9
Challenge 1.A We need to connect following relationships with two users’ locations. 9 Even a user has only one location follows others from different locations. Tweeting Probability Carol at Los Angeles follows Bob in San Diego. 20% Carol at Los Angeles follows Mike in Los Angeles. 30% … The following probability as the probability generating a following relationship from a user to another user based on their locations
10
10 Observation We explore following probability via investigating a corpus It captures our intuition well. It fits a power law distribution.
11
11 Solution: We derive location-based following model for following probability. The location-based following model
12
12 Challenge 1.B We need to connect tweeting relationships with a user’s location. User at a location tweets different locations. The tweeting probability as the probability generating a tweeting relationship from a user to a venue based on a location Probability of Tweeting Carol at Los Angeles tweets about watching a show in Hollywood. 30% Carol at Los Angeles tweets about traffic in Los Angeles. 40% …
13
They capture our intuition well. They can be modeled as a set of multinomial distributions. 13 Observation We explore tweeting probability via investigating a corpus.
14
14 Solution: We derive location-based tweeting model for tweeting probability. The location-based tweeting model
15
Noisy relationships are not useful! Noisy Relationships Carol follows Lady Gaga Carol tweets Honolulu Location-based Relationshipsb Carol follows Lucy Carol tweets Los Angeles 15 Challenge 2.A There are both noisy and location-based relationships.
16
16 Solution: We propose a mixture component for two types of relationships. 1.A relationship is generated based on either a location-based model or a random model. 2.A binary model selector μ indicates which model is used. 3.The selector is generated via a binomial distribution
17
17 Challenge 2.B Location-based relationships are related to multiple locations. Location-based relationships Carol follows Lucy Carol tweets Hollywood Accurate! Complete! both Carol and Lucy studied at Austin Carol lives Los Angeles
18
Solution: We fundamentally model users multiple locations in generating relationships. Carol {Los Angels 0.1, Austin 0.1, … } 18 Location profile as a multinomial distribution over locations. Each relationship is based on one particular location from his profile.
19
Challenge 3 We should utilize observed locations from some users’ profiles. Mike LA Carol ? Lucy Austin Gaga NY Bob San Diego Jean ? they are useful for profiling locations! we cannot use them directly to generate relationships! 19 20% users provide their home locations in their profiles.
20
Solution: We utilize observed locations from as priors to generate users’ profiles. Bob {San Diego 0.9, Los Angels 0.05, …} We assume users profiles are generated prior distributions. Home locations of users are likely to be generated.
21
Therefore, we arrive a complete model. 21
22
We crawled a subset of Twitter. There are 139K users, 50 million tweets and 2 million following relationships. We evaluate our model on a large Twitter corpus. 22
23
Task 1 profiling users’ home locations, MLP performs accurately and improves baselines. 23
24
Task 2 profiling users’ multiple locations, MLP proforms accurately and completely. Precision and Recall at Rank 2 Case Studies Locations in a similar region Locations in different areas Accurately Completely 24
25
Task 3 profiling following relationships, MLP achieves 57% accuracy. 25
26
26 Thanks and Questions !
27
27 Backup for Questions
28
28 Experiments 1 We use the home location provided in users’ profiles as ground truth. We compare two baseline methods proposed in literature.
29
29 Experiments 2 We manually labeled multiple locations of 1000 users, and obtained 585 users, who clearly have multiple locations. We compare the same baseline methods as in the previous task. We measure the performance in terms of “precision” and “recall”.
30
30 Experiments 3 We manually labeled location assignments of 585 users, whose multiple locations are known to us, and obtained 4426 relationships. We design a meaningful baseline method, which profile a relationship based users home locations.
31
MLP defines the joint probability of observations, parameters, and latent variables. We infer users’ locations and locations assignments with the observed relationships and the given parameters. We develop our algorithm based on the Gibbs sampling method. We infer users’ locations and location assignments for relationships as latent variable in the joint probability. 31
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.