Download presentation
Presentation is loading. Please wait.
Published byHelen Adams Modified over 9 years ago
1
Towards Social User Profiling: Unified and Discriminative Influence Model for Inferring Home Locations Rui Li, Shengjie Wang, Hongbo Deng, Rui Wang, Kevin Chen-chuan Chang University of Illinois at Urbana and Champaign
2
User profiling infers users’ essential attributes and is important for many services. and many others. Personalized Search Targeted Advertisement Search Engines Advertisers Richard User Job: Student Location: Champaign
3
This paper aims to profile Twitter users’ home locations from both Tweets and Following Network User Centric Data (Tweets) Social Network Data (Following network) Jessie Rob Lady Gaga Cindy Richard TechChruch Input Profiling a User’s Home Location Location: Champaign Output A user’s home location is defined as the place most his activities happen. It is different from a real-time geo position (e.g., Starbucks at green street) In Context of Twitter Network
4
The problem is difficult due to scarce signal challenge Only 6% messages contains location related terms! Jessie Champaign Rob Lady Gaga New York Cindy Richard TechChruch Unknown Only 16% users have locations on their profiles! Unknown Following Network San Francisco Tweets
5
The problem is difficult due to noisy signal challenge Tweets Jessie Champaign Rob Lady Gaga New York Cindy Richard TechChruch Unknown San Francisco Following Network A user tweets about locations different from his home location. User follows friends who live different locations from his home location.
6
Scarce Signal Challenge We propose a unified and discriminative probabilistic framework. Noisy Signal Challenge Unify two types of resources as a twitter graph Model the likelihood of an edge between two nodes via a discriminative Influence model Profile locations via maximizing the likelihood of observing the graph.
7
We unify two types of resources as a Directed Heterogeneous Graph We unify two types of resources as nodes on a heterogeneous graph We model it as a directed graph. We associate locations to the nodes. We aim to infer the locations of unlabeled nodes with locations of labeled nodes. Head Node Tail Node New York ? Champaign Beijing San Francisco ? ? Champaign v2v2 v1v1 u2u2 U6U6 u1u1 u3u3 u4u4 u5u5 Unlabeled Node labeled Node
8
We observe two key characteristics for the probability of an edge between two nodes Observation 1 The probability decreases as their distance increases Observation 2 At the same distance, different head (Chicago, Champaign) nodes have different probabilities to attract tail nodes. How likely a tail node n j at L(n j ) builds an edge e a head node n i at L(n i )
9
Conceptual level Discriminative Influence Model θ ni Influence probabilities decrease from the center. Different nodes have different influence scope. We propose a discriminative influence model to capture the two key characteristics Mathematical Level Gaussian Model
10
A local profiling algorithm profiles the location of a user via the edges from and to his labeled neighbors. simple but efficient closed-from solution. ? Champaign v2v2 v1v1 u2u2 u1u1 u4u4 u5u5 New York Beijing San Francisco Influence Scope Average Distance of a User’ s Followers User Location Weighted Average of Different Resources
11
A global algorithm profiles all the users’ locations together via all the edges in the graph. complex but accurate iterative algorithm. New York ? Champaign Beijing San Francisco ? ? Champaign v2v2 v1v1 u2u2 U6U6 u1u1 u3u3 u4u4 u5u5 The local algorithm only uses limited information. Our global algorithm aims to use all information.
12
We incorporate additional knowledge as constraints for maximizing the likelihood function. Additional Knowledge: e.g., users only live in cities or towns Constraint Optimization: we maximize the likelihood in each method under constraints.
13
Data Set: We crawled a subset of Twitter. We used the users having locations on profiles. There are 139K users, 50 million tweets and 2 million following relationships. Methods: User-based Location Profiling Content-based Location Profiling We compare our method with the-state-of-arts methods on a large Twitter corpus.
14
Our algorithms are better than the baseline methods as we model edges discriminatively.
15
Our algorithms can take advantages of modeling two different types of resources
16
The global profiling algorithm can further improve the local profiling algorithm.
17
We explore both social network and user-centric data for profiling users locations in a unified approach. We introduce a discriminative influence model. We develop two effective profiling methods and extend the methods via modeling constraints. The framework could be further extended to profiling other attributes. Conclusion and Future work
18
Questions?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.