Download presentation
Presentation is loading. Please wait.
Published byAubrie Stevenson Modified over 9 years ago
1
Linking Organizational Social Networking Profiles PROJECT ID: H0791030 JEROME CHENG ZHI KAI (A0080860H ) 1
2
Example: Holiday Inn TWITTERFACEBOOK 2
3
Motivation: Individuals Want to find profiles, but no one place has them Sometimes on company websites, but: No standardized location Not all companies bother 3
4
4
5
5
6
Motivation: Organizations Track competitor’s use of social media Find imposter profiles 6
7
Problem Definition 7 System Social Profiles Organization Name Official Affiliate Unrelated
8
Related Work Focused on deduplication for individuals Relevant: profile characteristics focused on 8
9
Related Work: Usernames Connecting Corresponding Identities across Communities (Zafarani & Liu, 2009) Connecting users across social media sites: a behavioral- modeling approach (Zafarani & Liu, 2013) Studying User Footprints in Different Online Social Networks (Malhotra et al., 2012) 9
10
Related Work: Created Content Identifying Users Across Social Tagging Systems (Iofciu, Fankhauser, Abel & Bischoff, 2011) 10
11
Methodology: System Design 1.Input: organization’s name (query) 2.Search Facebook/Twitter APIs, retrieve profiles 3.Convert profiles into feature vectors 4.Classify profile-as-vectors 11
12
Classifier Choice Evaluated scikit-learn’s: Decision Tree Naïve Bayes Support Vector Logistic Regression Random Forest Features aren’t independent – trees are well-suited 12
13
Feature Breakdown: Name-based Normalized Edit Distance Query to Username Query to Display Name Edit Distance Query to Username Query to Display Name Length of Query Length of Username Length of Display Name 13
14
Feature Breakdown: Name-based Quirks Need to handle abbreviations, stopwords Citigroup versus Citi, General Motors versus GM Take two edit distances: original string, processed string Use better scoring of the two 14
15
Feature Breakdown: Description Occurrences of Query Cosine Similarity Query and Description Duckduckgo Description and Profile Description 15
16
Feature Breakdown: Language Models Construct Bigram Language Model for: Official profile descriptions Affiliate profile descriptions Unrelated profile descriptions Probability that candidate description belongs to each 16
17
Evaluation: Ground Truth Creation 17 1.Retrieved organizations from Freebase 2.Searched for profiles on Twitter/Facebook 3.Manually labelled as official/affiliate/unrelated
18
Evaluation: Ground Truth Breakdown TWITTER CLASSESFACEBOOK CLASSES 18 3381 labels3413 labels
19
Evaluation: Process Mainly concerned with official and affiliate classes Not interested in unrelated class Modified 10-fold Cross Validation 19
20
Evaluation: Modified Cross Validation 1.Generate folds as per normal 2.Train classifier on training set as per normal 3.For each affiliate/official profile in test set: 1. Input organization’s name to system 2. Count number of correct results 4.Calculate precision/recall/F1 from counts 20
21
Evaluation: Baseline Normalised Edit Distance: Username/Display Name and Query Emulates searching networks manually without examining profile in detail 21
22
Results & Discussion: Twitter 22
23
Results & Discussion: Facebook 23
24
Discussion Baseline performs well for official class on Facebook Username and display name alone are good indicators for this class Other features still help, but not as much 24
25
Discussion: Facebook Characteristics Many profile types: people, pages, places, etc. Finding official pages is simplified But: finding affiliates requires more effort 25
26
Discussion: Facebook Characteristics Facebook doesn’t require a “username” be specified for pages Will just use an ID instead Auto-generated pages also only have IDs, use name from Wikipedia/other sources 26
27
Limitations Ground truth proportions: expand and/or balance 27
28
Limitations Ground truth proportions: expand and/or balance Limited number of profiles retrieved for classification 28
29
Future Work Support additional networks Examine post content “Preferential” classification 29
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.