Download presentation
Presentation is loading. Please wait.
Published byJonah Bates Modified over 9 years ago
1
Collating Social Network Profiles
2
Objective 2 System
3
Objective 3 Company Name System Social Network Profiles InputOutput
4
4 Record Linkage + Identity
5
Agenda 5 Introduction Objective Contrast to Existing Work Work Done Baseline System Individual Network Approach Machine Learning Experiments Next Steps, Q&A
6
Baseline System 6
7
Ground Truth Two networks: Facebook and Twitter Top seventy 2013 Fortune 500 companies Two networks: Facebook and Twitter Top seventy 2013 Fortune 500 companies 7
8
Baseline Algorithm 1.Take company name. 2.Search Facebook/Twitter API using it. 3.Return first result from each. 1.Take company name. 2.Search Facebook/Twitter API using it. 3.Return first result from each. 8
9
Baseline Performance 9
10
Individual Network Approach 10
11
New Approach Score profiles based on Edit Distance Company Name – Username Company Name – Display Name Relative Popularity Score profiles based on Edit Distance Company Name – Username Company Name – Display Name Relative Popularity 11
12
12 Display Name Username
13
New Approach Score profiles based on Edit Distance Company Name – Username Company Name – Display Name Relative Popularity Score profiles based on Edit Distance Company Name – Username Company Name – Display Name Relative Popularity 13
14
Scoring 14
15
Best Performing Combination 15
16
Machine Learning Experiments 16
17
Freebase Ground Truth 397,071 Business Operations1,422 with a social media presence917 with Facebook, 687 with Twitter598 with both553 with valid profiles 17
18
Training Set 553 Correct 553 Incorrect 1106 Total 18
19
Cross Validation Results ClassifierTest | TrainTrain | Test Linear Regression0.7340.707 Gaussian Naïve Bayes0.9720.956 Multinomial Naïve Bayes0.5110.506 Bernoulli Naïve Bayes0.7200.701 Decision Tree0.9540.935 19
20
Next Steps Improve training set: provide harder examples 20
21
Next Steps Improve training set: provide harder examples Incorporate more profile data Improve training set: provide harder examples Incorporate more profile data 21
22
Next Steps Improve training set: provide harder examples Incorporate more profile data Build system around classifiers Improve training set: provide harder examples Incorporate more profile data Build system around classifiers 22
23
Agenda 23 Introduction Objective Contrast to Existing Work Work Done Baseline System Individual Network Approach Machine Learning Experiments Next Steps, Q&A
24
24
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.