Fighting Fire With Fire: Crowdsourcing Security Solutions on the Social Web Christo Wilson Northeastern University

Slides:



Advertisements
Similar presentations
Securing your Social Networking Profile Presented by: Kevin O'Brien Division of IT SF State.
Advertisements

Defending against large-scale crawls in online social networks Mainack Mondal Bimal Viswanath Allen Clement Peter Druschel Krishna Gummadi Alan Mislove.
Rewarding Crowdsourced Workers Panos Ipeirotis New York University and Google Joint work with: Jing Wang, Foster Provost, Josh Attenberg, and Victor Sheng;
RB-Seeker: Auto-detection of Redirection Botnet Presenter: Yi-Ren Yeh Authors: Xin Hu, Matthew Knysz, Kang G. Shin NDSS 2009 The slides is modified from.
All Your Contacts Are Belong to Us: Automated Identity Theft Attacks on Social Networks Reporter : 鄭志欣 Advisor: Hsing-Kuo Pao Date : 2010/12/06 1.
Qiang Cao Duke University
ABUSING BROWSER ADDRESS BAR FOR FUN AND PROFIT - AN EMPIRICAL INVESTIGATION OF ADD-ON CROSS SITE SCRIPTING ATTACKS Presenter: Jialong Zhang.
Machine Learning and Data Mining Course Summary. 2 Outline  Data Mining and Society  Discrimination, Privacy, and Security  Hype Curve  Future Directions.
AdaBoost & Its Applications
Design and Evaluation of a Real- Time URL Spam Filtering Service Kurt Thomas, Chris Grier, Justin Ma, Vern Paxson, Dawn Song University of California,
Face detection Many slides adapted from P. Viola.
Enabling the Social Web Krishna P. Gummadi Networked Systems Group Max Planck Institute for Software Systems.
Hongyu Gao, Tuo Huang, Jun Hu, Jingnan Wang.  Boyd et al. Social Network Sites: Definition, History, and Scholarship. Journal of Computer-Mediated Communication,
You Are How You Click Clickstream Analysis for Sybil Detection Gang Wang, Tristan Konolige, Christo Wilson †, Xiao Wang ‡ Haitao Zheng and Ben Y. Zhao.
Final Project: Project 9 Part 1: Neural Networks Part 2: Overview of Classifiers Aparna S. Varde April 28, 2005 CS539: Machine Learning Course Instructor:
UPM, Faculty of Computer Science & IT, A robust automated attendance system using face recognition techniques PhD proposal; May 2009 Gawed Nagi.
Face detection and recognition Many slides adapted from K. Grauman and D. Lowe.
Electronic Voting (E-Voting) An introduction and review of technology Written By: Larry Brachfeld CS591, December 2010.
GAYATRI SWAMYNATHAN, CHRISTO WILSON, BRYCE BOE, KEVIN ALMEROTH AND BEN Y. ZHAO UC SANTA BARBARA Do Social Networks Improve e-Commerce? A Study on Social.
Fighting Fire With Fire: Crowdsourcing Security Threats and Solutions on the Social Web Gang Wang, Christo Wilson, Manish Mohanlal, Ben Y. Zhao Computer.
Why Crowdsourcing Software automation replaces the role of human in many areas Store and retrieve large volumes of information Perform calculation Human.
SocialFilter: Introducing Social Trust to Collaborative Spam Mitigation Michael Sirivianos Telefonica Research Telefonica Research Joint work with Kyungbaek.
CAP6135: Malware and Software Vulnerability Analysis Examples of Term Projects Cliff Zou Spring 2012.
Zifei Shan, Haowen Cao, Jason Lv, Cong Yan, Annie Liu Peking University, China 1.
Face Detection using the Viola-Jones Method
Authors: Gianluca Stringhini Christopher Kruegel Giovanni Vigna University of California, Santa Barbara Presenter: Justin Rhodes.
University of California at Santa Barbara Christo Wilson, Bryce Boe, Alessandra Sala, Krishna P. N. Puttaswamy, and Ben Zhao.
KAIST Web Wallet: Preventing Phishing Attacks by Revealing User Intentions Min Wu, Robert C. Miller and Greg Little Symposium On Usable Privacy and Security.
Gang Wang, Christo Wilson, Xiaohan Zhao, Yibo Zhu, Manish Mohanlal, Haitao Zheng and Ben Y. Zhao Computer Science Department, UC Santa Barbara Serf and.
Network and Systems Security By, Vigya Sharma (2011MCS2564) FaisalAlam(2011MCS2608) DETECTING SPAMMERS ON SOCIAL NETWORKS.
OSN Research As If Sociology Mattered Krishna P. Gummadi Networked Systems Research Group MPI-SWS.
FaceTrust: Assessing the Credibility of Online Personas via Social Networks Michael Sirivianos, Kyungbaek Kim and Xiaowei Yang in collaboration with J.W.
Measuring Web Site Performance Are companies making the leap?
 Two types of malware propagating through social networks, Cross Site Scripting (XSS) and Koobface worm.  How these two types of malware are propagated.
1 Controversial Issues  Data mining (or simple analysis) on people may come with a profile that would raise controversial issues of  Discrimination 
Task 1 Research on any 2 of the following: Online shopping Online banking Web broadcasting Social networking sites Discuss the disadvantages and advantages.
Man vs. Machine: Adversarial Detection of Malicious Crowdsourcing Workers Gang Wang, Tianyi Wang, Haitao Zheng, Ben Y. Zhao, UC Santa Barbara, Usenix Security.
Pete Bohman Adam Kunk. Real-Time Search  Definition: A search mechanism capable of finding information in an online fashion as it is produced. Technology.
Crowdsourcing: Ethics, Collaboration, Creativity KSE 801 Uichin Lee.
Uncovering Social Network Sybils in the Wild Zhi YangChristo WilsonXiao Wang Peking UniversityUC Santa BarbaraPeking University Tingting GaoBen Y. ZhaoYafei.
Data Warehousing Data Mining Privacy. Reading Bhavani Thuraisingham, Murat Kantarcioglu, and Srinivasan Iyer Extended RBAC-design and implementation.
Collaborative Information Retrieval - Collaborative Filtering systems - Recommender systems - Information Filtering Why do we need CIR? - IR system augmentation.
Contextual Crowd Intelligence Beng Chin Ooi National University of Singapore
Understanding Computer Viruses: What They Can Do, Why People Write Them and How to Defend Against Them Computer Hardware and Software Maintenance.
Leveraging Asset Reputation Systems to Detect and Prevent Fraud and Abuse at LinkedIn Jenelle Bray Staff Data Scientist Strata + Hadoop World New York,
By Gianluca Stringhini, Christopher Kruegel and Giovanni Vigna Presented By Awrad Mohammed Ali 1.
Computer Science Department, Peking University
Security Analytics Thrust Anthony D. Joseph (UCB) Rachel Greenstadt (Drexel), Ling Huang (Intel), Dawn Song (UCB), Doug Tygar (UCB)
The Viola/Jones Face Detector A “paradigmatic” method for real-time object detection Training is slow, but detection is very fast Key ideas Integral images.
Socialbots and its implication On ONLINE SOCIAL Networks Md Abdul Alim, Xiang Li and Tianyi Pan Group 18.
11 Shades of Grey: On the effectiveness of reputation- based “blacklists” Reporter: 林佳宜 /8/16.
Franklin Kramer.   Background  Experiment  Results  Conclusions Overview.
Social Turing Tests: Crowdsourcing Sybil Detection Gang Wang, Manish Mohanlal, Christo Wilson, Xiao Wang Miriam Metzger, Haitao Zheng and Ben Y. Zhao Computer.
Don’t Follow me : Spam Detection in Twitter January 12, 2011 In-seok An SNU Internet Database Lab. Alex Hai Wang The Pensylvania State University International.
Sybil Attacks VS Identity Clone Attacks in Online Social Networks Lei Jin, Xuelian Long, Hassan Takabi, James B.D. Joshi School of Information Sciences.
Uncovering Social Network Sybils in the Wild Zhi YangChristo WilsonXiao Wang Peking UniversityUC Santa BarbaraPeking University Tingting GaoBen Y. ZhaoYafei.
“What the is That? Deception and Countermeasures in the Android User Interface” Presented by Luke Moors.
Written by Qiang Cao, Xiaowei Yang, Jieqi Yu and Christopher Palow
Michael Xie, Neal Jean, Stefano Ermon
Online Social Network: Threats &
By : Namesh Kher Big Data Insights – INFM 750
Written by Qiang Cao, Xiaowei Yang, Jieqi Yu and Christopher Palow
Dieudo Mulamba November 2017
Big Data.
Spam Fighting at CERN 12 January 2019 Emmanuel Ormancey.
Binghui Wang, Le Zhang, Neil Zhenqiang Gong
Cybersecurity Simplified: Phishing
Presentation transcript:

Fighting Fire With Fire: Crowdsourcing Security Solutions on the Social Web Christo Wilson Northeastern University

 We tend to think of spam as “low quality”  What about high quality spam and Sybils? High Quality Sybils and Spam 2 Christo Wilson MaxGentleman is the bestest male enhancement system avalable. FAKE Stock Photographs

3

Black Market Crowdsourcing 4  Large and profitable  Growing exponentially in size and revenue in China  $1 million per month on just one site  Cost effective: $0.21 per click  Starting to grow in US and other countries  Mechanical Turk, Freelancer  Twitter Follower Markets  Huge problem for existing security systems  Little to no automation to detect  Turing tests fail

Crowdsourcing Sybil Defense  Defenders are losing the battle against OSN Sybils  Idea: build a crowdsourced Sybil detector  Leverage human intelligence  Scalable  Open Questions  How accurate are users?  What factors affect detection accuracy?  Is crowdsourced Sybil detection cost effective? 5

User Study  Two groups of users  Experts – CS professors, masters, and PhD students  Turkers – crowdworkers from Mechanical Turk and Zhubajie  Three ground-truth datasets of full user profiles  Renren – given to us by Renren Inc.  Facebook US and India Crawled Legitimate profiles – 2-hops from our profiles Suspicious profiles – stock profile images Banned suspicious profiles = Sybils 6 Stock Picture Also used by spammers

7 Progress Classifying Profiles Browsing Profiles Screenshot of Profile (Links Cannot be Clicked) Real or fake?Why? Navigation Buttons Testers may skip around and revisit profiles

Experiment Overview Dataset# of ProfilesTest Group# of Testers Profile per Tester SybilLegit. Renren100 Chinese Expert24100 Chinese Turker41810 Facebook US 3250 US Expert4050 US Turker29912 Facebook India 5049 India Expert20100 India Turker Crawled Data Data from RenrenFewer Experts More Profiles for Experts

Individual Tester Accuracy 9 Not so good :( Experts prove that humans can be accurate Turkers need extra help… Awesome! 80% of experts have >90% accuracy! Awesome! 80% of experts have >90% accuracy!

Accuracy of the Crowd  Treat each classification by each tester as a vote  Majority makes final decision 10 DatasetTest Group False Positives False Negatives Renren Chinese Expert0%3% Chinese Turker0%63% Facebook US US Expert0%10% US Turker2%19% Facebook India India Expert0%16% India Turker0%50% Almost Zero False Positives Experts Perform Okay Turkers Miss Lots of Sybils False positive rates are excellent Turkers need extra help against false negatives What can be done to improve accuracy?

How Many Classifications Do You Need? 11 China India US False Negatives False Positives Only need a 4-5 classifications to converge Few classifications = less cost

Eliminating Inaccurate Turkers 12 Dramatic Improvement Most workers are >40% accurate From 60% to 10% False Negatives Only a subset of workers are removed (<50%) Getting rid of inaccurate turkers is a no-brainer

How to turn our results into a system? Scalability  OSNs with millions of users 2. Performance  Improve turker accuracy  Reduce costs 3. Privacy  Preserve user privacy when giving data to turkers

14 Social Network Heuristics User Reports Suspicious Profiles All Turkers Experts Turker Selection Accurate Turkers Very Accurate Turkers Sybils System Architecture Filtering Layer Crowdsourcing Layer Filter Out Inaccurate Turkers Maximize Usefulness of High Accuracy Turkers Rejected! Leverage Existing Techniques Help the System Scale Leverage Existing Techniques Help the System Scale ? Continuous Quality Control Locate Malicious Workers Continuous Quality Control Locate Malicious Workers

Trace Driven Simulations  Simulate 2000 profiles  Error rates drawn from survey data  Vary 4 parameters 15 Accurate Turkers Very Accurate Turkers Classifications Controversial Range Results Average 6 classifications per profile <1% false positives <1% false negatives % 20-50% Results++ Average 8 classifications per profile <0.1% false positives <0.1% false negatives Threshold

Estimating Cost  Estimated cost in a real-world social networks: Tuenti  12,000 profiles to verify daily  14 full-time employees  Minimum wage ($8 per hour)  $890 per day  Crowdsourced Sybil Detection  20sec/profile, 8 hour day  50 turkers  Facebook wage ($1 per hour)  $400 per day  Cost with malicious turkers  Estimate that 25% of turkers are malicious  63 turkers  $1 per hour  $504 per day 16

Takeaways 17  Humans can differentiate between real and fake profiles  Crowdsourced Sybil detection is feasible  Designed a crowdsourced Sybil detection system  False positives and negatives <1%  Resistant to infiltration by malicious workers  Sensitive to user privacy  Low cost  Augments existing security systems

Questions? 18

Survey Fatigue 19 US Experts US Turkers No fatigue Fatigue matters All testers speed up over time

Sybil Profile Difficulty 20 Experts perform well on most difficult Sybils Really difficult profiles Some Sybils are more stealthy Experts catch more tough Sybils than turkers

Preserving User Privacy 21  Showing profiles to crowdworkers raises privacy issues  Solution: reveal profile information in context ! Crowdsourced Evaluation ! Public Profile Information Friend-Only Profile Information Friends