Fighting Fire With Fire: Crowdsourcing Security Threats and Solutions on the Social Web Gang Wang, Christo Wilson, Manish Mohanlal, Ben Y. Zhao Computer.

Slides:



Advertisements
Similar presentations
Detecting Spam Zombies by Monitoring Outgoing Messages Zhenhai Duan Department of Computer Science Florida State University.
Advertisements

Qiang Cao Duke University
ABUSING BROWSER ADDRESS BAR FOR FUN AND PROFIT - AN EMPIRICAL INVESTIGATION OF ADD-ON CROSS SITE SCRIPTING ATTACKS Presenter: Jialong Zhang.
Fighting Fire With Fire: Crowdsourcing Security Solutions on the Social Web Christo Wilson Northeastern University
Machine Learning and Data Mining Course Summary. 2 Outline  Data Mining and Society  Discrimination, Privacy, and Security  Hype Curve  Future Directions.
Design and Evaluation of a Real- Time URL Spam Filtering Service Kurt Thomas, Chris Grier, Justin Ma, Vern Paxson, Dawn Song University of California,
Enabling the Social Web Krishna P. Gummadi Networked Systems Group Max Planck Institute for Software Systems.
Asking Questions on the Internet
Hongyu Gao, Tuo Huang, Jun Hu, Jingnan Wang.  Boyd et al. Social Network Sites: Definition, History, and Scholarship. Journal of Computer-Mediated Communication,
You Are How You Click Clickstream Analysis for Sybil Detection Gang Wang, Tristan Konolige, Christo Wilson †, Xiao Wang ‡ Haitao Zheng and Ben Y. Zhao.
1 BotGraph: Large Scale Spamming Botnet Detection Yao Zhao EECS Department Northwestern University.
Software Testing and Quality Assurance Testing Web Applications.
BotGraph: Large Scale Spamming Botnet Detection Yao Zhao Yinglian Xie *, Fang Yu *, Qifa Ke *, Yuan Yu *, Yan Chen and Eliot Gillum ‡ EECS Department,
GAYATRI SWAMYNATHAN, CHRISTO WILSON, BRYCE BOE, KEVIN ALMEROTH AND BEN Y. ZHAO UC SANTA BARBARA Do Social Networks Improve e-Commerce? A Study on Social.
Why Crowdsourcing Software automation replaces the role of human in many areas Store and retrieve large volumes of information Perform calculation Human.
SocialFilter: Introducing Social Trust to Collaborative Spam Mitigation Michael Sirivianos Telefonica Research Telefonica Research Joint work with Kyungbaek.
CAP6135: Malware and Software Vulnerability Analysis Examples of Term Projects Cliff Zou Spring 2012.
Online Business Optimization Suite. All About DeskGod.com DeskGod is provider of Next-generation online- business optimization software. DeskGod’s software,
Zifei Shan, Haowen Cao, Jason Lv, Cong Yan, Annie Liu Peking University, China 1.
Norman SecureTide Powerful cloud solution to stop spam and threats before it reaches your network.
Buyer Advertising & UMass Boston Navigating the Changing Landscape of Recruitment Communications Presented to: November 18, 2014.
May l Washington, DC l Omni Shoreham The ROI of Messaging Security JF Sullivan VP Marketing, Cloudmark, Inc.
WARNINGBIRD: A Near Real-time Detection System for Suspicious URLs in Twitter Stream.
Social Media Attacks By Laura Jung. How the Attacks Start Popularity of these sites with millions of users makes them perfect places for cyber attacks.
Authors: Gianluca Stringhini Christopher Kruegel Giovanni Vigna University of California, Santa Barbara Presenter: Justin Rhodes.
University of California at Santa Barbara Christo Wilson, Bryce Boe, Alessandra Sala, Krishna P. N. Puttaswamy, and Ben Zhao.
John P., Fang Yu, Yinglian Xie, Martin Abadi, Arvind Krishnamurthy University of California, Santa Cruz USENIX SECURITY SYMPOSIUM, August, 2010 John P.,
Gang Wang, Christo Wilson, Xiaohan Zhao, Yibo Zhu, Manish Mohanlal, Haitao Zheng and Ben Y. Zhao Computer Science Department, UC Santa Barbara Serf and.
Network and Systems Security By, Vigya Sharma (2011MCS2564) FaisalAlam(2011MCS2608) DETECTING SPAMMERS ON SOCIAL NETWORKS.
HirePlug Keynote A sneak peek at Social Hiring Technology.
OSN Research As If Sociology Mattered Krishna P. Gummadi Networked Systems Research Group MPI-SWS.
Suspended Accounts in Retrospect: An Analysis of Twitter Spam Kurt Thomas, Chris Grier, Vern Paxson, Dawn Song University of California, Berkeley International.
Strategy. Engagement. Interactive. Analytics. In every arena, CR Creative Group pushes the envelope of what's possible in social media. Taken individually,
FaceTrust: Assessing the Credibility of Online Personas via Social Networks Michael Sirivianos, Kyungbaek Kim and Xiaowei Yang in collaboration with J.W.
Why I LIKE the Facebook Database… Sharon Viente May 2010.
Task 1 Research on any 2 of the following: Online shopping Online banking Web broadcasting Social networking sites Discuss the disadvantages and advantages.
Man vs. Machine: Adversarial Detection of Malicious Crowdsourcing Workers Gang Wang, Tianyi Wang, Haitao Zheng, Ben Y. Zhao, UC Santa Barbara, Usenix Security.
Uncovering Social Network Sybils in the Wild Zhi YangChristo WilsonXiao Wang Peking UniversityUC Santa BarbaraPeking University Tingting GaoBen Y. ZhaoYafei.
JOINT BUY PRE-ALPHA. Business Canvas Update – Last Week Customer Segments Cost Structure Value Propositions Revenue Streams ChannelsCustomer Relationships.
Social Media 101 An Overview of Social Media Basics.
1 FACEBOOK: CAPITALIZING ON AN ECOSYSTEM Joseph Kusnick & Jeunetta Lewis.
Leveraging Asset Reputation Systems to Detect and Prevent Fraud and Abuse at LinkedIn Jenelle Bray Staff Data Scientist Strata + Hadoop World New York,
In 8 Minutes SocialSimple Personalize Your Web Through Social Networking.
By Gianluca Stringhini, Christopher Kruegel and Giovanni Vigna Presented By Awrad Mohammed Ali 1.
Computer Science Department, Peking University
Automatic Detection of Emerging Threats to Computer Networks Andre McDonald.
Security Analytics Thrust Anthony D. Joseph (UCB) Rachel Greenstadt (Drexel), Ling Huang (Intel), Dawn Song (UCB), Doug Tygar (UCB)
The Koobface Botnet and the Rise of Social Malware Kurt Thomas David M. Nicol
Socialbots and its implication On ONLINE SOCIAL Networks Md Abdul Alim, Xiang Li and Tianyi Pan Group 18.
Detecting and Characterizing Social Spam Campaigns Yan Chen Lab for Internet and Security Technology (LIST) Northwestern Univ.
Social Turing Tests: Crowdsourcing Sybil Detection Gang Wang, Manish Mohanlal, Christo Wilson, Xiao Wang Miriam Metzger, Haitao Zheng and Ben Y. Zhao Computer.
MySpace & Facebook By Veronica Baca. MySpace Tom Anderson August 2003 Social Networking Website Free service Required Age: 14 & over A virtual community.
Don’t Follow me : Spam Detection in Twitter January 12, 2011 In-seok An SNU Internet Database Lab. Alex Hai Wang The Pensylvania State University International.
Chapter 1: Internet Marketing Foundations. Chapter Objectives Describe how computers and servers communicate to enable people to interact with webpages.
Gang Wang, Sarita Y. Schoenebeck †, Haitao Zheng, Ben Y. Zhao UC Santa Barbara, † University of Michigan Understanding Bias and Misbehavior on Location-based.
Uncovering Social Network Sybils in the Wild Zhi YangChristo WilsonXiao Wang Peking UniversityUC Santa BarbaraPeking University Tingting GaoBen Y. ZhaoYafei.
CPSC FALL 2015TEAM P6 Real-time Detection System for Suspicious URLs Submitted by T.ANUPCHANDRA V.KRANTHI SUDHA CH.KRISHNAPRASAD Under Guidance.
CrowdTarget: Target-based Detection of Crowdturfing in Online Social Networks Jenny (Bom Yi) Lee.
Written by Qiang Cao, Xiaowei Yang, Jieqi Yu and Christopher Palow
Gross Niv Analyzing Spammer’s Social Networks for Fun and Profit
BUILD SECURE PRODUCTS AND SERVICES
Online Social Network: Threats &
TriggerScope: Towards Detecting Logic Bombs in Android Applications
By : Namesh Kher Big Data Insights – INFM 750
Written by Qiang Cao, Xiaowei Yang, Jieqi Yu and Christopher Palow
Dieudo Mulamba November 2017
Panda Adaptive Defense Platform and Services
Binghui Wang, Le Zhang, Neil Zhenqiang Gong
Cybersecurity Simplified: Phishing
Presentation transcript:

Fighting Fire With Fire: Crowdsourcing Security Threats and Solutions on the Social Web Gang Wang, Christo Wilson, Manish Mohanlal, Ben Y. Zhao Computer Science Department, UC Santa Barbara.

A Little Bit About Me  3 nd Year UCSB  Intern at MSR Redmond 2011  Intern at LinkedIn (Security Team) Research Interests:  Security and Privacy  Online Social Networks  Crowdsourcing Data Driven Analysis and Modling

Recap: Threats on the Social Web 3  Social spam is a serious problem  10% of wall posts with URLs on Facebook are spam  70% phishing  Sybils underlie many attacks on Online Social Networks  Spam, spear phishing, malware distribution  Sybils blend completely into the social graph  Existing countermeasures are ineffective  Blacklists only catch 28% of spam  Sybil detectors from the literature do not work

Sybil Accounts on Facebook 4  In-house estimates  Early 2012: 54 million  August 2012: 83 million  8.7% of the user base  Fake likes  VirtualBagel: useless site, 3,000 likes in 1 week  75% from Cairo, age Sybils attacks in large scale Advertisers are fleeing Facebook

Sybil Accounts on Twitter 5  92% of Newt Gingritch’s followers are Sybils  Russian political protests on Twitter  25,000 Sybils sent 440,000 tweets  1 million Sybils controlled overall Followers 4,000 new followers/day 100,000 new followers in 1 day Twitter is vital infrastructure Sybils usurping Twitter for political ends

Talk Outline 6 1. Malicious crowdsourcing sites – crowdturfing [WWW’12]  Spam and Sybils generated by real people  Huge threat in China  Growing threat in the US 2. Crowdsourced Sybil detection [NDSS’13]  If attackers can do it, why not defenders?  Can humans detect Sybils?  Is this cost effective?  Design a crowdsourced Sybil detection system User Study

Outline 7  Intro  Crowdturfing  Crowdsourcing Overview  What is Crowdturfing  How bad is it?  Crowdturfing in the US  Crowdsourced Sybil Detection  Conclusion

 We tend to think of spam as “low quality”  What about high quality spam and Sybils?  Open questions  What is the scope of this problem?  Generated manually or mechanically?  What are the economics? High Quality Sybils and Spam 8 Gang Wang MaxGentleman is the bestest male enhancement system avalable. FAKE Stock Photographs

Black Market Crowdsourcing 9  Amazon’s Mechanical Turk  Admins remove spammy jobs  Black market crowdsourcing websites  Spam and fake accounts, generated by real people  Major force in China, expanding in the US and India Crowdturfing = Crowdsourcing + Astroturfing

10

Crowdturfing Workflow 11 Customers  Initiate campaigns  May be legitimate businesses Agents  Manage campaign and workers  Verify completed tasks Workers  Complete tasks for money  Control Sybils on other websites Campaign Tasks Reports

Crowdturfing in China 12 Site Active Since Total Campaigns WorkersReports $ for Workers $ for Site ZhubajieNov K169K6.3M$2.4M$595K Jan. 08Jan. 09Jan. 10Jan. 11 Zhubajie Sandaha Campaigns $ $

Spreading Spam on Weibo 13 50% of campaigns reach > users 8% reach >1 million users Campaigns reach huge audiences How effective are these campaigns?

 Travel agency reported sales statistics  2 sales/month before our campaign  11 sales within 24 hours after our campaign  Each trip sells for $1500!  Initiate our own campaigns as a customer  4 benign ad campaigns promoting real e-commerce sites  All clicks route through our measurement server How Effective is Crowdturfing? 14 CampaignAboutTargetCostTasksReportsClicks Cost Per Click Vacation Advertise for a discount vacation through a travel agent Weibo $ $0.21 QQ118187$0.09 Forums1233$0.90 Web Display Ads CPC = $0.01

Crowdturfing in America 15  Other studies support these findings  Freelancer 28% spam jobs Bulk OSN accounts, likes, spam Connections to botnet operators US Sites% Crowdturfing Legit Mechanical Turk12% Black Market MinuteWorkers70% MyEasyTasks83% Microworkers89% ShortTasks95%  Poultry Markets $20 for 1000 followers Ponzi scheme

Takeaways 16  Identified a new threat: Crowdturfing  Growing exponentially in size and revenue in China  $1 million per month on just one site  Cost effective: $0.21 per click  Starting to grow in US and other countries  Mechanical Turk, Freelancer  Twitter Follower Markets  Huge problem for existing security systems  Little to no automation to detect  Turing tests fail

Outline 17  Intro  Crowdturfing  Crowdsourced Sybil Detection  Open Questions  User Study  Accuracy Analysis  System Design  Conclusion

Crowdsourcing Sybil Defense  Defenders are losing the battle against OSN Sybils  Idea: build a crowdsourced Sybil detector  Leverage human intelligence  Scalable  Open Questions  How accurate are users?  What factors affect detection accuracy?  Is crowdsourced Sybil detection cost effective? 18

User Study  Two groups of users  Experts – CS professors, masters, and PhD students  Turkers – crowdworkers from Mechanical Turk and Zhubajie  Three ground-truth datasets of full user profiles  Renren – given to us by Renren Inc.  Facebook US and India Crawled Legitimate profiles – 2-hops from our own profiles Suspicious profiles – stock profile images Banned suspicious profiles = Sybils 19 Stock Picture Crowdturfing Site

20 Progress Classifying Profiles Browsing Profiles Screenshot of Profile (Links Cannot be Clicked) Real or fake?Why? Navigation Buttons Testers may skip around and revisit profiles

Experiment Overview Dataset# of ProfilesTest Group# of Testers Profile per Tester SybilLegit. Renren100 Chinese Expert24100 Chinese Turker41810 Facebook US 3250 US Expert4050 US Turker29912 Facebook India 5049 India Expert20100 India Turker Crawled Data Data from RenrenFewer Experts More Profiles per Experts

Individual Tester Accuracy 22 Not so good :( Experts prove that humans can be accurate Turkers need extra help… Awesome! 80% of experts have >90% accuracy! Awesome! 80% of experts have >90% accuracy!

Accuracy of the Crowd  Treat each classification by each tester as a vote  Majority makes final decision 23 DatasetTest Group False Positives False Negatives Renren Chinese Expert0%3% Chinese Turker0%63% Facebook US US Expert0%10% US Turker2%19% Facebook India India Expert0%16% India Turker0%50% Almost Zero False Positives Experts Perform Okay Turkers Miss Lots of Sybils False positive rates are excellent Turkers need extra help against false negatives What can be done to improve accuracy?

Eliminating Inaccurate Turkers 24 Dramatic Improvement Most workers are >40% accurate From 60% to 10% False Negatives Only a subset of workers are removed (<50%) Getting rid of inaccurate turkers is a no-brainer

How Many Classifications Do You Need? 25 China India US False Negatives False Positives Only need a 4-5 classifications to converge Few classifications = less cost

How to turn our results into a system? Scalability  OSNs with millions of users 2. Performance  Improve turker accuracy  Reduce costs 3. Preserve user privacy when giving data to turkers

27 Social Network Heuristics User Reports Suspicious Profiles All Turkers OSN employee Turker Selection Accurate Turkers Very Accurate Turkers Sybils System Architecture Filtering Layer Crowdsourcing Layer Filter Out Inaccurate Turkers Maximize Usefulness of High Accuracy Turkers Rejected! Leverage Existing Techniques Help the System Scale Leverage Existing Techniques Help the System Scale ? Continuous Quality Control Locate Malicious Workers Continuous Quality Control Locate Malicious Workers

Trace Driven Simulations  Simulate 2000 profiles  Error rates drawn from survey data  Vary 4 parameters 28 Accurate Turkers Very Accurate Turkers Classifications Threshold Controversial Range Results Average 6 classifications per profile <1% false positives <1% false negatives % 20-50% Results++ Average 8 classifications per profile <0.1% false positives <0.1% false negatives

Estimating Cost  Estimated cost in a real-world social networks: Tuenti  12,000 profiles to verify daily  14 full-time employees  Annual salary 30,000 EUR (~$20 per hour)  $2240 per day  Crowdsourced Sybil Detection  20sec/profile, 8 hour day  50 turkers  Facebook wage ($1 per hour)  $400 per day  Cost with malicious turkers  Estimate that 25% of turkers are malicious  63 turkers  $1 per hour  $504 per day 29

Takeaways 30  Humans can differentiate between real and fake profiles  Crowdsourced Sybil detection is feasible  Designed a crowdsourced Sybil detection system  False positives and negatives <1%  Resistant to infiltration by malicious workers  Sensitive to user privacy  Low cost  Augments existing security systems

Outline 31  Intro  Crowdturfing  Crowdsourced Sybil Detection  Conclusion  Summary of My Work  Future Work

Key Contributions Identified novel threat: crowdturfing  End-to-end spam measurements from customers to the web  Insider knowledge of social spam 2. Novel defense: crowdsourced Sybil detection  User study proves feasibility of this approach  Build an accurate, scalable system  Possible deployment in real OSNs – LinkedIn and RenRen

Ongoing Works Twitter follower markets  Locate customers who purchase bulk of Twitter followers  Study the un-follow dynamics of customers  Develop systems to detect customers in the wild 2. Sybil detection using server-side click streams  Build click models based on clickstream logs  Extract click patterns of Sybil and normal users  Develop systems to detect Sybil

Questions? 34 Thank you!

Potential Project Ideas 35  Malware distribution in cellular networks  Identify malware related cellular network traffic  Coordinated malware distribution campaigns  Feature based detection  Advertising traffic analysis on mobile Apps  Characterize ads traffic  How effective for app-displayed ads to get click-through?  Are there malware delivered through ads?

Preserving User Privacy 36  Showing profiles to crowdworkers raises privacy issues  Solution: reveal profile information in context ! Crowdsourced Evaluation ! Public Profile Information Friend-Only Profile Information Friends

Clickstream Sybil Detection 37 Sybil Clickstream Friend Invite Share Browse Profiles InitialFinal 96% 9% 68% 15%2% 27% 64% 20%55% 31% Photo InitialFinal 22% 3% Share Message Friend Invite Browse Profiles 9% 4% 5% 14% 9% 21% 56% 29% 86% 87% 10% 43% 14% 93% Normal Clickstream  Clickstream detection of Sybils 1. Absolute number of clicks 2. Time between clicks 3. Page traversal order  Challenges  Real-time  Massive scalability  Low-overhead

Are Workers Real People? 38 Late Night/Early Morning Work Day/Evening Lunch Dinner ZBJ SDH

Crowdsourced Sybil Detection 39  How to detect crowdturfed Sybils?  Blur the line between real and fake  Difficult to detect algorithmically  Anecdotal evidence that people can spot Sybils  75% of friend requests from Sybils are rejected  Can people distinguish in real/fake general?  User studies: experts, turkers, undergrads  What features give Sybils away?  Are certain Sybils tougher than others?  Integration of human and machine intelligence

Survey Fatigue 40 US Experts US Turkers No fatigue Fatigue matters All testers speed up over time

Sybil Profile Difficulty 41 Experts perform well on most difficult Sybils Really difficult profiles Some Sybils are more stealthy Experts catch more tough Sybils than turkers