Detecting and Characterizing Social Spam Campaigns Hongyu Gao, Jun Hu, Christo Wilson, Zhichun Li, Yan Chen and Ben Y. Zhao Northwestern University, US.

Slides:



Advertisements
Similar presentations
Wenke Lee and Nick Feamster Georgia Tech Botnet and Spam Detection in High-Speed Networks.
Advertisements

Wenke Lee and Nick Feamster Georgia Tech Botnet and Spam Detection in High-Speed Networks.
WHO AM I?  Founder of Gray Hat Web, an award winning web design & marketing agency  We focus on strategic planning, conversion optimization, and .
Detecting Social Spam Campaigns on Twitter Zi Chu & Haining Wang The College of William & Mary Indra Widjaja Bell Laboratories, Alcatel-Lucent, USA Presented.
Pete Bohman Adam Kunk.  Introduction  Related Work  System Overview  Indexing Scheme  Ranking  Evaluation  Conclusion.
FRAppE: Detecting Malicious Facebook Applications
ABUSING BROWSER ADDRESS BAR FOR FUN AND PROFIT - AN EMPIRICAL INVESTIGATION OF ADD-ON CROSS SITE SCRIPTING ATTACKS Presenter: Jialong Zhang.
DSPIN: Detecting Automatically Spun Content on the Web Qing Zhang, David Y. Wang, Geoffrey M. Voelker University of California, San Diego 1.
Design and Evaluation of a Real- Time URL Spam Filtering Service Kurt Thomas, Chris Grier, Justin Ma, Vern Paxson, Dawn Song University of California,
Asking Questions on the Internet
Hongyu Gao, Tuo Huang, Jun Hu, Jingnan Wang.  Boyd et al. Social Network Sites: Definition, History, and Scholarship. Journal of Computer-Mediated Communication,
Personalizing Web Page Recommendation via Collaborative Filtering and Topic-Aware Markov Model Qingyan Yang, Ju Fan, Jianyong Wang, Lizhu Zhou Database.
UNDERSTANDING VISIBLE AND LATENT INTERACTIONS IN ONLINE SOCIAL NETWORK Presented by: Nisha Ranga Under guidance of : Prof. Augustin Chaintreau.
Leveraging Personal Knowledge for Robust Authentication Systems Mentor: Danfeng Yao Anitra Babic Chestnut Hill College Computer Science Department.
BotGraph: Large Scale Spamming Botnet Detection Yao Zhao Yinglian Xie *, Fang Yu *, Qifa Ke *, Yuan Yu *, Yan Chen and Eliot Gillum ‡ EECS Department,
Unconstrained Endpoint Profiling (Googling the Internet)‏ Ionut Trestian Supranamaya Ranjan Aleksandar Kuzmanovic Antonio Nucci Northwestern University.
Verma - ICISS 2014 R easoning M ining NLP Defense Rakesh M. Verma ReMiND Laboratory Catching Classical and Hijack-based Phishing Attacks.
Using the Engaging Networks tools Ghazal Vaghedi Toronto February 21, 2012 #12ENCONF.
Towards Online Spam Filtering in Social Networks Hongyu Gao, Yan Chen, Kathy Lee, Diana Palsetia and Alok Choudhary Lab for Internet and Security Technology.
GAYATRI SWAMYNATHAN, CHRISTO WILSON, BRYCE BOE, KEVIN ALMEROTH AND BEN Y. ZHAO UC SANTA BARBARA Do Social Networks Improve e-Commerce? A Study on Social.
1 New : Create your own message starting from scratch 2 New From Template: add professionally designed templates provided exclusively by Gorilla Contact.
272: Software Engineering Fall 2012 Instructor: Tevfik Bultan Lecture 17: Code Mining.
資安新聞簡報 報告者:劉旭哲、曾家雄. Spam down, but malware up 報告者:劉旭哲.
Tag-based Social Interest Discovery
«Tag-based Social Interest Discovery» Proceedings of the 17th International World Wide Web Conference (WWW2008) Xin Li, Lei Guo, Yihong Zhao Yahoo! Inc.,
Detecting Spammers on Social Networks Gianluca Stringhini, Christopher Kruegel, Giovanni Vigna (University of California) Annual Computer Security Applications.
WARNINGBIRD: A Near Real-time Detection System for Suspicious URLs in Twitter Stream.
Authors: Gianluca Stringhini Christopher Kruegel Giovanni Vigna University of California, Santa Barbara Presenter: Justin Rhodes.
University of California at Santa Barbara Christo Wilson, Bryce Boe, Alessandra Sala, Krishna P. N. Puttaswamy, and Ben Zhao.
John P., Fang Yu, Yinglian Xie, Martin Abadi, Arvind Krishnamurthy University of California, Santa Cruz USENIX SECURITY SYMPOSIUM, August, 2010 John P.,
Discovery of Emergent Malicious Campaigns in Cellular Networks Nathaniel Boggs, Wei Wang, Suhas Mathur, Baris Coskun, Carol Pincock © 2013 AT&T Intellectual.
Network and Systems Security By, Vigya Sharma (2011MCS2564) FaisalAlam(2011MCS2608) DETECTING SPAMMERS ON SOCIAL NETWORKS.
Suspended Accounts in Retrospect: An Analysis of Twitter Spam Kurt Thomas, Chris Grier, Vern Paxson, Dawn Song University of California, Berkeley International.
Reporter: Li, Fong Ruei National Taiwan University of Science and Technology 9/19/2015Slide 1 (of 32)
Behavior-based Spyware Detection By Engin Kirda and Christopher Kruegel Secure Systems Lab Technical University Vienna Greg Banks, Giovanni Vigna, and.
User Browsing Graph: Structure, Evolution and Application Yiqun Liu, Yijiang Jin, Min Zhang, Shaoping Ma, Liyun Ru State Key Lab of Intelligent Technology.
Understanding Cross-site Linking in Online Social Networks Yang Chen 1, Chenfan Zhuang 2, Qiang Cao 1, Pan Hui 3 1 Duke University 2 Tsinghua University.
Introduction to Text and Web Mining. I. Text Mining is part of our lives.
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
Jhih-sin Jheng 2009/09/01 Machine Learning and Bioinformatics Laboratory.
Uncovering Social Network Sybils in the Wild Zhi YangChristo WilsonXiao Wang Peking UniversityUC Santa BarbaraPeking University Tingting GaoBen Y. ZhaoYafei.
Microblogs: Information and Social Network Huang Yuxin.
By Gianluca Stringhini, Christopher Kruegel and Giovanni Vigna Presented By Awrad Mohammed Ali 1.
Spamming Botnets: Signatures and Characteristics Yinglian Xie, Fang Yu, Kannan Achan, Rina Panigrahy, Geoff Hulten, and Ivan Osipkov. SIGCOMM, Presented.
SocialTube: P2P-assisted Video Sharing in Online Social Networks
User Interactions in Social Networks and their Implications Christo Wilson, Bryce Boe, Alessandra Sala, Krishna P. N. Puttaswamy, Ben Y. Zhao (UC Santa.
Studying Spamming Botnets Using Botlab
The Koobface Botnet and the Rise of Social Malware Kurt Thomas David M. Nicol
Search Engine using Web Mining COMS E Web Enhanced Information Mgmt Prof. Gail Kaiser Presented By: Rupal Shah (UNI: rrs2146)
Detecting and Characterizing Social Spam Campaigns Yan Chen Lab for Internet and Security Technology (LIST) Northwestern Univ.
Bloom Cookies: Web Search Personalization without User Tracking Authors: Nitesh Mor, Oriana Riva, Suman Nath, and John Kubiatowicz Presented by Ben Summers.
Social Turing Tests: Crowdsourcing Sybil Detection Gang Wang, Manish Mohanlal, Christo Wilson, Xiao Wang Miriam Metzger, Haitao Zheng and Ben Y. Zhao Computer.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
A Framework for Detection and Measurement of Phishing Attacks Reporter: Li, Fong Ruei National Taiwan University of Science and Technology 2/25/2016 Slide.
Don’t Follow me : Spam Detection in Twitter January 12, 2011 In-seok An SNU Internet Database Lab. Alex Hai Wang The Pensylvania State University International.
Spamming Botnets: Signatures and Characteristics Yinglian Xie, Fang Yu, Kannan Achan, Rina Panigrahy, Microsoft Research, Silicon Valley Geoff Hulten,
Fabricio Benevenuto, Gabriel Magno, Tiago Rodrigues, and Virgilio Almeida Universidade Federal de Minas Gerais Belo Horizonte, Brazil ACSAC 2010 Fabricio.
1 Patterns of Cascading Behavior in Large Blog Graphs Jure Leskoves, Mary McGlohon, Christos Faloutsos, Natalie Glance, Matthew Hurst SDM 2007 Date:2008/8/21.
Ariel Fuxman, Panayiotis Tsaparas, Kannan Achan, Rakesh Agrawal (2008) - Akanksha Saxena 1.
Seamlessly customize and update content for each and every location.
Gross Niv Analyzing Spammer’s Social Networks for Fun and Profit
CSCE 548 Student Presentation Ryan Labrador
Uncovering Social Spammers: Social Honeypots + Machine Learning
Hummingbird: Privacy at the time of Twitter
Are these Ads Safe: Detecting Hidden A4acks through Mobile App-Web Interfaces Vaibhav Rastogi, Rui Shao, Yan Chen, Xiang Pan, Shihong Zou, and Ryan Riley.
Lab for Internet and Security Technology Yan Chen
Dieudo Mulamba November 2017
Binghui Wang, Le Zhang, Neil Zhenqiang Gong
Unconstrained Endpoint Profiling (Googling the Internet)‏
Yingze Wang and Shi-Kuo Chang University of Pittsburgh
Presentation transcript:

Detecting and Characterizing Social Spam Campaigns Hongyu Gao, Jun Hu, Christo Wilson, Zhichun Li, Yan Chen and Ben Y. Zhao Northwestern University, US Northwestern / Huazhong Univ. of Sci & Tech, China University of California, Santa Barbara, US NEC Laboratories America, Inc., US

Background 2

3 Benign post1 Benign post2 Benign post1 Benign post2 Benign post1 Benign post2 Benign post3 Benign post1 Benign post2 Benign post3 Benign post1 Benign post2 Benign post1 Benign post2 … … … … … … … … … … … … … …

4 Secret admirer reveald. Go here to find out who …

5 Contributions Conduct the largest scale experiment on Facebook to confirm spam campaigns. –3.5M user profiles, 187M wall posts. Uncover the attackers’ characteristics. –Mainly use compromised accounts. –Mostly conduct phishing attack. Release the confirmed spam URLs, with posting times. – –

66 Detection System Design Validation Malicious Activity Analysis Conclusions Roadmap

7 System Overview Identify coordinated spam campaigns in Facebook. –Templates are used for spam generation.

8 Build Post Similarity Graph –A node: an individual wall post –An edge: connect two “similar” wall posts Check out funny.com Go to evil.com!

Wall Post Similarity Metric Spam wall post model: 9 A textual description: A destination URL: hey see your love compatibility ! go here yourlovecalc. com (remove spaces) hey see your love compatibility ! go here (remove spaces) yourlovecalc. com

10 Wall Post Similarity Metric Condition 1: –Similar textual description. Guess who your secret admirer is?? Go here nevasubevd. blogs pot. co m (take out spaces) Guess who your secret admirer is??” Visit: yes-crush. com (remove spaces) Establish an edge! Guess who your secret admirer is?? Go here (take out spaces) “Guess who ”, “uess who y”, “ess who yo”, “ss who you”, “s who your”, “ who your ”, “who your s”, “ho your se”, … , , , , … , , , , … Guess who your secret admirer is??” Visit: yes-crush. com (remove spaces) “Guess who ”, “uess who y”, “ess who yo”, “ss who you”, “s who your”, “who your s”, “ho your se”, “o your sec”, … , , , , … , , , , …

11 Wall Post Similarity Metric Condition 2: –Same destination URL. secret admirer revealed. goto yourlovecalc. com (remove the spaces) hey see your love compatibility ! go here yourlovecalc. com (remove spaces) Establish an edge!

12 Extract Wall Post Campaigns Intuition: Reduce the problem of identifying potential campaigns to identifying connected subgraphs. AB CB AC B

13 Locate Spam Campaigns Distributed: campaigns have many senders. Bursty: campaigns send fast. Wall post campaign Distributed? NO Benign YES Bursty? NO Benign YES Malicious

14 Detection System Design Validation Malicious Activity Analysis Conclusions Roadmap

15 Validation Dataset: –Leverage unauthenticated regional network. –Wall posts already crawled from prior study. –187M wall posts in total, 3.5M recipients. –~2M wall posts with URLs. Detection result: –~200K malicious wall posts (~10%).

16 Validation Focused on detected URLs. Adopted multiple validation steps:  URL de-obfuscation  3 rd party tools  Redirection analysis  Keyword matching  URL grouping  Manual confirmation

17 Validation Step 1: Obfuscated URL –URLs embedded with obfuscation are malicious. –Reverse engineer URL obfuscation methods: Replace ‘.’ with “dot” : 1lovecrush dot com Insert white spaces : abbykywyty. blogs pot. co m

18 Validation Step 2: Third-party tools –Use multiple tools, including: McAfee SiteAdvisor Google’s Safe Browsing API Spamhaus Wepawet (a drive-by-download analysis tool) …

19 Validation Step 3: Redirection analysis –Commonly used by the attackers to hide the malicious URLs. URL1 URL M URL1

20 Experimental Evaluation The validation result.

21 Detection System Design Validation Malicious Activity Analysis Conclusions Roadmap

22 Malicious Activity Analysis Spam URL Analysis Spam Campaign Analysis Malicious Account Analysis Temporal Properties of Malicious Activity

23 Spam Campaign Topic Analysis CampaignSummarized wall post descriptionPost # CrushSomeone likes you RingtoneInvitation for free ringtones Love-calcTest the love compatibility …… … Identifying attackers’ social engineering tricks:

24 Categorize the attacks by attackers’ goals. Spam Campaign Goal Analysis Phishing #1: for money Phishing #2: for info

25 Sampled manual analysis: Malicious Account Analysis Account behavioral analysis:

26 Counting all wall posts, the curves for malicious and benign accounts converge. Malicious Account Analysis

27 Detection System Design Validation Malicious Activity Analysis Conclusions Roadmap

28 Conclusions Conduct the largest scale spam detection and analysis on Facebook. –3.5M user profiles, 187M wall posts. Make interesting discoveries, including: –Over 70% of attacks are phishing attacks. –Compromised accounts are prevailing.

29 Thank you! Project webpage: Spam URL release:

30 Bob Dave Chuck Bob’s Wall That movie was fun! From: Dave That movie was fun! Check out funny.com From: Chuck Check out funny.com Go to evil.com! From: Chuck Go to evil.com! Chuck

31 Benign post1 Benign post2 Benign post1 Benign post2 Malicious p1 Malicious p2 Benign post1 Benign post2 Benign post3 Malicious p1 Benign post1 Benign post2 Benign post3 Malicious p1 Benign post1 Benign post2 Benign post1 Benign post2 Malicious p1 Malicious p2 … … … … … … … … … … … … … … … … … … … …

32 Data Collection Based on “wall” messages crawled from Facebook (crawling period: Apr. 09 ~ Jun. 09 and Sept. 09). Leveraging unauthenticated regional networks, we recorded the crawled users’ profile, friend list, and interaction records going back to January 1, M wall posts with 3.5M recipients are used in this study.

33 Filter posts without URLs Assumption: All spam posts should contain some form of URL, since the attacker wants the recipient to go to some destination on the web. Example (without URL): Kevin! Lol u look so good tonight!!! Filter out

34 Filter posts without URLs Assumption: All spam posts should contain some form of URL, since the attacker wants the recipient to go to some destination on the web. Example (with URL): Further process Um maybe also this: Guess who your secret admirer is?? Go here nevasubevd\t. blogs pot\t.\tco\tm (take out spaces)

35 Extract Wall Post Clusters A sample wall post similarity graph and the corresponding clustering result (for illustrative purpose only)

36 Locate Malicious Clusters (5, 1.5hr) is found to be a good (n, t) value. Slightly modifying the value only have minor impact on the detection result. A relaxed threshold of (4, 6hr) only result in 4% increase in the classified malicious cluster.

37 Experimental Validation Step 5: URL grouping –Groups of URLs exhibit highly uniform features. Some have been confirmed as “malicious” previously. The rest are also considered as “malicious”. –Human assistance is involved in identifying such groups. Step 6: Manual analysis –We leverage Google search engine to confirm the malice of URLs that appear many times in our trace.

38 URL Analysis 3 different URL formats (with e.g.): –Link: –Plain text: mynewcrsh.com –Obfuscated: nevasubevu. blogs pot. co m Type# of URLs # of Wall Posts Avg # of Wall posts per URL Total #15,484199,782N/A Obfuscated6.5%25.3%50.3 Plaintext3.8%6.7%22.9 Hypertext link89.7%68.0%9.8

39 URL Analysis 4 different domain types (with e.g.): –Content sharing service: imageshack.us –URL shortening service: tinyurl.org –Blog service: blogspot.com –Other: yes-crush.com Type# of URLs# of Wall Posts ContentShare2.8%4.8% URL-short0.7%5.0% Blogs55.6%15.8% Other40.9%74.4%

40 Spam Campaign Temporal Analysis

41 Account Analysis The CDF of interaction ratio. Malicious accounts exhibit higher interaction ratio than benign ones.

42 Wall Post Hourly Distribution The hourly distribution of benign posts is consistent with the diurnal pattern of human, while that of malicious posts is not.