BOTNET JUDO : Fighting Spam with Itself

Slides:



Advertisements
Similar presentations
Basic Communication on the Internet:
Advertisements

Seyedehmehrnaz Mireslami, Mohammad Moshirpour, Behrouz H. Far Department of Electrical and Computer Engineering University of Calgary, Canada {smiresla,
What is Spam  Any unwanted messages that are sent to many users at once.  Spam can be sent via , text message, online chat, blogs or various other.
Addressing spam and enforcing a Do Not Registry using a Certified Electronic Mail System Information Technology Advisory Group, Inc.
DSPIN: Detecting Automatically Spun Content on the Web Qing Zhang, David Y. Wang, Geoffrey M. Voelker University of California, San Diego 1.
1 Aug. 3 rd, 2007Conference on and Anti-Spam (CEAS’07) Slicing Spam with Occam’s Razor Chris Fleizach, Geoffrey M. Voelker, Stefan Savage University.
 Malicious or unsolicited mail sent to a mailbox without the option to unsubscribe  Often used as a catch-all of any undesired or questionable mail.
Wide-scale Botnet Detection and Characterization Anestis Karasaridis, Brian Rexroad, David Hoeflin.
Ragib Hasan Johns Hopkins University en Spring 2011 Lecture 10 04/18/2011 Security and Privacy in Cloud Computing.
Understanding the Network-Level Behavior of Spammers Mike Delahunty Bryan Lutz Kimberly Peng Kevin Kazmierski John Thykattil By Anirudh Ramachandran and.
The problems associated with operating an effective anti-spam blocklist system in an increasingly hostile environment. Robert Gallagher September 2004.
Internet Quarantine: Requirements for Containing Self-Propagating Code David Moore et. al. University of California, San Diego.
Spam Reduction Techniques Using greylisting and SpamAssassin.
2009/9/151 Rishi : Identify Bot Contaminated Hosts By IRC Nickname Evaluation Reporter : Fong-Ruei, Li Machine Learning and Bioinformatics Lab In Proceedings.
SocialFilter: Introducing Social Trust to Collaborative Spam Mitigation Michael Sirivianos Telefonica Research Telefonica Research Joint work with Kyungbaek.
Botnets An Introduction Into the World of Botnets Tyler Hudak
Introduction to Honeypot, Botnet, and Security Measurement
B OTNETS T HREATS A ND B OTNETS DETECTION Mona Aldakheel
 Collection of connected programs communicating with similar programs to perform tasks  Legal  IRC bots to moderate/administer channels  Origin of.
Network and Systems Security By, Vigya Sharma (2011MCS2564) FaisalAlam(2011MCS2608) DETECTING SPAMMERS ON SOCIAL NETWORKS.
Speaker:Chiang Hong-Ren Botnet Detection by Monitoring Group Activities in DNS Traffic.
Click Trajectories: End-to-End Analysis of the spam value chain Kirill Levchenko, Andreas Pitsillidis, Neha Chachra, Brandon Enright, Tristan Halvorson,
Principles of Computer Security: CompTIA Security + ® and Beyond, Third Edition © 2012 Principles of Computer Security: CompTIA Security+ ® and Beyond,
The Internet 8th Edition Tutorial 2 Basic Communication on the Internet: .
2012 4th International Conference on Cyber Conflict C. Czosseck, R. Ottis, K. Ziolkowski (Eds.) 2012 © NATO CCD COE Publications, Tallinn 朱祐呈.
A Virtual Honeypot Framework Author: Niels Provos Published in: CITI Report 03-1 Presenter: Tao Li.
Botnet behavior and detection October RONOG Silviu Sofronie – a Head of Forensics.
BOTNET JUDO Fighting Spam with Itself By: Pitsillidis, Levchenko, Kreibich, Kanich, Voelker, Paxson, Weaver, and Savage Presentation by: Heath Carroll.
Week 10-11c Attacks and Malware III. Remote Control Facility distinguishes a bot from a worm distinguishes a bot from a worm worm propagates itself and.
Spamscatter: Characterizing Internet Scam Hosting Infrastructure By D. Anderson, C. Fleizach, S. Savage, and G. Voelker Presented by Mishari Almishari.
11 Spamcraft: An Inside Look At Spam Campaign Orchestration Reporter: 林佳宜 Advisor: Chun-Ying Huang /6/3.
Studying Spamming Botnets Using Botlab 台灣科技大學資工所 楊馨豪 2009/10/201 Machine Learning And Bioinformatics Laboratory.
Spamming Botnets: Signatures and Characteristics Yinglian Xie, Fang Yu, Kannan Achan, Rina Panigrahy, Geoff Hulten, and Ivan Osipkov. SIGCOMM, Presented.
Understanding the Network-Level Behavior of Spammers Author: Anirudh Ramachandran, Nick Feamster SIGCOMM ’ 06, September 11-16, 2006, Pisa, Italy Presenter:
Detecting Phishing in s Srikanth Palla Ram Dantu University of North Texas, Denton.
How a major ISP built a new anti-abuse platform Mike O’Reirdan Comcast Distinguished Engineer Internet Systems Engineering Comcast National Engineering.
Reporter : 鄭志欣 Advisor: Hsing-Kuo Pao Botnet Judo: Fighting Spam with Itself.
Studying Spamming Botnets Using Botlab
Search Worms, ACM Workshop on Recurring Malcode (WORM) 2006 N Provos, J McClain, K Wang Dhruv Sharma
The Client-Server Model And the Socket API. Client-Server (1) The datagram service does not require cooperation between the peer applications but such.
11 Shades of Grey: On the effectiveness of reputation- based “blacklists” Reporter: 林佳宜 /8/16.
Automated Worm Fingerprinting Authors: Sumeet Singh, Cristian Estan, George Varghese and Stefan Savage Publish: OSDI'04. Presenter: YanYan Wang.
Spamming Botnets: Signatures and Characteristics Yinglian Xie, Fang Yu, Kannan Achan, Rina Panigrahy, Microsoft Research, Silicon Valley Geoff Hulten,
Spamalytics: An Empirical Analysis of Spam Marketing Conversion
Introduction to Machine Learning, its potential usage in network area,
Dec 14, 2014, Harvard University
Unit 3 Section 6.4: Internet Security
Anti-Spam Managing Spam with Kerio Connect
Chapter 40 Internet Security.
Botnets A collection of compromised machines
Internet Quarantine: Requirements for Containing Self-Propagating Code
Exchange Online Advanced Threat Protection
Project Management: Messages
TMG Client Protection 6NPS – Session 7.
A lustrum of malware network communication: Evolution & insights
Distribution and components
CHAPTER 3 Architectures for Distributed Systems
BotCatch: A Behavior and Signature Correlated Bot Detection Approach
Click Trajectories: End to End Analysis of the Spam Value Chain
Botnets A collection of compromised machines
Exchange Online Advanced Threat Protection
Anonymity, Unlinkability, Undetectability, Unobservability, Pseudonymity and Identity Management – A Consolidated Proposal for Terminology Authors: Andreas.
Intro to Ethical Hacking
Design open relay based DNS blacklist system
iSRD Spam Review Detection with Imbalanced Data Distributions
Spam Fighting at CERN 12 January 2019 Emmanuel Ormancey.
Test 3 review FTP & Cybersecurity
Slides Credit: Sogand Sadrhaghighi
Presented by Aaron Ballew
When Machine Learning Meets Security – Secure ML or Use ML to Secure sth.? ECE 693.
Presentation transcript:

BOTNET JUDO : Fighting Spam with Itself Andreas Pitsillidis Kirill Levchenko Christian Kreibich Chris Kanich Geoffrey M. Voelker Vern Paxson Nicholas Weaver Stefan Savage Presented by: Mohan Krishna Karanam

Agenda Introduction: What is Spam ? Background and Related Work Current Antispam Approaches Spamming Botnets Template-based Spam Signature Generator Evaluation Conclusion

What is spam ? Terminology Unsolicited Commercial Email Unsolicited Bulk Email Examples

What is spam ?

Introduction How do spammers operate ? They exploit small time advantages to deliver large overall gains How does the receiver block spam ? Install filters to block spam based on pattern detection (signature). How do the filters work? How much do you know about your opponent’s next move and how quickly can you act on it? It is all about how actions and counter reactions.

Introduction What is a Botnet ? Components of a Botnet Command and control Fast Flux DNS Zombie computer Botnet: A number of Internet-connected computers communicating with other similar machines in which components located on networked computers communicate and coordinate their actions by .command and control or by passing messages to one another. They have been many times been used to send spam email or participate in distributed denial-of-service attacks. The word botnet is a combination of the words robot and network. The term is usually used with a negative or malicious connotation. Botnets can be legal and illegal(en.wikipedia.org/wiki/Botnet).

Introduction Types of attacks using a botnet : Distributed Denial-of-Service attacks Spyware E-mail Spam Click fraud Bitcoin Mining

E-mail Spam Target: Mail servers Generate a bulk of mails Contain attractive content which misleads the user Email spams are generated based on a template Templates are designed in a way such that the messages being derived from them are not generic, thus allowing the spam email to escape/cross the spam filters High amounts of varied templates are generated and the detection becomes difficult as it needs manual effort. The email filter cannot determine if it is a spam as the content is not generic. Difficult for both [Sender] and [Content] filter to detect the spam. Requires alternative methods to combat botnet.

Anti spam approaches Content based Sender based Template based Popular preventive measures include Anti-virus software, Passive OS fingerprinting, Network based approaches (nullrouting) and Spam filtering. The filter would be derived from templates followed by spammers and this can potentially identify spam emails. Can require significant manual effort to Reverse Engineer each unique protocol Here comes blackbox approach

Content based spam Oldest and the best known filtering technique. Filters based on the textual content Textual Content may include: Message Body Anomalous Headers Undesirable Messages Transition from manual configuration to systems based on supervised learning.

Sender Based spam Focuses on means by which spam is delivered. Any IP address delivering spam is highly likely to repeat this act and unlikely to send any legitimate communication. Makes use of blacklists to store IP addresses of internet hosts which send spam. Disadvantages: Scalability and increase in number of hosts.

BOTNET Judo: Black Box approach Individual botnet instances are executed in a controlled environment and the spam messages they attempt to send are analyzed online. Produce comprehensive regular expression Captures all messages generated from a template. Avoids extraneous matches. Produce zero false positives against non-spam mail Match all spam based on same template

Background work General approach for inferring e-mail generation templates (e.g., mail header anomalies, subject lines, URLs) Concrete algorithm to generate an initial high-quality regular expression update in fraction of seconds if needed. Test empirically against live botnet spam, demonstrate its effectiveness, and identify the requirements for practical use.

Template based Spam Uses macros Personalize each message Avoid spam filtering based on message content The figure shows a regex signature generated from 1000 message instances

Template Elements Macros are of two types: Noise Macros : Used to randomize strings for which there are few semantic rules Dictionary Macros : Used for content that must be semantically appropriate in context.

Real time Filtering Macros can be converted to regular expressions Noise macros become repeated character classes Dictionary macros become a disjunction of the dictionary elements These regexes then match all spam generated from the template The process of obtaining templates in highly time consuming as it involves reverse engineering the botnet ‘command and control’ panel. This is overcome by using the ‘Template Inference Algorithm’

Architecture of the judo system The judo filtering system has three components: A Bot Farm Signature Generator Spam Filter

System Assumptions: Our proposed spam filtering system operates on a number of assumptions. we assume that bots compose spam using a template system as described. rely on spam campaigns employing a small number of templates at any point in time. System assumes that the first few messages of a new template do not appear intermixed with messages from other new templates.

Assumptions for the judo system Spam is composed by bots The experiment relies on spam campaigns employing a small number of templated at any point in time The judo system assumes that the first few messages of a new template do not appear intermixed with messages from other new templates

The Signature Generator Template Interface Anchors Macros Dictionary Macros Micro-anchors Noise Macros Leveraging Domain Knowledge Header Filtering Special tokens Signature Update Second Chance Mechanism Pre-Clustering Execution Time

Signature Update There is a need for a set of signatures, rather than single signature because several templates may be in use at the same time Training buffer Trade off between signature selectivity and training time Two additional mechanisms for handling extreme cases: Second Chance Mechanism Pre-Clustering

Second Chance Mechanism Used to mitigate the effects of a small training buffer Developed to combine the advantage of fast signature deployment with the eventual benefits of dictionaries. If a message signature fails to match an existing signature It is re-checked with existing signatures consisting only anchors If matched, signature is updated. Update is performed incrementally without needing to rerun a new instance of the inference algorithm Pre-Clustering: A large training buffer may intermix messages from different templates, resulting in an amalgamated signature. unclassified messages are clustered using skeleton signatures. A skeleton signature is a kin to an anchor signature, but is built with a larger minimum anchor size, More permissive Used to mitigate the effects of overly large training buffers (potentially mixed RE’s) Skeleton signatures used to sort incoming messages prior to running Judo on them. Similar to second chance mechanism, but with a larger allowable anchor size.

Pre-Clustering Large training buffers intermix messages from different templated, thus resulting in amalgamated signature. Pre-clustering is used to mitigate the effects of a large training buffer Skeleton signature If a message fails to match a full signature, an attempt to assign it to a training buffer using a skeleton signature is made Once the buffer is full, skeleton regex is generated. Several skeleton regex are then combined to form a full signature ready for use.

Evaluation and Methodology Signature Safety Testing Methodology Single Template Inference Multiple Template Inference Real-world Deployment False Positives Response time Other Content based approaches

Single template inference Results Template Inference Algorithm – Heart of the Judo System 0% False positive rate achieved at k=1000 Generated 5000 instances of spam from a ‘Storm’ bot from templates gained through reverse engineering Here ‘k’ is the size of the training buffer.

Multi Template interface Results False negative rate decreases as the classification delay d increases. Increasing false negative rate as k increases, may seem counterintuitive(previous experiment -increasing k decreased the false negative rate).

DYNAMIC BEHAVIOUR

Real world deployment Results Worst Case: Rustock is only source of false positives: 1 in 12,500 messages. All others 0 total false positives in corpora

Response Time Takes under 10 seconds in almost all cases to run the algorithm. The remaining time it takes is to build up the required training sets. Rustock bots deployed required only 20 hours sent 932,474 messages spamming rate of 194 messages per minute, for each bot

Conclusion Judo is an attractive system which generates highly specialized regexes. Judo also proves that it is practical to generate high-quality spam content signatures by observing the output of the bot instances.

Queries and Discussion

THANK YOU