1 Using Failure Information Analysis to Detect Enterprise Zombies Zhaosheng Zhu, Vinod Yegneswaran, Yan Chen Lab of Internet and Security Technology Northwestern.

Slides:



Advertisements
Similar presentations
Analyzing and Exploiting Network Behaviors of Malware Jose Andre Morales Areej Al-Bataineh Shouhuai XuRavi Sandhu SecureComm Singapore, 2010 ©2010 Institute.
Advertisements

Analyzing DNS Activities of Bot Processes Dr. Jose Andre Morales Areej Al-Bataineh Dr. Shouhuai Xu Dr.Ravi Sandhu 4th International Conference on Malicious.
Wenke Lee and Nick Feamster Georgia Tech Botnet and Spam Detection in High-Speed Networks.
Wenke Lee and Nick Feamster Georgia Tech Botnet and Spam Detection in High-Speed Networks.
A Survey of Botnet Size Measurement PRESENTED: KAI-HSIANG YANG ( 楊凱翔 ) DATE: 2013/11/04 1/24.
BOTHUNTER : DETECTING MALWARE INFECTION THROUGH IDS-DRIVEN DIALOG CORRELATION AUTHORS: Guofei Gu, Phillip Porras, Vinod Yegneswaran, Martin Fong, Wenke.
An Introduction of Botnet Detection – Part 2 Guofei Gu, Wenke Lee (Georiga Tech)
Next Generation Endpoint Security Jason Brown Enterprise Solution Architect McAfee May 23, 2013.
Botnets. Botnet Threat Botnets are a major threat to the Internet because: Consist of a large pool of compromised computers that are organized by a master.
Kindred Domains: Detecting and Clustering Botnet Domains Using DNS Traffic Matt Thomas Data Architect, Verisign Labs.
Taxonomy of Botnets Team Mag Five Valerie Buitron Jaime Calahorrano Derek Chow Julia Marsh Mark Zogbaum.
BotMiner Guofei Gu, Roberto Perdisci, Junjie Zhang, and Wenke Lee College of Computing, Georgia Institute of Technology.
1 Understanding Botnet Phenomenon MITP Kevin Lynch, Will Fiedler, Navin Johri, Sam Annor, Alex Roussev.
Distributed Intrusion Detection Systems (dIDS) 2/10 CIS 610.
Botnet Dection system. Introduction  Botnet problem  Challenges for botnet detection.
Measurement and Diagnosis of Address Misconfigured P2P traffic Zhichun Li, Anup Goyal, Yan Chen and Aleksandar Kuzmanovic Lab for Internet and Security.
BotFinder: Finding Bots in Network Traffic Without Deep Packet Inspection F. Tegeler, X. Fu (U Goe), G. Vigna, C. Kruegel (UCSB)
Bayesian Bot Detection Based on DNS Traffic Similarity Ricardo Villamarín-Salomón, José Carlos Brustoloni Department of Computer Science University of.
2009/9/151 Rishi : Identify Bot Contaminated Hosts By IRC Nickname Evaluation Reporter : Fong-Ruei, Li Machine Learning and Bioinformatics Lab In Proceedings.
Guofei Gu, Roberto Perdisci, Junjie Zhang, and Wenke Lee College of Computing, Georgia Institute of Technology USENIX Security '08 Presented by Lei Wu.
Network-based and Attack-resilient Length Signature Generation for Zero-day Polymorphic Worms Zhichun Li 1, Lanjia Wang 2, Yan Chen 1 and Judy Fu 3 1 Lab.
1 Measurements and Mitigation of Peer-to-Peer-based Botnets: A Case Study on Storm Worm T. Holz, M. Steiner, F. Dahl, E. Biersack, and F. Freiling - Proceedings.
SECURING NETWORKS USING SDN AND MACHINE LEARNING DRAGOS COMANECI –
Using Failure Information Analysis to Detect Enterprise Zombies Zhaosheng Zhu 1, Vinod Yegneswaran 2, Yan Chen 1 1 Department of Electrical and Computer.
B OTNETS T HREATS A ND B OTNETS DETECTION Mona Aldakheel
Article presentation for: The Dark Cloud: Understanding and Defending against Botnets and Stealthy Malware Based on article by: Jaideep Chandrashekar,
An Evaluation model of botnet based on peer to peer Gao Jian KangFeng ZHENG,YiXian Yang,XinXin Niu 2012 Fourth International Conference on Computational.
 Collection of connected programs communicating with similar programs to perform tasks  Legal  IRC bots to moderate/administer channels  Origin of.
BotNet Detection Techniques By Shreyas Sali
Amir Houmansadr CS660: Advanced Information Assurance Spring 2015
BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure-Independent Botnet Detection Guofei Gu, Roberto Perdisci, Junjie Zhang, and.
Speaker:Chiang Hong-Ren Botnet Detection by Monitoring Group Activities in DNS Traffic.
Honeypot and Intrusion Detection System
11 Automatic Discovery of Botnet Communities on Large-Scale Communication Networks Wei Lu, Mahbod Tavallaee and Ali A. Ghorbani - in ACM Symposium on InformAtion,
Using Failure Information Analysis to Detect Enterprise Zombies Zhaosheng Zhu, Vinod Yegneswaran, Yan Chen Lab of Internet and Security Technology Northwestern.
CSCI 530 Lab Intrusion Detection Systems IDS. A collection of techniques and methodologies used to monitor suspicious activities both at the network and.
DoWitcher: Effective Worm Detection and Containment in the Internet Core S. Ranjan et. al in INFOCOM 2007 Presented by: Sailesh Kumar.
A Multifaceted Approach to Understanding the Botnet Phenomenon Authors : Moheeb Abu Rajab, Jay Zarfoss, Fabian Monrose, Andreas Terzis Computer Science.
Carleton University School of Computer Science Exposure Maps: Removing Reliance on Attribution During Scan Detection David Whyte, P.C. van Oorschot, Evangelos.
Nullcon Goa 2010http://nullcon.net Botnet Mitigation, Monitoring and Management - Harshad Patil.
Botnet behavior and detection October RONOG Silviu Sofronie – a Head of Forensics.
Learning Rules for Anomaly Detection of Hostile Network Traffic Matthew V. Mahoney and Philip K. Chan Florida Institute of Technology.
Automatically Generating Models for Botnet Detection Presenter: 葉倚任 Authors: Peter Wurzinger, Leyla Bilge, Thorsten Holz, Jan Goebel, Christopher Kruegel,
Not So Fast Flux Networks for Concealing Scam Servers Theodore O. Cochran; James Cannady, Ph.D. Risks and Security of Internet and Systems (CRiSIS), 2010.
Week 10-11c Attacks and Malware III. Remote Control Facility distinguishes a bot from a worm distinguishes a bot from a worm worm propagates itself and.
Wide-scale Botnet Detection and Characterization Anestis Karasaridis, Brian Rexroad, David Hoeflin In First Workshop on Hot Topics in Understanding Botnets,
Studying Spamming Botnets Using Botlab 台灣科技大學資工所 楊馨豪 2009/10/201 Machine Learning And Bioinformatics Laboratory.
Cross-Analysis of Botnet Victims: New Insights and Implication Seungwon Shin, Raymond Lin, Guofei Gu Presented by Bert Huang.
Speaker: Hom-Jay Hom Date:2009/11/17 Botnet, and the CyberCriminal Underground IEEE 2008 Hsin chun Chen Clinton J. Mielke II.
1 HoneyNets. 2 Introduction Definition of a Honeynet Concept of Data Capture and Data Control Generation I vs. Generation II Honeynets Description of.
Understanding the Network-Level Behavior of Spammers Author: Anirudh Ramachandran, Nick Feamster SIGCOMM ’ 06, September 11-16, 2006, Pisa, Italy Presenter:
Published: Internet Measurement Conference (IMC) 2006 Presented by Wei-Cheng Xiao 2015/11/221.
Omar Hemmali CAP 6135 Paul Barford Vinod Yegneswaran Computer Sciences Department University of Wisconsen, Madison.
Exploiting Temporal Persistence to Detect Covert Botnet Channels Authors: Frederic Giroire, Jaideep Chandrashekar, Nina Taft… RAID 2009 Reporter: Jing.
Studying Spamming Botnets Using Botlab
Botnets Usman Jafarey Including slides from The Zombie Roundup by Cooke, Jahanian, McPherson of the University of Michigan.
BotCop: An Online Botnet Traffic Classifier 鍾錫山 Jan. 4, 2010.
Effective Anomaly Detection with Scarce Training Data Presenter: 葉倚任 Author: W. Robertson, F. Maggi, C. Kruegel and G. Vigna NDSS
Speaker:Chiang Hong-Ren An Investigation and Implementation of Botnet Detection Schemes.
BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure-Independent Botnet Detection Presented by D Callahan.
1 Modeling and Measuring Botnets David Dagon, Wenke Lee Georgia Institute of Technology Cliff C. Zou Univ. of Central Florida Funded by NSF CyberTrust.
Role Of Network IDS in Network Perimeter Defense.
Network-based and Attack-resilient Length Signature Generation for Zero-day Polymorphic Worms Zhichun Li 1, Lanjia Wang 2, Yan Chen 1 and Judy Fu 3 1 Lab.
Speaker: Hom-Jay Hom Date:2009/10/20 Botnet Research Survey Zhaosheng Zhu. et al July 28-August
2009/6/221 BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure- Independent Botnet Detection Reporter : Fong-Ruei, Li Machine.
BotCatch: A Behavior and Signature Correlated Bot Detection Approach
Guofei Gu, Roberto Perdisci, Junjie Zhang, and Wenke Lee
Data Mining & Machine Learning Lab
Presentation transcript:

1 Using Failure Information Analysis to Detect Enterprise Zombies Zhaosheng Zhu, Vinod Yegneswaran, Yan Chen Lab of Internet and Security Technology Northwestern University SRI International

2 Motivation Increasing prevalence and sophistication of malware Increasing prevalence and sophistication of malware Current solutions are a day late and dollar short Current solutions are a day late and dollar short NIDS NIDS Firewalls Firewalls AV systems AV systems Conficker is a great example! Conficker is a great example! Over 10M hosts infected across variants A/B/C Over 10M hosts infected across variants A/B/C

3 Related Work BotHunter [Usenix Security 2007] BotHunter [Usenix Security 2007] Dialog Correlation Engine to detect enterprise bots Dialog Correlation Engine to detect enterprise bots Models lifecycle of bots: Models lifecycle of bots: Inbound Scan / Exploit / Egg download / C & C / Outbound Scans Inbound Scan / Exploit / Egg download / C & C / Outbound Scans Relies on Snort signatures to detect different phases Relies on Snort signatures to detect different phases Rishi [HotBots 07]: Detects IRC bots based on nickname patterns Rishi [HotBots 07]: Detects IRC bots based on nickname patterns BotSniffer [NDSS 08] BotSniffer [NDSS 08] Uses spatio-temporal correlation to detect C&C activity Uses spatio-temporal correlation to detect C&C activity BotMiner [Usenix Security 08] BotMiner [Usenix Security 08] Combines clustering with BotHunter and BotSniffer heuristics Combines clustering with BotHunter and BotSniffer heuristics Focus on successful bot communication patterns Focus on successful bot communication patterns

4 Objective and Approach Develop a complement to existing network defenses to improve its resilience and robustness Develop a complement to existing network defenses to improve its resilience and robustness Signature independent Signature independent Malware family independent – no prior knowledge on malware semantics or C&C mechanisms needed Malware family independent – no prior knowledge on malware semantics or C&C mechanisms needed Malware class independent (detect more than bots) Malware class independent (detect more than bots) Key idea: Failure Information Analysis Key idea: Failure Information Analysis Observation: malware communication patterns result in abnormally high failure rates Observation: malware communication patterns result in abnormally high failure rates Correlates network and application failures at multi-points Correlates network and application failures at multi-points

5 Outline Motivations and Key Idea Motivations and Key Idea Empirical Failure Pattern Study: Malware and Normal Applications Empirical Failure Pattern Study: Malware and Normal Applications Netfuse Design Netfuse Design Evaluations Evaluations Conclusions Conclusions

6 Malware Failure Patterns Empirical survey of 32 malware instances with long- lived traces (5 – 8 hours) Empirical survey of 32 malware instances with long- lived traces (5 – 8 hours) SRI honeynet, spamtrap and Offensive Computing SRI honeynet, spamtrap and Offensive Computing Spyware, HTTP botnet, IRC botnet, P2P botnet, Worm Spyware, HTTP botnet, IRC botnet, P2P botnet, Worm Application protocols studied: Application protocols studied: DNS, HTTP, FTP, SMTP, IRC DNS, HTTP, FTP, SMTP, IRC 24/32 generated failures 24/32 generated failures 18/32 generated DNS failures 18/32 generated DNS failures Mostly NXDOMAINs Mostly NXDOMAINs DNS failures part of normal behavior for some bots like Kraken and Conficker (generates new list of C&C rendezvous points everyday) DNS failures part of normal behavior for some bots like Kraken and Conficker (generates new list of C&C rendezvous points everyday)

7 Malware Failure Patterns (2) SMTP failures part of most spam bots SMTP failures part of most spam bots Storm, Bobax etc. Storm, Bobax etc. 550: recipient address rejected 550: recipient address rejected HTTP failures HTTP failures Generated by worms: Virut (DoS attacks) and Weby Generated by worms: Virut (DoS attacks) and Weby Weby contacts remote servers to get configuration info Weby contacts remote servers to get configuration info IRC failures IRC failures Channel removed from a public IRC server Channel removed from a public IRC server Channel is full due to too many bots Channel is full due to too many bots

8 MALWARECLASS DNS rate HTTP rate ICMP rate SMTP rate TCP rate Look2meWsnpoemSPYWARE515 BobaxKrakenHTTPBOTNET AgobotGobot Sdbot I+II Spybot I/II/III WootbotWebloitIRCBOTNET Nugache Storm I/II P2PBOTNET AllapleGrumKwbotMytobNetskyProtorideVirutWebyWORM

9 Normal Applications Studied Webcrawler Webcrawler news.sohu.com, amazon.com, bofa.com, imdb.com news.sohu.com, amazon.com, bofa.com, imdb.com P2P P2P BitTorrent, Emule BitTorrent, Emule Video Video Youtube.com Youtube.com HTTP 304/Not Modified errors whitelisted HTTP 304/Not Modified errors whitelisted

10 Normal Applications Studied For video traffic, no transport-layer failures For video traffic, no transport-layer failures Application level only “HTTP 304/Not modified” failures. Application level only “HTTP 304/Not modified” failures.

11 Normal Application Failure Patterns ApplicationHTTP Hourly rate ICMP TCP # ports/ Hourly rate Sohu.comAmazon.comImdb.comBofa.com /0.041/1.41/0.21/0.9 BitTorrenteMule /333839/370

12 Empirical Analysis Summary High volume failures are good indicators of malware High volume failures are good indicators of malware DNS failures (NXDomain messages) are common among malware DNS failures (NXDomain messages) are common among malware Malware failures tend to be persistent Malware failures tend to be persistent Malware failure patterns tend to be repetitive (low entropy) while normal applications don’t Malware failure patterns tend to be repetitive (low entropy) while normal applications don’t

13 Outline Motivations and Key Idea Motivations and Key Idea Empirical Failure Pattern Study: Malware and Normal Applications Empirical Failure Pattern Study: Malware and Normal Applications Netfuse Design Netfuse Design Evaluations Evaluations Conclusions Conclusions

14 Netfuse Design Netfuse: a behavior based network monitor Netfuse: a behavior based network monitor Correlates network and application failures Correlates network and application failures Wireshark and L7 filters for protocol parsing Wireshark and L7 filters for protocol parsing Multi-point failure monitoring Multi-point failure monitoring Netfuse components Netfuse components FIA (Failure Information Analysis) Engine FIA (Failure Information Analysis) Engine DNSMon DNSMon SVM-based Correlation Engine SVM-based Correlation Engine Clustering Clustering

15 Multi-point Deployment Enterprise Network DNSMon Gateway FIA Failure Scores SVM Correlation Clustering

16 FIA Engine Wireshark: open source protocol analyzer / dissector Wireshark: open source protocol analyzer / dissector Analyzes online and offline pcap captures Analyzes online and offline pcap captures Supports most protocols Supports most protocols Uses port numbers to choose dissectors Uses port numbers to choose dissectors Augment wireshark with L7 protocol signatures Augment wireshark with L7 protocol signatures Automated decoding with payload signatures Automated decoding with payload signatures Sample sig for HTTP Sample sig for HTTP http/(0\.9|1\.0|1\.1) [1-5][0-9][0-9] [\x09-\x0d - ~]*(connection:|content-type:|content-length:|date:)|post [\x09-\x0d -~]* http/[01 ]\.[019] http/(0\.9|1\.0|1\.1) [1-5][0-9][0-9] [\x09-\x0d - ~]*(connection:|content-type:|content-length:|date:)|post [\x09-\x0d -~]* http/[01 ]\.[019]

17 DNSMon DNS servers typically located inside enterprise networks DNS servers typically located inside enterprise networks Suspicious domain lookups can’t be tracked back to original clients from gateway traces Suspicious domain lookups can’t be tracked back to original clients from gateway traces Especially true for NXDomain lookups Especially true for NXDomain lookups DNS Caching DNS Caching DNSMon track traffic b/t clients and resolving DNS server DNSMon track traffic b/t clients and resolving DNS server More comprehensive view of failure activity More comprehensive view of failure activity

18 Correlation Engine Integrates four failure scores Integrates four failure scores Composite Failure Score Composite Failure Score Failure Divergence Score Failure Divergence Score Failure Entropy Score Failure Entropy Score Failure Persistence Score Failure Persistence Score Malware failures tend to be long-lived Malware failures tend to be long-lived SVM-based correlation using Weka SVM-based correlation using Weka

19 Composite Failure Score Estimates severity of each host based on failure volume Estimates severity of each host based on failure volume Consider hosts Consider hosts Large # of application failures (e.g., > 15 per min) or Large # of application failures (e.g., > 15 per min) or TCP RST, ICMP failures > 2 std. dev from mean of all hosts TCP RST, ICMP failures > 2 std. dev from mean of all hosts Compute weighted failure score based on failure frequency of protocol Compute weighted failure score based on failure frequency of protocol

20 Failure Persistence Score Motivated by observation that malware failures tend to be long-lived Motivated by observation that malware failures tend to be long-lived Split time horizon into N parts and compute number of parts where failure occurs Split time horizon into N parts and compute number of parts where failure occurs In our experiments N = 24 In our experiments N = 24

21 Failure Divergence Score Measure degree of uptick in a host’s failure profile Measure degree of uptick in a host’s failure profile Newly infected hosts would demonstrate strong and positive dynamics Newly infected hosts would demonstrate strong and positive dynamics EWMA Algorithm EWMA Algorithm α = 0.5 α = 0.5 For each host, protocol and date compute difference between expected and actual value. For each host, protocol and date compute difference between expected and actual value. Add divergence of each protocol for that host Add divergence of each protocol for that host Normalize by dividing with the maximum divergence value for all hosts Normalize by dividing with the maximum divergence value for all hosts

22 Failure Entropy Score Measure degree of diversity in a host’s failure profile Measure degree of diversity in a host’s failure profile Malware failures tend to be redundant (low diversity) Malware failures tend to be redundant (low diversity) TCP: track server/port distribution of each client receiving failures TCP: track server/port distribution of each client receiving failures DNS: track domain name diversity DNS: track domain name diversity HTTP/SMTP/FTP: track failure types and host names HTTP/SMTP/FTP: track failure types and host names Ignore ICMP Ignore ICMP Compute weighted average failure entropy score Compute weighted average failure entropy score Protocols that dominate failure volume of a host get higher weights Protocols that dominate failure volume of a host get higher weights

23 Outline Motivations and Key Idea Motivations and Key Idea Empirical Failure Pattern Study: Malware and Normal Applications Empirical Failure Pattern Study: Malware and Normal Applications Netfuse Design Netfuse Design Evaluations Evaluations Conclusions Conclusions

24 Evaluation Traces Malware I: 24 malware traces from failure pattern study Malware I: 24 malware traces from failure pattern study Malware II: 5 new malware families (Peacomm, Mimail, Rbot, Bifrose, Kraken) + 3 trained families Malware II: 5 new malware families (Peacomm, Mimail, Rbot, Bifrose, Kraken) + 3 trained families Run for 8 to 10 hours each. Run for 8 to 10 hours each. Malware III: 242 traces selected from 5000 malware sandbox traces based on duration & trace size Malware III: 242 traces selected from 5000 malware sandbox traces based on duration & trace size Institute Traces: Benign traces from well-administered Class B (/16) network with hundreds of machines (5- day and 12-day) Institute Traces: Benign traces from well-administered Class B (/16) network with hundreds of machines (5- day and 12-day)

25 Evaluation Methodology 5-day Institute Trace 12-day Institute Trace Malware Trace I TrainingTesting Malware Trace 2 Testing Malware Trace 3 Testing

26 Detection Rate

27 False Positive Rate

28 Performance Summary Detection rate > 92% for traces I/II Detection rate > 92% for traces I/II Detection rate under 40% for trace III Detection rate under 40% for trace III Trace includes many types of malware including adware with failure patterns similar to benign applications Trace includes many types of malware including adware with failure patterns similar to benign applications Traces are short, many under 15 mins Traces are short, many under 15 mins False positive rate < 5% False positive rate < 5%

29 Clustering Results Peacomm pkts 3/3100% Bifrose306353/3100% Mimail /3100% Kraken495053/3100% Sdbot /3100% Spybot797503/3100% Rbot /3100% Weby90003/3100% Cluster detected hosts based on their failure profile 24 instances belong to 8 different types of malware

30 Conclusions Failure Information Analysis Failure Information Analysis Signature-independent methodology for detecting infected enterprise hosts Signature-independent methodology for detecting infected enterprise hosts Netfuse system Netfuse system Four components: FIA Engine, DNSMon, Correlation Engine, Clustering Four components: FIA Engine, DNSMon, Correlation Engine, Clustering Correlation metrics: Correlation metrics: Composite Failure Score, Divergence Score, Failure Entropy Score, Persistence Score Useful complement to existing network defenses Useful complement to existing network defenses