Automated Worm Fingerprinting

Slides:

Advertisements

Similar presentations

Wenke Lee and Nick Feamster Georgia Tech Botnet and Spam Detection in High-Speed Networks.

Advertisements

Wenke Lee and Nick Feamster Georgia Tech Botnet and Spam Detection in High-Speed Networks.

Bitmap algorithms for flow counting – Internet Measurement Conference, October 2003 Bitmap Algorithms for Counting Active Flows on High Speed Links Cristian.

Code-Red : a case study on the spread and victims of an Internet worm David Moore, Colleen Shannon, Jeffery Brown Jonghyun Kim.

New Directions in Traffic Measurement and Accounting Cristian Estan – UCSD George Varghese - UCSD Reviewed by Michela Becchi Discussion Leaders Andrew.

CS 443 Advanced OS Fabián E. Bustamante, Spring 2005 Automated Worm Fingerprinting Sumeet Singh, Cristian Estan, George Varghese and Stefan Savage Presenter:

Automated Worm Fingerprinting [Singh, Estan et al] Internet Quarantine: Requirements for Self- Propagating Code [Moore, Shannon et al] David W. Hill CSCI.

Firewalls and Intrusion Detection Systems

 Looked at some research approaches to: o Evaluate defense effectiveness o Stop worm from spreading from a given host o Defend a circle of friends against.

 Population: N=100,000  Scan rate  = 4000/sec, Initially infected: I 0 =10  Monitored IP space 2 20, Monitoring interval:  = 1 second Infected hosts.

 Well-publicized worms  Worm propagation curve  Scanning strategies (uniform, permutation, hitlist, subnet) 1.

Incremental Network Programming for Wireless Sensors NEST Retreat June 3 rd, 2004 Jaein Jeong UC Berkeley, EECS Introduction Background – Mechanisms of.

5/1/2006Sireesha/IDS1 Intrusion Detection Systems (A preliminary study) Sireesha Dasaraju CS526 - Advanced Internet Systems UCCS.

Deterministic Memory- Efficient String Matching Algorithms for Intrusion Detection Nathan Tuck, Timothy Sherwood, Brad Calder, George Varghese Department.

Worms: Taxonomy and Detection Mark Shaneck 2/6/2004.

“On Scalable Attack Detection in the Network” Ramana Rao Kompella, Sumeet Singh, and George Varghese Presented by Nadine Sundquist.

Automated Worm Fingerprinting Sumeet Singh, Cristian Estan, George Varghese, and Stefan Savage Manan Sanghi.

1 The Spread of the Sapphire/Slammer Worm D. Moore, V. Paxson, S. Savage, C. Shannon, S. Staniford, N. Weaver Presented by Stefan Birrer.

Internet Quarantine: Requirements for Containing Self-Propagating Code David Moore et. al. University of California, San Diego.

Hash, Don’t Cache: Fast Packet Forwarding for Enterprise Edge Routers Minlan Yu Princeton University Joint work with Jennifer.

FIREWALL TECHNOLOGIES Tahani al jehani. Firewall benefits  A firewall functions as a choke point – all traffic in and out must pass through this single.

George Varghese (based on Cristi Estan’s work) University of California, San Diego May 2011 Internet traffic measurement: from packets to insight.

Lucent Technologies – Proprietary Use pursuant to company instruction Learning Sequential Models for Detecting Anomalous Protocol Usage (work in progress)

FIREWALL Mạng máy tính nâng cao-V1.

Automated Worm Fingerprinting

SIGCOMM 2002 New Directions in Traffic Measurement and Accounting Focusing on the Elephants, Ignoring the Mice Cristian Estan and George Varghese University.

Scalable and Efficient Data Streaming Algorithms for Detecting Common Content in Internet Traffic Minho Sung Networking & Telecommunications Group College.

A Virtual Honeypot Framework Author: Niels Provos Published in: CITI Report 03-1 Presenter: Tao Li.

CSCI 530 Lab Intrusion Detection Systems IDS. A collection of techniques and methodologies used to monitor suspicious activities both at the network and.

15-744: Computer Networking L-23 Worms. 2 Overview Worm propagation Worm signatures.

1 Figure 4-16: Malicious Software (Malware) Malware: Malicious software Essentially an automated attack robot capable of doing much damage Usually target-of-opportunity.

Click to add Text Automated Worm Fingerprinting Sumeet Singh, Cristian Estan, George Varghese and Stefan Savage Department of Computer Science and Engineering.

DoWitcher: Effective Worm Detection and Containment in the Internet Core S. Ranjan et. al in INFOCOM 2007 Presented by: Sailesh Kumar.

Automated Worm Fingerprinting Sumeet Singh, Cristian Estan, George Varghese, and Stefan Savage.

IEEE Communications Surveys & Tutorials 1st Quarter 2008.

Heuristics to Classify Internet Backbone Traffic based on Connection Patterns Wolfgang John and Sven Tafvelin Dept. of Computer Science and Engineering.

Internet Quarantine: Requirements for Containing Self-Propagating Code David Moore, Colleen Shannon, Geoffrey M.Voelker, Stefan Savage University of California,

1 Very Fast containment of Scanning Worms By: Artur Zak Modified by: David Allen Nicholas Weaver Stuart Staniford Vern Paxson ICSI Nevis Netowrks ICSI.

Search Worms, ACM Workshop on Recurring Malcode (WORM) 2006 N Provos, J McClain, K Wang Dhruv Sharma

D 陳怡安 R 解巽評 R 高榮泰 IEEE/ACM TRANSACTIONS ON NETWORKING OCTOBER 2006 Cristian Estan, George Varghese, Member, IEEE, and Michael Fisk.

Polygraph: Automatically Generating Signatures for Polymorphic Worms Presented by: Devendra Salvi Paper by : James Newsome, Brad Karp, Dawn Song.

Intrusion Detection Systems Paper written detailing importance of audit data in detecting misuse + user behavior 1984-SRI int’l develop method of.

1 Modeling, Early Detection, and Mitigation of Internet Worm Attacks Cliff C. Zou Assistant professor School of Computer Science University of Central.

Automated Worm Fingerprinting Authors: Sumeet Singh, Cristian Estan, George Varghese and Stefan Savage Publish: OSDI'04. Presenter: YanYan Wang.

Slammer Worm By : Varsha Gupta.P 08QR1A1216.

Role Of Network IDS in Network Perimeter Defense.

Polygraph: Automatically Generating Signatures for Polymorphic Worms Authors: James Newsome (CMU), Brad Karp (Intel Research), Dawn Song (CMU) Presenter:

Cosc 4765 Antivirus Approaches. In a Perfect world The best solution to viruses and worms to prevent infected the system –Generally considered impossible.

Vigilante: End-to-End Containment of Internet Worms Manuel Costa, Jon Crowcroft, Miguel Castro, Antony Rowstron, Lidong Zhou, Lintao Zhang and Paul Barham.

Behrouz A. Forouzan TCP/IP Protocol Suite, 3rd Ed.

Internet Quarantine: Requirements for Containing Self-Propagating Code

TMG Client Protection 6NPS – Session 7.

POLYGRAPH: Automatically Generating Signatures for Polymorphic Worms

EN Lecture Notes Spring 2016

Empirically Characterizing the Buffer Behaviour of Real Devices

Data Streaming in Computer Networking

DNS-based Detection of Computer Worms in an Enterprise Environment

Chapter 14 User Datagram Protocol (UDP)

Chap 10 Malicious Software.

Taking Down the Internet

Qun Huang, Patrick P. C. Lee, Yungang Bao

Intrusion Prevention Systems

Local Worm Detection using Honeypots Justin Miller Jan 25, 2007

Chap 10 Malicious Software.

Delivery, Forwarding, and Routing of IP Packets

CS 6290 Many-core & Interconnect

Automated Worm Fingerprinting

Transport Layer Identification of P2P Traffic

Session 20 INST 346 Technologies, Infrastructure and Architecture

Introduction to Internet Worm

Presentation transcript:

Automated Worm Fingerprinting Sumeet Singh, Cristian Estan, George Varghese, and Stefan Savage

Introduction Problem: how to react quickly to worms? CodeRed 2001 Infected ~360,000 hosts within 11 hours Sapphire/Slammer (376 bytes) 2002 Infected ~75,000 hosts within 10 minutes

Existing Approaches Detection Characterization Ad hoc intrusion detection Characterization Manual signature extraction Isolates and decompiles a new worm Look for and test unique signatures Can take hours or days

Existing Approaches Containment Updates to anti-virus and network filtering products

Earlybird Automatically detect and contain new worms Two observations Some portion of the content in existing worms is invariant Rare to see the same string recurring from many sources to many destinations

Earlybird Automatically extract the signature of all known worms Also Blaster, MyDoom, and Kibuv.B hours or days before any public signatures were distributed Few false positives

Background and Related Work Almost all IPs were scanned by Slammer < 10 minutes Limited only by bandwidth constraints

The SQL Slammer Worm: 30 Minutes After “Release” This information is from the Cooperative Association for Internet Data Analysis and the University of California at San Diego. - Infections doubled every 8.5 seconds - Spread 100X faster than Code Red - At peak, scanned 55 million hosts per second.

Network Effects Of The SQL Slammer Worm At the height of infections Several ISPs noted significant bandwidth consumption at peering points Average packet loss approached 20% South Korea lost almost all Internet service for period of time Financial ATMs were affected Some airline ticketing systems overwhelmed

Signature-Based Methods Pretty effective if signatures can be generated quickly For CodeRed, 60 minutes For Slammer, 1 – 5 minutes

Worm Detection Three classes of methods Scan detection Honeypots Behavioral techniques

Scan Detection Look for unusual frequency and distribution of address scanning Limitations Not suited to worms that spread in a non-random fashion (i.e. emails, IM, P2P apps) Based on a target list Spread topologically

Scan Detection More limitations Detects infected sites Does not produce a signature

Honeypots Monitored idle hosts with untreated vulnerabilities Used to isolate worms Limitations Manual extraction of signatures Depend on quick infections

Behavioral Detection Looks for unusual system call patterns Sending a packet from the same buffer containing a received packet Can detect slow moving worms Limitations Needs application-specific knowledge Cannot infer a large-scale outbreak

Characterization Process of analyzing and identifying a new worm Current approaches Use a priori vulnerability signatures Automated signature extraction

Vulnerability Signatures Example Slammer Worm UDP traffic on port 1434 that is longer than 100 bytes (buffer overflow) Can be deployed before the outbreak Can only be applied to well-known vulnerabilities

Some Automated Signature Extraction Techniques Allows viruses to infect decoy programs Extracts the modified regions of the decoy Uses heuristics to identify invariant code strings across infected instances

Some Automated Signature Extraction Techniques Limitation Assumes the presence of a virus in a controlled environment

Some Automated Signature Extraction Techniques Honeycomb Find longest common subsequences among sets of strings found in messages Autograph Uses network-level data to infer worm signatures Limitations Scale and full distributed deployments

Containment Mechanism used to deter the spread of an active worm Host quarantine Via IP ACLs on routers or firewalls String-matching Connection throttling On all outgoing connections

Host Quarantine Preventing an infected host from talking to other hosts Via IP ACLs on routers or firewalls

Defining Worm Behavior Content invariance Portions of a worm are invariant (e.g. the decryption routine) Content prevalence Appears frequently on the network Address dispersion Distribution of destination addresses more uniform to spread fast

Finding Worm Signatures Traffic pattern is sufficient for detecting worms Relatively straightforward Extract all possible substrings Raise an alarm when FrequencyCounter[substring] > threshold1 SourceCounter[substring] > threshold2 DestCounter[substring] > threshold3

Practical Content Sifting Characteristics Small processing requirements Small memory requirements Allows arbitrary deployment strategies

Estimating Content Prevalence Finding the packet payloads that appear at least x times among the N packets sent During a given interval

Estimating Content Prevalence Table[payload] 1 GB table filled in 10 seconds Table[hash[payload]] 1 GB table filled in 4 minutes Tracking millions of ants to track a few elephants Collisions...false positives

Multistage Filters stream memory Array of counters Hash(Pink) As with sample and hold, we start with an empty stream memory. But besides this we also have an array of counters initialized to all 0s. When a packet comes in, it is hashed to one of the counters based on the stream identifier and the counter is incremented. [Singh et al. 2002]

Multistage Filters packet memory Array of counters Hash(Green) Packets from different streams usually hash to different counters.

Multistage Filters packet memory Array of counters Hash(Green) But packets from the same stream always hash to the same counter. This gives us a nice invariant. The value of the counter a stream hashes to will always be at least as large as the stream itself. So if we set a threshold which is three in this example, we are sure that by then time a stream reaches the threshold, the counter iit hashes to does too.

Multistage Filters packet memory Now a yellow packet.

Multistage Filters packet memory Collisions are OK We see here that the violet packet hashed to the same counter as the pink one. We do nothing about these collisions. This makes it very easy to implement the counters efficiently in hardware or software.

Multistage Filters Reached threshold packet memory Insert packet1 1 Now green sends its third packet. We see that the counter reaches the threshold and create an entry for the green stream which will count all of its packets from here on.

Multistage Filters packet memory packet1 1 Brown packet.

Multistage Filters packet memory packet1 1 packet2 1 Now, pink got an entry, but it shouldn’t have, because it’s only at its second packet. The collision with violet is what caused the trouble. We could reduce the number of collisions by increasing the number of counters, but this isn’t a very scalable solution since we would have to increase the number of counters linearly with the number of streams.

Multistage Filters packet memory Stage 1 Stage 2 packet1 1 What we can do is add another stage of counters that uses an independent hash function and operates in parallel with the first one. We change the rule for adding an entry to the stream memory so that we only add an entry if the counters at both stages reach the threshold. Green will have no problem achieving this because it will push both counters high. Pink is not so lucky though, because its counter at the second stage doesn’t let it get an entry. Our analysis shows that the number of small streams passing the filter decreases exponentially with the number of stages. No false negatives! Stage 2

Gray = all prior packets Conservative Updates Gray = all prior packets Conservative update is a modification of the way multistage filters operate that accounts for spectacular performance improvements. Let’s revisit how this three stage filter would normally operate (gray represents all previous packets). When the yellow packet comes in it is hashed to one counter at each stage and all three counters are incremented.

Conservative Updates Redundant If we take a look at these counters we can see that yellow sent just this one packet. We know this based on the counter from the second stage. So even if we don’t increment the other two counters, we don’t violate the invariant that all counters are at least as large as the yellow stream. But does this improve the performance of the filter? Very much. What happens is that by updating the counters conservatively we give less help to other streams to reach the threshold and thus significantly reduce the number of small streams passing the filter. So, this is how conservative update would work in this case.

Conservative Updates Of course, if we had more than one counter with the minimum value we would increment them all.

Detecting Common Strings Cannot afford to detect all substrings Maybe can afford to detect all strings with a small fixed length

Detecting Common Strings Cannot afford to detect all substrings Maybe can afford to detect all strings with a small fixed length A horse is a horse, of course, of course F1 = (c1p4 + c2p3 + c3p2 + c4p1 + c5) mod M

Detecting Common Strings Cannot afford to detect all substrings Maybe can afford to detect all strings with a small fixed length A horse is a horse, of course, of course F2 = (c2p4 + c3p3 + c4p2 + c5p1 + c6) mod M F1 = (c1p4 + c2p3 + c3p2 + c4p1 + c5) mod M

Detecting Common Strings Cannot afford to detect all substrings Maybe can afford to detect all strings with a small fixed length F2 = (c2p4 + c3p3 + c4p2 + c5p1 + c6) mod M = (c1p5 + c2p4 + c3p3 + c4p2 + c5p1 + c6 - c1p5) mod M = (pF1 + c6 - c1p5) mod M

Detecting Common Strings Cannot afford to detect all substrings Maybe can afford to detect all strings with a small fixed length Still too expensive…

Estimating Address Dispersion Not sufficient to count the number of source and destination pairs e.g. send a mail to a mailing list Two sources—mail server and the sender Many destinations Need to track the distinct source and destination IP addresses For each substring

Bitmap counting – direct bitmap Set bits in the bitmap using hash of the flow ID of incoming packets HASH(green)=10001001 [Estan et al. 2003]

Bitmap counting – direct bitmap Different flows have different hash values HASH(blue)=00100100

Bitmap counting – direct bitmap Packets from the same flow always hash to the same bit HASH(green)=10001001

Bitmap counting – direct bitmap Collisions OK, estimates compensate for them HASH(violet)=10010101

Bitmap counting – direct bitmap HASH(orange)=11110011

Bitmap counting – direct bitmap HASH(pink)=11100000

Bitmap counting – direct bitmap As the bitmap fills up, estimates get inaccurate HASH(yellow)=01100011

Bitmap counting – direct bitmap Solution: use more bits HASH(green)=10001001

Bitmap counting – direct bitmap Solution: use more bits Problem: memory scales with the number of flows HASH(blue)=00100100

Bitmap counting – virtual bitmap Solution: a) store only a portion of the bitmap b) multiply estimate by scaling factor

Bitmap counting – virtual bitmap HASH(pink)=11100000

Bitmap counting – virtual bitmap Problem: estimate inaccurate when few flows active HASH(yellow)=01100011

Bitmap counting – multiple bmps Solution: use many bitmaps, each accurate for a different range

Bitmap counting – multiple bmps HASH(pink)=11100000

Bitmap counting – multiple bmps HASH(yellow)=01100011

Bitmap counting – multiple bmps Use this bitmap to estimate number of flows

Bitmap counting – multiple bmps Use this bitmap to estimate number of flows

Bitmap counting – multires. bmp OR OR Problem: must update up to three bitmaps per packet Solution: combine bitmaps into one

Bitmap counting – multires. bmp HASH(pink)=11100000

Bitmap counting – multires. bmp HASH(yellow)=01100011

Multiresolution Bitmaps Still too expensive to scale Scaled bitmap Recycles the hash space with too many bits set Adjusts the scaling factor according

Too CPU-Intensive A packet with 1,000 bytes of payload Needs 960 fingerprints for string length of 40 Prone to Denial-of-Service attacks

CPU Scaling Obvious approach: sampling Solution: value sampling - Random sampling may miss many substrings Solution: value sampling Track only certain substrings e.g. last 6 bits of fingerprint are 0 P(not tracking a worm) = P(not tracking any of its substrings)

CPU Scaling Example Track only substrings with last 6 bits = 0s String length = 40 P(finding a 100-byte signature) = 55% P(finding a 200-byte signature) = 92% P(finding a 400-byte signature) = 99.64%

Putting It Together Address Dispersion Table key src cnt dest cnt header payload substring fingerprints AD entry exist? update counters else update counter key cnt cnt > prevalence threshold? create AD entry counters > dispersion threshold? report key as suspicious worm Content Prevalence Table

Putting It Together Sample frequency: 1/64 String length: 40 Use 4 hash functions to update prevalence table Multistage filter reset every 60 seconds

System Design Two major components Sensors An aggregator Sift through traffic for a given address space Report signatures An aggregator Coordinates real-time updates Distributes signatures

Implementation and Environment Written in C and MySQL (5,000 lines) rrd-tools library for graphical reporting PHP scripting for administrative control Prototype executes on a 1.6Ghz AMD Opteron 242 1U Server Linux 2.6 kernel

EarlyBird Processes 1TB of traffic per day Can keep up with 200Mbps of continuous traffic

Parameter Tuning Prevalence threshold: 3 Address dispersion threshold Very few signatures repeat Address dispersion threshold 30 sources and 30 destinations Reset every few hours Reduces the number of reported signatures down to ~25,000

Parameter Tuning Tradeoff between and speed and accuracy Can detect Slammer in 1 second as opposed to 5 seconds With 100x more reported signatures

Performance 200Mbps Can be pipelined and parallelized for achieve 40Gbps

Memory Consumption Prevalence table Address dispersion table 4 stages Each with ~500,000 bins (8 bits/bin) 2MB total Address dispersion table 25K entries (28 bytes each) < 1 MB Total: < 4MB

Trace-Based Verification Two main sources of false positives 2,000 common protocol headers e.g. HTTP, SMTP Whitelisted SPAM e-mails BitTorrent Many-to-many download

False Negatives So far none Detected every worm outbreak

Inter-Packet Signatures An attacker might evade detection by splitting an invariant string across packets With 7MB extra, EarlyBird can keep per flow states and fingerprint across packets

Live Experience with EarlyBird Detected precise signatures CodeRed variants MyDoom mail worm Sasser Kibvu.B

Variant Content Polymorphic viruses Semantically equivalent but textually distinct code Invariant decoding routine

Extensions Self configuration Slow worms

Containment How to handle false positives? If too aggressive, EarlyBird becomes a target for DoS attacks An attacker can fool the system to block a target message

Coordination Trust of deployed servers Validation Policy

Conclusions EarlyBird is a promising approach To detect unknown worms real-time To extract signatures automatically To detect SPAMs with minor changes Wire-speed signature learning is viable