15-446 Networked Systems Practicum Lecture 14 – Worms/Viruses/Botnets 1.

Slides:



Advertisements
Similar presentations
(Distributed) Denial of Service Nick Feamster CS 4251 Spring 2008.
Advertisements

Code-Red : a case study on the spread and victims of an Internet worm David Moore, Colleen Shannon, Jeffery Brown Jonghyun Kim.
By Hiranmayi Pai Neeraj Jain
Web Defacement Anh Nguyen May 6 th, Organization Introduction How Hackers Deface Web Pages Solutions to Web Defacement Conclusions 2.
Automated Worm Fingerprinting [Singh, Estan et al] Internet Quarantine: Requirements for Self- Propagating Code [Moore, Shannon et al] David W. Hill CSCI.
 Population: N=100,000  Scan rate  = 4000/sec, Initially infected: I 0 =10  Monitored IP space 2 20, Monitoring interval:  = 1 second Infected hosts.
Worms: Taxonomy and Detection Mark Shaneck 2/6/2004.
UNCLASSIFIED Secure Indirect Routing and An Autonomous Enterprise Intrusion Defense System Applied to Mobile ad hoc Networks J. Leland Langston, Raytheon.
Automated Worm Fingerprinting Sumeet Singh, Cristian Estan, George Varghese, and Stefan Savage Manan Sanghi.
Worm Defense. Outline  Internet Quarantine: Requirements for Containing Self-Propagating Code  Netbait: a Distributed Worm Detection Service  Midgard.
Lesson 9-Securing a Network. Overview Identifying threats to the network security. Planning a secure network.
How to Own the Internet in your spare time Ashish Gupta Network Security April 2004.
Internet Quarantine: Requirements for Containing Self-Propagating Code David Moore et. al. University of California, San Diego.
Henric Johnson1 Chapter 10 Malicious Software Henric Johnson Blekinge Institute of Technology, Sweden
Introduction to Honeypot, Botnet, and Security Measurement
Computer Security: Principles and Practice First Edition by William Stallings and Lawrie Brown Lecture slides by Lawrie Brown Chapter 7 – Malicious Software.
Malicious Software Malicious Software Han Zhang & Ruochen Sun.
1 Chap 10 Malicious Software. 2 Viruses and ”Malicious Programs ” Computer “Viruses” and related programs have the ability to replicate themselves on.
© 2007 Cisco Systems, Inc. All rights reserved.Cisco Public ITE PC v4.0 Chapter 1 1 Basic Security Networking for Home and Small Businesses – Chapter 8.
B OTNETS T HREATS A ND B OTNETS DETECTION Mona Aldakheel
Network and Internet Security SYSTEM SECURITY. Virus Countermeasures Antivirus approach ◦Ideal solution: Prevention ◦Not allowing the virus to infect.
Computer Security: Principles and Practice First Edition by William Stallings and Lawrie Brown Lecture slides by Lawrie Brown Chapter 8 – Denial of Service.
Malicious Code Brian E. Brzezicki. Malicious Code (from Chapter 13 and 11)
Internet Worms Brad Karp UCL Computer Science CS GZ03 / th December, 2007.
Automated Worm Fingerprinting
BY ANDREA ALMEIDA T.E COMP DON BOSCO COLLEGE OF ENGINEERING.
Speaker:Chiang Hong-Ren Botnet Detection by Monitoring Group Activities in DNS Traffic.
“How to 0wn the Internet in Your Spare Time” Nathanael Paul Malware Seminar September 7, 2004.
Honeypot and Intrusion Detection System
Active Worms CSE 4471: Information Security 1. Active Worm vs. Virus Active Worm –A program that propagates itself over a network, reproducing itself.
1 How to 0wn the Internet in Your Spare Time Authors: Stuart Staniford, Vern Paxson, Nicholas Weaver Publication: Usenix Security Symposium, 2002 Presenter:
1 Chap 10 Virus. 2 Viruses and ”Malicious Programs ” Computer “Viruses” and related programs have the ability to replicate themselves on an ever increasing.
Virus Detection Mechanisms Final Year Project by Chaitanya kumar CH K.S. Karthik.
15-744: Computer Networking L-23 Worms. 2 Overview Worm propagation Worm signatures.
1 Figure 4-16: Malicious Software (Malware) Malware: Malicious software Essentially an automated attack robot capable of doing much damage Usually target-of-opportunity.
How to Own the Internet in Your Spare Time (Stuart Staniford Vern Paxson Nicholas Weaver ) Giannis Kapantaidakis University of Crete CS558.
Click to add Text Automated Worm Fingerprinting Sumeet Singh, Cristian Estan, George Varghese and Stefan Savage Department of Computer Science and Engineering.
Chapter 10 Malicious software. Viruses and ” Malicious Programs Computer “ Viruses ” and related programs have the ability to replicate themselves on.
Modeling Worms: Two papers at Infocom 2003 Worms Programs that self propagate across the internet by exploiting the security flaws in widely used services.
Automated Worm Fingerprinting Sumeet Singh, Cristian Estan, George Varghese, and Stefan Savage.
IEEE Communications Surveys & Tutorials 1st Quarter 2008.
Week 10-11c Attacks and Malware III. Remote Control Facility distinguishes a bot from a worm distinguishes a bot from a worm worm propagates itself and.
1 Very Fast containment of Scanning Worms By: Artur Zak Modified by: David Allen Nicholas Weaver Stuart Staniford Vern Paxson ICSI Nevis Netowrks ICSI.
Worm Defense Alexander Chang CS239 – Network Security 05/01/2006.
Search Worms, ACM Workshop on Recurring Malcode (WORM) 2006 N Provos, J McClain, K Wang Dhruv Sharma
Viruses a piece of self-replicating code attached to some other code – cf biological virus both propagates itself & carries a payload – carries code to.
n Just as a human virus is passed from person from person, a computer virus is passed from computer to computer. n A virus can be attached to any file.
Chapter 19 – Malicious Software What is the concept of defense: The parrying of a blow. What is its characteristic feature: Awaiting the blow. —On War,
Mobile Code and Worms By Mitun Sinha Pandurang Kamat 04/16/2003.
A Case Study on Computer Worms Balaji Badam. Computer worms A self-propagating program on a network Types of Worms  Target Discovery  Carrier  Activation.
1 Modeling, Early Detection, and Mitigation of Internet Worm Attacks Cliff C. Zou Assistant professor School of Computer Science University of Central.
Automated Worm Fingerprinting Authors: Sumeet Singh, Cristian Estan, George Varghese and Stefan Savage Publish: OSDI'04. Presenter: YanYan Wang.
Slammer Worm By : Varsha Gupta.P 08QR1A1216.
How to 0wn the Internet In Your Spare Time Authors Stuart Staniford, Vern Paxson, Nicholas Weaver Published Proceedings of the 11th USENIX Security Symposium.
Page 1 Viruses. Page 2 What Is a Virus A virus is basically a computer program that has been written to perform a specific set of tasks. Unfortunately,
Malicious Programs (1) Viruses have the ability to replicate themselves Other Malicious programs may be installed by hand on a single machine. They may.
@Yuan Xue Worm Attack Yuan Xue Fall 2012.
Protecting Computers From Viruses and Similarly Programmed Threats Ryan Gray COSC 316.
Firewalls. Overview of Firewalls As the name implies, a firewall acts to provide secured access between two networks A firewall may be implemented as.
Cosc 4765 Antivirus Approaches. In a Perfect world The best solution to viruses and worms to prevent infected the system –Generally considered impossible.
The Worms Crawl In The Worms Crawl Out David Andersen.
MALWARE.
Internet Quarantine: Requirements for Containing Self-Propagating Code
Viruses and Other Malicious Content
Automated Worm Fingerprinting
Chap 10 Malicious Software.
Brad Karp UCL Computer Science
Chap 10 Malicious Software.
Automated Worm Fingerprinting
Introduction to Internet Worm
Presentation transcript:

Networked Systems Practicum Lecture 14 – Worms/Viruses/Botnets 1

Outline Worms Worm Defense Botnet/Viruses 2

What is a Computer Worm? Self replicating network program Exploit vulnerabilities to infect remote machines Victim machines continue to propagate infection Three main stages Detect new targets Attempt to infect new targets Activate code on victim machine Difference w/ computer virus? No human intervention required 3 Network

Why Worry About Worms? Speed Much faster than viruses CRv2: 14 hours for victims Slammer: 10 minutes for victims Faster than human reaction Highly malicious payloads DDoS or data corruption 4

Some Major Worms WormYearStrategyVictimsOther Notes Morris1988 Topological scanning 6KFirst major autonomous worm Code Red2001 Random scanning ~300KFirst recent "fast" worm Nimda2001 Local scanning ~200KLocal subnet scanning Effective mix of techniques Slammer2003 Random scanning >75KSpread worldwide in 10 minutes MyDoom2004 Topological scanning <15KFirst “Zero Day” Worm Conficker2008 Random scanning >15M?Largest infection, capability of updates 5

Threat Model Traditional High-value targets Insider threats Worms & Botnets Automated attack of millions of targets Value in aggregate, not individual systems Threats: Software vulnerabilities; naïve users 6

... and it's profitable Botnets used for Spam (and more spam)? Credit card theft DDoS extortion Flourishing Exchange market Spam proxying: 3-10 cents/host/week 25k botnets: $40k - $130k/year Also for stolen account compromised machines, credit cards, identities, etc. (be worried)? 7

Why is this problem hard? Monoculture: little “genetic diversity” in hosts Instantaneous transmission: Almost entire network within 500ms Slow immune response: human scales (10x- 1Mx slower!)? Poor hygiene: Out of date / misconfigured systems; naïve users Intelligent designer... of pathogens Near-Anonymitity 8

Code Red I v1 July 12th, 2001 Exploited a known vulnerability in Microsoft’s Internet Information Server (IIS) Buffer overflow in a rarely used URL decoding routine – published June 18 th 1 st – 19 th of each month: attempts to spread Random scanning of IP address space 99 propagation threads, 100 th defaced pages on server Static random number generator seed Every worm copy scans the same set of addresses  Linear growth 9

Code Red I v1 20 th – 28 th of each month: attacks DDOS attack against ( Memory resident – rebooting the system removes the worm However, could quickly be reinfected 10

Code Red I v2 July 19th, 2001 Largely same codebase – same author? Ends website defacements Fixes random number generator seeding bug Scanned address space grew exponentially 359,000 hosts infected in 14 hours Compromised almost all vulnerable IIS servers on internet 11

Analysis of Code Red I v2 Random Constant Spread model Constants N = total number of vulnerable machines K = initial compromise rate, per hour T = Time at which incident happens Variables a = proportion of vulnerable machines compromised t = time in hours 12

Analysis of Code Red I v2 N = total number of vulnerable machines K = initial compromise rate, per hour T = Time at which incident happens Variables a = proportion of vulnerable machines compromised t = time in hours “Logistic equation” Rate of growth of epidemic in finite systems when all entities have an equal likelihood of infecting any other entity 13

Code Red I v2 – Plot K = 1.8 T = 11.9 Hourly probe rate data for inbound port 80 at the Chemical Abstracts Service during the initial outbreak of Code Red I on July 19th,

Improvements: Localized scanning Observation: Density of vulnerable hosts in IP address space is not uniform Idea: Bias scanning towards local network Used in CodeRed II P=0.50: Choose address from local class-A network (/8) P=0.38: Choose address from local class-B network (/16) P=0.12: Choose random address Allows worm to spread more quickly 15

Code Red II (August 2001) Began : August 4th, 2001 Exploit : Microsoft IIS webservers (buffer overflow) Named “Code Red II” because : It contained a comment stating so. However the codebase was new. Infected IIS on windows 2000 successfully but caused system crash on windows NT. Installed a root backdoor on the infected machine. 16

Improvements: Multi-vector Idea: Use multiple propagation methods simultaneously Example: Nimda IIS vulnerability Bulk s Open network shares Defaced web pages Code Red II backdoor Onset of Nimda Time (PDT) 18 September, 2001 HTTP connections/second seen at LBNL (only confirmed Nimda attacks) 1/2 hour 17

Better Worms: Hit-list Scanning Worm takes a long time to “get off the ground” Worm author collects a list of, say, vulnerable machines Worm initially attempts to infect these hosts 18

How to build Hit-List Stealthy randomized scan over number of months Distributed scanning via botnet DNS searches – e.g. assemble domain list, search for IP address of mail server in MX records Web crawling spider similar to search engines Public surveys – e.g. Netcraft Listening for announcements – e.g. vulnerable IIS servers during Code Red I 19

Better Worms: Permutation scanning Problem: Many addresses are scanned multiple times Idea: Generate random permutation of all IP addresses, scan in order Hit-list hosts start at their own position in the permutation When an infected host is found, restart at a random point Can be combined with divide-and-conquer approach 20 H0H0 H4H4 H1H1 H3H3 H2H2 H 1 (Restart)

Warhol Worm Simulation shows that employing the two previous techniques, can attack 300,000 hosts in less than 15 minutes Conventional = 10 scans/sec Fast Scanning = 100 scans/sec Warhol = 100 scans/sec, Permutation scanning and 10,000 entry hit list 21

More on Warhol worm 22

Flash worms A flash worm would start with a hit list that contains most/all vulnerable hosts Realistic scenario: Complete scan takes 2h with an OC-12 Internet warfare? Problem: Size of the hit list 9 million hosts  36 MB Compression works: 7.5MB Can be sent over a 256kbps DSL link in 3 seconds Extremely fast: Full infection in tens of seconds! 23

Surreptitious worms Idea: Hide worms in inconspicuous traffic to avoid detection Leverage P2P systems? High node degree Lots of traffic to hide in Proprietary protocols Homogeneous software Immense size (30,000,000 Kazaa downloads!) 24

Example Outbreak: SQL Slammer (2003) Single, small UDP packet exploit (376 b)‏ First ~1min: classic random scanning Doubles # of infected hosts every ~8.5sec (In comparison: Code Red doubled in 40min)‏ After 1min, starts to saturate access b/w Interferes with itself, so it slows down By this point, was sending 20M pps Peak of 55 million IP 3min 90% of Internet scanned in < 10mins Infected ~100k or more hosts 25

Stuxnet Worm The first worm for control systems Discovered in June 2010 Attack SCADA systems using Siemens WinCC/PCS 7 software Not only spying but also reprogram programmable logic controllers (PLCs) Four zero-day attacks used Infection includes Iran (62K) and China (6M?) Nation-wide support cyberwarefare? 26

Prevention Get rid of the or permute vulnerabilities (e.g., address space randomization) makes it harder to compromise Block traffic (firewalls) only takes one vulnerable computer wandering between in & out or multi-homed, etc. Keep vulnerable hosts off network incomplete vuln. databases & 0-day worms Slow down scan rate Allow hosts limited # of new contacts/sec. Can slow worms down, but they do still spread Quarantine Detect worm, block it 27

Outline Worms Worm Defense Botnet/Viruses 28

Context Worm Detection Scan detection Honeypots Host based behavioral detection Payload-based 29

Worm: Countermeasures Signature-based worm scan filtering Vulnerable to polymorphic worms Scan detection High scanning activity to identify victims Scanning with high failure rate compared to legitimate users (DNS) TCP RST, ICMP Unreachable Two dimensions: time, space Rate limiting, rate halting False positive (Index crawler, NAT, etc.) Disruption to legitimate services Not applicable to UDP based propagation 30

Worm behavior Content Invariance Limited polymorphism e.g. encryption key portions are invariant e.g. decryption routine Content Prevalence invariant portion appear frequently Address Dispersion # of infected distinct hosts grow overtime reflecting different source and dest. addresses 31

Signature Inference Content prevalence: Autograph, EarlyBird, etc. Assumes some content invariance Pretty reasonable for starters. Goal: Identify “attack” substrings Maximize detection rate Minimize false positive rate 32

Content Sifting For each string w, maintain prevalence(w): Number of times it is found in the network traffic sources(w): Number of unique sources corresponding to it destinations(w): Number of unique destinations corresponding to it If thresholds exceeded, then block(w) 33

Issues How to compute prevalence(w), sources(w) and destinations(w) efficiently ? Scalable Low memory and CPU requirements Real time deployment over a Gigabit link 34

Estimating Content Prevalence Table[payload] 1 GB table filled in 10 seconds Table[hash[payload]] 1 GB table filled in 4 minutes Tracking millions of ants to track a few elephants Collisions...false positives 35

stream memory Array of counters Hash(Pink) Multistage Filters 36

packet memory Array of counters Hash(Green) Multistage Filters 37

packet memory Array of counters Hash(Green) Multistage Filters 38

packet memory Multistage Filters 39

packet memory Collisions are OK Multistage Filters 40

packet memory packet1 1 Insert Reached threshold Multistage Filters 41

packet memory packet1 1 Multistage Filters 42

packet memory packet1 1 packet2 1 Multistage Filters 43

Stage 2 packet memory packet1 1 Stage 1 Multistage Filters No false negatives! (guaranteed detection) 44

Gray = all prior packets Conservative Updates 45

Redundant Conservative Updates 46

Conservative Updates 47

Value Sampling The problem: s-b+1 substrings Solution: Sample But: Random sampling is not good enough Trick: Sample only those substrings for which the fingerprint matches a certain pattern 48

sources(w) & destinations(w) Address Dispersion Counting distinct elements vs. repeating elements Simple list or hash table is too expensive Key Idea: Bitmaps Trick : Scaled Bitmaps 49

Bitmap counting – direct bitmap HASH(green)= Set bits in the bitmap using hash of the flow ID of incoming packets 50

Bitmap counting – direct bitmap HASH(blue)= Different flows have different hash values 51

Bitmap counting – direct bitmap HASH(green)= Packets from the same flow always hash to the same bit 52

Bitmap counting – direct bitmap HASH(violet)= Collisions OK, estimates compensate for them 53

Bitmap counting – direct bitmap HASH(orange)=

Bitmap counting – direct bitmap HASH(pink)=

Bitmap counting – direct bitmap HASH(yellow)= As the bitmap fills up, estimates get inaccurate 56

Bitmap counting – direct bitmap Solution: use more bits HASH(green)=

Bitmap counting – direct bitmap Solution: use more bits Problem: memory scales with the number of flows HASH(blue)=

Bitmap counting – virtual bitmap Solution: a) store only a portion of the bitmap b) multiply estimate by scaling factor 59

Bitmap counting – virtual bitmap HASH(pink)=

Bitmap counting – virtual bitmap HASH(yellow)= Problem: estimate inaccurate when few flows active 61

Bitmap counting – multiple bmps Solution: use many bitmaps, each accurate for a different range 62

Bitmap counting – multiple bmps HASH(pink)=

Bitmap counting – multiple bmps HASH(yellow)=

Bitmap counting – multiple bmps Use this bitmap to estimate number of flows 65

Bitmap counting – multiple bmps Use this bitmap to estimate number of flows 66

Bitmap counting – multires. bmp Problem: must update up to three bitmaps per packet Solution: combine bitmaps into one OROR OROR 67

HASH(pink)= Bitmap counting – multires. bmp 68

Bitmap counting – multires. bmp HASH(yellow)=

Multiresolution Bitmaps Still too expensive to scale Scaled bitmap Recycles the hash space with too many bits set Adjusts the scaling factor according 70

Scaled Bitmap Idea: Subsample the range of hash space How it works? multiple bitmaps each mapped to progressively smaller and smaller portions of the hash space. bitmap recycled if necessary. Result Roughly 5 time less memory + actual estimation of address dispersion 71

Putting It Together headerpayload substring fingerprints keysrc cntdest cnt AD entry exist? update counters keycnt else update counter cnt > prevalence threshold? create AD entry Content Prevalence Table Address Dispersion Table counters > dispersion threshold? report key as suspicious worm 72

Putting It Together Sample frequency: 1/64 String length: 40 Use 4 hash functions to update prevalence table Multistage filter reset every 60 seconds 73

Parameter Tuning Prevalence threshold: 3 Very few signatures repeat Address dispersion threshold 30 sources and 30 destinations Reset every few hours Reduces the number of reported signatures down to ~25,000 74

Parameter Tuning Tradeoff between and speed and accuracy Can detect Slammer in 1 second as opposed to 5 seconds With 100x more reported signatures 75

False Negatives in EB False Negatives Very hard to prove... Earlybird detected all worm outbreaks reported on security lists over 8 months EB detected all worms detected by Snort (signature-based IDS)? And some that weren't 76

False Positives in EB Common protocol headers HTTP, SMTP headers p2p protocol headers Non-worm epidemic activity Spam BitTorrent (!)‏ Solution: Small whitelist... 77

Outline Worms Worm Defense Botnet/Viruses 78

... and it's profitable Botnets used for Spam (and more spam)? Credit card theft DDoS extortion Flourishing Exchange market Spam proxying: 3-10 cents/host/week 25k botnets: $40k - $130k/year Also for stolen account compromised machines, credit cards, identities, etc. (be worried)? 79

Botnet A group of zombie computers under the remote control of an attacker via a command and control (C&C) server C&C Server Attacker Zombies Target Sites Command Attack C&C Server 80

Botnet Countermeasure Detecting new botnets by using honeypots, analyzing spam pools, capturing group activities in DNS Sinkholing or nullrouting C&C server connections and cleaning zombies 81

Outline Worms Worm Defense Botnet/Viruses 82

Malicious Code Many types of malicious code Virus, worm, botnet, spyware, spam, etc. Who writes this and why? Challenge (for fun) Fame (for pride) Business (for money) Black markets for attacks (DDoS and spams) and info(credit cards, vulnerabilities) Ideology (for activism) Hactivism, cyberterrorism, cyberwarefare 83

What is a Computer Virus? Program that spreads itself by infecting (modifying) an executable file and making copies of itself 84

Components 1.Propagation mechanism Sharing infected file with other computers USB drive, attachment, and shared folders Executing infected file  Infect other computers and spread infection 2.Trigger Time/condition when payload is activated 3.Payload Damage existing files Extort sensitive information Consume computer’s resources 85

Infected File 1Insert document in fax machine. (Program entry-point) 2Dial the phone number. 3Hit the SEND button on the fax. 4Wait for completion. If a problem occurs, go back to step 1. 5End task. 86 1Skip to step 6. 2Dial the phone number. 3Hit the SEND button on the fax. 4Wait for completion. If a problem occurs, go back to step 1. 5End task. 6VIRUS instructions 7Insert document in fax machine and go to step 2. Before After Nachenberg, Computer Virus-Antivirus Coevolution, CACM 1997

Propagation Virus replicates when infected file is executed Task is not entirely automated User makes the first step Virus copies malicious code to other files Jump instruction to malicious code is added Why are Windows-based viruses most prolific? Largest population Why write a virus if only a few people are infected? 87

Simple Virus program V := {goto main; ; subroutine infect-executable := {loop: file := get-random-executable-file; if (first-line-of-file = ) then goto loop else prepend V to file; } subroutine trigger-pulled := subroutine do-damage := main: {infect-executable; if trigger-pulled then do-damage; goto next;} next: … } 88 Stallings, Chapter 7.2

Detection Infected file has a larger size than initial version of file Scanners record files lengths and searches for changes Virus can easily bypass detection through compression (packing) 89 PP V P PP V

Detection (cont’d) Virus signature Same structure and bit pattern for uniquely identifying a virus 90 New malicious code signatures, Symantec 2010

More Advanced Viruses Encrypted viruses Prevent “signature” to detect virus via encryption  Polymorphic viruses  Change virus code to prevent signature 91

Detection of Encrypted Virus A different encryption key is generated for each new infection Therefore, encrypted virus body appears different in each infected file Antivirus can no longer parse virus body for the virus signatures Still pattern matching possible Still identical copy of decryption routine 92

Detection of Polymorphic Viruses Advanced encrypted virus Formerly, constant decryption routine Now, mutable decryption routine unique crypto code generated for each copy No more signatures in code 93

Generic Decryption (GD) Technology Signatures still present in decrypted code Let the virus do the work for you Emulate code in controlled environment Periodically, scan virtual memory for virus signatures 94

GD Limitations How long to emulate? Emulator must include software versions and processor hardware and CPU may have different machine language level 95