UCSD Potemkin Honeyfarm Jay Chen, Ranjit Jhala, Chris Kanich, Erin Kenneally, Justin Ma, David Moore, Stefan Savage, Colleen Shannon, Alex Snoeren, Amin Vahdat, Erik Vandekeift, George Varghese, Geoff Voelker, Michael Vrable George Varghese, Geoff Voelker, Michael Vrable
Network Telescopes Infected host scans for other vulnerable hosts by randomly generating IP addresses Network Telescope: monitor large range of unused IP addresses – will receive scans from infected host Very scalable. UCSD monitors 17M+ addresses (/8 + /16s)
Telescopes + Active Responders Problem: Telescopes are passive, can’t respond to TCP handshake Is a SYN from a host infected by CodeRed or Welchia? Dunno. What does the worm payload look like? Dunno. Solution: proxy responder Stateless: TCP SYN/ACK (Internet Motion Sensor), per-protocol responders (iSink) Stateful: Honeyd Can differentiate and fingerprint payload
HoneyNets Problem: don’t know what worm/virus would do? No code ever executes after all. Solution: redirect scans to real “infectible” hosts (honeypots) Individual hosts or VM-based: Collapsar, HoneyStat, Symantec Can reduce false positives/negatives with host-analysis (e.g., TaintCheck, Vigilante, Minos) and behavioral/procedural signatures Challenges Scalability Liability (honeywall) Isolation (2000 IP addrs -> 40 physical machines) Detection (VMWare detection code in the wild)
The Scalability/Fidelity tradeoff Live Honeypot Telescopes + Responders (iSink, Internet Motion Sensor) VM-based Honeynet Network Telescopes (passive) Most Scalable Highest Fidelity Nada
Potemkin: A large scale high-fidelity honeyfarm Goal: emulate significant fraction of Internet hosts (10M+) Multiplex large address space on smaller # of servers Temporal & spatial multiplexing Global Internet 64x /16 advertised Physical Honeyfarm Servers VM MGMT Gateway GRE Tunnels Scalability, Fidelity, and Containment in the Potemkin Virtual Honeyfarm, Vrable, Ma, Chen, Moore, VandeKieft, Snoeren, Voelker, and Savage, SOSP 2005
UCSD Honeyfarm Approach Make VMs very, very cheap Create one (or more) VM per packet on demand Deploy many types of VM systems Plethora of OSes, versions, configurations Monitor VM behavior Decide benign or malicious Benign: Quickly terminate, recycle resources Malicious: Track propagation, save for offline analysis, etc. Assumes common case that most traffic is benign Key issues for remainder of talk 1) Scaling 2) Containment
Scaling Naïve approach: one machine per IP address 1M addresses = 1M hosts = $2B+ investment However most of these resources would be wasted Claim: should be possible to make do with 5-6 orders of magnitude less
Resulting philosophy Only commit the minimal resources needed and only when you need them Address space multiplexing Late-bind the assignment of IP addresses to physical machines (on demand assumption of identity) Physical resource multiplexing Multiple VMs per physical machine Exploit memory coherence Delta virtualization (allows ~1000 VMs per physical machine) Flash cloning (low latency creation of on demand VM)
Address space multiplexing For a given unused address range and service time distribution, most addresses are idle /16 network 500ms service time But most of these are horizontal port scans!
The value of scan filtering Heuristic: no more than one (srcip, dstport, protocol) tuple per 60 seconds Max Mean
Implementation Gateway (Click-based) terminates inbound GRE tunnels Maintains external IP address->type mapping i.e should be a Windows XP box w/IIS version 5, etc Mapping made concrete when packet arrives Flow entry created and pkt dispatched to type- compatible physical host VMM on host creates new VM with target IP address VM and flow mapping GC’d after system determines that no state change Bottom line: 3 orders of magnitude savings
Physical resource multiplexing Can create multiple VMs per host, but expensive Memory: address spaces for each VM (100s of MB) In principal limit for VMWare = 64 VMs, practical limit less Overhead: initializing new VM wasteful Claim: can support 100’s-1000 VMs per host by specializing hosts and VMM Specialize each host to software type Maintain reference image of active system of that type Flash cloning: instantiate new VMs via copying reference image Delta virtualization: share state COW for new VMs (state proportional to difference from reference image)
How much unique memory does a VM need?
Potemkin VMM implementation Xen-based using new shadow translate mode New COW architecture being incorporated back into Xen (VT compatible) Clone manager instantiates frozen VM image and keeps it resident in physical memory Flash clone memory instantiated via eager copy of PTE pages and lazy faulting of data pages (moving to lazy + profile driven eager pre-copy) Ram disk or Parallax FS for COW disks Overhead: currently takes ~300ms to create new VM Highly unoptimized (e.g. includes python invocation) Goal: Pre-allocated VM’s can be invoked in ~5ms
Containment Key issue: 3 rd party liability and contributory damages Honeyfarm = worm accelerator Worse, I knowingly allowed my hosts to be infected (premeditated negligence) Export policy tradeoffs between risk and fidelity Block all outbound packets: no TCP connections Only allow outbound packets to host that previously send packet: no outbound DNS, no botnet updates Allow outbound, but “scrub”: is this a best practice? In the end, need fairly flexible policy capabilities Could do whole talk on interaction between technical & legal drivers But it gets more complex…
Internal reflection If outbound packet not permitted to real internet, it can be sent back through gateway New VM generated to assume target address (honeyfarm emulates external Internet) Allows causal detection (A->B->C->D) and can dramatically reduces false positives However, creates new problem: Is there only one version of IP address A? Yes, single “universe” inside honeyfarm No isolation between infections Also allows cross contamination (liability rears its head again) No, how are packets routed internally?
Causal address space aliasing A new packet i destined for address t, creates a new universe U it Each VM created by actions rooted at t is said to exist in the same universe and a single export policy is shared In essence, the 32-bit IP address space is augmented with a universe-id that provides aliasing Universes are closed; no leaking What about symbiotic infections? (e.g., Nimda) When a universe is created it can be made open it to multiple outside influences Common use: a fraction of all traffic is directed to a shared universe with draconian export rules
Overall challenges for honeyfarms Depends on worms scanning it What if they don’t scan that range (smart bias) What if they propagate via , IM? (doable, but privacy issues) Camouflage Honeypot detection software exists… perfect virtualization tough It doesn’t necessary reflect what’s happening on your network (can’t count on it for local protection) Hence, there is a need for both honeyfarm and in-situ approaches
Summary Potemkin: High-fidelity, scalable honeyfarm Fidelity: New virtual host per packet Scalability: 10M IP addresses 100 physical machines Approach Address multiplexing: late-bind IPs to VMs (10 3 :1) Physical multiplexing: VM coherence, state sharing Flash cloning: Clone from reference image (milliseconds) Delta virtualization: Copy-on-write memory, disk (100+ VMs per host) Containment Risk vs. fidelity: Rich space of export policies in gateway Challenges Attracting attacks, camouflage, denial-of-service