Download presentation
Presentation is loading. Please wait.
Published byNorman Fletcher Modified over 9 years ago
1
Surviving Internet Catastrophes Flavio Junqueira, Alejandro Hevia, Ranjita Bhagwan, Keith Marzullo, and Geoffrey M. Voelker Hot Topics in Operating Systems (HotOS’03) USENIX Annual Technical Conference (USENIX’05) University of California, San Diego - 2004
2
2 A typical day of an Internet worm… Host A runs the Widows OS Host B runs runs the Sux OS Shut up!! I exploit a vulnerability in the Widows OS … AB …but not in the Sux OS! Data
3
3 Outline Introduction System model Host diversity Searching for replica sets heuristics and simulations The Phoenix Recovery System Implementations Security issues Prototype evaluation Conclusions
4
4 Setting up the stage Past worm outbreaks Code red (2001): compromised over 359,000 hosts Nimda (2001): multiple forms of infection Slammer (2003): fastest worm in history (90% of vulnerable hosts in 10 minutes) Witty (2004): first to contain malicious payload Coping with worms Containment is hard [Moore03] Not possible if human intervention required Automatic detection [Singh04] Problem: Network evasion Recover from catastrophes [HotOS03] Goal: minimize data loss
5
5 Defining the problem How are Internet pathogens successful? Shared vulnerabilities Vulnerability: design or implementation flaw in a software system Survivable data Replicate data Informed replication Replica sets based on shared vulnerabilities How do we identify sets of vulnerabilities? Common software systems Leverage Internet diversity
6
6 Challenges Understand the limitations Appropriate settings Quantify diversity Searching for replica sets Scalable Balance load Small replica sets
7
7 System model A set of hosts (H) A host fails by losing its state A set of attributes (A) Attribute = software system Operating systems + Applications Configuration One operating system Applications A set of configurations ( ) Attributes (Software systems) Hosts {,, } Configurations
8
8 Cores A set S H is a core iff: Ideally A’ = A Cores Hosts {,, } Configurations
9
9 Host diversity Diversity: distribution of configurations Skewed: not uniform Study of the UCSD network nmap tool Port scans: detect open ports OS fingerprinting: guess OS out of error messages Total number of scanned devices: 11,963 2,963 general-purpose hosts (port data + OS) Conservative assumptions Same open port = run the same service Ignore OS versions
10
10 Top 10 operating systems and services OSService Windows54.1%netbios-ssn55.3% Solaris10.1%epmap50.4% Mac OS X10.0%microsoft-ds39.0% Linux10.0%sshd30.7% Mac OS6.9%sunrpc25.3% FreeBSD2.2%active directory24.8% IRIX2.0%smtp19.4% HP-UX1.1%httpd18.0% BSD/OS0.9%ftpd17.8% Tru64 Unix0.7%printer15.6%
11
11 Configuration distribution Distribution is skewed 50% of hosts comprise: All: 20% Multiple: 15% Top 100: 8%
12
12 Visualizing diversity Qualitative view More diversity across operating systems Still a fair amount of diversity for the same OS
13
13 Searching for cores What is the practical problem? Determine replica sets Our approach: find cores Computing a core of optimal size is NP-complete Use heuristics Host as both client and server Client: request cores Server: participates in cores Core Host that requests it (original copy) Replicas
14
14 Basic idea Configuration {,, } Configuration {,, } Configuration {,, } Configuration {,, } Attributes (Software systems) or Possible cores
15
15 Representing advertised configurations Container abstraction Containers (B) One for each operating system in A Each container b B has a set SB(b) of sub-containers, one for each non-OS attribute in A A host h advertises its configuration by associating itself with every sub-container s SB(b) b is the container for the OS of h s is the sub-container in SB(b) for some attribute of h
16
16 Container abstraction {,, }
17
17 Heuristics Random Ignore configurations Choose randomly a number n of hosts from H Uniform I.Different OS 1.Choose a container b randomly 2.Choose a sub-container sb randomly from b 3.Choose a host randomly from sb II.Same OS (same b where h is placed) 1.Choose a sub-container sb randomly from b 2.Choose a host randomly from sb Weighted: containers weighted by the number of hosts Doubly-weighted: sub-containers also weighted
18
18 Simulations Population: 2,963 general-purpose hosts One run: Each host computes a core Questions How much replication is needed? How many other hosts a particular host has to service? How well chosen cores protect hosts? Metrics Average core size (core size) Core size averaged across all the hosts Maximum load (load) Maximum number of other hosts that any host services Average coverage (coverage) Coverage: percentage of attributes covered in a core
19
19 A sample run Random Better load balance Worse coverage Worse core size Load is too high for other heuristics Proposed modification Limit the load of each host Intuition: force load balance Each host services at most L other hosts L = load limit or simply limit Core size CoverageLoad Random50.97712 Uniform2.560.9997284 Weighted2.640.999584 DWeighted2.580.999791
20
20 Core size Random increases linearly with load Intrinsic to the heuristic Other heuristics Core size less than 3 For many hosts, one single replica
21
21 Coverage Lower bound on limit: 2 Dependent on the diversity Uniform: limit at least 3 to achieve 3 nines coverage Weighted: achieves 3 nines coverage for limit values at least 2 Random: core size at least 9 to achieve same coverage
22
22 Uncovered hosts Share of hosts that are not fully covered is small Uniform Limit 3: slightly over 1% Limit > 4: around 0.5% Weighted Around 0.5% Random Core size greater than 8 to achieve similar results
23
23 Load variance Downside of uniform Worst variance Variance is similar for small values of limit Load limit forces better distribution
24
24 Summary of simulation results How many replicas are needed? Around 1.3 on average How many other hosts a particular host has to service? Uniform: 3 for good coverage Weighted: 2 for good coverage How well chosen cores protect hosts? Uniform: coverage greater than 0.999, L 3 Weighted: coverage greater than 0.999, L 2 Uniform heuristic Simpler Weighted heuristics Better load balance
25
25 Translating to real pathogens Uniform, limit > 3, tolerates with high probability attacks to a single attribute Previous worms One or more vulnerabilities on a single platform Our approach tolerates Attacks to vulnerabilities on the same software system, possibly cross-platform Attacks to vulnerabilities on different software systems in the same platform Attacks to vulnerabilities on different software systems, cross-platform Extensible approach
26
26 Exploits on k attributes Illustrate with k=2 A variant of uniform 1.Client c chooses a host h with different OS 2.Find a core for c using uniform 3.Find a core for h using uniform 4.Combine the 2 cores to form a 2-resilient core L2-cov1-covCore size 50.760.864.18 60.860.924.58 70.950.995.00 80.971.005.11 90.981.005.16 100.981.005.17
27
27 The Phoenix Recovery System Backup data on cores Requirement: set of operating systems and applications is not known Macedon framework Pastry DHT Advertising configurations Container Zone Sub-container Sub-zone OS hint lists Empty zones Doesn’t need to be accurate
28
28 Protocol
29
29 Security in Phoenix Using security primitives Security goals Data privacy : no host other than the owner of the data can obtain any partial information from the data stored on a server host Data integrity : any tampering of the backup data should be detectable by the client host Data availability : if a client stores data in an honest server, then it is eventually able to recover its data Two modes Basic: software libraries Enhanced: requires devices such as smartcards Cannot prevent servers from acting maliciously Proofs of operations
30
30 Prototype evaluation On PlanetLab Total number of hosts: 63 62 PlanetLab hosts 1 UCSD host Configurations manually set 63 randomly chosen out of the 2,963
31
31 Evaluation results Simulated attack Parameters Backup file: 5MB L = 3 Interval between announcements: 120s Target: Windows hosts (60%) Caused hosts to crash almost simultaneously All hosts recovered For 35: avg 100s For 3: several minutes (transient network failures) LCore sizeCoverageLoad var. Imp.Sim.Imp.Sim.Imp.Sim. 32.122.22111.651.94 52.102.23112.882.72 72.102.22114.443.33 Imp. = implementation Sim. = simulation
32
32 Conclusions Informed replication Replica sets based on attributes Internet catastrophes: software systems Survivable data at a low replication cost Core size is less than 3 on average Hosts service at most 3 other hosts Diversity study Approach is realistic Side-effects of load limit scheme Upper bounds the amount of work any host has to accomplish Constrain damage in case of individual malicious behavior
33
33 Future work Real deployment Tune current prototype Security features Cope with real threats More data sets to determine diversity Mechanism to monitor resource usage Informed replication With other approaches for cooperative backup With other types of attributes E.g. Resource utilization
34
34 END
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.