Download presentation
Presentation is loading. Please wait.
Published byPercival McCoy Modified over 9 years ago
1
Resource-Freeing Attacks: Improve Your Cloud Performance (at Your Neighbor's Expense) (Venkat)anathan Varadarajan, Thawan Kooburat, Benjamin Farley, Thomas Ristenpart, and Michael Swift 1 D EPARTMENT OF C OMPUTER S CIENCES
2
Public Clouds (EC2, Azure, Rackspace, …) VM Multi-tenancy Different customers’ virtual machines (VMs) share same server Why multi-tenancy? Improved resource utilization VM 2
3
Implications of Multi-tenancy VMs share many resources – CPU, cache, memory, disk, network, etc. Virtual Machine Managers (VMM) – Goal: Provide Isolation Deployed VMMs don’t perfectly isolate VMs – Side-channels [Ristenpart et al. ’09, Zhang et al. ’12] 3 Today: Performance degraded by other customers VM VMM
4
Contention in Xen 4 Local Xen Testbed MachineIntel Xeon E5430, 2.66 Ghz CPU2 packages each with 2 cores Cache Size6MB per package VM Non-work-conserving CPU scheduling Work-conserving scheduling 3x-6x Performance loss Higher cost
5
This work: Greedy customer can recover performance by interfering with other tenants Resource-Freeing Attack What can a tenant do? 5 Pack up VM and move (See our SOCC 2012 paper) … but, not all workloads cheap to move VM Ask provider for better isolation … requires overhaul of the cloud
6
Resource-freeing attacks (RFAs) What is an RFA? RFA case studies 1.Two highly loaded web server VMs 2.Last Level Cache (LLC) bound VM and highly loaded webserver VM Demonstration on Amazon EC2 6
7
The Setting Victim: – One or more VMs – Public interface (eg, http) Beneficiary: – VM whose performance we want to improve Helper: – Mounts the attack Beneficiary and victim fighting over a target resource Helper 7 VM Victim Beneficiary
8
Example: Network Contention 8 Net Clients What can you do? Victim Beneficiary Local Xen Test bed
9
Ways to Reduce Contention? Break into victim VM and disable it 9 Net Clients Local Xen Test bed But: Requires knowledge of vulnerability Drastic Easy to detect Helper Victim Beneficiary The good: frees up resources used by victim
10
Ways to Reduce Contention? Do a simple DoS attack? This may NOT free up target resources 10 Net Clients Local Xen Test bed Backfires: May increase the contention Helper SYN flood Victim Beneficiary
11
Recipe for a Successful RFA Shift resource away from the target resource towards the bottleneck resource 11 Shift resource usage via public interface Proportion of Network usage CPU intensive dynamic pages Static pages Proportion of CPU usage Push towards CPU bottleneck Reduce target resource usage Limits
12
An RFA in Our Example 12 Net Helper CGI Request CPU Utilization Clients Result in our testbed: Increases beneficiary’s share of bandwidth No RFA: 1800 page requests/sec W/ RFA: 3026 page requests/sec 50% 85% share of bandwidth
13
Shared CPU Cache: – Ubiquitous: Almost all workloads need cache – Hardware controlled: Not easily isolated via software – Performance Sensitive: High performance cost! 13 Resource-freeing attacks 1) Send targeted requests to victim 2) Shift resources use from target to a bottleneck Can we mount RFAs when target resource is CPU cache? Can we mount RFAs when target resource is CPU cache?
14
Cache Contention 14 RFA Goal
15
Case Study: Cache vs. Network Victim : Apache webserver hosting static and dynamic (CGI) web pages Beneficiary: Synthetic cache bound workload (LLCProbe) Target Resource: Cache No cache isolation: – ~3x slower when sharing cache with webserver 15 Net Cache $$$ Clients Local Xen Test bed Victim Beneficiary Core
16
Net Cache vs. Network Victim webserver frequently interrupts, pollutes the cache – Reason: Xen gives higher priority to VM consuming less CPU time Cache 16 Clients $$$ Cache state time line Beneficiary starts to run Core decreased cache efficiency Webserver receives a request Heavily loaded web server cache state
17
Net Cache vs. Network w/ RFA RFA helps in two ways: 1.Webserver loses its priority. 2.Reducing the capacity of webserver. Cache 17 Clients $$$ Cache state time line Core Helper Heavily loaded webserver requests under RFA CGI Request Beneficiary starts to run Webserver receives a request Heavily loaded web server cache state
18
RFA: Performance Improvement 18 RFA intensities – time in ms per second 196% slowdown 86% slowdown 60% Performance Improvement 60% Performance Improvement
19
RFA Effect on Interruptions Beneficiary: LLCProbe 19 40% 85 % x +
20
RFA Effect on Victim’s capacity Decreases with increasing RFA intensity 20
21
Instance typem1.small # of co-resident pairs9 (23 total instances) Machine typeIntel Xeon E5507 with 4MB LLC Experiments on Amazon EC2 VM 21 VM Multiple Accounts Co-resident VMs from our accounts: Stand-ins for victim and beneficiary Separate instances for helper and web clients No direct interact with any other customers Indirect interaction akin to normal usage cases VM
22
LLCProbe Synthetic Benchmark RFA improved performance of LLCProbe on all experimental EC2 instances! Highest performance improvement of 13%, recovering 33% of performance lost. 22 Average performance improvement: 6%
23
mcf from SPEC-CPU 23 10% slowdown 6% slowdown 3% performance improvement = 35% reduction in performance loss On average RFA improved performance across all SPEC workloads!
24
Discussion: Practical Aspects RFA case studies used CPU intensive CGI requests – Alternative: DoS vulnerabilities (Eg. hash-collision attacks) Identifying co-resident victims – Easy on most clouds (Co-resident VMs have predictable internal IP addresses) No public interface? – Paper discusses possibilities for RFAs 24 VM
25
Conclusion Resource-Freeing Attacks – Interfere with victim to shift resource use – Proof-of-concept of efficacy in public clouds Open questions: – Other RFAs? – Countermeasures: Detection, stricter isolation, smarter scheduling? 25 VM
26
References [MMSys10] Sean K. Barker and Prashant Shenoy. “Empirical evaluation of latency-sensitive application performance in the cloud.” In MMSys, 2010. [Security10] Thomas Moscibroda and Onur Mutlu. “Memory performance attacks: Denial of memory service in multi-core systems.” In Usenix Security Symposium, 2007. [CCS09] T. Ristenpart, E. Tromer, H. Shacham, and S. Savage. “Hey, you, get off my cloud: exploring information leakage in third party compute clouds.” In CCS, 2009. 26
27
Backup Slides 27
28
Discussion: Countermeasures Detection? – May be hard to differentiate RFA from legitimate Stricter Isolation? – Works but expensive Contention-aware scheduling – Not yet used in public IaaS 28
29
Discussion: Economies Cost of RFA – Helper instance, and – RFA traffic. Co-resident helper – An efficient implementation of helper can run inside the attacker’s VM. – Current helper implementation consumes 15 Kbps of network bandwidth and a CPU utilization of 0.7%. Multiplex Singe Helper Instance for many beneficiaries. Note: Currently, internal EC2 network traffic is free-of- cost. 29
30
Identifying Co-resident VMs Identifying the public interface: – Predictable numerical distance between internal IP addresses in public clouds. – Identifying port used by the victim application (standard ports like http(s), etc.). 30
31
Experiment: Measuring Resource Contention Synthetic workloads 31
32
Other RFAs RFAs are not limited to the presented case studies. LLC vs. Disk – Sending spurious, random disk requests asynchronously to create a bottleneck for the shared disk resource. Memory vs. Disk – Similarly to the above RFA 32
33
Discussion: More on Practical Aspects Work-conserving vs. Non-work-conserving schedulers – It is expected that public cloud environment manage resources in a non-work-conserving fashion. – Eg. Net vs. Net RFA won’t work on Amazon EC2. Simulated client workload – What is the effect of RFA in the presence of multiple independent client requests originating from numerous clients? 33
34
N/W Core cache memory Disk Hypervisor Dom0 VM Xen Internals Domain-0 – Privileged Domain, direct access to I/O devices. – All I/O requests goes through Dom-0 Xen scheduler internal – Boost priority for interactive workloads Incoming request 34
35
Experiment: Measuring Resource Contention On a local Xen test bed Local Xen Test bed VM N/W Core VM LLC memory Disk VM MachineIntel Xeon E5430, 2.66 Ghz Packages2, 2 cores per package LLC Size6MB per package LLC Not all resources conflict Some have huge performance degradation 35
36
Boost Priority and Interruptions Victim: WebserverBeneficiary: LLCProbe 95% < 30% 40% 85% Fewer interruptions Higher cache efficiency 36
37
Demonstration on EC2 Problem #1: Achieving Co-residence – Launching multiple instances simultaneously from two or more accounts. Problem #2: Verifying Co-residency – Numerical distance between internal IP addresses [CCS09]. – Faster packet round-trip times. – Using resource contention experiments. 37
38
Normalized Performance on EC2 Baseline Higher is better Aggregate performance degradation is within 5 performance points 6% On an average all SPEC workloads benefitted from RFA 38
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.