Download presentation
Presentation is loading. Please wait.
Published byEustace Williamson Modified over 9 years ago
1
SAN FRANCISCO, CA, USA Adaptive Energy-efficient Resource Sharing for Multi-threaded Workloads in Virtualized Systems Can HankendiAyse K. Coskun Boston University Electrical and Computer Engineering Department This project has been partially funded by:
2
Computing in Heterogeneous, Autonomous 'N' Goal-oriented Environments Energy Efficiency in Computing Clusters Energy-related costs are among the biggest contributors to the total cost of ownership. Consolidating multiple workloads on the same physical node improves energy efficiency. 2 (Source: International Data Corporation (IDC), 2009)
3
Computing in Heterogeneous, Autonomous 'N' Goal-oriented Environments Multi-threaded Applications in the Cloud HPC applications are expected to shift towards cloud resources. Resource allocation decisions significantly affect the energy efficiency of server nodes. Energy efficiency is a function of application characteristics. 3
4
Computing in Heterogeneous, Autonomous 'N' Goal-oriented EnvironmentsOutline Background Methodology Adaptive Resource Sharing Results Conclusions 4
5
Computing in Heterogeneous, Autonomous 'N' Goal-oriented EnvironmentsBackground Cluster-level VM Management -Consolidation policies across server nodes -VM migration techniques [Srikantaiah, HotPower’08] [Bonvin, CCGrid’11] Node-level Management Recent Co-scheduling policies -Co-scheduling contrasting workloads -Balancing performance events across nodes -Cache misses -IPC -Bus accesses [Dhiman, ISLPED’09] [Bhadauria, ICS’10] -Co-scheduling based on thread communication -Identifying best thread mixes to co-schedule [Frachtenberg, TPDS’05] [McGregor, IPDPS’05] 5
6
Computing in Heterogeneous, Autonomous 'N' Goal-oriented Environments Virtualized System Setup 12-core AMD Magny Cours Server 2x 6-core dies attached side by side in the same package Private L1 and L2-caches for each core 6 MB shared L3-cache for each 6-core die 6 Virtualized through VMware vSphere 5 ESXi hypervisor 2 Virtual Machines (VM) with Ubuntu Server Guest OS
7
Computing in Heterogeneous, Autonomous 'N' Goal-oriented Environments Methodology: Measurement Setup System-level power measurements at 1s sampling rate Performance counter collection through vmkperf at 1s sampling rate Counters: CPU cycles, retired instructions, L3-cache misses VM-level CPU and memory utilization data collection through esxtop with 2s sampling rate System-level power measurement Logger esxtop vmkperf 7
8
Computing in Heterogeneous, Autonomous 'N' Goal-oriented Environments Parallel Workloads PARSEC 2.1 benchmark suite [Bienia et al., 2008] BenchmarkApplicationIPCMemory Acc. blackscholesFinancial AnalysisLow bodytrackComputer VisionHighMedium cannealVLSI DesignLowHigh dedupEnterprise StorageMediumLow ferretSimilarity SearchMediumLow freqmineData MiningHighLow swaptionsFinancial AnalysisHighLow streamclusterData MiningLowHigh vipsMedia ProcessingHighLow x264Media ProcessingMedium 8
9
Computing in Heterogeneous, Autonomous 'N' Goal-oriented Environments Tracking Parallel Phases consolmgmt Consolidation management interface Synchronizes ROI (region-of-interest) of multiple workloads consolmgmt parsecmgmt hooks.c roi-Trigger() start-Logging Input (Serial) Output (Serial) Input (Serial)Output (Serial) Benchmark A Benchmark B sleep() start-Logging() end-Logging() roi-Trigger() 9
10
Computing in Heterogeneous, Autonomous 'N' Goal-oriented Environments Performance Impact of Consolidation Consolidating multiple workloads can degrade performance due to resource contention. Virtualization provides performance isolation by managing memory and NUMA node affinities. With native OS, performance variation is 2.5x higher. 10 Average throughput of Streamcluster when co- scheduled with another PARSEC benchmark
11
Computing in Heterogeneous, Autonomous 'N' Goal-oriented EnvironmentsOutline Background Methodology Adaptive Resource Sharing Results Conclusions 11
12
Computing in Heterogeneous, Autonomous 'N' Goal-oriented Environments Impact of Application Selection Previous co-scheduling policies focus on application selection to improve energy efficiency. Application selection is based on balancing memory operations and CPU usage. 12 A B C D
13
Computing in Heterogeneous, Autonomous 'N' Goal-oriented Environments Predicting Power Efficiency To improve the energy efficiency, we need to allocate more CPU resources to power-efficient workloads. IPC*CPU Utilization metric shows strong correlation with power efficiency. 13 IPC*CPU Utilization
14
Computing in Heterogeneous, Autonomous 'N' Goal-oriented Environments IPC*CPU Utilization metric is used to classify applications according to their power efficiency levels. We utilize density based clustering algorithm (DBSCAN) to determine application groups based on their power efficiency classes. Application Classification 14
15
Computing in Heterogeneous, Autonomous 'N' Goal-oriented Environments IPC*CPU Utilization metric is used to classify applications according to their power efficiency levels. We utilize density based clustering algorithm (DBSCAN) to determine application groups based on their power efficiency classes. Application Classification Case 2 VM1 ESXi VM0 VM1 ESXi VM0 Benchmarks Case 1 VM Configuration 15
16
Computing in Heterogeneous, Autonomous 'N' Goal-oriented Environments Reconfiguring Resource Allocations CPU hot-plugging: Adding/removing vCPUs during runtime Cons: Removing vCPU is not supported in some OSes Resource Allocation Adjustment: Allocating/limiting CPU resources for VMs Pros: Fine granularity (resource allocation unit is MHz) Both techniques have low overhead, less than 1%. 16 Resource Configuration Comparison
17
Computing in Heterogeneous, Autonomous 'N' Goal-oriented Environments Reconfiguration Runtime Behavior Resource allocation limits can be dynamically adjusted according to application classes. CPU allocation limits can be effectively reconfigured within a second. 17
18
Computing in Heterogeneous, Autonomous 'N' Goal-oriented EnvironmentsResults Proposed approach improves throughput-per-watt by up to 25% and by 9% on average. 18
19
Computing in Heterogeneous, Autonomous 'N' Goal-oriented Environments Results We generate 50 workload sets, each consists of randomly selected 10 PARSEC applications. 19 Set 23x canneal 3x ferret 2x bodytrack 1x dedup 1x vips Set 1 4x blackscholes 2x vips 1x bodytrack 1x freqmine 1x streamcluster 1x swaptions
20
Computing in Heterogeneous, Autonomous 'N' Goal-oriented Environments Results We generate 50 workload sets, each consists of randomly selected 10 PARSEC applications. Proposed resource sharing technique improves the throughput-per- watt by 12% on average in comparison to application selection based co-scheduling techniques. 20
21
Computing in Heterogeneous, Autonomous 'N' Goal-oriented Environments Conclusions & Future Work Consolidation is a powerful technique to improve the energy efficiency on data centers. Energy efficiency of parallel workloads varies significantly depending on application characteristics. Adaptive VM configuration for parallel workloads improves the energy efficiency by 12% on average over existing co-scheduling algorithms. Future research directions include: Investigating the effect of memory allocation decisions on energy efficiency; Utilizing application-level instrumentation to explore power/energy optimization opportunities; Expanding the application space. 21
22
Computing in Heterogeneous, Autonomous 'N' Goal-oriented Environments Performance Comparison 22
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.