Presentation is loading. Please wait.

Presentation is loading. Please wait.

Heracles: Improving Resource Efficiency at Scale ISCA’15 Stanford University Google, Inc.

Similar presentations


Presentation on theme: "Heracles: Improving Resource Efficiency at Scale ISCA’15 Stanford University Google, Inc."— Presentation transcript:

1 Heracles: Improving Resource Efficiency at Scale ISCA’15 Stanford University Google, Inc.

2 Outline Introduction Design ◦ Isolation Mechanisms ◦ Controllers Evaluation Conclusion

3 Motivation Average server utilization in most datacenter is low, ranging between 10%~50%. ◦ Difficult to consolidate the latency-critical services on a subset of highly utilized servers. Increase the server utilization by launching best-effort tasks on the same server with a latency-critical job.

4 Motivation(Cont.) Previous works tend to protect LC workloads, but reduce the opportunities for higher utilization through co-location.

5 Goal Eliminate SLO violations at all levels of load for the LC job while maximizing the throughput for BE tasks.

6 Heracles A real-time, feedback-based controller ◦ Enables the safe co-location of best-effort(BE) tasks alongside a latency-critical(LC) service. ◦ Ensures that LC jobs meet their target while maximizing the resources given to BE tasks.

7 Heracles(Cont.) ◦ Four hardware and software isolation mechanisms.  Hardware: shared cache partitioning, fine-grained power/frequency setting.  Software: core isolation, network traffic control.

8 Isolation Mechanisms(Soft) Core isolation ◦ Pin workload to a set of core using cpuset cgroups. ◦ Speed of (re)allocation: tens of milliseconds. Network traffic ◦ Limit the outgoing bandwidth of BE tasks using Linux traffic control. ◦ No limit on LC job. ◦ Take effect in less than hundreds of milliseconds.

9 Isolation Mechanisms(Hard) LLC isolation ◦ Cache Allocation Technology(CAT) in recent Intel chip.  Use way-partitioning to define non-overlapping partitions on LLC.  Take effect in a few milliseconds. ◦ Implement software monitor to track the bandwidth usage of LC and BE jobs.  Scale down the # of cores for BE jobs if LC jobs does not receive sufficient bandwidth.

10 Isolation Mechanisms(Hard)(Cont.) Power isolation ◦ CPU frequency monitoring, Running Average Power Limit(RAPL), and per-core DVFS. ◦ Take effect within a few milliseconds.

11 Design Approach An optimization problem ◦ Maximize utilization with the constraint that the SLO must be met. Heracles ◦ decomposes the high-dimensional optimization problem into many smaller and independent problem.  Decoupling interference sources. ◦ Monitors latency, latency slack, and load.  Adjust the BE job allocation.

12 System Diagram

13 High-level Controller

14 Core & Memory Sub-controller

15 Max Load under SLO

16 Power and Network Sub-controller

17 Evaluation Two sets of experiments ◦ Co-locates LC applications with BE tasks on a single server. ◦ Measuring end-to-end latency of Websearch on tens of servers.  BE tasks are also running. Effective Machine Utilization(EMU) ◦ LC throughput + BE throughput

18 Workloads Three Google production LC workloads: ◦ websearch ◦ ml_cluster  Real-time text clustering using machine learning ◦ memkeyval  In-memory key-value store Run LC workloads with benchmarks that stress a single shared resource. ◦ Stream-LLC, Stream-DRAM, cpu-pwr, iperf, brain, and streetview.

19 Latency of LC Applications

20 EMU

21 Shared Resource Utilization

22 Websearch in Cluster

23 Conclusion Heracles ◦ a heuristic feedback-based system that manages four isolation mechanisms to enable a latency-critical workload to be co-located with batch jobs without SLO violations. ◦ Evaluation on real hardware demonstrates an average utilization of 90% across all evaluated scenarios without any SLO violations for the latency-critical job.

24 Interference Analysis Three Google production LC workloads: ◦ websearch ◦ ml_cluster  Real-time text clustering using machine learning ◦ memkeyval  In-memory key-value store Run LC workloads with synthetic benchmarks that stress each resource in isolation.

25


Download ppt "Heracles: Improving Resource Efficiency at Scale ISCA’15 Stanford University Google, Inc."

Similar presentations


Ads by Google