Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chih-Hsun Chou Daniel Wong Laxmi N. Bhuyan

Similar presentations


Presentation on theme: "Chih-Hsun Chou Daniel Wong Laxmi N. Bhuyan"— Presentation transcript:

1 Chih-Hsun Chou Daniel Wong Laxmi N. Bhuyan
DynSleep: Fine-grained Power Management for a Latency-Critical Data Center Application Chih-Hsun Chou Daniel Wong Laxmi N. Bhuyan

2 Outline Background & Motivation DynSleep Prototype with Memcached
Data Center Workload Characteristics. Existing Approaches. DynSleep Prototype with Memcached Experimental Evaluation

3 Data Center Latency-Critical Workloads Characteristics
Server utilization Lightly loaded. Short-term variability. Request processing ON/OFF execution pattern. Non-deterministic. Poor energy efficiency at low server utilization

4 Power Saving Opportunities
Target QoS is defined at peak load. Low utilization servers create latency slack. Exploiting this slack for power saving. Tail latency under light load Target tail latency Latency Slack

5 Existing Approaches DVFS: reducing the processing rate. Sleep States
Limited room for down scaling. Limited power saving. Per-core control is not common. Sleep States Limited by the length of idle periods. Frequency (GHz) 2.7 2.4 2.1 1.8 1.5 1.2 Voltage (V) 0.99 0.96 0.94 0.92 0.90 0.88 Active Power (W) 3.42 2.93 2.49 2.05 1.68 1.31 56% frequency reduction 13% voltage reduction 61% power reduction State State transition time Target Residency Power C0 N/A (3~3.5 W) C1 1 μs 1.2 W C3 59 μs 156 μs 0.13 W C6 89 μs 300 μs 0 W

6 Observations Our Solution: DynSleep
Short idle periods cause high idle power. Traffic variability. Fine-grained control over time and space domain. Our Solution: DynSleep

7 DynSleep: Overview Utilizing per-core sleep state. (space domain)
Postponing the request service. Transform scattered idle periods into a longer one for deep sleep state. (reduce idle power) Dynamically determine core wake-up time. Satisfy the target tail latency constraint. (time domain)

8 DynSleep: Example at t=A2 at t=A3 at t=A1 t=0 W3 W1 time R1 arrives
Target Tail Latency Target Tail Latency Target Tail Latency R1 arrives R2 arrives R3 arrives

9 DynSleep: Power consumption behavior
Baseline DynSleep Active Shallow sleep Deep sleep Time

10 Case Study: Memcached Clients Worker Thread ∙ ∙ ∙ Libevent Request Processing Send Response req result Read and Parse Data fd Client send requests Memcached Server libevent monitors network sockets through epoll for the request arrivals. Independent threads and requests.

11 Memcached with DynSleep
Libevent Thread Request Processing Thread Clients Libevent Request Processing Send Response DynSleep Manager DynSleep Calculator req result fd1 fd2 Register/Update Timer Thread Communication Read and Parse Data fd Wakeup signal Sleep signal Two separate threads. A core is woken up by wakeup signal.

12 Evaluation: Experiment Setup
A client server and a request processing server connected over 10G Ethernet. Intel Xeon E V2 12-core processor. Only support per-core DFS. On-chip energy sensors with 1KHz sampling rate.

13 Evaluation: Power Saving
At low to medium load, lager latency slack leads to high power saving of DynSleep. At high load, DynSleep power saving is comparative to DVFS scheme. DynSleep significantly outperforms per-core DFS scheme.

14 Evaluation: Latency Distribution
Baseline 95th: 187µs DFS 95th: 448µs Target : 686µs DynSleep 95th: 665µs Load 0.3 Baseline has about 500 µsec latency slack. Large gap still left in DFS(DVFS) scheme because of the limited VF down scaling. DynSleep effective close the gap by postponing request processing.

15 Evaluation: Load Changes
At low load, DFS can’t fully utilize the latency slack. At high load, DFS lacks the responsiveness and frequently violate the constraint. DynSleep responds instantaneously to the load changes because of the request level updates.

16 Conclusion Major source of the energy inefficiency comes from the idle power. Non-deterministic and short idle periods. We propose DynSleep Reshape the idle periods pattern. Utilize deep sleep states. Dynamically wake up to meet the strict QoS constraint. Our memcached prototype demonstrates up to 65% power saving.


Download ppt "Chih-Hsun Chou Daniel Wong Laxmi N. Bhuyan"

Similar presentations


Ads by Google