Presentation is loading. Please wait.

Presentation is loading. Please wait.

Mingzhe Hao, Huaicheng Li, Michael Hao Tong,

Similar presentations


Presentation on theme: "Mingzhe Hao, Huaicheng Li, Michael Hao Tong,"— Presentation transcript:

1 MittOS Supporting Millisecond Tail Tolerance with Fast Rejecting SLO-Aware OS Interface
Mingzhe Hao, Huaicheng Li, Michael Hao Tong, Chrisma Pakha, Riza O. Suminto, Cesar A. Stuardo, Andrew A. Chien, and Haryadi S. Gunawi

2 SOSP’17 Millisecond Matters! Amazon: “every 100ms of latency costs 1% in sales” Tabb Group: “broker could lose as much as $4 million in revenues per millisecond if its electronic trading platform was only 5ms behind the competition” Google: “extra 500ms in search page generation time dropped traffic by 20%”

3 Millisecond Tail Latency
SOSP’17 Millisecond Tail Latency Parallel task #1 10ms Parallel task #2 30ms IO contention Completion 30ms

4 Current Tail-Tolerant Mechanisms
SOSP’17 Current Tail-Tolerant Mechanisms 1. Speculation 2. Cloning Introduces 2x workload Most popular Straggler! 3. Snitching Does not work when burstiness fluctuates in ms-level Wait Completion 20ms Backup

5 Must Wait! No Wait ? App OS Wait Completion 20ms Backup Completion
SOSP’17 Must Wait! No Wait ? My disk is busy! Try elsewhere Disk Queue App OS Wait ok Fast reject Completion 20ms Backup Failover (no-wait) Completion 10ms + network-hop

6 OS can see “everything” and tell app when it is busy
SOSP’17 OS can see “everything” and tell app when it is busy Use-Case I want < 20ms latency App OS 1 SLO = 20ms 3 Disk Queue ret = read(.., SLO) 2 OS Fast Reject (no-wait) 5 if ( ret == Reject) // failover 4 Reject fast Latency = 10ms + network-hop

7 Transparent of busyness
SOSP’17 MittOS App OS MittOS Principles SLO-aware interface Reject fast Transparent of busyness PC era: is best effort (cannot reject IOs) DC era: Less-busy replicas available Disk Queue ret = read(.., < 20ms) if ( ret == Reject) // failover Reject

8 What is the OS queue policy?
SOSP’17 Challenge App ret = read(.., < 20ms) OS Should I reject this IO? What is the OS queue policy? What is the device type? FIFO, elevator, CFQ, etc. Prediction depends on queue policy and device type

9 50% latency reduction above 75 percentile
SOSP’17 Contribution +50 LOC MittOS-powered vs. state of the art: hedged requests, cloning, application timeout, etc. Fast Reject Interface MittOS Latency Prediction Cut tail: 50% latency reduction above 75 percentile Open-Channel SSD Disk OS Cache MittOS principle: Support fast rejecting SLO-aware interface

10 Outline Introduction Design Evaluation Conclusion Challenges Solutions
SOSP’17 Outline Introduction Design Challenges Solutions Evaluation Conclusion

11 Prediction ret = read(.., < 20ms) App OS Tail or
SOSP’17 Prediction ret = read(.., < 20ms) How to predict latency before submitting to the device? App OS Tail Latency < SLO → Accept Latency > SLO → Reject or How many IOs in front? How long? Head

12 Challenge #1: Modeling Queue Policy
SOSP’17 Challenge #1: Modeling Queue Policy SLO < 20ms App Reject / Accept depends on queue policy OS > 20ms Reject Low Priority > 20ms Reject 50ms 40ms High Priority 30ms Outstanding IOs < 20ms Accept 20ms 10ms Elevator + CFQ FIFO Elevator

13 Reject / Accept depends on device type
SOSP’17 Challenge #2: Device Type Parallel channels & chips Reject / Accept depends on device type FIFO FIFO < 20ms Accept > 20ms Reject Single spindle Disk

14 Challenge #2: Device Type
SOSP’17 Challenge #2: Device Type Idiosyncrasies of devices are mostly unrevealed End of queue! Reject? IO Offset Elevator 200 OS prediction incorrect! OS prediction incorrect! 700 OS OS Too many! Reject! 600 250 SSTF 700 600 Re-sort, thus fast, Accept! 200 Remap to fast chip, Accept! Scheduling algorithm 250

15 Outline Introduction Design Evaluation Conclusion Challenges Solutions
SOSP’17 Outline Introduction Design Challenges Solutions Evaluation Conclusion

16 White-box knowledge required
SOSP’17 Reject/Latency Prediction Reject? = f ( ) SLO, queue policy, device type Get from source-code. e.g. CFQ, noop Simple type Complicated type Profiling is enough White-box knowledge required MittCFQ MittSSD

17 MittCFQ OS Disk Scheduling? Seek latency? Transfer time?
SOSP’17 MittCFQ Contains user group management Contains 3 service trees Contains 7 different IO priorities Contains ~4500 LOC Reverse engineer Which tree/queue each IO belongs to? How many IOs in front? Open-sourced CFQ OS Disk Scheduling? Seek latency? (depends on seek distance) Black box Transfer time? (depends on IO size)

18 + concurrent IO profiling
SOSP’17 For each interval in [ 100MB, 200MB, ..., 1GB ] do: for (startOffset = 0; startOffset < maxOffset; startOffset += interval) { for (endOffset = 0; endOffset < maxOffset; endOffset += interval) { for (size = 0; size < maxSize; size += sizeInterval){ start_ts = gettimeofday(); seek (startOffset); read (endOffset, size); end_ts = gettimeofday(); latency = start_ts - end_ts; print (endOffset - endOffset, size, latency); } MittCFQ Profiling Random seek Random read Collect latency scikit-learn 2 disk models 11-hour profiling + concurrent IO profiling Latency Linear Regression Infer Seek Distance SSTF scheduling IO Size 1 million entries (30MB memory overhead) For 1TB drive Accurate prediction

19 Which channel/chip? Fast? Busy?
SOSP’17 Which channel/chip? Fast? Busy? MittSSD OS Too complex to model! FTL invisible to OS! Write lat. variability FTL Fast? Busy? 1ms 2ms Lower MLC bits 1ms Pages 1ms Invisible dynamic GC 2ms 1ms Upper MLC bits 2ms

20 MittSSD Software-defined flash OS LightNVM GC
SOSP’17 OS can see #outstanding IOs to every chip/channel MittSSD Accurate prediction OS knows where IOs are mapped Software-defined flash FTL at host LightNVM OS GC OS knows page-level latencies Open-Channel SSD OS can track every single IO 1ms Pages 2ms OS can capture all GCs 1ms 1ms

21 Reject/Latency Prediction
SOSP’17 Reject/Latency Prediction Reject? = f ( ) SLO, queue policy, device type Reverse engineering based on source code Simple type Complicated type Profiling is enough White-box knowledge required MittCFQ ~1800 LOC MittSSD ~1400 LOC LightNVM + Open-Channel SSD

22 Other Solved Challenges
SOSP’17 Other Solved Challenges Prediction overhead optimizations Avoids going through every IO in the queue Reduces overhead from O(n) to roughly O(1) Shows < 5μs overhead for MittCFQ prediction < 300ns for MittSSD prediction MittCache Prediction for OS Cache Please refer to the paper!

23 Outline Introduction Design Evaluation Conclusion Tail reduction
SOSP’17 Outline Introduction Design Evaluation Tail reduction Latency prediction accuracy Conclusion

24 MittCFQ-powered MongoDB
SOSP’17 MittCFQ-powered MongoDB Metric: CDF of all get() requests latencies (total 6 million data points) Remote YCSB client #1 Client #2 Client #20 Can failover get() get() get() 3 replicas Noisy neighbors based on EC2 data Physical node #1 Node #2 Node #20

25 Baseline Better! CDF (percentile) 13ms at p95 > 40ms above p98
SOSP’17 Baseline Better! 13ms at p95 > 40ms above p98 CDF (percentile) Next slides: use 13ms deadline SLO for Hedged & MittCFQ 90th percentile

26 SOSP’17 Clone Tail reduction Worse performance < p95

27 Hedged Requests (b) Waits for 13ms timeout (a) Sends first request
SOSP’17 Hedged Requests (b) Waits for 13ms timeout (a) Sends first request (d) Picks faster response (c) Sends secondary request Communications of the ACM, vol. 56 (2013), pp

28 SOSP’17 Wait Secondary Tail reduction Little extra workload

29 MittCFQ Wait Cut ms tail! Fast reject No-wait wins! Secondary Failover
SOSP’17 MittCFQ Wait Secondary Failover Cut ms tail! Fast reject No-wait wins!

30 Tail amplified & more improvement space
SOSP’17 Tail amplified at Scale S Parallel get() #1 MittCFQ get() #2 Hedged Base Up to 2x speedup above p75 get() #S Tail amplified & more improvement space

31 Accuracy Evaluation Metrics: MittCFQ MittSSD
SOSP’17 Accuracy Evaluation MittCFQ MittSSD Metrics: False positive: IO rejected, but deadline is met False negative: Deadline violated, but IO is not rejected Disk Open-Channel SSD 5 real-world block-level traces DAPPS DTRS TPCC EXCH LMBE

32 Among incorrect cases:
SOSP’17 Accuracy Evaluation Among incorrect cases: Only <1% inaccuracy! MittCFQ: < 3ms diff MittSSD: < 1ms diff

33 MittSSD MittCache Riak MittCache MittCFQ MittSSD
SOSP’17 MittSSD MittCache Riak MongoDB + Filebench + Hadoop All in one SLO 13ms SLO 20ms SLO 5ms MittCache MittCFQ MittSSD

34 Fast Reject (No-wait) Interface
SOSP’17 Conclusion MittOS State of art Latency CDF Cuts ms tail! MittOS-powered apps Fast Reject (No-wait) Interface MittOS Latency Predictions Thank you! Questions?


Download ppt "Mingzhe Hao, Huaicheng Li, Michael Hao Tong,"

Similar presentations


Ads by Google