Presentation is loading. Please wait.

Presentation is loading. Please wait.

Deconstructing Storage Arrays Timothy E. Denehy, John Bent, Florentina I. Popovici, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin,

Similar presentations


Presentation on theme: "Deconstructing Storage Arrays Timothy E. Denehy, John Bent, Florentina I. Popovici, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin,"— Presentation transcript:

1 Deconstructing Storage Arrays Timothy E. Denehy, John Bent, Florentina I. Popovici, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin, Madison

2 Gray-box Research Computer systems becoming more complex Transistors Lines of code Each component is becoming more complex Interactions between subsystems can affect Performance Reliability Power Security

3 Gray-box Research Interfaces remain the same Changes can be difficult and impractical Support multiple platforms or legacy systems Commercial acceptance for wide-spread adoption Hardware and software phenomenon IA-32 instruction set, POSIX OS, SCSI storage Problem: lack of information

4 Gray-box Solution Treat target system as a gray-box General characteristics are known Extract information from an existing interface e.g. determine cache contents Exploit information to control system behavior e.g. access cached data first

5 Gray-box Information Techniques Make assumptions about target system Observe system inputs and outputs Statistical methods Draw inferences about internal structure Microbenchmarks and probes Parameterize system components Observe system under controlled input

6 Gray-box Applications Gray-box techniques have been used to identify Memory hierarchy parameters [Saavedra and Smith] Processor cycle time [Staelin and McVoy] Low-level disk characteristics [Worthington et al.] Buffer cache replacement algorithms [Burnett et al.] File system data structures [Sivathanu et al.] storage array characteristics: Shear

7 Shear Software tool that automatically determines the important properties of a storage array Enables file system performance tuning with knowledge of storage array characteristics Acts as a management tool to help configure, monitor, and maintain storage arrays

8 Outline Introduction Shear Background Algorithm Case Studies Performance: Stripe-aligned Writes Management: Detecting Misconfiguration, Failure Conclusion

9 Shear Goals Determine storage array characteristics 012345678910111213141516171819202122232425262728293031 SCSI

10 Shear Goals Determine storage array characteristics Number of disks 012345678910111213141516171819202122232425262728293031 SCSI

11 Shear Goals Determine storage array characteristics Number of disks Chunk size 012345678910111213141516171819202122232425262728293031 SCSI

12 Shear Goals Determine storage array characteristics Number of disks Chunk size Layout and redundancy scheme 012316171819456789101112131415202122232425262728293031012345678910111213141516171819202122232425262728293031 RAID-0 SCSI

13 Shear Goals Determine storage array characteristics Number of disks Chunk size Layout and redundancy scheme 012316171819 24252627 456720212223 4567 20212223 RAID-1 SCSI 28293031 28293031 012316171819 24252627 012345678910111213141516171819202122232425262728293031

14 Shear Goals Determine storage array characteristics Number of disks Chunk size Layout and redundancy scheme 012320212223 PPPP 456789101112131415 PPPP PPPP 16 171819 PPPP 323334352425262728293031363738394041424344454647 012345678910111213141516171819202122232425262728293031 RAID-5 SCSI

15 Shear Motivation Performance Tune file systems to array characteristics Management Verify configuration Detect failure

16 Shear Techniques Microbenchmarks and probes Controlled, random access read and write patterns Measure response time of access patterns Measure steady-state performance Statistical clustering Automatically classify fast and slow regimes Identify patterns that utilize only a single disk

17 Shear Assumptions Storage array Layout follows a repeatable pattern Composed of homogeneous disks System Able to bypass the file system and buffer cache Little traffic from other processes

18 Outline Introduction Shear Background Algorithm Case Studies Performance: Stripe-aligned Writes Management: Detecting Misconfiguration, Failure Conclusion

19 Shear Algorithm Pattern size Chunk size Layout of chunks to disks Level of redundancy

20 Determining the Pattern Size Find the size of the layout's repeating pattern Not always the stripe size Choose a hypothetical pattern size Perform random reads at multiples of that distance Repeat for a range of pattern sizes Cluster results and identify actual pattern size

21 Pattern Size Example RAID-0 4 Disks 8 KB Chunks

22 Pattern Size Example Testing 2 KB RAID-0 4 Disks 8 KB Chunks

23 Pattern Size Example Testing 4 KB RAID-0 4 Disks 8 KB Chunks

24 Pattern Size Example Testing 6 KB RAID-0 4 Disks 8 KB Chunks

25 Pattern Size Example Testing 8 KB RAID-0 4 Disks 8 KB Chunks

26 Pattern Size Example Testing 10 KB RAID-0 4 Disks 8 KB Chunks

27 Pattern Size Example Testing 12 KB RAID-0 4 Disks 8 KB Chunks

28 Pattern Size Example Testing 14 KB RAID-0 4 Disks 8 KB Chunks

29 Pattern Size Example Testing 16 KB RAID-0 4 Disks 8 KB Chunks

30 Pattern Size Example Testing 18 KB RAID-0 4 Disks 8 KB Chunks

31 Pattern Size Example Testing 20 KB RAID-0 4 Disks 8 KB Chunks

32 Pattern Size Example Testing 22 KB RAID-0 4 Disks 8 KB Chunks

33 Pattern Size Example Testing 24 KB RAID-0 4 Disks 8 KB Chunks

34 Pattern Size Example Testing 26 KB RAID-0 4 Disks 8 KB Chunks

35 Pattern Size Example Testing 28 KB RAID-0 4 Disks 8 KB Chunks

36 Pattern Size Example Testing 30 KB RAID-0 4 Disks 8 KB Chunks

37 Pattern Size Example Testing 32 KB RAID-0 4 Disks 8 KB Chunks

38 Pattern Size Example RAID-0 4 Disks 8 KB Chunks

39 Pattern Size Example RAID-0 4 Disks 8 KB Chunks Actual 32 KB cluster

40 Shear Algorithm Pattern size Chunk size Layout of chunks to disks Level of redundancy

41 Determining the Chunk Size Chunk size amount of data contiguously allocated to one disk Find the boundaries between disks Choose a hypothetical boundary offset Perform random reads on both sides of that offset Repeat for all offsets in the pattern size Cluster results and identify actual chunk size

42 Chunk Size Example RAID-0 4 Disks 8 KB Chunks

43 Chunk Size Example Testing 0 KB RAID-0 4 Disks 8 KB Chunks

44 Chunk Size Example Testing 2 KB RAID-0 4 Disks 8 KB Chunks

45 Chunk Size Example Testing 4 KB RAID-0 4 Disks 8 KB Chunks

46 Chunk Size Example Testing 6 KB RAID-0 4 Disks 8 KB Chunks

47 Chunk Size Example Testing 8 KB RAID-0 4 Disks 8 KB Chunks

48 Chunk Size Example Testing 10 KB RAID-0 4 Disks 8 KB Chunks

49 Chunk Size Example Testing 12 KB RAID-0 4 Disks 8 KB Chunks

50 Chunk Size Example Testing 14 KB RAID-0 4 Disks 8 KB Chunks

51 Chunk Size Example Testing 16 KB RAID-0 4 Disks 8 KB Chunks

52 Chunk Size Example RAID-0 4 Disks 8 KB Chunks

53 Chunk Size Example RAID-0 4 Disks 8 KB Chunks Actual 8 KB cluster

54 Shear Algorithm Pattern size Chunk size Layout of chunks to disks Level of redundancy

55 Determining the Read Layout Find mapping of chunks to disks Choose a pair of chunks in the pattern Perform random reads to both chunks Repeat for all pairs of chunks Cluster results and identify chunks on same disk

56 Read Layout Example 0 7 RAID-0 ZIG-ZAG 4 Disks 1 6 2 5 3 4

57 Read Layout Example 0 7 RAID-0 ZIG-ZAG 4 Disks 1 6 2 5 3 4 Testing { 0, 0 }

58 Read Layout Example RAID-0 ZIG-ZAG 4 Disks 1 6 2 5 3 4 Testing { 0, 1 } 0 7

59 Read Layout Example RAID-0 ZIG-ZAG 4 Disks 1 6 2 5 3 4 Testing { 0, 2 } 0 7

60 Read Layout Example RAID-0 ZIG-ZAG 4 Disks 1 6 2 5 3 4 Testing { 0, 3 } 0 7

61 Read Layout Example RAID-0 ZIG-ZAG 4 Disks 1 6 2 5 3 4 Testing { 0, 4 } 0 7

62 Read Layout Example RAID-0 ZIG-ZAG 4 Disks 1 6 2 5 3 4 Testing { 0, 5 } 0 7

63 Read Layout Example RAID-0 ZIG-ZAG 4 Disks 1 6 2 5 3 4 Testing { 0, 6 } 0 7

64 Read Layout Example 0 7 RAID-0 ZIG-ZAG 4 Disks 1 6 2 5 3 4 Testing { 0, 7 }

65 Read Layout Example 0 7 RAID-0 ZIG-ZAG 4 Disks 1 6 2 5 3 4 Testing { 1, 1 }

66 Read Layout Example 0 7 RAID-0 ZIG-ZAG 4 Disks 2 5 3 4 Testing { 1, 2 } 1 6

67 Read Layout Example 0 7 RAID-0 ZIG-ZAG 4 Disks 2 5 3 4 Testing { 1, 3 } 1 6

68 Read Layout Example 0 7 RAID-0 ZIG-ZAG 4 Disks 2 5 3 4 Testing { 1, 4 } 1 6

69 Read Layout Example 0 7 RAID-0 ZIG-ZAG 4 Disks 2 5 3 4 Testing { 1, 5 } 1 6

70 Read Layout Example 0 7 RAID-0 ZIG-ZAG 4 Disks 2 5 3 4 Testing { 1, 6 } 1 6

71 Read Layout Example 0 7 RAID-0 ZIG-ZAG 4 Disks 2 5 3 4 Testing { 1, 7 } 1 6

72 Read Layout Example 0 7 RAID-0 ZIG-ZAG 4 Disks 2 5 3 4 1 6

73 Read Layout Example 0 7 RAID-0 ZIG-ZAG 4 Disks 2 5 3 4 Actual { 0, 7 } { 1, 6 } { 2, 5 } { 3, 4} 1 6 cluster

74 Shear Algorithm Pattern size Chunk size Layout of chunks to disks Level of redundancy

75 Determining Level of Redundancy Ratio of read to write bandwidth reveals the type of redundancy in the array Expected R/W ratios: RAID-0:1 (no redundancy) RAID-1:2(mirroring) RAID-4: varies(examine write layout) RAID-5:4(parity)

76 Shear Experience Shear has been applied to Linux software RAID Poor RAID-5 parity updates Adaptec hardware RAID controller Implements RAID-5 left-asymmetric layout – RAID-0 – RAID-1 – Chained Declustering – RAID-4 – RAID-5 – P+Q

77 Outline Introduction Shear Background Algorithm Case Studies Performance: Stripe-aligned Writes Management: Detecting Misconfiguration, Failure Conclusion

78 RAID-5 Performance Small writes on RAID-5 are problematic Require two reads, parity calculation, two writes Writing in full stripes is more efficient 012320212223 PPPP 456789101112131415 PPPP PPPP 16 171819 PPPP 323334352425262728293031363738394041424344454647 RAID-5

79 Stripe-aligned Writes Overcome RAID-5 small write problem Modified Linux disk scheduler Groups writes into full stripes Aligns writes along stripe boundaries Approximately 20 lines of code Experiment Hardware RAID-5, 4 disks, 16 KB chunks Create 100 files of varying sizes

80 Stripe-aligned Writes Experiment Simple modification has a large impact

81 Detecting Misconfigurations Correct RAID 5-LSRAID 5-LARAID 5-RSRAID 5-RA Software RAID, 4 Disks, 8 KB Chunks What if one disk is accidentally used twice?

82 Detecting Misconfigurations Correct Misconfig RAID 5-LSRAID 5-LARAID 5-RSRAID 5-RA

83 Detecting Failures Software RAID RAID-5 LS 10 disks 8 KB chunks

84 Detecting Failures Software RAID RAID-5 LS 10 disks 8 KB chunks Disk 5 fails

85 Outline Introduction Shear Background Algorithm Case Studies Performance: Stripe-aligned Writes Management: Detecting Misconfiguration, Failure Conclusion

86 Gray-box research Extract / exploit information from existing interfaces Shear Extracts information Automatically determines storage array properties Exploits information File system performance tuning Storage management

87 Questions? http://www.cs.wisc.edu/adsl/


Download ppt "Deconstructing Storage Arrays Timothy E. Denehy, John Bent, Florentina I. Popovici, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin,"

Similar presentations


Ads by Google