Download presentation
1
File System Benchmarking
Advanced Research Computing
2
Outline IO benchmarks Benchmarks results for What is benchmarked
Micro-benchmarks Synthetic benchmarks Benchmarks results for Shelter NFS server, client on hokiespeed NetApp FAS 3240 server, client on hokiespeed and blueridge
3
IO BENCHMARKING
4
IO Benchmarks Micro-benchmarks Synthetic benchmarks:
Measure one basic operation in isolation Read and write throughput: dd, IOzone, IOR Metadata operations (file create, stat, remove): mdtest Good for: tuning an operation, system acceptance Synthetic benchmarks: Mix of operations that model real applications Useful if they are good models of real applications Examples: Kernel build, kernel tar and untar NAS BT-IO
5
IO Benchmark pitfalls Not measuring want you want to measure
masking of the results by various caching and buffering mechanisms Examples of different behaviors Sequential bandwidth vs random IO bandwidth; Direct IO bandwidth vs bandwidth in the presence of the page cache (in the latter case an fsync is needed) Caching of file attributes: stat-ing a file on the same node on which the file has been written
6
What is benchmarked What we measure is the combined effect of:
native file system on the NFS server (shelter) NFS server performance which depends on factors such as enabling/disabling write-delay and the number of server threads Too few threads: client retries several times Too many threads: server thrashing network between the compute cluster and the NFS server NFS client and mount options Synchronous or asynchronous Enable/disable attribute caching
7
Micro-benchmarks IOZone – measure read/write bandwidth
Historical benchmark ability to test multiple readers/writers dd – measure read/write bandwidth Tests file write/read mdtest – metadata operations per second file/directory create/stat/remove
8
Mdtest – metada test Measures the rate of the operations of file/directory create, stat, remove Mdtest creates a tree of files and directories Parameters used tree depth z = 1 branching factor b = 3 number of files/directories per tree node: I = 256 Stat run by another node than the create node: N = 1 Number of repeats of the run: i = 5
9
Synthetic benchmarks tar-untar-rm – measure time
Test large number of small file creation/deletion Test filesystem metadata creation/deletion NAS BT-IO – bandwidth and time doing IO Solve a block tri-diagonal linear system arising from the discretization of Navier-Stokes equations
10
Kernel source tar-untar-rm
Run on 1 to 32 nodes. Tarball size: 890M Total directories: 4732 Max directory depth: 10 Total files: 75984 Max file size: 919 kB <= 1k: 14490 <= 10k: 40190 <=100k: 20518 <= 1M: 786
11
NAS BT-I/O Test mechanism What it measures
BT is a simulated CFD application that uses an implicit algorithm to solve 3-dimensional compressible Navier-Stokes equations. The finite differences solution to the problem is based on an Alternating Direction Implicit (ADI) approximate factorization that decouples the x, y and z dimensions. The resulting systems are Block-Tridiagonal of 5x5 blocks and are solved sequentially along each dimension. BT-I/O is test of different parallel I/O techniques in BT Reference - What it measures Multiple cores I/O with a single large file (blocking MPI calls mpi_file_write_at_all and mpi_file_read_at_all) I/O timing percentage, Total data written, I/O data rate
12
ShelTER NFS RESULTS
13
dd throughput (MB/sec)
Run on 1 to 32 nodes Two block size – 1MB and 4MB Three file sizes – 1GB, 5GB, 15GB Block size File size Average Median Stdev 1M 1G 8.01 6.10 4.58 5G 7.75 5.95 4.52 15G 5.74 5.60 0.34 4M 4G 11.17 11.80 2.87 20G 15.71 12.70 10.68 60G 14.60 10.50 9.22
14
dd throughput (MB/sec)
15
IOZone write throughput
16
IOZone write vs read (single thread)
17
Mdtest file/directory create rate
18
Mdtest file/directory remove rate
19
Mdtest file/directory stat rate
20
Tar-untar-rm time (sec)
Real User Sys Average 781.27 1.35 10.41 Median 1.66 13.08 Standard deviation 644.16 0.44 3.39 untar Real User Sys Average 1.51 18.02 Median 17.90 Standard deviation 99.03 0.06 0.62 rm Real User Sys Average 227.48 0.22 3.91 Median 216.28 3.87 Standard deviation 64.21 0.02 0.16
21
BT-IO Results Attribute Class C Class D Problem Size 162 x 162 x 162
Iterations 200 250 Number of Processes 4 361 I/O timing percentage 13.44 91.66 Total data written in a single file (MB) I/O data rate (MB/sec) 94.99 73.45 Data written or read at every I/O instance into a single file per processor (MB/core) 42.5 7.5
22
NETAPP FAS 3240 RESULTS
23
Server and Clients NAS server: NetApp FAS 3240
Clients running on two clusters Hokiespeed Blueridge Hokiespeed: Linux kernel compile, tar-untar and rm tests have been run with: nodes spread uniformly over racks, and consecutive nodes (rack-packed) Blueridge: Linux kernel compile, tar-untar, and rm tests have been run on consecutive nodes
24
IOzone read and write throughput (KB/s)
Hokiespeed
25
dd bandwidth (MB/sec) Two node placement policies Direct IO was used
packed on a rack spread across racks Direct IO was used Two operations: read and write Two block sizes – 1MB and 4MB Three file sizes – 1GB, 5GB, 15 GB Results show throughput in MB/s
26
dd read throughput (MB/sec), 1MB blocks
Hokiespeed BlueRidge Nodes spread Nodes packed Nodes packed
27
dd read throughput (MB/sec), 4 MB blocks
Hokiespeed BlueRidge Nodes packed Nodes spread Nodes packed
28
dd write throughput (MB/sec), 1MB blocks
BlueRidge Hokiespeed Nodes spread Nodes packed Nodes packed
29
dd write throughput (MB/sec), 4MB blocks
Hokiespeed BlueRidge Nodes spread Nodes packed Nodes packed
30
Linux Kernel tests Two node placement policies Operations
packed on a rack spread across racks Operations Compile: make –j 12 Tar creations and extraction Remove directory tree read and write Results show throughput in MB/s
31
Linux Kernel compile time (sec)
Hokiespeed BlueRidge nodes real user sys 1 817 4968 1096 2 990 5014 1138 4 993 5223 1171 8 939 5143 1167 16 1318 5112 1198 32 2561 5087 1183 64 4985 5111 1209 nodes real user sys 1 694 4589 951 2 1092 4572 993 4 2212 4631 1038 8 4451 4691 1073 16 5636 4716 1098 32 5999 4702 1111 64 6609 4699 1089 Nodes spread Nodes packed nodes real user sys 1 733 5001 1116 2 1546 5086 1233 4 3189 5146 1273 8 6343 5219 1317 16 9476 5251 1366 32 10012 5255 1339 Nodes packed
32
Tar extraction time (sec)
Hokiespeed BlueRidge nodes real user sys 1 143 1.05 9.5 2 125 0.98 9.4 4 144 1.04 9.8 8 149 16 216 1.08 10.4 32 399 1.23 12.5 64 809 1.42 15.0 nodes real user sys 1 98 0.6 6.6 2 103 4 106 6.5 8 130 0.7 7.1 16 217 0.8 9.1 32 406 1.2 13 64 818 1.1 14 Nodes spread Nodes packed nodes real user Sys 1 167 1.0 9.5 2 172 0.98 4 177 1.06 9.6 8 202 1.03 9.7 16 312 1.09 10.2 32 421 1.18 11.9 Nodes packed
33
Rm execution time (sec)
Hokiespeed BlueRidge nodes real user sys 1 20 0.12 2.5 2 21 0.15 2.7 4 25 0.16 2.8 8 33 0.17 16 123 0.22 3.7 32 284 0.24 4.0 64 650 0.27 4.4 nodes real user sys 1 19.21 0.07 1.69 2 19.14 0.10 4 26.68 0.11 1.98 8 63.75 0.16 3.16 16 152.59 0.22 4.24 32 324.90 0.26 4.98 64 699.04 0.25 5.06 Nodes spread Nodes packed nodes real user sys 1 21 0.14 2.84 2 22 2.82 4 0.15 2.80 8 47 0.18 3.30 16 135 0.21 3.85 32 248 0.23 4.01 64 811 0.27 4.54 Nodes packed
34
Uplink switch traffic, runs on hokiespeed
Nodes spread Nodes packed
35
Mdtest file/directory create rate
IO ops/sec for mdtest –z 1 –b 3 –I 256 –i 10 –N 1 BlueRidge Hokiespeed
36
Mdtest file/directory remove rate
IO ops/sec for mdtest –z 1 –b 3 –I 256 –i 10 –N 1 Hokiespeed BlueRidge
37
Mdtest file/directory stat rate
IO ops/sec for mdtest –z 1 –b 3 –I 256 –i 10 –N 1 Hokiespeed BlueRidge
38
250 (I/O after every 5 steps)
NAS BT IO results Class D Iterations 250 (I/O after every 5 steps) Number of jobs 50 Total data size (written/read) (TB) 6.5 (50 files of 135GB each) System HokieSpeed BlueRidge Nodes per job 3 4 Total number of cores 1800 3200 Average I/O timing in hours 5.175 5.85 5.3 5.5 Average I/O timing (percentage of total time) 92.6 93.4 92.7 96.6 Average Mop/s/process 80.6 72 79.6 44.5 Average I/O rate per node (MB/s) 2.44 2.15 2.34 1.71 Total I/O rates (MB/s) 357.64 323.02 359.8 343.42
39
Uplink switch traffic for BT-IO on hokiespeed
1 2 The boxes indicate the three NAS BT IO runs Red is write Green is read 3
40
EMC Isilon X400 RESULTS
41
dd bandwidth (MB/sec) Runs on BlueRidge Direct IO was used
no special node placement policy Direct IO was used Two operations: read and write Two block sizes – 1MB and 4MB Three file sizes – 1GB, 5GB, 15 GB Results show throughput in MB/s
42
dd read throughput (MB/sec), 1MB blocks
EMC Isilon NetApp
43
dd read throughput (MB/sec), 4 MB blocks
Isilon NetApp
44
dd write throughput (MB/sec), 1MB blocks
Isilon NetApp
45
dd write throughput (MB/sec), 4MB blocks
Isilon NetApp
46
Linux Kernel tests Runs on BlueRidge Direct IO was used Operations
no special node placement policy Direct IO was used Operations Compile: make –j 12 Tar creations and extraction Remove directory tree read and write Results show throughput in MB/s
47
Linux Kernel compile time (sec)
Isilon NetApp nodes real user sys 1 701 4584 957 2 1094 4558 989 4 2228 4631 1038 8 4642 4713 1084 16 5860 4723 1107 32 6655 4754 1120 64 7181 4760 1113 nodes real user sys 1 694 4589 951 2 1092 4572 993 4 2212 4631 1038 8 4451 4691 1073 16 5636 4716 1098 32 5999 4702 1111 64 6609 4699 1089
48
Tar creation time (sec)
Isilon NetApp nodes real user sys 1 32 0.50 4.45 2 0.51 4.54 4 0.47 4.39 8 0.48 4.38 16 33 0.49 4.28 35 4.19 64 57 4.20 nodes real user sys 1 30 0.51 4.50 2 0.49 4.46 4 34 0.50 4.51 8 41 4.45 16 62 0.54 32 116 0.60 4.83 64 238 0.89 7.10
49
Tar extraction time (sec)
Isilon NetApp nodes real user sys 1 230 0.65 10.1 2 234 0.62 10.3 4 237 0.63 10.4 8 255 0.64 10.5 16 300 0.67 10.9 32 431 0.74 11.8 64 754 0.87 14.1 nodes real user sys 1 98 0.6 6.6 2 103 4 106 6.5 8 130 0.7 7.1 16 217 0.8 9.1 32 406 1.2 13 64 818 1.1 14
50
Rm execution time (sec)
Isilon NetApp nodes real user sys 1 110 0.23 4.76 2 113 0.24 4.80 4 124 4.82 8 158 4.85 16 234 0.25 4.93 32 340 0.26 4.99 64 655 5.27 nodes real user sys 1 19.2 0.07 1.69 2 19.1 0.10 4 26.7 0.11 1.98 8 63.7 0.16 3.16 16 152 0.22 4.24 32 324 0.26 4.98 64 699 0.25 5.06
51
IOZone write throughput (KB/s) Isilon
Buffered IO/BlueRidge Direct IO/BlueRidge
52
IOZone read throughput (KB/s) Isilon
Buffered IO/BlueRidge Direct IO/BlueRidge
53
Iozone write throughput (KB/s)
Isilon/BlueRidge NetApp/HokieSpeed
54
IOzone read throughput (KB/s)
Isilon/BlueRidge NetApp/HokieSpeed
55
Thank you.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.