Download presentation
Presentation is loading. Please wait.
1
Squeezing Every Drop of Ink out of Ceph
By Bryan Stillwell - June 28, 2016
2
Agenda Welcome (food/drinks) Ceph updates Presentation Q&A Discussions
3
Ceph Updates Current release: Jewel (10.2.2)
CephFS stable RGW – Re-architected multisite code RGW – NFS access RBD – Asynchronous mirroring RADOS – BlueStore preview (no checksums) Current dev series: Kraken (Fall 2016) BlueStore stable (default in Luminous) Compression support Modify/update support erasure coded pools
4
RADOS Bench rados bench -p rbd 60 write -b 4096 -t 16
Total time run: Total writes made: Write size: Bandwidth (MB/sec): Stddev Bandwidth: Max bandwidth (MB/sec): 7.625 Min bandwidth (MB/sec): 0 Average Latency: Stddev Latency: Max latency: Min latency: rados -p rbd cleanup --prefix bench
5
FIO RBD engine Flexible I/O tester (written by Jens Axboe)
Needs Ceph devel packages to build with RBD support $ rbd create testvol --size 65536 $ ./fio --ioengine=rbd --clientname=admin --pool=rbd \ --rbdname=testvol --rw=write --bs=4k --iodepth=16 \ --numjobs=1 --size=60G --runtime=60 \ --group_reporting --name=test
6
FIO RBD engine test: (g=0): rw=write, bs=4K-4K/4K-4K/4K-4K, ioengine=rbd, iodepth=16 fio-2.12 Starting 1 process rbd engine: RBD version: 0.1.9 Jobs: 1 (f=1): [W(1)] [100.0% done] [0KB/5248KB/0KB /s] [0/1312/0 iops] [eta 00m:00s] test: (groupid=0, jobs=1): err= 0: pid=150137: Tue Jun 28 22:16: write: io=316672KB, bw=5277.8KB/s, iops=1319, runt= 60009msec slat (usec): min=0, max=780, avg= 1.74, stdev=10.33 clat (msec): min=6, max=33, avg=12.10, stdev= 1.79 lat (msec): min=6, max=33, avg=12.10, stdev= 1.79 clat percentiles (usec): | 1.00th=[ 8768], 5.00th=[ 9536], 10.00th=[10176], 20.00th=[10688], | 30.00th=[11200], 40.00th=[11584], 50.00th=[11968], 60.00th=[12224], | 70.00th=[12608], 80.00th=[13248], 90.00th=[14272], 95.00th=[15296], | 99.00th=[17792], 99.50th=[18816], 99.90th=[20608], 99.95th=[23680], | 99.99th=[33536] bw (KB /s): min= 4736, max= 5748, per=100.00%, avg= , stdev=212.39 lat (msec) : 10=8.47%, 20=91.34%, 50=0.18% cpu : usr=0.40%, sys=0.07%, ctx=9703, majf=0, minf=70 IO depths : 1=6.2%, 2=12.5%, 4=25.0%, 8=50.0%, 16=6.2%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=94.1%, 8=0.0%, 16=5.9%, 32=0.0%, 64=0.0%, >=64=0.0% issued : total=r=0/w=79168/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=16 Run status group 0 (all jobs): WRITE: io=316672KB, aggrb=5277KB/s, minb=5277KB/s, maxb=5277KB/s, mint=60009msec, maxt=60009msec Disk stats (read/write): dm-0: ios=0/999, merge=0/0, ticks=0/264, in_queue=264, util=0.15%, aggrios=0/445, aggrmerge=0/584, aggrticks=0/88, aggrin_queue=88, aggrutil=0.14% sda: ios=0/445, merge=0/584, ticks=0/88, in_queue=88, util=0.14%
7
COSBench Good for testing RGW (also support librados)
8
CBT (Ceph Benchmark Tool
Automated testing of different configurations Recent Ceph Tech Talk on CBT by Kyle Bader:
9
Single node cluster Building block Removes networking from equation
Test disk controller limits ceph osd getcrushmap -o crushmap.bin crushtool -d crushmap.bin -o crushmap.txt sed -ie 's/\(step chooseleaf firstn 0 type\) host/\1 osd/' crushmap.txt crushtool -c crushmap.txt -o crushmap.bin ceph osd setcrushmap -i crushmap.bin
10
Journals Don’t co-locate them except on SSDs
Test your HDD:SSD ratio to make sure they don’t become a bottleneck Ensure your SSDs have good direct I/O
11
JBOD Mode (aka HBA Mode)
Enable this on RAID controllers Can see 2x performance improvement
12
Memory allocators tcmalloc (currently linked against) jemalloc
Preload trick # /etc/default/ceph TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=
13
CPU Governors Definitely affects SSD clusters
Set to performance for best results (watch your power usage)
14
Process pinning Use on large systems with multiple HBAs
Great talk at OpenStack Austin: Designing for High Performance Ceph at Scale By John Benton & James Saint-Rossy
15
BlueStore Biggest thing to happen to Ceph in a long time
Double performance on the same hardware (removes the double write penalty) Good Ceph Tech Talk last week by Sage:
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.