CEPH: Ceph installation using ceph-deploy on Centos 7
Initial Setup
- 10 nodes, all with Centos 7
- SSH Keys setup between hosts
- Firewall disabled (out of laziness)
- SElinux disabled
- /etc/hosts sync'd across the servers (node names, reflect their purpose below, mon,osd1 etc)
- Had to add; yum install redhat-lsb-core
Install ceph-deploy
wget http://download.ceph.com/rpm/el7/noarch/ceph-release-1-1.el7.noarch.rpm
rpm -ivh ceph-release-1-1.el7.noarch.rpm
yum install ceph-deploySetup the Systems with CEPH
So before we going deploying and configuring Ceph, we need to install the RPMs on the nodes we'll be using; This just installs Ceph on the nodes and you'll see a lot of debug info while this process progresses.
# Note on OpenHPC; Perform the following if you've already installed OpenStack/Liberty
# $ yum-config-manager --enable epel
# $ yum-config-manager --disable centos-openstack-liberty
# Also, make sure the hostname on the nodes matches the ceph-mon1, ceph-osd{1,2,3} etc
# not even sure i needed the release arg; Works sequentially, room for improvement!
ceph-deploy install --release hammer ceph-osd{1,2,3,4,5}
ceph-deploy install --release hammer ceph-mon1
# Note; if you dont change the node names the services will start up incorrectly, stop / remove and ceph-deploy again. Your process ID's will be different
# hostname ceph-mon1
# systemctl | grep ceph
# systemctl stop ceph-mon.head.1461319199.566181162.service
# systemctl stop ceph-mon.ceph-mon1.1461320175.884595648.service
# systemctl disable ceph-mon.ceph-mon1.1461320175.884595648.service
# systemctl disable stop ceph-mon.head.1461319199.566181162.service
# ceph-deploy --overwrite-conf mon create ceph-mon1Create the cluster and setup mon node(s)
ceph-deploy new ceph-mon1
ceph-deploy mon create ceph-mon1
# if i had multiple mons;
ceph-deploy --cluster rbdcluster new ceph-mon{1,2,3}Before you can provision a host to run OSDs or metadata servers, you must gather monitor keys and the OSD and MDS bootstrap keyrings. To gather keys, enter the following:
ceph-deploy gatherkeys ceph-mon1
# when no longer using ceph-deploy or restarting the install; ceph-deploy forgetkeysCreate the OSDs
Check the disks and prepare them all for Ceph OSD installation
ceph-deploy disk list ceph-osd1
ceph-deploy disk zap ceph-osd1:sd{a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z,aa,ab,ac,ad,ae,af,ag,ah,ai}
ceph-deploy osd prepare ceph-osd1:sd{a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v}:/dev/sdd # last entry is the journal for the OSDs, usually an SSD.
# on one occasion, the prepare didnt work (firewall issue) so follow up commands had to be issued:
# ceph-deploy osd activate ceph-osd1:sdc1:sda1 # and note, prepare would have stated /dev/sda (block device), and activate would have /dev/sda1 (partition)
# ceph-deploy osd activate ceph-osd2:sdc1:sda1
# ceph-deploy osd activate ceph-osd3:sdc1:sda1Monitor the system while loading;
[root@ceph-mon1 ~]# ceph -w
cluster 9180ea1b-1342-479c-b7b4-63296f195d1c
health HEALTH_WARN
too few PGs per OSD (1 < min 30)
monmap e1: 1 mons at {ceph-mon1=172.28.55.6:6789/0}
election epoch 2, quorum 0 ceph-mon1
osdmap e1001: 122 osds: 121 up, 121 in
pgmap v3263: 64 pgs, 1 pools, 0 bytes data, 0 objects
10714 MB used, 659 TB / 659 TB avail
64 active+clean
# lots of output about balancing
2016-03-03 17:19:51.843294 mon.0 [INF] pgmap v3272: 4160 pgs: 440 creating, 2844 creating+peering, 789 creating+activating, 87 active+clean; 0 bytes data, 10737 MB used, 659 TB / 659 TB avail
2016-03-03 17:19:53.261868 mon.0 [INF] pgmap v3273: 4160 pgs: 2 active, 1421 creating+peering, 2507 creating+activating, 230 active+clean; 0 bytes data, 10772 MB used, 659 TB / 659 TB avail
2016-03-03 17:19:54.332496 mon.0 [INF] pgmap v3274: 4160 pgs: 2 active, 1351 creating+peering, 2573 creating+activating, 234 active+clean; 0 bytes data, 10773 MB used, 659 TB / 659 TB availYou'll need to give Ceph a while to balance. Check the output of the following to verify all OSDs are up;
[root@ceph-mon1 ~]# ceph osd tree
ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY
-1 659.89941 root default
-2 120.11981 host ceph-osd1
0 5.45999 osd.0 up 1.00000 1.00000
1 5.45999 osd.1 up 1.00000 1.00000
2 5.45999 osd.2 up 1.00000 1.00000
3 5.45999 osd.3 up 1.00000 1.00000
4 5.45999 osd.4 up 1.00000 1.00000
5 5.45999 osd.5 up 1.00000 1.00000
6 5.45999 osd.6 up 1.00000 1.00000
7 5.45999 osd.7 up 1.00000 1.00000
8 5.45999 osd.8 up 1.00000 1.00000
9 5.45999 osd.9 up 1.00000 1.00000
10 5.45999 osd.10 up 1.00000 1.00000
11 5.45999 osd.11 up 1.00000 1.00000
12 5.45999 osd.12 up 1.00000 1.00000
13 5.45999 osd.13 up 1.00000 1.00000
14 5.45999 osd.14 up 1.00000 1.00000
15 5.45999 osd.15 up 1.00000 1.00000
16 5.45999 osd.16 up 1.00000 1.00000
17 5.45999 osd.17 up 1.00000 1.00000
18 5.45999 osd.18 up 1.00000 1.00000
19 5.45999 osd.19 up 1.00000 1.00000
20 5.45999 osd.20 up 1.00000 1.00000
21 5.45999 osd.21 up 1.00000 1.00000
-3 125.57980 host ceph-osd2
22 5.45999 osd.22 up 1.00000 1.00000
23 5.45999 osd.23 up 1.00000 1.00000
24 5.45999 osd.24 up 1.00000 1.00000
25 5.45999 osd.25 up 1.00000 1.00000
26 5.45999 osd.26 up 1.00000 1.00000
27 5.45999 osd.27 up 1.00000 1.00000
28 5.45999 osd.28 up 1.00000 1.00000
29 5.45999 osd.29 up 1.00000 1.00000
30 5.45999 osd.30 up 1.00000 1.00000
31 5.45999 osd.31 up 1.00000 1.00000
32 5.45999 osd.32 up 1.00000 1.00000
33 5.45999 osd.33 up 1.00000 1.00000
34 5.45999 osd.34 up 1.00000 1.00000
35 5.45999 osd.35 up 1.00000 1.00000
36 5.45999 osd.36 up 1.00000 1.00000
39 5.45999 osd.39 up 1.00000 1.00000
42 5.45999 osd.42 up 1.00000 1.00000
46 5.45999 osd.46 up 1.00000 1.00000
49 5.45999 osd.49 up 1.00000 1.00000
52 5.45999 osd.52 up 1.00000 1.00000
55 5.45999 osd.55 up 1.00000 1.00000
58 5.45999 osd.58 up 1.00000 1.00000
61 5.45999 osd.61 up 1.00000 1.00000
-4 136.24992 host ceph-osd3
37 5.45000 osd.37 up 1.00000 1.00000
40 5.45000 osd.40 up 1.00000 1.00000
43 5.45000 osd.43 up 1.00000 1.00000
45 5.45000 osd.45 up 1.00000 1.00000
48 5.45000 osd.48 up 1.00000 1.00000
51 5.45000 osd.51 up 1.00000 1.00000
54 5.45000 osd.54 up 1.00000 1.00000
57 5.45000 osd.57 up 1.00000 1.00000
60 5.45000 osd.60 up 1.00000 1.00000
63 5.45000 osd.63 up 1.00000 1.00000
65 5.45000 osd.65 up 1.00000 1.00000
67 5.45000 osd.67 up 1.00000 1.00000
69 5.45000 osd.69 up 1.00000 1.00000
71 5.45000 osd.71 up 1.00000 1.00000
88 5.45000 osd.88 up 1.00000 1.00000
91 5.45000 osd.91 up 1.00000 1.00000
94 5.45000 osd.94 up 1.00000 1.00000
96 5.45000 osd.96 up 1.00000 1.00000
97 5.45000 osd.97 up 1.00000 1.00000
98 5.45000 osd.98 up 1.00000 1.00000
99 5.45000 osd.99 up 1.00000 1.00000
101 5.45000 osd.101 up 1.00000 1.00000
103 5.45000 osd.103 up 1.00000 1.00000
105 5.45000 osd.105 up 1.00000 1.00000
107 5.45000 osd.107 up 1.00000 1.00000
-5 141.69992 host ceph-osd4
38 5.45000 osd.38 up 1.00000 1.00000
41 5.45000 osd.41 up 1.00000 1.00000
44 5.45000 osd.44 up 1.00000 1.00000
47 5.45000 osd.47 up 1.00000 1.00000
50 5.45000 osd.50 up 1.00000 1.00000
53 5.45000 osd.53 up 1.00000 1.00000
56 5.45000 osd.56 up 1.00000 1.00000
59 5.45000 osd.59 up 1.00000 1.00000
62 5.45000 osd.62 up 1.00000 1.00000
64 5.45000 osd.64 up 1.00000 1.00000
66 5.45000 osd.66 up 1.00000 1.00000
68 5.45000 osd.68 up 1.00000 1.00000
70 5.45000 osd.70 up 1.00000 1.00000
72 5.45000 osd.72 up 1.00000 1.00000
73 5.45000 osd.73 up 1.00000 1.00000
75 5.45000 osd.75 up 1.00000 1.00000
76 5.45000 osd.76 up 1.00000 1.00000
77 5.45000 osd.77 up 1.00000 1.00000
79 5.45000 osd.79 up 1.00000 1.00000
81 5.45000 osd.81 up 1.00000 1.00000
83 5.45000 osd.83 up 1.00000 1.00000
85 5.45000 osd.85 up 1.00000 1.00000
87 5.45000 osd.87 up 1.00000 1.00000
90 5.45000 osd.90 up 1.00000 1.00000
93 5.45000 osd.93 up 1.00000 1.00000
95 5.45000 osd.95 up 1.00000 1.00000
-6 136.24992 host ceph-osd5
78 5.45000 osd.78 up 1.00000 1.00000
80 5.45000 osd.80 up 1.00000 1.00000
82 5.45000 osd.82 up 1.00000 1.00000
84 5.45000 osd.84 up 1.00000 1.00000
86 5.45000 osd.86 up 1.00000 1.00000
89 5.45000 osd.89 up 1.00000 1.00000
92 5.45000 osd.92 up 1.00000 1.00000
100 5.45000 osd.100 up 1.00000 1.00000
102 5.45000 osd.102 up 1.00000 1.00000
104 5.45000 osd.104 up 1.00000 1.00000
106 5.45000 osd.106 up 1.00000 1.00000
108 5.45000 osd.108 up 1.00000 1.00000
109 5.45000 osd.109 up 1.00000 1.00000
110 5.45000 osd.110 up 1.00000 1.00000
111 5.45000 osd.111 up 1.00000 1.00000
112 5.45000 osd.112 up 1.00000 1.00000
113 5.45000 osd.113 up 1.00000 1.00000
114 5.45000 osd.114 up 1.00000 1.00000
115 5.45000 osd.115 up 1.00000 1.00000
116 5.45000 osd.116 up 1.00000 1.00000
117 5.45000 osd.117 up 1.00000 1.00000
118 5.45000 osd.118 up 1.00000 1.00000
119 5.45000 osd.119 up 1.00000 1.00000
120 5.45000 osd.120 up 1.00000 1.00000
121 5.45000 osd.121 up 1.00000 1.00000
74 0 osd.74 down 0 1.00000And check ceph status for the health of the ceph cluster
[root@ceph-mon1 ~]# ceph status
cluster 9180ea1b-1342-479c-b7b4-63296f195d1c
health HEALTH_WARN
too few PGs per OSD (1 < min 30)
monmap e1: 1 mons at {ceph-mon1=172.28.55.6:6789/0}
election epoch 2, quorum 0 ceph-mon1
osdmap e1001: 122 osds: 121 up, 121 in
pgmap v3263: 64 pgs, 1 pools, 0 bytes data, 0 objects
10714 MB used, 659 TB / 659 TB avail
64 active+cleanSetup the PGs (Placement Groups)
At this stage we are still getting a warning about the PG (placement group). The default pg_num is 64 and we have over 120 OSDs in this configuration. The place group page: http://docs.ceph.com/docs/master/rados/operations/placement-groups/ suggests we need a much a large PG number so lets increase it (or add another pool) .
How much PG you need for a POOL :
Total PGs = (OSDs * 100) / Replicas
[root@ceph-mon1 ~]# ceph osd stat
osdmap e1004: 122 osds: 121 up, 121 inApplying formula gives us = ( 121 * 100 ) / 3 = 4033
Now , round up this value to the next power of 2 , this will give you the number of PG you should have for a pool having replication size of 3 and total 121 OSD in entire cluster.
Final Value = 4096 PG
Check the current (Default) pools
[root@ceph-mon1 ~]# ceph osd lspools
0 rbd,
[root@ceph-mon1 ~]# ceph osd dump | grep rbd
pool 0 'rbd' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 1 flags hashpspool stripe_width 0
[root@ceph-mon1 ~]# ceph osd pool get rbd pg_num
pg_num: 64From the above we can see;
- replication factor 3 (default value)
- pg size 64 (default)
Create a new pool (with the optimal PGs)
Lets create another pool (With a large PG num), then give it a few minutes and check the heath again;
[root@ceph-mon1 ~]# ceph osd pool create pooldp 4096
pool 'pooldp' created
[root@ceph-mon1 ~]# ceph -w
cluster 9180ea1b-1342-479c-b7b4-63296f195d1c
health HEALTH_WARN
2527 pgs peering
4096 pgs stuck inactive
4096 pgs stuck unclean
monmap e1: 1 mons at {ceph-mon1=172.28.55.6:6789/0}
election epoch 2, quorum 0 ceph-mon1
osdmap e1004: 122 osds: 121 up, 121 in
pgmap v3271: 4160 pgs, 2 pools, 0 bytes data, 0 objects
10721 MB used, 659 TB / 659 TB avail
2527 creating+peering
1219 creating
350 creating+activating
64 active+clean
2016-03-03 17:19:48.468333 mon.0 [INF] pgmap v3270: 4160 pgs: 1389 creating, 2371 creating+peering, 336 creating+activating, 64 active+clean; 0 bytes data, 10720 MB used, 659 TB / 659 TB avail
2016-03-03 17:19:50.586989 mon.0 [INF] pgmap v3271: 4160 pgs: 1219 creating, 2527 creating+peering, 350 creating+activating, 64 active+clean; 0 bytes data, 10721 MB used, 659 TB / 659 TB avail
2016-03-03 17:19:51.843294 mon.0 [INF] pgmap v3272: 4160 pgs: 440 creating, 2844 creating+peering, 789 creating+activating, 87 active+clean; 0 bytes data, 10737 MB used, 659 TB / 659 TB avail
2016-03-03 17:19:53.261868 mon.0 [INF] pgmap v3273: 4160 pgs: 2 active, 1421 creating+peering, 2507 creating+activating, 230 active+clean; 0 bytes data, 10772 MB used, 659 TB / 659 TB avail
2016-03-03 17:19:54.332496 mon.0 [INF] pgmap v3274: 4160 pgs: 2 active, 1351 creating+peering, 2573 creating+activating, 234 active+clean; 0 bytes data, 10773 MB used, 659 TB / 659 TB avail
2016-03-03 17:19:56.689413 mon.0 [INF] pgmap v3275: 4160 pgs: 2 active, 756 creating+peering, 3084 creating+activating, 318 active+clean; 0 bytes data, 10784 MB used, 659 TB / 659 TB avail
2016-03-03 17:19:58.209627 mon.0 [INF] pgmap v3276: 4160 pgs: 38 active, 148 creating+peering, 2457 creating+activating, 1517 active+clean; 0 bytes data, 10823 MB used, 659 TB / 659 TB avail
2016-03-03 17:19:59.274988 mon.0 [INF] pgmap v3277: 4160 pgs: 50 active, 4 creating+peering, 2401 creating+activating, 1705 active+clean; 0 bytes data, 10829 MB used, 659 TB / 659 TB avail
2016-03-03 17:20:01.569975 mon.0 [INF] pgmap v3278: 4160 pgs: 50 active, 2 creating+peering, 2009 creating+activating, 2099 active+clean; 0 bytes data, 10834 MB used, 659 TB / 659 TB avail
2016-03-03 17:20:02.814976 mon.0 [INF] pgmap v3279: 4160 pgs: 13 active, 454 creating+activating, 3693 active+clean; 0 bytes data, 10860 MB used, 659 TB / 659 TB avail
2016-03-03 17:20:03.886496 mon.0 [INF] pgmap v3280: 4160 pgs: 39 creating+activating, 4121 active+clean; 0 bytes data, 10868 MB used, 659 TB / 659 TB avail
2016-03-03 17:20:06.025433 mon.0 [INF] pgmap v3281: 4160 pgs: 34 creating+activating, 4126 active+clean; 0 bytes data, 10869 MB used, 659 TB / 659 TB avail
2016-03-03 17:20:07.187744 mon.0 [INF] pgmap v3282: 4160 pgs: 6 creating+activating, 4154 active+clean; 0 bytes data, 10870 MB used, 659 TB / 659 TB avail
2016-03-03 17:20:08.283504 mon.0 [INF] pgmap v3283: 4160 pgs: 4160 active+clean; 0 bytes data, 10870 MB used, 659 TB / 659 TB avail
^C
[root@ceph-mon1 ~]# ceph status
cluster 9180ea1b-1342-479c-b7b4-63296f195d1c
health HEALTH_OK
monmap e1: 1 mons at {ceph-mon1=172.28.55.6:6789/0}
election epoch 2, quorum 0 ceph-mon1
osdmap e1004: 122 osds: 121 up, 121 in
pgmap v3283: 4160 pgs, 2 pools, 0 bytes data, 0 objects
10870 MB used, 659 TB / 659 TB avail
4160 active+clean
[root@ceph-mon1 ~]#Good we now have a healthy ceph cluster!
Benchmark your Ceph Cluster (RADOS / Objects)
Ceph includes the rados bench command, designed specifically to benchmark a RADOS storage cluster. To use it, create a storage pool and then use rados bench to perform a write benchmark, as shown below.
The rados command is included with Ceph.
[root@ceph-mon1 ~]# ceph osd pool create scbench 100 100
pool 'scbench' created
[root@ceph-mon1 ~]# rados bench -p scbench 60 write --no-cleanup
Maintaining 16 concurrent writes of 4194304 bytes for up to 60 seconds or 0 objects
Object prefix: benchmark_data_ceph-mon1_8522
sec Cur ops started finished avg MB/s cur MB/s last lat avg lat
0 0 0 0 0 0 - 0
1 16 24 8 31.9888 32 0.488635 0.679065
2 16 48 32 63.9853 96 0.925952 0.710474
3 16 75 59 78.652 108 0.800361 0.705261
4 16 98 82 81.9867 92 0.89478 0.693056
5 16 124 108 86.3872 104 0.482141 0.674745
6 16 146 130 86.6546 88 0.50371 0.670405
7 16 171 155 88.5595 100 0.38974 0.664332
8 16 196 180 89.9882 100 0.800197 0.678677
9 16 216 200 88.8775 80 0.219502 0.662146
10 16 240 224 89.5888 96 0.560353 0.681194
11 16 266 250 90.898 104 0.590875 0.674103
12 16 290 274 91.3225 96 0.362163 0.677343
13 16 315 299 91.9892 100 0.45172 0.671456
14 16 341 325 92.8464 104 1.52322 0.668701
15 16 364 348 92.7895 92 0.342974 0.672035
16 16 386 370 92.4896 88 0.415546 0.668979
17 16 414 398 93.6366 112 0.380463 0.671389
18 16 438 422 93.7675 96 0.507864 0.667457
19 16 465 449 94.5161 108 0.955241 0.662551
2016-03-03 18:07:07.368978min lat: 0.219502 max lat: 2.02651 avg lat: 0.659791
sec Cur ops started finished avg MB/s cur MB/s last lat avg lat
20 16 490 474 94.7898 100 0.326693 0.659791
21 16 515 499 95.0373 100 0.454321 0.659829
22 16 542 526 95.626 108 0.484665 0.656761
23 16 566 550 95.6419 96 0.438458 0.656717
24 16 591 575 95.8231 100 0.367075 0.653703
25 16 614 598 95.6689 92 0.344375 0.654561
26 16 644 628 96.6044 120 0.489909 0.653584
27 16 668 652 96.5816 96 0.345536 0.653503
28 16 693 677 96.7033 100 0.440913 0.654526
29 16 715 699 96.403 88 0.387208 0.652397
30 16 742 726 96.7892 108 1.08176 0.653314
31 16 762 746 96.2474 80 0.497605 0.649928
32 16 789 773 96.6143 108 0.495238 0.65265
33 16 816 800 96.959 108 0.335065 0.653864
34 16 839 823 96.8129 92 0.833682 0.650921
35 16 866 850 97.1322 108 0.750654 0.65168
36 16 888 872 96.8783 88 0.628799 0.651708
37 16 911 895 96.7461 92 0.387538 0.651389
38 16 938 922 97.042 108 0.339795 0.653065
39 16 964 948 97.2201 104 0.488341 0.652364
2016-03-03 18:07:27.371173min lat: 0.219502 max lat: 2.02651 avg lat: 0.651527
sec Cur ops started finished avg MB/s cur MB/s last lat avg lat
40 16 989 973 97.2894 100 0.502734 0.651527
41 16 1013 997 97.2577 96 1.05962 0.652523
42 16 1035 1019 97.0371 88 0.470472 0.652811
43 16 1061 1045 97.1988 104 1.00319 0.652995
44 16 1085 1069 97.1714 96 0.817875 0.651415
45 16 1113 1097 97.5006 112 0.58483 0.65208
46 16 1139 1123 97.6417 104 0.275317 0.650259
47 16 1162 1146 97.5215 92 0.334055 0.649269
48 16 1190 1174 97.8229 112 0.486993 0.649918
49 16 1211 1195 97.5406 84 0.60307 0.649903
50 16 1237 1221 97.6696 104 0.492317 0.651083
51 16 1263 1247 97.7936 104 0.74506 0.65062
52 16 1285 1269 97.6051 88 1.03266 0.649712
53 16 1310 1294 97.6501 100 0.426457 0.649229
54 16 1335 1319 97.6934 100 0.552609 0.649796
55 16 1361 1345 97.8079 104 0.423608 0.649596
56 16 1380 1364 97.4184 76 0.965162 0.651008
57 16 1408 1392 97.674 112 0.523899 0.650617
58 16 1430 1414 97.5071 88 0.417439 0.650343
59 16 1454 1438 97.4814 96 0.467891 0.650477
2016-03-03 18:07:47.373053min lat: 0.219502 max lat: 2.02651 avg lat: 0.651066
sec Cur ops started finished avg MB/s cur MB/s last lat avg lat
60 16 1481 1465 97.6565 108 0.939101 0.651066
Total time run: 60.510059
Total writes made: 1482
Write size: 4194304
Bandwidth (MB/sec): 97.967
Stddev Bandwidth: 17.5802
Max bandwidth (MB/sec): 120
Min bandwidth (MB/sec): 0
Average Latency: 0.652469
Stddev Latency: 0.27032
Max latency: 2.02651
Min latency: 0.219502Which seems about right for a 1GB link;
This creates a new pool named 'scbench' and then performs a write benchmark for 60 seconds. Notice the --no-cleanup option, which leaves behind some data. The output gives you a good indicator of how fast your cluster can write data.
Two types of read benchmarks are available: seq for sequential reads and rand for random reads. To perform a read benchmark, use the commands below:
rados bench -p scbench 10 seq # hmm this crashes on hammer?!
rados bench -p scbench 10 rand # this one runs okYou can also add the -t parameter to increase the concurrency of reads and writes (defaults to 16 threads), or the -b parameter to change the size of the object being written (defaults to 4 MB). It's also a good idea to run multiple copies of this benchmark against different pools, to see how performance changes with multiple clients.
You can clean up the benchmark data left behind by the write benchmark with this command:
rados -p scbench cleanupBenchmark your Ceph Cluster (RBD/Block)
Lets setup the block device benchmark;
[root@ceph-mon1 ~]# ceph osd pool create rbdbench 100 100
pool 'rbdbench' created
[root@ceph-mon1 ~]# rbd create image01 --size 1024 --pool rbdbench
[root@ceph-mon1 ~]# rbd map image01 --pool rbdbench --name client.admin
/dev/rbd0
[root@ceph-mon1 ~]# ls /dev/rbd
rbd/ rbd0
[root@ceph-mon1 ~]# ls /dev/rbd/rbdbench/image01 -lah
lrwxrwxrwx 1 root root 10 Mar 3 18:25 /dev/rbd/rbdbench/image01 -> ../../rbd0
[root@ceph-mon1 ~]# mkfs.ext4 -m0 /dev/rbd0
# snip
[root@ceph-mon1 ~]# mkdir /mnt/ceph-block-device
[root@ceph-mon1 ~]# mount /dev/rbd/rbdbench/image01 /mnt/ceph-block-device
[root@ceph-mon1 ~]# df -h
# snip
/dev/rbd0 976M 2.6M 958M 1% /mnt/ceph-block-deviceAnd run the rbd tests;
[root@ceph-mon1 ~]# rbd bench-write image01 --pool=rbdbench
bench-write io_size 4096 io_threads 16 bytes 1073741824 pattern seq
SEC OPS OPS/SEC BYTES/SEC
1 13023 13039.19 53408507.58
2 26663 13116.23 53724063.44
3 41471 13829.29 56644767.46
4 55679 13754.19 56337163.38
5 69292 13695.71 56097642.82
6 82949 13956.94 57167624.18
7 96818 14124.95 57855797.69
8 110524 13692.56 56084731.58
9 125270 13974.68 57240275.09
10 137981 13874.42 56829606.72
11 151636 13722.28 56206447.11
12 164513 13541.25 55464962.50
13 177053 13421.49 54974406.85
14 190050 13032.00 53379086.79
15 204131 13260.21 54313806.63
16 218337 13346.96 54669135.15
17 233748 13847.05 56717512.07
18 246706 13930.60 57059725.75
19 260079 13985.05 57282745.63
elapsed: 19 ops: 262144 ops/sec: 13395.91 bytes/sec: 54869667.12Lets try fio (which has rbd support now), we'll need to install latest version with rbd support
yum groupinstall 'Development tools'
yum install librbd1.x86_64 librbd1-devel.x86_64
yum install zlib-devel
yum install git
git clone git://git.kernel.dk/fio.git
cd fio/
./configure
# ensure the following;
Rados Block Device engine yes
rbd_invalidate_cache yes
makeWith fio built and ready;
[root@ceph-mon1 fio]# cat rbd.fio
[global]
ioengine=rbd
clientname=admin
pool=rbdbench
rbdname=image01
rw=randwrite
bs=4k
[rbd_iodepth32]
iodepth=32
[root@ceph-mon1 fio]# ./fio ./rbd.fio
rbd_iodepth32: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=rbd, iodepth=32
fio-2.6-21-g8116
Starting 1 process
rbd engine: RBD version: 0.1.9
Jobs: 1 (f=1): [w(1)] [15.6% done] [0KB/3244KB/0KB /s] [0/811/0 iops] [eta 10m:48s]