Difference between revisions of "CEPH: Ceph on the Blades"

From Define Wiki
Jump to navigation Jump to search
Line 152: Line 152:
 
2+0 records out
 
2+0 records out
 
2147483648 bytes (2.1 GB) copied, 4.51147 s, 476 MB/s</syntaxhighlight>
 
2147483648 bytes (2.1 GB) copied, 4.51147 s, 476 MB/s</syntaxhighlight>
 +
|}
 +
=== Evaluating the network ===
 +
==== IPERF ====
 +
{| class="wikitable"
 +
|-
 +
! 1G Ethernet !! 10G Ethernet
 +
|-
 +
| <syntaxhighlight></syntaxhighlight>
 +
|| <syntaxhighlight></syntaxhighlight>
 +
|}
 +
==== NETCAT ====
 +
{| class="wikitable"
 +
|-
 +
! 1G Ethernet !! 10G Ethernet
 +
|-
 +
| <syntaxhighlight></syntaxhighlight>
 +
|| <syntaxhighlight></syntaxhighlight>
 +
|}
 +
=== Ceph Benchmarks ===
 +
{| class="wikitable"
 +
|-
 +
! 1G Ethernet !! 10G Ethernet
 +
|-
 +
| <syntaxhighlight></syntaxhighlight>
 +
|| <syntaxhighlight></syntaxhighlight>
 
|}
 
|}

Revision as of 12:09, 3 May 2013

Environment

Dependencies

LINUX KERNEL

Ceph Kernel Client: We currently recommend:

  • v3.6.6 or later in the v3.6 stable series
  • v3.4.20 or later in the v3.4 stable series
  • btrfs: If you use the btrfs file system with Ceph, we recommend using a recent Linux kernel (v3.5 or later).

Testing Environment

OS: CentOS release 6.3 (Final)
Firewall Disabled

Server nodes
uname -a

Linux Blade3 2.6.32-279.el6.x86_64 #1 SMP Fri Jun 22 12:19:21 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux

Client node
uname -a

Linux Blade8 3.8.8-1.el6.elrepo.x86_64 #1 SMP Wed Apr 17 16:47:58 EDT 2013 x86_64 x86_64 x86_64 GNU/Linux

Install the Ceph Bobtail release

On all the nodes

rpm --import 'https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc'
su -c 'rpm -Uvh http://ceph.com/rpm-bobtail/el6/x86_64/ceph-release-1-0.el6.noarch.rpm'
yum -y install ceph

Configuration

Create the Ceph Configuration File

  • Location: /etc/ceph/ceph.conf
  • To be copied to all the nodes (servers nodes and clients)
[global]
	auth cluster required = none
	auth service required = none
	auth client required = none
[osd]
	osd journal size = 1000
	filestore xattr use omap = true
	osd mkfs type = ext4
	osd mount options ext4 = user_xattr,rw,noexec,nodev,noatime,nodiratime
[mon.a]
	host = blade3
	mon addr = <IP of blade3>:6789
[mon.b]
	host = blade4
	mon addr = <IP of blade4>:6789
[mon.c]
	host = blade5
	mon addr = <IP of blade5>:6789
[osd.0]
	host = blade3
[osd.1]
	host = blade4
[osd.2]
	host = blade5
[mds.a]
	host = blade3

Create the Ceph deamon working directories

  • The location and naming convention of the directories should be strictly followed.
ssh blade3 mkdir -p /var/lib/ceph/osd/ceph-0
ssh blade4 mkdir -p /var/lib/ceph/osd/ceph-1
ssh blade5 mkdir -p /var/lib/ceph/osd/ceph-2
ssh blade3 mkdir -p /var/lib/ceph/mon/ceph-a
ssh blade4 mkdir -p /var/lib/ceph/mon/ceph-b
ssh blade5 mkdir -p /var/lib/ceph/mon/ceph-c
ssh blade3 mkdir -p /var/lib/ceph/mds/ceph-a

Run the mkcephfs command from a server node

  • Execute the following from a server node
mkcephfs -a -c /etc/ceph/ceph.conf -k ceph.keyring

Start the Ceph Cluster

  • Execute the following from a node that has password less SSH to the other server nodes.
service ceph -a start

Issues

  • Have seen the following issue quite often when starting the cluster
[root@Blade3 ~]# service ceph -a start
=== mon.a ===
=== mon.b ===
=== mon.c ===
=== mds.a ===
=== osd.0 ===
Starting Ceph osd.0 on blade3...
starting osd.0 at :/0 osd_data /var/lib/ceph/osd/ceph-0 /var/lib/ceph/osd/ceph-0/journal
=== osd.1 ===
Starting Ceph osd.1 on blade4...
global_init: unable to open config file from search list /tmp/ceph.conf.33b54ef1fee10259e92480001532cf78
failed: 'ssh blade4 ulimit -n 8192;  /usr/bin/ceph-osd -i 1 --pid-file /var/run/ceph/osd.1.pid -c /tmp/ceph.conf.33b54ef1fee10259e92480001532cf78 '
  • Its not picking up the correct file name on the remote server node.
  • Executing the following on the failed node should start the OSD daemon. Eg: In the case run the following on blade4
  • Substitute the correct the name for the ceph.conf file form the tmp directory.
ssh blade4 ulimit -n 8192;  /usr/bin/ceph-osd -i 1 --pid-file /var/run/ceph/osd.1.pid -c /tmp/ceph.conf.<XXX>

Verify Cluster Health

ceph status

 health HEALTH_OK
   monmap e1: 3 mons at {a=100.100.0.3:6789/0,b=100.100.0.4:6789/0,c=100.100.0.5:6789/0}, election epoch 8,  quorum 0,1,2 a,b,c
   osdmap e22: 3 osds: 3 up, 3 in
    pgmap v1198: 1544 pgs: 1544 active+clean; 15766 MB data, 40372 MB used, 100 GB / 147 GB avail
   mdsmap e9: 1/1/1 up {0=a=up:active}
ceph osd tree

# id    weight  type name       up/down reweight
-1      3       root default
-3      3               rack unknownrack
-2      1                       host blade3
0       1                               osd.0   up      1
-4      1                       host blade4
1       1                               osd.1   up      1
-5      1                       host blade5
2       1                               osd.2   up      1

Performance

Local disk benchmark

1G Ethernet 10G Ethernet
[root@Blade6 test]# dd if=/dev/zero of=here bs=1G count=2 oflag=direct
2+0 records in
2+0 records out
2147483648 bytes (2.1 GB) copied, 4.47646 s, 480 MB/s
[root@Blade6 test]# dd if=/dev/zero of=here bs=1G count=2 oflag=direct
2+0 records in
2+0 records out
2147483648 bytes (2.1 GB) copied, 4.53849 s, 473 MB/s
[root@Blade6 test]# dd if=/dev/zero of=here bs=1G count=2 oflag=direct
2+0 records in
2+0 records out
2147483648 bytes (2.1 GB) copied, 4.50215 s, 477 MB/s
[root@Blade8 test]# dd if=/dev/zero of=here bs=1G count=2 oflag=direct
2+0 records in
2+0 records out
2147483648 bytes (2.1 GB) copied, 4.58078 s, 469 MB/s
[root@Blade8 test]# dd if=/dev/zero of=here bs=1G count=2 oflag=direct
2+0 records in
2+0 records out
2147483648 bytes (2.1 GB) copied, 3.85319 s, 557 MB/s
[root@Blade8 test]# dd if=/dev/zero of=here bs=1G count=2 oflag=direct
2+0 records in
2+0 records out
2147483648 bytes (2.1 GB) copied, 4.51147 s, 476 MB/s

Evaluating the network

IPERF

1G Ethernet 10G Ethernet

NETCAT

1G Ethernet 10G Ethernet

Ceph Benchmarks

1G Ethernet 10G Ethernet