Difference between revisions of "CEPH: Ceph on the Blades"

From Define Wiki
Jump to navigation Jump to search
Line 8: Line 8:
  
 
=== Testing Environment ===
 
=== Testing Environment ===
# OS: CentOS release 6.3 (Final)
+
OS: CentOS release 6.3 (Final) </br>
 +
Firewall Disabled
 
{| class="wikitable"  
 
{| class="wikitable"  
 
|-
 
|-
Line 24: Line 25:
 
Linux Blade8 3.8.8-1.el6.elrepo.x86_64 #1 SMP Wed Apr 17 16:47:58 EDT 2013 x86_64 x86_64 x86_64 GNU/Linux
 
Linux Blade8 3.8.8-1.el6.elrepo.x86_64 #1 SMP Wed Apr 17 16:47:58 EDT 2013 x86_64 x86_64 x86_64 GNU/Linux
 
|}
 
|}
# Firewall Disabled
 
  
 
== Install the Ceph Bobtail release ==
 
== Install the Ceph Bobtail release ==

Revision as of 11:59, 3 May 2013

Environment

Dependencies

LINUX KERNEL

Ceph Kernel Client: We currently recommend:

  • v3.6.6 or later in the v3.6 stable series
  • v3.4.20 or later in the v3.4 stable series
  • btrfs: If you use the btrfs file system with Ceph, we recommend using a recent Linux kernel (v3.5 or later).

Testing Environment

OS: CentOS release 6.3 (Final)
Firewall Disabled

Server nodes
uname -a

Linux Blade3 2.6.32-279.el6.x86_64 #1 SMP Fri Jun 22 12:19:21 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux

Client node
uname -a

Linux Blade8 3.8.8-1.el6.elrepo.x86_64 #1 SMP Wed Apr 17 16:47:58 EDT 2013 x86_64 x86_64 x86_64 GNU/Linux

Install the Ceph Bobtail release

On all the nodes

rpm --import 'https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc'
su -c 'rpm -Uvh http://ceph.com/rpm-bobtail/el6/x86_64/ceph-release-1-0.el6.noarch.rpm'
yum -y install ceph

Configuration

Create the Ceph Configuration File

  • Location: /etc/ceph/ceph.conf
  • To be copied to all the nodes (servers nodes and clients)
[global]
	auth cluster required = none
	auth service required = none
	auth client required = none
[osd]
	osd journal size = 1000
	filestore xattr use omap = true
	osd mkfs type = ext4
	osd mount options ext4 = user_xattr,rw,noexec,nodev,noatime,nodiratime
[mon.a]
	host = blade3
	mon addr = <IP of blade3>:6789
[mon.b]
	host = blade4
	mon addr = <IP of blade4>:6789
[mon.c]
	host = blade5
	mon addr = <IP of blade5>:6789
[osd.0]
	host = blade3
[osd.1]
	host = blade4
[osd.2]
	host = blade5
[mds.a]
	host = blade3

Create the Ceph deamon working directories

  • The location and naming convention of the directories should be strictly followed.
ssh blade3 mkdir -p /var/lib/ceph/osd/ceph-0
ssh blade4 mkdir -p /var/lib/ceph/osd/ceph-1
ssh blade5 mkdir -p /var/lib/ceph/osd/ceph-2
ssh blade3 mkdir -p /var/lib/ceph/mon/ceph-a
ssh blade4 mkdir -p /var/lib/ceph/mon/ceph-b
ssh blade5 mkdir -p /var/lib/ceph/mon/ceph-c
ssh blade3 mkdir -p /var/lib/ceph/mds/ceph-a

Run the mkcephfs command from a server node

  • Execute the following from a server node
mkcephfs -a -c /etc/ceph/ceph.conf -k ceph.keyring

Start the Ceph Cluster

  • Execute the following from a node that has password less SSH to the other server nodes.
service ceph -a start

Issues

  • Have seen the following issue quite often when starting the cluster
[root@Blade3 ~]# service ceph -a start
=== mon.a ===
=== mon.b ===
=== mon.c ===
=== mds.a ===
=== osd.0 ===
Starting Ceph osd.0 on blade3...
starting osd.0 at :/0 osd_data /var/lib/ceph/osd/ceph-0 /var/lib/ceph/osd/ceph-0/journal
=== osd.1 ===
Starting Ceph osd.1 on blade4...
global_init: unable to open config file from search list /tmp/ceph.conf.33b54ef1fee10259e92480001532cf78
failed: 'ssh blade4 ulimit -n 8192;  /usr/bin/ceph-osd -i 1 --pid-file /var/run/ceph/osd.1.pid -c /tmp/ceph.conf.33b54ef1fee10259e92480001532cf78 '
  • Its not picking up the correct file name on the remote server node.
  • Executing the following on the failed node should start the OSD daemon. Eg: In the case run the following on blade4
  • Substitute the correct the name for the ceph.conf file form the tmp directory.
ssh blade4 ulimit -n 8192;  /usr/bin/ceph-osd -i 1 --pid-file /var/run/ceph/osd.1.pid -c /tmp/ceph.conf.<XXX>

Verify Cluster Health

ceph status

 health HEALTH_OK
   monmap e1: 3 mons at {a=100.100.0.3:6789/0,b=100.100.0.4:6789/0,c=100.100.0.5:6789/0}, election epoch 8,  quorum 0,1,2 a,b,c
   osdmap e22: 3 osds: 3 up, 3 in
    pgmap v1198: 1544 pgs: 1544 active+clean; 15766 MB data, 40372 MB used, 100 GB / 147 GB avail
   mdsmap e9: 1/1/1 up {0=a=up:active}
ceph osd tree

# id    weight  type name       up/down reweight
-1      3       root default
-3      3               rack unknownrack
-2      1                       host blade3
0       1                               osd.0   up      1
-4      1                       host blade4
1       1                               osd.1   up      1
-5      1                       host blade5
2       1                               osd.2   up      1

Performance

1G Ethernet 10G Ethernet
Example Example
Example Example
Example Example