Install Ceph Hammer

From Define Wiki
Revision as of 15:58, 18 May 2015 by Chenhui (talk | contribs) (→‎Add Ceph to OpenStack)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Set up Ceph Repos

on all ceph nodes: add keys: To install the release.asc key, execute the following:

sudo rpm --import 'https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc'

To install the autobuild.asc key, execute the following (QA and developers only):

sudo rpm --import 'https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/autobuild.asc'

add Ceph extras: For RPM packages, add our package repository to your /etc/yum.repos.d repos (e.g., ceph-extras.repo). Some Ceph packages (e.g., QEMU) must take priority over standard packages, so you must ensure that you set priority=2.

[ceph-extras]
name=Ceph Extras Packages
baseurl=http://ceph.com/packages/ceph-extras/rpm/centos6/$basearch
enabled=1
priority=2
gpgcheck=1
type=rpm-md
gpgkey=https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc

[ceph-extras-noarch]
name=Ceph Extras noarch
baseurl=http://ceph.com/packages/ceph-extras/rpm/centos6/noarch
enabled=1
priority=2
gpgcheck=1
type=rpm-md
gpgkey=https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc

[ceph-extras-source]
name=Ceph Extras Sources
baseurl=http://ceph.com/packages/ceph-extras/rpm/centos6/SRPMS
enabled=1
priority=2
gpgcheck=1
type=rpm-md
gpgkey=https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc

Add Ceph repo: For major releases, you may add a Ceph entry to the /etc/yum.repos.d directory. Create a ceph.repo file

[ceph]
name=Ceph packages for $basearch
baseurl=http://ceph.com/rpm-hammer/el6/$basearch
enabled=1
priority=2
gpgcheck=1
type=rpm-md
gpgkey=https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc

[ceph-noarch]
name=Ceph noarch packages
baseurl=http://ceph.com/rpm-hammer/el6/noarch
enabled=1
priority=2
gpgcheck=1
type=rpm-md
gpgkey=https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc

[ceph-source]
name=Ceph source packages
baseurl=http://ceph.com/rpm-hammer/el6/SRPMS
enabled=0
priority=2
gpgcheck=1
type=rpm-md
gpgkey=https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc

Install EPEL repository

## RHEL/CentOS 6 64-Bit ##
# wget http://download.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm
# rpm -ivh epel-release-6-8.noarch.rpm

Install other packages: Install third-party binaries required by Ceph:

# yum install -y snappy leveldb gdisk python-argparse gperftools-libs

Install Ceph:

yum install ceph

Deploy the Ceph: on ceph-node1: Adding First MON: Create a directory for Ceph and create your Ceph cluster configuration file:

# mkdir /etc/ceph
# touch /etc/ceph/ceph.conf
Generate a FSID for your Ceph cluster: e.g.: 792cda6c-af73-46c7-a60b-89d8aa8cf2cb
# uuidgen
create ceph config file: /etc/ceph/ceph.conf as:
[global]
fsid = {cluster-id}
mon initial members = {hostname}[, {hostname}]
mon host = {ip-address}[, {ip-address}]

#All clusters have a front-side public network.
#If you have two NICs, you can configure a back side cluster 
#network for OSD object replication, heart beats, backfilling,
#recovery, etc.
public network = {network}[, {network}]
#cluster network = {network}[, {network}] 

#Clusters require authentication by default.
auth cluster required = cephx
auth service required = cephx
auth client required = cephx

#Choose reasonable numbers for your journals, number of replicas
#and placement groups.
osd journal size = {n}
osd pool default size = {n}  # Write an object n times.
osd pool default min size = {n} # Allow writing n copy in a degraded state.
osd pool default pg num = {n}
osd pool default pgp num = {n}

#Choose a reasonable crush leaf type.
#0 for a 1-node cluster.
#1 for a multi node cluster in a single rack
#2 for a multi node, multi chassis cluster with multiple hosts in a chassis
#3 for a multi node cluster with hosts across racks, etc.
osd crush chooseleaf type = {n}

An example ceph.conf:

[global]
fsid = 792cda6c-af73-46c7-a60b-89d8aa8cf2cb
public network = 172.28.0.0/16

#Choose reasonable numbers for your journals, number of replicas
#and placement groups.
osd journal size = 1024
osd pool default min size = 1
osd pool default pg num = 128
osd pool default pgp num = 128

[mon]
mon initial members = ceph-node1
mon host = ceph-node1,ceph-node2,ceph-node3
mon addr = 172.28.1.89,172.28.0.228,172.28.0.177

[mon.ceph-node1]
host = ceph-node1
mon addr = 172.28.1.89

Create a keyring for your cluster and generate a monitor secret key as follows:

# ceph-authtool --create-keyring /tmp/ceph.mon.keyring --gen-key -n mon. --cap mon 'allow *'

Create a client.admin user and add the user to the keyring:

# ceph-authtool --create-keyring /etc/ceph/ceph.client.admin.keyring --gen-key -n client.admin --set-uid=0 --cap mon 'allow *' --cap osd 'allow *' --cap mds 'allow'

Add the client.admin key to ceph.mon.keyring:

# ceph-authtool /tmp/ceph.mon.keyring --import-keyring /etc/ceph/ceph.client.admin.keyring
create first monitor daemon
   43  monmaptool --create --add ceph-node1 172.28.1.89 --fsid 792cda6c-af73-46c7-a60b-89d8aa8cf2cb /tmp/monmap
   44  mkdir /var/lib/ceph/mon/ceph-ceph-node1
   45  ceph-mon --mkfs -i ceph-node1 --monmap /tmp/monmap --keyring /tmp/ceph.mon.keyring

start the ceph service:

[root@ceph-node1 ceph]# service ceph start
=== mon.ceph-node1 === 
Starting Ceph mon.ceph-node1 on ceph-node1...
2015-05-14 12:22:22.356497 7f61f29b97a0 -1 WARNING: 'mon addr' config option 172.28.1.89:0/0 does not match monmap file
         continuing with monmap configuration
Starting ceph-create-keys on ceph-node1...
[root@ceph-node1 ceph]# ceph status 
    cluster 792cda6c-af73-46c7-a60b-89d8aa8cf2cb
     health HEALTH_ERR
            64 pgs stuck inactive
            64 pgs stuck unclean
            no osds
     monmap e1: 1 mons at {ceph-node1=172.28.1.89:6789/0}
            election epoch 2, quorum 0 ceph-node1
     osdmap e1: 0 osds: 0 up, 0 in
      pgmap v2: 64 pgs, 1 pools, 0 bytes data, 0 objects
            0 kB used, 0 kB / 0 kB avail
                  64 creating

Creating OSD:

   74  ceph-disk list|grep unknown
   75  parted /dev/sdb mklabel GPT
   76  parted /dev/sdc mklabel GPT
   77  parted /dev/sdd mklabel GPT
   78  ceph-disk prepare --cluster ceph --cluster-uuid 792cda6c-af73-46c7-a60b-89d8aa8cf2cb --fs-type xfs /dev/sdb
   79  df
   80  ceph-disk prepare --cluster ceph --cluster-uuid 792cda6c-af73-46c7-a60b-89d8aa8cf2cb --fs-type xfs /dev/sdc
   81  ceph-disk prepare --cluster ceph --cluster-uuid 792cda6c-af73-46c7-a60b-89d8aa8cf2cb --fs-type xfs /dev/sdd
   82  lsblk
   83  ceph-disk activate /dev/sdb1
   84  ceph-disk activate /dev/sdc1
   85  ceph-disk activate /dev/sdd1
   86  ceph -s
   87  history|less

Now the ceph status:

[root@ceph-node1 ceph]# ceph -s
    cluster 792cda6c-af73-46c7-a60b-89d8aa8cf2cb
     health HEALTH_WARN
            59 pgs degraded
            64 pgs stuck unclean
            59 pgs undersized
            too few PGs per OSD (21 < min 30)
     monmap e1: 1 mons at {ceph-node1=172.28.1.89:6789/0}
            election epoch 2, quorum 0 ceph-node1
     osdmap e15: 3 osds: 3 up, 3 in; 37 remapped pgs
      pgmap v20: 64 pgs, 1 pools, 0 bytes data, 0 objects
            101936 kB used, 2094 GB / 2094 GB avail
                  32 active+undersized+degraded+remapped
                  27 active+undersized+degraded
                   5 active

Copy keyring to other nodes:

   88  scp /etc/ceph/ceph.c* ceph-node2:/etc/ceph
   89  scp /etc/ceph/ceph.c* ceph-node3:/etc/ceph

Now on other nodes, you should be able issue

ceph -s


Scale up the cluster:

Adding Monitor: on ceph-node2:

   33  mkdir -p /var/lib/ceph/mon/ceph-ceph-node2 /tmp/ceph-node2
   34  vim /etc/ceph/ceph.conf 
   35  ceph auth get mon. -o /tmp/ceph-node2/monkeyring
   39  ceph mon getmap -o /tmp/ceph-node2/monmap
   40  ceph-mon -i ceph-node2 --mkfs --monmap /tmp/ceph-node2/monmap --keyring /tmp/ceph-node2/monkeyring 
   43  service ceph start
   44  ceph mon add ceph-node2 172.28.0.228:6789
   45  ceph -s

On ceph-node3:

   36  mkdir -p /var/lib/ceph/mon/ceph-ceph-node3 /tmp/ceph-node3
   37  vim /etc/ceph/ceph.conf 
   38  ceph auth get mon. -o /tmp/ceph-node3/monkeyring
   39  ceph mon getmap -o /tmp/ceph-node3/monmap
   40  ceph-mon -i ceph-node3 --mkfs --monmap /tmp/ceph-node3/monmap --keyring /tmp/ceph-node3/monkeyring
   43  service ceph start
   44  ceph mon add ceph-node3 172.28.0.177:6789


Configuring NTP: On ceph-node1

   95  chkconfig ntpd on
   96  ssh ceph-node2 chkconfig ntpd on
   97  ssh ceph-node3 chkconfig ntpd o
   98  ssh ceph-node3 chkconfig ntpd on
   99  ntpdate pool.ntp.org
  100  ssh ceph-node2 ntpdate pool.ntp.org
  101  ssh ceph-node3 ntpdate pool.ntp.org
  102  /etc/init.d/ntpd start
  103  ssh ceph-node2 /etc/init.d/ntpd start
  104  ssh ceph-node3 /etc/init.d/ntpd start
  106  ssh ceph-node3 /etc/init.d/ntpd start

Adding OSDs refer to Creating OSD on ceph-node1

Add Ceph to OpenStack

install pre-required packages on openstack1:

yum install qemu libvirt
sudo yum install python-rbd

On ceph-node1: setup ssh key and ceph related configs

vim ~/.ssh/known_hosts 
ssh-copy-id openstack1
ls
cd /etc/yum.repos.d/
ls
scp ceph.repo openstack1:`pwd`
cd /etc/ce
cd /etc/ceph/
ls
scp ceph.conf ceph.client.admin.keyring openstack1:`pwd`

make sure the repo details, keyrings, ceph.conf and ceph -s are correct! On openstack1, create ceph auth:

ceph auth get-or-create client.cinder mon 'allow r' osd 'allow class-read object_prefix rbd_children, allow rwx pool=volumes, allow rwx pool=vms, allow rx pool=images'
ceph auth get-or-create client.glance mon 'allow r' osd 'allow class-read object_prefix rbd_children, allow rwx pool=images'
ceph auth get-or-create client.cinder-backup mon 'allow r' osd 'allow class-read object_prefix rbd_children, allow rwx pool=backups'
create uuid and secret for libvirt
uuidgen		
vim secret.xml 
yum install -y vim
vim secret.xml
uuidgen
vim secret.xml
virsh secret-define --file secret.xml
virsh secret-set-value --secret 457eb676-33da-42ec-9a8c-9293d545c337 --base64 $(cat client.cinder.key) && rm client.cinder.key secret.xml
ls
cd
ls
virsh secret-set-value --secret 07b30ea7-88e1-4dc6-b7b5-cc3b8886487a --base64 $(cat client.cinder.key) && rm client.cinder.key secret.xml
rm -fr /etc/ceph/secret.xml 
ls
cat keystonerc_admin

Content for secret.xml:

<secret ephemeral='no' private='no'>
  <uuid>457eb676-33da-42ec-9a8c-9293d545c337</uuid>
  <usage type='ceph'>
    <name>client.cinder secret</name>
  </usage>
</secret>

Configure OpenStack to use Ceph Configuring Glance Glance can use multiple back ends to store images. To use Ceph block devices by default, configure Glance like the following.

Edit /etc/glance/glance-api.conf and add under the [glance_store] section:

[DEFAULT]
...
default_store = rbd
...
[glance_store]
stores = rbd
rbd_store_pool = images
rbd_store_user = glance
rbd_store_ceph_conf = /etc/ceph/ceph.conf
rbd_store_chunk_size = 8

If you want to enable copy-on-write cloning of images, also add under the [DEFAULT] section:

show_image_direct_url = True

Note that this exposes the back end location via Glance’s API, so the endpoint with this option enabled should not be publicly accessible.

Disable the Glance cache management to avoid images getting cached under /var/lib/glance/image-cache/, assuming your configuration file has flavor = keystone+cachemanagement:

[paste_deploy]
flavor = keystone

Configuring Cinder OpenStack requires a driver to interact with Ceph block devices. You must also specify the pool name for the block device. On your OpenStack node, edit /etc/cinder/cinder.conf by adding:

volume_driver = cinder.volume.drivers.rbd.RBDDriver
rbd_pool = volumes
rbd_ceph_conf = /etc/ceph/ceph.conf
rbd_flatten_volume_from_snapshot = false
rbd_max_clone_depth = 5
rbd_store_chunk_size = 4
rados_connect_timeout = -1
glance_api_version = 2

If you’re using cephx authentication, also configure the user and uuid of the secret you added to libvirt as documented earlier:

rbd_user = cinder
rbd_secret_uuid = 457eb676-33da-42ec-9a8c-9293d545c337

Note that if you are configuring multiple cinder back ends, glance_api_version = 2 must be in the [DEFAULT] section. Configuring Cinder Backup OpenStack Cinder Backup requires a specific daemon so don’t forget to install it. On your Cinder Backup node, edit /etc/cinder/cinder.conf and add:

backup_driver = cinder.backup.drivers.ceph
backup_ceph_conf = /etc/ceph/ceph.conf
backup_ceph_user = cinder-backup
backup_ceph_chunk_size = 134217728
backup_ceph_pool = backups
backup_ceph_stripe_unit = 0
backup_ceph_stripe_count = 0
restore_discard_excess_bytes = true

Configuring Nova to attach Ceph RBD block device

In order to attach Cinder devices (either normal block or by issuing a boot from volume), you must tell Nova (and libvirt) which user and UUID to refer to when attaching the device. libvirt will refer to this user when connecting and authenticating with the Ceph cluster.

rbd_user = cinder
rbd_secret_uuid = 457eb676-33da-42ec-9a8c-9293d545c337

These two flags are also used by the Nova ephemeral backend.

Configuring Nova

In order to boot all the virtual machines directly into Ceph, you must configure the ephemeral backend for Nova.

It is recommended to enable the RBD cache in your Ceph configuration file (enabled by default since Giant). Moreover, enabling the admin socket brings a lot of benefits while troubleshoothing. Having one socket per virtual machine using a Ceph block device will help investigating performance and/or wrong behaviors.

This socket can be accessed like this:

ceph daemon /var/run/ceph/ceph-client.cinder.19195.32310016.asok help

Now on every compute nodes edit your Ceph configuration file:

[client]
    rbd cache = true
    rbd cache writethrough until flush = true
    admin socket = /var/run/ceph/$cluster-$type.$id.$pid.$cctid.asok

In Juno, Ceph block device was moved under the [libvirt] section. On every Compute node, edit /etc/nova/nova.conf under the [libvirt] section and add:

[libvirt]
images_type = rbd
images_rbd_pool = vms
images_rbd_ceph_conf = /etc/ceph/ceph.conf
rbd_user = cinder
rbd_secret_uuid = 457eb676-33da-42ec-9a8c-9293d545c337

It is also a good practice to disable file injection. While booting an instance, Nova usually attempts to open the rootfs of the virtual machine. Then, Nova injects values such as password, ssh keys etc. directly into the filesystem. However, it is better to rely on the metadata service and cloud-init.

On every Compute node, edit /etc/nova/nova.conf and add the following under the [libvirt] section:

inject_password = false
inject_key = false
inject_partition = -2

To ensure a proper live-migration, use the following flags:

live_migration_flag="VIR_MIGRATE_UNDEFINE_SOURCE,VIR_MIGRATE_PEER2PEER,VIR_MIGRATE_LIVE,VIR_MIGRATE_PERSIST_DEST"

Restart OpenStack

To activate the Ceph block device driver and load the block device pool name into the configuration, you must restart OpenStack. Thus, for Debian based systems execute these commands on the appropriate nodes:

sudo service openstack-glance-api restart
sudo service openstack-nova-compute restart
sudo service openstack-cinder-volume restart
sudo service openstack-cinder-backup restart

Test OpenStack with Ceph:

source /root/keystonerc_admin 
cinder create --display-name ceph-volume01 --display-description "test ceph storage" 10
ceph -s
cinder list
rados -p images ls
wget http://cloud-images.ubuntu.com/precise/current/precise-server-cloudimg-amd64-disk1.img
rados -p volumes ls
ceph -s
yum install wget 
wget http://cloud-images.ubuntu.com/precise/current/precise-server-cloudimg-amd64-disk1.img
rados df
rados df -h
rados df 
glance image-create --name="ubuntu-precise-image" --is-public=True --disk-format=qcow2 --container-format=ovf <precise-server-cloudimg-amd64-disk1.img
glance image-list