Openstack:Setting up Mapr 5.0.0 CentOS Sahara cluster

From Define Wiki
Revision as of 10:34, 3 August 2016 by Vipul (talk | contribs) (Added filed Sahara bug link from launchpad)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Following is a description of the steps followed to setup Sahara Mapr 5.0.0 CentOS cluster on Openstack Liberty. Some steps in between include modifications done specifically for deployment on Keele University Openstack.

Upload Mapr Image to Glance

Some pre-built sahara images are available at http://sahara-files.mirantis.com/images/upstream/liberty/. Since Sahara deployments in Liberty don't offer cloud-init, hence we need to modify the qcow2 image before uploading it to glance. For this, we use a tool called guestfish. Following changes are done to the image:

  • (Keele University infra only) Add proxy for yum package manager and shell sessions. Also, add custom DHCP servers
$ guestfish --rw -a sahara-liberty-mapr-5.0.0-centos-6.6.qcow2
<fs> run
<fs> mount /dev/sda1 /
<fs> vi /etc/yum.conf
# Add the below line
proxy=http://wwwcache.keele.ac.uk:8080

<fs> write /etc/dhcp/dhclient.conf "prepend domain-name-servers 160.5.40.3, 160.5.169.62;"

In the Mapr plugin inside Openstack Liberty 3.0.2, some oozie process related bugs were encountered, specifically

  • When the mapr packages are installed after spawning instances, the warden.oozie.conf is not placed in the correct directory, hence the warden process fails to recognize the presence of oozie, which ultimately make the cluster go into Error state. To overcome this, we'll be placing the warden.oozie.conf file in the correct location. For the same, a bug was filed in Sahara project on Launchpad. More details here : https://bugs.launchpad.net/bugs/1607704
<fs> mkdir -p /opt/mapr/conf/conf.d
<fs> write /opt/mapr/conf/conf.d/warden.oozie.conf ""
<fs> vi /opt/mapr/conf/conf.d/warden.oozie.conf

# Write the below contents
services=oozie:1:cldb
service.displayname=Oozie
service.command.start=/opt/mapr/oozie/oozie-4.1.0/bin/oozied.sh start
service.command.stop=/opt/mapr/oozie/oozie-4.1.0/bin/oozied.sh stop
service.command.type=BACKGROUND
service.command.monitorcommand=/opt/mapr/oozie/oozie-4.1.0/bin/oozied.sh status
service.port=11000
service.ui.port=11000
service.uri=/oozie
service.logs.location=/opt/mapr/oozie/oozie-4.1.0/logs
service.process.type=JAVA
service.env="MAPR_MAPREDUCE_MODE=default"
  • At the end of configuration phase, sahara restarts oozie. Before going to Running state, it is stuck in standby state for some time. Sahara process only waits 60s for this to happen. Even in a big instance, oozie may not start within 60s. Hence, predefined parameter timeout's value in function _wait_for_status is changed from 60 to 180 in file /usr/lib/python2.7/site-packages/sahara/plugins/mapr/domain/node_process.py

When the image is ready, upload it to glance.

Register the image with Sahara

In Data processing tab in Horizon dahsboard, register the image uploaded to glance with plugin tags 'mapr' & '5.0.0.mrv2'. For centos images, configure user as 'cloud-user'. For ubuntu images, the user would be 'ubuntu'

Create node group templates

A single template with all processes running in one node can be created or the processes can be distributed across multiple node group templates. More info about the the kind of processes required can be found at http://docs.openstack.org/developer/sahara/userdoc/mapr_plugin.html

Choose instance flavor accordingly. Do choose a floating ip pool

Create cluster template

Create a cluster template and add the count of node group instances which you want to run in your cluster. Edit any Mapr process parameters if you want to.

Launch Cluster

Launch cluster using the cluster template created above. Use the same base image as registered with Sahara earlier. Choose ssh access key and private network.

Note: By default, Sahara uses heat to launch clusters. This may cause clusters to get stuck in waiting state. If this happens, the infrastructure_engine parameter in /etc/sahara.conf can be changed to 'direct', so that sahara communicated with nova directly.