Difference between revisions of "Bright:Deploying Hadoop"

From Define Wiki
Jump to navigation Jump to search
 
(11 intermediate revisions by 2 users not shown)
Line 1: Line 1:
Before any installation, make sure you have the appropriate licence to install Hadoop nodes.
+
== First steps ==
 +
 
 +
* Before any installation, make sure you have the appropriate licence to install Hadoop nodes.
 +
* Preferably, create a new Node Category, which will include all the Hadoop nodes (nameservers, datanodes etc.)
 +
* Install the <tt>cm-hortonworks-hadoop</tt> package, through Bright:
 +
<syntaxhighlight> sudo yum install cm-hortonworks-hadoop</syntaxhighlight>
  
 
== Installing from CMGUI ==
 
== Installing from CMGUI ==
 +
 +
* Go to the "''Hadoop HDFS''" tab and click "''Add HDFS Instance''"
 +
[[File:hadoop-install0.png | 800px]]
 +
 +
* Provide instance name, java installation path and Hadoop data path:
 +
[[File:hadoop-install1.png | 500px]]
 +
 +
* Provide Hadoop distribution and make sure the tarball names are correct:
 +
[[File:hadoop-install2.png | 500px]]
 +
 +
* Select namenodes and Yarn server host:
 +
[[File:hadoop-install3.png | 500px]]
 +
 +
* Select the category of the Hadoop nodes:
 +
[[File:hadoop-install4.png | 500px]]
 +
 +
* Select any other optional nodes to act as data nodes:
 +
[[File:hadoop-install5.png | 500px]]
 +
 +
* Select Zookeeper nodes (at least 3):
 +
[[File:hadoop-install6.png | 500px]]
 +
 +
* Select HBASE nodes:
 +
[[File:hadoop-install7.png | 500px]]
 +
 +
* Review the configuration file and save it if it's needed:
 +
[[File:hadoop-install8.png | 500px]]
 +
 +
After clicking "''Next''" on the last step, the installation will start.
  
 
== Installing from terminal ==
 
== Installing from terminal ==
 +
 +
* If you have an xml Hadoop configuration file, just run:
 +
<syntaxhighlight> cm-hadoop-setup -c <config>.xml </syntaxhighlight>
 +
 +
Some example configuration files are located in <tt> /cm/local/apps/cluster-tools/hadoop/conf/ </tt>
 +
 +
* If you don't have a configuration file then run:
 +
<syntaxhighlight> cm-hadoop-setup </syntaxhighlight>
 +
 +
Then follow the on screen instructions, which are the same as the CMGUI ones. A configuration file will be then created.
 +
 +
== Log from Install ==
 +
<syntaxhighlight>
 +
# went through the cm-hadoop-setup ncurses setup, created teh file below
 +
[root@head-Boston hadoop]# cp /tmp/hadoop-ubs-test.xml /home/david/hadoop-scripts/
 +
[root@head-Boston hadoop]# cm-hadoop-setup -c /home/david/hadoop-scripts/hadoop-ubs-test.xml
 +
Reading config from file '/home/david/hadoop-scripts/hadoop-ubs-test.xml'... done.
 +
Hadoop flavor 'Hortonworks', release '2.6.0.2.2.0.0-2041'
 +
Hadoop is already installed in /cm/shared/apps/hadoop/Hortonworks/2.6.0.2.2.0.0-2041
 +
Will now configure instance 'ubs-test'
 +
ZooKeeper is already installed in /cm/shared/apps/hadoop/Hortonworks/zookeeper-3.4.6.2.2.0.0-2041
 +
HBase is already installed in /cm/shared/apps/hadoop/Hortonworks/hbase-0.98.4.2.2.0.0-2041-hadoop2
 +
Creating module file... done.
 +
Configuring Hadoop instance on local filesystem and images... done.
 +
Updating images...
 +
Updating images... done.
 +
Creating Hadoop instance in cmdaemon...
 +
Creating Hadoop instance in cmdaemon... done.
 +
Formatting HDFS...
 +
Formatting HDFS... done.
 +
Waiting for NameNode service...
 +
Waiting for NameNode service... done.
 +
Waiting for DataNodes to come up...
 +
Waiting for DataNodes to come up... done.
 +
Waiting for NameNode to be ready...
 +
Waiting for NameNode to be ready... done.
 +
Setting up HDFS...
 +
Setting up HDFS... done.
 +
Setting up HBase hosts...
 +
Setting up HBase hosts... done.
 +
Setting up YARN hosts...
 +
Setting up YARN hosts... done.
 +
Waiting for Hadoop to be ready for validation test...
 +
Waiting for Hadoop to be ready for validation test... done.
 +
Validating Hadoop setup...
 +
Validating Hadoop setup... done.
 +
Checking ZooKeeper installation...
 +
Checking ZooKeeper installation... done.
 +
Checking HBase installation...
 +
Checking HBase installation... done.
 +
Installation successfully completed.
 +
Finished.
 +
[root@head-Boston ~]# cmsh
 +
[head-Boston]% user
 +
[head-Boston->user]% use user david
 +
[head-Boston->user[david]]% set hadoophdfsaccess ubs-test; commit
 +
[Ctrl+d]
 +
[david@head-Boston ~]$ ssh openstack1
 +
Last login: Sat Jun  6 15:42:30 2015 from head-boston.cm.cluster
 +
[david@openstack1 ~]$ module load hadoop/ubs-test/Hortonworks/2.6.0.2.2.0.0-2041
 +
[david@openstack1 ~]$ module load zookeeper/ubs-test/Hortonworks/3.4.6.2.2.0.0-2041
 +
[david@openstack1 ~]$ module load hbase/ubs-test/Hortonworks/0.98.4.2.2.0.0-2041-hadoop2
 +
[david@openstack1 ~]$ cd $HADOOP_PREFIX/bin
 +
[david@openstack1 bin]$ ./hadoop fs -ls /
 +
Found 6 items
 +
drwx------  - root  hadoop          0 2015-06-06 16:13 /accumulo
 +
drwx------  - hbase hadoop          0 2015-06-06 16:13 /hbase
 +
drwxrwxrwt  - hdfs  hadoop          0 2015-06-06 16:18 /staging
 +
drwxr-xr-x  - hdfs  hadoop          0 2015-06-06 20:59 /system
 +
drwxrwxrwt  - hdfs  hadoop          0 2015-06-06 16:12 /tmp
 +
drwxr-xr-x  - hdfs  hadoop          0 2015-06-06 21:32 /user
 +
[david@openstack1 bin]$ hadoop jar /cm/shared/apps/hadoop/ubs-test/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.2.2.0.0-2041.jar pi 1 5
 +
Number of Maps  = 1
 +
Samples per Map = 5
 +
Wrote input for Map #0
 +
Starting Job
 +
15/06/06 21:35:42 INFO client.RMProxy: Connecting to ResourceManager at openstack3.cm.cluster/173.16.0.55:8032
 +
15/06/06 21:35:42 INFO input.FileInputFormat: Total input paths to process : 1
 +
15/06/06 21:35:42 INFO mapreduce.JobSubmitter: number of splits:1
 +
15/06/06 21:35:42 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1433603618985_0005
 +
15/06/06 21:35:42 INFO impl.YarnClientImpl: Submitted application application_1433603618985_0005
 +
15/06/06 21:35:43 INFO mapreduce.Job: The url to track the job: http://openstack3:8088/proxy/application_1433603618985_0005/
 +
15/06/06 21:35:43 INFO mapreduce.Job: Running job: job_1433603618985_0005
 +
15/06/06 21:35:49 INFO mapreduce.Job: Job job_1433603618985_0005 running in uber mode : false
 +
15/06/06 21:35:49 INFO mapreduce.Job:  map 0% reduce 0%
 +
15/06/06 21:35:54 INFO mapreduce.Job:  map 100% reduce 0%
 +
15/06/06 21:36:00 INFO mapreduce.Job:  map 100% reduce 100%
 +
15/06/06 21:36:00 INFO mapreduce.Job: Job job_1433603618985_0005 completed successfully
 +
15/06/06 21:36:00 INFO mapreduce.Job: Counters: 49
 +
File System Counters
 +
FILE: Number of bytes read=28
 +
FILE: Number of bytes written=219263
 +
FILE: Number of read operations=0
 +
FILE: Number of large read operations=0
 +
FILE: Number of write operations=0
 +
HDFS: Number of bytes read=277
 +
HDFS: Number of bytes written=215
 +
HDFS: Number of read operations=7
 +
HDFS: Number of large read operations=0
 +
HDFS: Number of write operations=3
 +
Job Counters
 +
Launched map tasks=1
 +
Launched reduce tasks=1
 +
Data-local map tasks=1
 +
Total time spent by all maps in occupied slots (ms)=2938
 +
Total time spent by all reduces in occupied slots (ms)=3030
 +
Total time spent by all map tasks (ms)=2938
 +
Total time spent by all reduce tasks (ms)=3030
 +
Total vcore-seconds taken by all map tasks=2938
 +
Total vcore-seconds taken by all reduce tasks=3030
 +
Total megabyte-seconds taken by all map tasks=3008512
 +
Total megabyte-seconds taken by all reduce tasks=3102720
 +
Map-Reduce Framework
 +
Map input records=1
 +
Map output records=2
 +
Map output bytes=18
 +
Map output materialized bytes=28
 +
Input split bytes=159
 +
Combine input records=0
 +
Combine output records=0
 +
Reduce input groups=2
 +
Reduce shuffle bytes=28
 +
Reduce input records=2
 +
Reduce output records=0
 +
Spilled Records=4
 +
Shuffled Maps =1
 +
Failed Shuffles=0
 +
Merged Map outputs=1
 +
GC time elapsed (ms)=62
 +
CPU time spent (ms)=2090
 +
Physical memory (bytes) snapshot=447819776
 +
Virtual memory (bytes) snapshot=2117320704
 +
Total committed heap usage (bytes)=402653184
 +
Shuffle Errors
 +
BAD_ID=0
 +
CONNECTION=0
 +
IO_ERROR=0
 +
WRONG_LENGTH=0
 +
WRONG_MAP=0
 +
WRONG_REDUCE=0
 +
File Input Format Counters
 +
Bytes Read=118
 +
File Output Format Counters
 +
Bytes Written=97
 +
Job Finished in 18.315 seconds
 +
Estimated value of Pi is 4.00000000000000000000
 +
[david@openstack1 ~]$ sudo su -
 +
[root@openstack1 ~]# rpm -e cluster-tools-slave-7.0-41_cm7.0.noarch
 +
[root@openstack1 ~]# yum install cluster-tools-7.0-4764_cm7.0.noarch
 +
[root@openstack1 ~]# exit
 +
[david@openstack1 hadoop]$ ./cm-hadoop-tests.sh ubs-test
 +
hadoop/ubs-test/Hortonworks/2.6.0.2.2.0.0-2041
 +
hbase/ubs-test/Hortonworks/0.98.4.2.2.0.0-2041-hadoop2
 +
 +
# STUFF RUNS FOR A WHILE - GO THROUGH GUI
 +
 +
# Lets uninstall
 +
 +
 +
[root@head-Boston ~]# cm-hadoop-setup -u ubs-test
 +
Requested removal of Hadoop instance 'ubs-test'.
 +
Stopping all Hadoop services for instance 'ubs-test'...
 +
Stopping all Hadoop services for instance 'ubs-test'... done.
 +
Instance 'ubs-test' found in cmdaemon. Removing...
 +
Unable to remove ubs-test, the following objects depend on it:
 +
Remove failed but no errors
 +
Instance 'ubs-test' found in cmdaemon. Removing... FAILED.
 +
Waiting a few seconds before removing files from local fs...
 +
Waiting a few seconds before removing files from local fs... done.
 +
Removing Hadoop instance 'ubs-test'
 +
 +
Removing:
 +
/etc/hadoop/ubs-test
 +
/var/log/hadoop/ubs-test
 +
/var/run/hadoop/ubs-test
 +
/tmp/hadoop/ubs-test/
 +
/etc/hadoop/ubs-test/zookeeper
 +
/var/lib/zookeeper/ubs-test
 +
/var/log/zookeeper/ubs-test
 +
/var/run/zookeeper/ubs-test
 +
/etc/hadoop/ubs-test/hbase
 +
/var/log/hbase/ubs-test
 +
/var/run/hbase/ubs-test
 +
/var/lib/hadoop/ubs-test/
 +
/etc/init.d/hadoop-ubs-test-*
 +
Updating images...
 +
Updating images... done.
 +
Instance 'ubs-test' removed from images.
 +
Removing log files.
 +
Module file(s) deleted.
 +
Removal successfully completed.
 +
</syntaxhighlight>

Latest revision as of 21:00, 6 June 2015

First steps

  • Before any installation, make sure you have the appropriate licence to install Hadoop nodes.
  • Preferably, create a new Node Category, which will include all the Hadoop nodes (nameservers, datanodes etc.)
  • Install the cm-hortonworks-hadoop package, through Bright:
 sudo yum install cm-hortonworks-hadoop

Installing from CMGUI

  • Go to the "Hadoop HDFS" tab and click "Add HDFS Instance"
Error creating thumbnail: File missing
  • Provide instance name, java installation path and Hadoop data path:
Error creating thumbnail: File missing
  • Provide Hadoop distribution and make sure the tarball names are correct:
Error creating thumbnail: File missing
  • Select namenodes and Yarn server host:
Error creating thumbnail: File missing
  • Select the category of the Hadoop nodes:
Error creating thumbnail: File missing
  • Select any other optional nodes to act as data nodes:
Error creating thumbnail: File missing
  • Select Zookeeper nodes (at least 3):
Error creating thumbnail: File missing
  • Select HBASE nodes:
Error creating thumbnail: File missing
  • Review the configuration file and save it if it's needed:
Error creating thumbnail: File missing

After clicking "Next" on the last step, the installation will start.

Installing from terminal

  • If you have an xml Hadoop configuration file, just run:
 cm-hadoop-setup -c <config>.xml

Some example configuration files are located in /cm/local/apps/cluster-tools/hadoop/conf/

  • If you don't have a configuration file then run:
 cm-hadoop-setup

Then follow the on screen instructions, which are the same as the CMGUI ones. A configuration file will be then created.

Log from Install

# went through the cm-hadoop-setup ncurses setup, created teh file below
[root@head-Boston hadoop]# cp /tmp/hadoop-ubs-test.xml /home/david/hadoop-scripts/
[root@head-Boston hadoop]# cm-hadoop-setup -c /home/david/hadoop-scripts/hadoop-ubs-test.xml 
Reading config from file '/home/david/hadoop-scripts/hadoop-ubs-test.xml'... done.
Hadoop flavor 'Hortonworks', release '2.6.0.2.2.0.0-2041'
Hadoop is already installed in /cm/shared/apps/hadoop/Hortonworks/2.6.0.2.2.0.0-2041
Will now configure instance 'ubs-test'
ZooKeeper is already installed in /cm/shared/apps/hadoop/Hortonworks/zookeeper-3.4.6.2.2.0.0-2041
HBase is already installed in /cm/shared/apps/hadoop/Hortonworks/hbase-0.98.4.2.2.0.0-2041-hadoop2
Creating module file... done.
Configuring Hadoop instance on local filesystem and images... done.
Updating images...
Updating images... done.
Creating Hadoop instance in cmdaemon...
Creating Hadoop instance in cmdaemon... done.
Formatting HDFS...
Formatting HDFS... done.
Waiting for NameNode service...
Waiting for NameNode service... done.
Waiting for DataNodes to come up...
Waiting for DataNodes to come up... done.
Waiting for NameNode to be ready...
Waiting for NameNode to be ready... done.
Setting up HDFS...
Setting up HDFS... done.
Setting up HBase hosts...
Setting up HBase hosts... done.
Setting up YARN hosts...
 Setting up YARN hosts... done.
Waiting for Hadoop to be ready for validation test...
Waiting for Hadoop to be ready for validation test... done.
Validating Hadoop setup...
Validating Hadoop setup... done.
Checking ZooKeeper installation...
Checking ZooKeeper installation... done.
Checking HBase installation...
Checking HBase installation... done.
Installation successfully completed.
Finished.
[root@head-Boston ~]# cmsh 
[head-Boston]% user
[head-Boston->user]% use user david
[head-Boston->user[david]]% set hadoophdfsaccess ubs-test; commit 
[Ctrl+d]
[david@head-Boston ~]$ ssh openstack1
Last login: Sat Jun  6 15:42:30 2015 from head-boston.cm.cluster
[david@openstack1 ~]$ module load hadoop/ubs-test/Hortonworks/2.6.0.2.2.0.0-2041 
[david@openstack1 ~]$ module load zookeeper/ubs-test/Hortonworks/3.4.6.2.2.0.0-2041 
[david@openstack1 ~]$ module load hbase/ubs-test/Hortonworks/0.98.4.2.2.0.0-2041-hadoop2 
[david@openstack1 ~]$ cd $HADOOP_PREFIX/bin
[david@openstack1 bin]$ ./hadoop fs -ls /
Found 6 items
drwx------   - root  hadoop          0 2015-06-06 16:13 /accumulo
drwx------   - hbase hadoop          0 2015-06-06 16:13 /hbase
drwxrwxrwt   - hdfs  hadoop          0 2015-06-06 16:18 /staging
drwxr-xr-x   - hdfs  hadoop          0 2015-06-06 20:59 /system
drwxrwxrwt   - hdfs  hadoop          0 2015-06-06 16:12 /tmp
drwxr-xr-x   - hdfs  hadoop          0 2015-06-06 21:32 /user
[david@openstack1 bin]$ hadoop jar /cm/shared/apps/hadoop/ubs-test/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.2.2.0.0-2041.jar pi 1 5
Number of Maps  = 1
Samples per Map = 5
Wrote input for Map #0
Starting Job
15/06/06 21:35:42 INFO client.RMProxy: Connecting to ResourceManager at openstack3.cm.cluster/173.16.0.55:8032
15/06/06 21:35:42 INFO input.FileInputFormat: Total input paths to process : 1
15/06/06 21:35:42 INFO mapreduce.JobSubmitter: number of splits:1
15/06/06 21:35:42 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1433603618985_0005
15/06/06 21:35:42 INFO impl.YarnClientImpl: Submitted application application_1433603618985_0005
15/06/06 21:35:43 INFO mapreduce.Job: The url to track the job: http://openstack3:8088/proxy/application_1433603618985_0005/
15/06/06 21:35:43 INFO mapreduce.Job: Running job: job_1433603618985_0005
15/06/06 21:35:49 INFO mapreduce.Job: Job job_1433603618985_0005 running in uber mode : false
15/06/06 21:35:49 INFO mapreduce.Job:  map 0% reduce 0%
15/06/06 21:35:54 INFO mapreduce.Job:  map 100% reduce 0%
15/06/06 21:36:00 INFO mapreduce.Job:  map 100% reduce 100%
15/06/06 21:36:00 INFO mapreduce.Job: Job job_1433603618985_0005 completed successfully
15/06/06 21:36:00 INFO mapreduce.Job: Counters: 49
	File System Counters
		FILE: Number of bytes read=28
		FILE: Number of bytes written=219263
		FILE: Number of read operations=0
		FILE: Number of large read operations=0
		FILE: Number of write operations=0
		HDFS: Number of bytes read=277
		HDFS: Number of bytes written=215
		HDFS: Number of read operations=7
		HDFS: Number of large read operations=0
		HDFS: Number of write operations=3
	Job Counters 
		Launched map tasks=1
		Launched reduce tasks=1
		Data-local map tasks=1
		Total time spent by all maps in occupied slots (ms)=2938
		Total time spent by all reduces in occupied slots (ms)=3030
		Total time spent by all map tasks (ms)=2938
		Total time spent by all reduce tasks (ms)=3030
		Total vcore-seconds taken by all map tasks=2938
		Total vcore-seconds taken by all reduce tasks=3030
		Total megabyte-seconds taken by all map tasks=3008512
		Total megabyte-seconds taken by all reduce tasks=3102720
	Map-Reduce Framework
		Map input records=1
		Map output records=2
		Map output bytes=18
		Map output materialized bytes=28
		Input split bytes=159
		Combine input records=0
		Combine output records=0
		Reduce input groups=2
		Reduce shuffle bytes=28
		Reduce input records=2
		Reduce output records=0
		Spilled Records=4
		Shuffled Maps =1
		Failed Shuffles=0
		Merged Map outputs=1
		GC time elapsed (ms)=62
		CPU time spent (ms)=2090
		Physical memory (bytes) snapshot=447819776
		Virtual memory (bytes) snapshot=2117320704
		Total committed heap usage (bytes)=402653184
	Shuffle Errors
		BAD_ID=0
		CONNECTION=0
		IO_ERROR=0
		WRONG_LENGTH=0
		WRONG_MAP=0
		WRONG_REDUCE=0
	File Input Format Counters 
		Bytes Read=118
	File Output Format Counters 
		Bytes Written=97
Job Finished in 18.315 seconds
Estimated value of Pi is 4.00000000000000000000
[david@openstack1 ~]$ sudo su - 
[root@openstack1 ~]# rpm -e cluster-tools-slave-7.0-41_cm7.0.noarch
[root@openstack1 ~]# yum install cluster-tools-7.0-4764_cm7.0.noarch
[root@openstack1 ~]# exit
[david@openstack1 hadoop]$ ./cm-hadoop-tests.sh ubs-test
hadoop/ubs-test/Hortonworks/2.6.0.2.2.0.0-2041
hbase/ubs-test/Hortonworks/0.98.4.2.2.0.0-2041-hadoop2

# STUFF RUNS FOR A WHILE - GO THROUGH GUI

# Lets uninstall


[root@head-Boston ~]# cm-hadoop-setup -u ubs-test
Requested removal of Hadoop instance 'ubs-test'.
Stopping all Hadoop services for instance 'ubs-test'...
Stopping all Hadoop services for instance 'ubs-test'... done.
Instance 'ubs-test' found in cmdaemon. Removing...
Unable to remove ubs-test, the following objects depend on it:
Remove failed but no errors
Instance 'ubs-test' found in cmdaemon. Removing... FAILED.
Waiting a few seconds before removing files from local fs...
Waiting a few seconds before removing files from local fs... done.
Removing Hadoop instance 'ubs-test'

Removing:
		/etc/hadoop/ubs-test
		/var/log/hadoop/ubs-test
		/var/run/hadoop/ubs-test
		/tmp/hadoop/ubs-test/
		/etc/hadoop/ubs-test/zookeeper
		/var/lib/zookeeper/ubs-test
		/var/log/zookeeper/ubs-test
		/var/run/zookeeper/ubs-test
		/etc/hadoop/ubs-test/hbase
		/var/log/hbase/ubs-test
		/var/run/hbase/ubs-test
		/var/lib/hadoop/ubs-test/
		/etc/init.d/hadoop-ubs-test-*
Updating images...
Updating images... done.
Instance 'ubs-test' removed from images.
Removing log files.
Module file(s) deleted.
Removal successfully completed.