Hortonworks HDP: Automated Installation using Ambari

From Define Wiki
Revision as of 10:54, 11 July 2014 by David (talk | contribs)
Jump to navigation Jump to search

Initial Setup

The following must be setup:

  • CentOS 6.5 base installation with working repositories
  • FQDN names for all the nodes, check with hostname -f</f>, make sure to setup /etc/hosts and /etc/sysconfig/network
  • Passwordless ssh between hosts
  • Iptables stopped on all nodes
  • NTP / Times all in sync
  • SELinux disabled getenforce

File Systems / Partitioning

File System Partitioning Recommendations. Use the following information to help you set up the file system partitions on master and slave nodes in an HDP cluster.

Partitioning Recommendations for All Nodes

  • Root partition: OS and core program files
  • Swap: Size 2X system memory

Partitioning Recommendations for Slave Nodes

Hadoop Slave node partitions: Hadoop should have its own partitions for Hadoop files and logs. Drives should be partitioned using XFS, ext4, or ext3 in that order of preference. Don't use LVM; it adds latency and causes a bottleneck.

On slave nodes only, all Hadoop partitions should be mounted individually from drives as "/grid/[0-n]".

Hadoop Slave Node Partitioning Configuration Example

/swap - 96 GB (for a 48GB memory system)
/root - 20GB (ample room for existing files, future log file growth, and OS upgrades)
/grid/0/ - [full disk GB] first partition for Hadoop to use for local storage
/grid/1/ - second partition for Hadoop to use
/grid/2/ - ...

Redundancy (RAID) Recommendations

  • Master nodes -- Configured for reliability (RAID 10, dual Ethernet cards, dual power supplies, etc.)
  • Slave nodes -- RAID is not necessary, as failure on these nodes is managed automatically by the cluster. All data is stored across at least three different hosts, and therefore redundancy is built-in. Slave nodes should be built for speed and low cost.

Recommended Maximum Open File Descriptors

The recommended maximum number of open file descriptors is 10000 or more. To check the current value set for the maximum number of open file descriptors, execute the following shell commands:

 ulimit -Sn 
 ulimit -Hn

Install the Ambari Server

Pull down the repo (Instructions here: http://hortonworks.com/hdp/downloads/# and here: http://docs.hortonworks.com/HDPDocuments/Ambari-1.6.0.0/bk_using_Ambari_book/content/ambari-chap2-1.html)

  wget http://public-repo-1.hortonworks.com/ambari/centos6/1.x/updates/1.6.0/ambari.repo
  cp ambari.repo /etc/yum.repos.d

Check things out and install

  yum repolist
  yum install ambari-server

Run through the Ambari setup

[root@hadoop-head yum.repos.d]# ambari-server setup
Using python  /usr/bin/python2.6
Setup ambari-server
Checking SELinux...
SELinux status is 'disabled'
Customize user account for ambari-server daemon [y/n] (n)? 
Adjusting ambari-server permissions and ownership...
Checking iptables...
Checking JDK...
[1] - Oracle JDK 1.7
[2] - Oracle JDK 1.6
[3] - Custom JDK
==============================================================================
Enter choice (1): 
JDK already exists, using /var/lib/ambari-server/resources/jdk-7u45-linux-x64.tar.gz
Installing JDK to /usr/jdk64
Successfully installed JDK to /usr/jdk64/jdk1.7.0_45
JCE Policy archive already exists, using /var/lib/ambari-server/resources/UnlimitedJCEPolicyJDK7.zip
Completing setup...
Configuring database...
Enter advanced database configuration [y/n] (n)? 
Default properties detected. Using built-in database.
Checking PostgreSQL...
Configuring local database...
Connecting to local database...done.
Configuring PostgreSQL...
Backup for pg_hba found, reconfiguration not required
Ambari Server 'setup' completed successfully.

Then start the server

[root@hadoop-head yum.repos.d]# ambari-server start 
Using python  /usr/bin/python2.6
Starting ambari-server
Ambari Server running with 'root' privileges.
Organizing resource files at /var/lib/ambari-server/resources...
Server PID at: /var/run/ambari-server/ambari-server.pid
Server out at: /var/log/ambari-server/ambari-server.out
Server log at: /var/log/ambari-server/ambari-server.log
Ambari Server 'start' completed successfully.

At this stage you can point your web browser at https://headnode:8080/ and login with admin/admin

Reset the Ambari Configuration

If any of the information provided was incorrect or you wanted to go back and start again, run the following:

 ambari-server stop
 ambari-server reset
 ambari-server start