Difference between revisions of "MapR: Installation"

From Define Wiki
Jump to navigation Jump to search
Line 82: Line 82:
 
*''Identify disks to allocate to the MapR file system''. For each node in the cluster, you must identify which disks you want to allocate to the MapR file system. If the same set of disks and partitions applies to all nodes in the cluster, you can use interactive mode for the installer. To specify a distinct set of disks and partitions for individual cluster nodes, you need to use the configuration file. The installer’s interactive mode and configuration files are discussed in depth later in this page.
 
*''Identify disks to allocate to the MapR file system''. For each node in the cluster, you must identify which disks you want to allocate to the MapR file system. If the same set of disks and partitions applies to all nodes in the cluster, you can use interactive mode for the installer. To specify a distinct set of disks and partitions for individual cluster nodes, you need to use the configuration file. The installer’s interactive mode and configuration files are discussed in depth later in this page.
 
*:Use the <tt>lsblk</tt> or <tt>fdisk -l</tt> commands to determine the full path for the disks that you plan to use.
 
*:Use the <tt>lsblk</tt> or <tt>fdisk -l</tt> commands to determine the full path for the disks that you plan to use.
 +
 +
=== Setting Up the Installation Machine ===
 +
 +
Complete the following steps to set up the installation machine:
 +
#Download the mapr-setup file for the MapR version that you plan to install. The following examples use the wget utility to download mapr-setup for MapR version 4.0.2, however you can also download mapr-setup for MapR v4.0.1.
 +
#*Ubuntu
 +
#*:<tt>wget http://package.mapr.com/releases/v4.0.2/ubuntu/mapr-setup</tt>
 +
#*RedHat/CentOS
 +
#*:<tt>wget http://package.mapr.com/releases/v4.0.2/redhat/mapr-setup</tt>
 +
#*SuSE
 +
#*:<tt>wget http://package.mapr.com/releases/v4.0.2/suse/mapr-setup</tt>
 +
#Navigate to the directory where you downloaded mapr-setup, and enable execute permissions with the following command:
 +
#:<tt>chmod 755 mapr-setup</tt>
 +
#Run mapr-setup to unpack the installer files to the /opt/mapr-installer directory. The user running mapr-setup must have write access to the /opt and /tmp directories. You can execute mapr-setup with sudo privileges:
 +
#:<tt>sudo ./mapr-setup</tt>
 +
#:The system extracts the installer and copies the set up files to /opt/mapr-install.  The system prompts you to run /opt/mapr-installer/bin/install to begin the installation process. Follow the guidelines in the Using the MapR Quick Installer section.
 +
 +
'''Note:'''
 +
 +
This installer enables password-authenticated ssh login, which remains enabled after installation. You can disable password authentication for ssh manually after installation by adding the following line to the sshd_config file and restarting ssh: PasswordAuthentication no

Revision as of 15:13, 26 March 2015

The MapR quick installer automates the process of configuring a Hadoop cluster and installing MapR software based on node type. You can install the MapR distribution for Hadoop on a set of nodes from any machine that can connect to the nodes. Using the quick installer, you can configure each node in a MapR cluster as one of the following types:

  • Control Node: Control nodes manage the operation of the cluster. Control nodes host the ZooKeeper, CLDB, JobTracker, ResourceManager, and Webserver services. One control node also hosts the HistoryServer.
  • Data Nodes: Data nodes host the NodeManager, TaskTracker, and FileServer services. These nodes store data, run YARN applications and MapReduce jobs, and process table data.
  • Control-as-Data Nodes: Control-as-data nodes combine control and data node functionality. This node type is appropriate for small clusters.
  • Client Nodes: Client nodes provide controlled user access to the cluster.

Ecosystem Component Installation

In addition to installing the core components of the MapR Hadoop distribution, the MapR quick installer supports installation of Apache Spark, Hive, and HBase. To install the Spark and Hive ecosystem components, you must use the quick installer configuration file. You can also use the configuration file to install HBase, however when you run the quick installer in interactive mode, the installer prompts you to see if you want HBase or MapR-DB installed. Entering y at these prompts instructs the installer to install HBase and/or MapR-DB during the installation process.

Installation Steps

To successfully install MapR using the quick installer, complete the following steps:

  1. Make sure your installation machine and nodes meet all of the prerequisites.
  2. Prepare for the installation and set up the installation machine.
  3. Run the quick installer.
  4. Complete the post installation steps.

Prerequisites

Verify that your installation machine and the nodes that you plan to install MapR on meet the required prerequisites.

Installation Machine Prerequisites

The machine from which you run the quick installer must run one of the following operating systems:

  • Ubuntu 12.04 or later
  • RedHat (with the EPEL repository installed) 6.1 or later
  • CentOS (with the EPEL repository installed) 6.1 or later
  • SuSE 11SP2
    To install from a machine running SuSE, you must create a symbolic link named libssl.so.10, that points to libssl.so.1.0.0 under /usr/lib64 before you install.
    Example:
cd /usr/lib64
ln -s libssl.so.1.0.0 libssl.so.10

MapR Node Prerequisites

The nodes that you install MapR on must meet the following prerequisites:

  • Java 1.7 or 1.8
  • Python 2.6 or later
  • The operating system on each node must meet the listed package dependencies. The quick installer should install these dependencies automatically. If not, you can install them manually. For RedHat and CentOS, you must have the EPEL repository installed for the quick installer to install the dependencies automatically.
    1. Ubuntu/SUSE Package Dependencies:
      python-pycurl
      openssl
      sshpass
    2. RedHat/CentOS Package Dependencies
      python-pycurl
      libselinux-python
      openssl
      sshpass
      openssh-clients

Installing the EPEL Repository

If you need to install the EPEL repository, complete the following steps:

  1. Download the version of the EPEL repository that corresponds to the version of your operating system:
  2. Issue the following command to install the EPEL repository, replacing version with the EPEL version:
    • Syntax
      rpm -Uvh epel-release-<version>*.rpm
    • Example
      rpm -Uvh epel-release-6*.rpm

Before You Run the Quick Installer

Before you run the quick installer to install MapR on your cluster, verify that you have completed all of the preparation tasks and set up the installation machine.

Preparing for Installation

Verify that you have completed the following preparation tasks before you set up the installation machine:

  • Determine the number of control nodes. The MapR installer supports one or three control nodes. Three control nodes are typically sufficient for clusters up to approximately 100 nodes.
  • Determine the data and client nodes. The MapR installer supports an arbitrary number of data or client nodes.
  • Ensure all nodes have internet access. For online installation only.
  • Ensure access to a local repository of MapR packages and Linux distribution repositories. For offline installation only. Ensure that you have access to a local repository of MapR packages and to Linux distribution repositories.
  • Decide if you will install Spark or Hive. If you decide to install Apache ecosystem projects, like Spark or Hive, you must install using the configuration file.
  • Verify that all nodes you plan to install on are configured to have the same login information. If you are using the quick installer in interactive mode, described later in this document, verify that all the nodes have the same disks for use by the MapR Hadoop Platform.
  • Identify disks to allocate to the MapR file system. For each node in the cluster, you must identify which disks you want to allocate to the MapR file system. If the same set of disks and partitions applies to all nodes in the cluster, you can use interactive mode for the installer. To specify a distinct set of disks and partitions for individual cluster nodes, you need to use the configuration file. The installer’s interactive mode and configuration files are discussed in depth later in this page.
    Use the lsblk or fdisk -l commands to determine the full path for the disks that you plan to use.

Setting Up the Installation Machine

Complete the following steps to set up the installation machine:

  1. Download the mapr-setup file for the MapR version that you plan to install. The following examples use the wget utility to download mapr-setup for MapR version 4.0.2, however you can also download mapr-setup for MapR v4.0.1.
  2. Navigate to the directory where you downloaded mapr-setup, and enable execute permissions with the following command:
    chmod 755 mapr-setup
  3. Run mapr-setup to unpack the installer files to the /opt/mapr-installer directory. The user running mapr-setup must have write access to the /opt and /tmp directories. You can execute mapr-setup with sudo privileges:
    sudo ./mapr-setup
    The system extracts the installer and copies the set up files to /opt/mapr-install. The system prompts you to run /opt/mapr-installer/bin/install to begin the installation process. Follow the guidelines in the Using the MapR Quick Installer section.

Note:

This installer enables password-authenticated ssh login, which remains enabled after installation. You can disable password authentication for ssh manually after installation by adding the following line to the sshd_config file and restarting ssh: PasswordAuthentication no