Difference between revisions of "MapR: Installation"

From Define Wiki
Jump to navigation Jump to search
Line 66: Line 66:
 
#*Example
 
#*Example
 
#*:<tt>rpm -Uvh epel-release-6*.rpm</tt>
 
#*:<tt>rpm -Uvh epel-release-6*.rpm</tt>
 +
 +
== Before You Run the Quick Installer ==
 +
 +
Before you run the quick installer to install MapR on your cluster, verify that you have completed all of the preparation tasks and set up the installation machine.
 +
 +
=== Preparing for Installation ===
 +
 +
Verify that you have completed the following preparation tasks before you set up the installation machine:
 +
*''Determine the number of control nodes''. The MapR installer supports one or three control nodes. Three control nodes are typically sufficient for clusters up to approximately 100 nodes.
 +
*''Determine the data and client nodes''. The MapR installer supports an arbitrary number of data or client nodes.
 +
*''Ensure all nodes have internet access''. For online installation only.
 +
*''Ensure access to a local repository of MapR packages and Linux distribution repositories''. For offline installation only. Ensure that you have access to a local repository of MapR packages and to Linux distribution repositories.
 +
*''Decide if you will install Spark or Hive''. If you decide to install Apache ecosystem projects, like Spark or Hive, you must install using the configuration file.
 +
*''Verify that all nodes you plan to install on are configured to have the same login information''. If you are using the quick installer in interactive mode, described later in this document, verify that all the nodes have the same disks for use by the MapR Hadoop Platform.
 +
*''

Revision as of 15:03, 26 March 2015

The MapR quick installer automates the process of configuring a Hadoop cluster and installing MapR software based on node type. You can install the MapR distribution for Hadoop on a set of nodes from any machine that can connect to the nodes. Using the quick installer, you can configure each node in a MapR cluster as one of the following types:

  • Control Node: Control nodes manage the operation of the cluster. Control nodes host the ZooKeeper, CLDB, JobTracker, ResourceManager, and Webserver services. One control node also hosts the HistoryServer.
  • Data Nodes: Data nodes host the NodeManager, TaskTracker, and FileServer services. These nodes store data, run YARN applications and MapReduce jobs, and process table data.
  • Control-as-Data Nodes: Control-as-data nodes combine control and data node functionality. This node type is appropriate for small clusters.
  • Client Nodes: Client nodes provide controlled user access to the cluster.

Ecosystem Component Installation

In addition to installing the core components of the MapR Hadoop distribution, the MapR quick installer supports installation of Apache Spark, Hive, and HBase. To install the Spark and Hive ecosystem components, you must use the quick installer configuration file. You can also use the configuration file to install HBase, however when you run the quick installer in interactive mode, the installer prompts you to see if you want HBase or MapR-DB installed. Entering y at these prompts instructs the installer to install HBase and/or MapR-DB during the installation process.

Installation Steps

To successfully install MapR using the quick installer, complete the following steps:

  1. Make sure your installation machine and nodes meet all of the prerequisites.
  2. Prepare for the installation and set up the installation machine.
  3. Run the quick installer.
  4. Complete the post installation steps.

Prerequisites

Verify that your installation machine and the nodes that you plan to install MapR on meet the required prerequisites.

Installation Machine Prerequisites

The machine from which you run the quick installer must run one of the following operating systems:

  • Ubuntu 12.04 or later
  • RedHat (with the EPEL repository installed) 6.1 or later
  • CentOS (with the EPEL repository installed) 6.1 or later
  • SuSE 11SP2
    To install from a machine running SuSE, you must create a symbolic link named libssl.so.10, that points to libssl.so.1.0.0 under /usr/lib64 before you install.
    Example:
cd /usr/lib64
ln -s libssl.so.1.0.0 libssl.so.10

MapR Node Prerequisites

The nodes that you install MapR on must meet the following prerequisites:

  • Java 1.7 or 1.8
  • Python 2.6 or later
  • The operating system on each node must meet the listed package dependencies. The quick installer should install these dependencies automatically. If not, you can install them manually. For RedHat and CentOS, you must have the EPEL repository installed for the quick installer to install the dependencies automatically.
    1. Ubuntu/SUSE Package Dependencies:
      python-pycurl
      openssl
      sshpass
    2. RedHat/CentOS Package Dependencies
      python-pycurl
      libselinux-python
      openssl
      sshpass
      openssh-clients

Installing the EPEL Repository

If you need to install the EPEL repository, complete the following steps:

  1. Download the version of the EPEL repository that corresponds to the version of your operating system:
  2. Issue the following command to install the EPEL repository, replacing version with the EPEL version:
    • Syntax
      rpm -Uvh epel-release-<version>*.rpm
    • Example
      rpm -Uvh epel-release-6*.rpm

Before You Run the Quick Installer

Before you run the quick installer to install MapR on your cluster, verify that you have completed all of the preparation tasks and set up the installation machine.

Preparing for Installation

Verify that you have completed the following preparation tasks before you set up the installation machine:

  • Determine the number of control nodes. The MapR installer supports one or three control nodes. Three control nodes are typically sufficient for clusters up to approximately 100 nodes.
  • Determine the data and client nodes. The MapR installer supports an arbitrary number of data or client nodes.
  • Ensure all nodes have internet access. For online installation only.
  • Ensure access to a local repository of MapR packages and Linux distribution repositories. For offline installation only. Ensure that you have access to a local repository of MapR packages and to Linux distribution repositories.
  • Decide if you will install Spark or Hive. If you decide to install Apache ecosystem projects, like Spark or Hive, you must install using the configuration file.
  • Verify that all nodes you plan to install on are configured to have the same login information. If you are using the quick installer in interactive mode, described later in this document, verify that all the nodes have the same disks for use by the MapR Hadoop Platform.