Hadoop: Setup a single host test system
Jump to navigation
Jump to search
Tests performed on a single calxeda SOC with ubuntu 12.10
Prerequisites
Install Java/JRE
apt-get update
apt-get install default-jreSetup Passwordless Access
Setup passwordless ssh for user/root (I used root in this example, separate hadoop user should really be setup!)
ssh-keygen -t rsa
# dont enter a passphrase, just hit enter twice for a blank passphrase
cd .ssh
cat id_rsa.pub >> authorized_keys
chmod 600 authorized_keysInstall Hadoop
Get latest stable release
The latest release is available from: http://ftp.heanet.ie/mirrors/www.apache.org/dist/hadoop/common/stable/
wget http://ftp.heanet.ie/mirrors/www.apache.org/dist/hadoop/common/stable/hadoop-1.0.3.tar.gz
cd /opt
tar zxvf /path/to/download/hadoop-1.0.3.tar.gzSetup Config Files
All files in question here are found in /opt/hadoop-1.0.3
conf/core-site.xml: <xml> <configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration> </xml>
conf/hdfs-site.xml: <xml> <configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration> </xml>
conf/mapred-site.xml: <xml> <configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:9001</value>
</property>
</configuration> </xml>
conf/hadoop-env/sh:
export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-armhf/jreFormat the namenode
./bin/hadoop namenode -formatVerify Hadoop
Check available tests
root@cal4:/opt/hadoop-1.0.3$ ./bin/hadoop jar hadoop-test-1.0.3.jar
An example program must be given as the first argument.
Valid program names are:
DFSCIOTest: Distributed i/o benchmark of libhdfs.
DistributedFSCheck: Distributed checkup of the file system consistency.
MRReliabilityTest: A program that tests the reliability of the MR framework by injecting faults/failures
TestDFSIO: Distributed i/o benchmark.
dfsthroughput: measure hdfs throughput
filebench: Benchmark SequenceFile(Input|Output)Format (block,record compressed and uncompressed), Text(Input|Output)Format (compressed and uncompressed)
loadgen: Generic map/reduce load generator
mapredtest: A map/reduce test check.
mrbench: A map/reduce benchmark that can create many small jobs
nnbench: A benchmark that stresses the namenode.
testarrayfile: A test for flat files of binary key/value pairs.
testbigmapoutput: A map/reduce program that works on a very big non-splittable file and does identity map/reduce
testfilesystem: A test for FileSystem read/write.
testipc: A test for ipc.
testmapredsort: A map/reduce program that validates the map-reduce framework's sort.
testrpc: A test for rpc.
testsequencefile: A test for flat files of binary key value pairs.
testsequencefileinputformat: A test for sequence file input format.
testsetfile: A test for flat files of binary key/value pairs.
testtextinputformat: A test for text input format.
threadedmapbench: A map/reduce benchmark that compares the performance of maps with multiple spills over maps with 1 spill