Difference between revisions of "Bright:Testing Hadoop"

From Define Wiki
Jump to navigation Jump to search
Line 6: Line 6:
 
cd /cm/local/apps/cluster-tools/hadoop/
 
cd /cm/local/apps/cluster-tools/hadoop/
 
./cm-hadoop-tests.sh <instance> # instance is boston-hdfs in our case
 
./cm-hadoop-tests.sh <instance> # instance is boston-hdfs in our case
 +
# only worked on the nodes after I remove cluster-tools-slave then installed cluster-tools (hadoop setup on 4 compute nodes, not on headnode)
 +
rpm -e cluster-tools-slave-7.0-41_cm7.0.noarch
 +
yum install cluster-tools-7.0-4764_cm7.0.noarch
 
</syntaxhighlight>
 
</syntaxhighlight>
  
Line 23: Line 26:
 
Estimated value of Pi is 4.00000000000000000000
 
Estimated value of Pi is 4.00000000000000000000
 
[fred@bright70 ~]$ hadoop jar $HADOOP_PREFIX/share/hadoop/mapreduce/hhadoop-mapreduce-examples-2.6.0.2.2.0.0-2041.jar pi <numMappers> <numSamplesPerMapper>
 
[fred@bright70 ~]$ hadoop jar $HADOOP_PREFIX/share/hadoop/mapreduce/hhadoop-mapreduce-examples-2.6.0.2.2.0.0-2041.jar pi <numMappers> <numSamplesPerMapper>
 +
</syntaxhighlight>
 +
 +
=== List of All Tests Available ===
 +
<syntaxhighlight>
 +
[david@openstack1 ~]$ hadoop jar /cm/shared/apps/hadoop/ubs-test/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.2.2.0.0-2041.jar -h
 +
Unknown program '-h' chosen.
 +
Valid program names are:
 +
  aggregatewordcount: An Aggregate based map/reduce program that counts the words in the input files.
 +
  aggregatewordhist: An Aggregate based map/reduce program that computes the histogram of the words in the input files.
 +
  bbp: A map/reduce program that uses Bailey-Borwein-Plouffe to compute exact digits of Pi.
 +
  dbcount: An example job that count the pageview counts from a database.
 +
  distbbp: A map/reduce program that uses a BBP-type formula to compute exact bits of Pi.
 +
  grep: A map/reduce program that counts the matches of a regex in the input.
 +
  join: A job that effects a join over sorted, equally partitioned datasets
 +
  multifilewc: A job that counts words from several files.
 +
  pentomino: A map/reduce tile laying program to find solutions to pentomino problems.
 +
  pi: A map/reduce program that estimates Pi using a quasi-Monte Carlo method.
 +
  randomtextwriter: A map/reduce program that writes 10GB of random textual data per node.
 +
  randomwriter: A map/reduce program that writes 10GB of random data per node.
 +
  secondarysort: An example defining a secondary sort to the reduce.
 +
  sort: A map/reduce program that sorts the data written by the random writer.
 +
  sudoku: A sudoku solver.
 +
  teragen: Generate data for the terasort
 +
  terasort: Run the terasort
 +
  teravalidate: Checking results of terasort
 +
  wordcount: A map/reduce program that counts the words in the input files.
 +
  wordmean: A map/reduce program that counts the average length of the words in the input files.
 +
  wordmedian: A map/reduce program that counts the median length of the words in the input files.
 +
  wordstandarddeviation: A map/reduce program that counts the standard deviation of the length of the words in the input files.
 
</syntaxhighlight>
 
</syntaxhighlight>

Revision as of 14:54, 6 June 2015

Bright's Hadoop Tests

Automatic tests:

cd /cm/local/apps/cluster-tools/hadoop/
./cm-hadoop-tests.sh <instance> # instance is boston-hdfs in our case
# only worked on the nodes after I remove cluster-tools-slave then installed cluster-tools (hadoop setup on 4 compute nodes, not on headnode) 
rpm -e cluster-tools-slave-7.0-41_cm7.0.noarch
yum install cluster-tools-7.0-4764_cm7.0.noarch

End User test runs:

  • Firstly, you must give the user access to the Hadoop instance:
 [bright70->user[fred]]% set hadoophdfsaccess boston-hdfs; commit
  • Then the user can run the tests like this:
[fred@bright70 ~]$ module load hadoop/hdfs1/Hortonworks/2.6.0.2.2.0.0-2041
[fred@bright70 ~]$ module load zookeeper/hdfs1/Hortonworks/3.4.6.2.2.0.0-2041
[fred@bright70 ~]$ module load hbase/hdfs1/Hortonworks/0.98.4.2.2.0.0-2041-hadoop2
[fred@bright70 ~]$ hadoop jar $HADOOP_PREFIX/share/hadoop/mapreduce/hhadoop-mapreduce-examples-2.6.0.2.2.0.0-2041.jar pi 1 5
...
Job Finished in 19.732 seconds
Estimated value of Pi is 4.00000000000000000000
[fred@bright70 ~]$ hadoop jar $HADOOP_PREFIX/share/hadoop/mapreduce/hhadoop-mapreduce-examples-2.6.0.2.2.0.0-2041.jar pi <numMappers> <numSamplesPerMapper>

List of All Tests Available

[david@openstack1 ~]$ hadoop jar /cm/shared/apps/hadoop/ubs-test/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.2.2.0.0-2041.jar -h
Unknown program '-h' chosen.
Valid program names are:
  aggregatewordcount: An Aggregate based map/reduce program that counts the words in the input files.
  aggregatewordhist: An Aggregate based map/reduce program that computes the histogram of the words in the input files.
  bbp: A map/reduce program that uses Bailey-Borwein-Plouffe to compute exact digits of Pi.
  dbcount: An example job that count the pageview counts from a database.
  distbbp: A map/reduce program that uses a BBP-type formula to compute exact bits of Pi.
  grep: A map/reduce program that counts the matches of a regex in the input.
  join: A job that effects a join over sorted, equally partitioned datasets
  multifilewc: A job that counts words from several files.
  pentomino: A map/reduce tile laying program to find solutions to pentomino problems.
  pi: A map/reduce program that estimates Pi using a quasi-Monte Carlo method.
  randomtextwriter: A map/reduce program that writes 10GB of random textual data per node.
  randomwriter: A map/reduce program that writes 10GB of random data per node.
  secondarysort: An example defining a secondary sort to the reduce.
  sort: A map/reduce program that sorts the data written by the random writer.
  sudoku: A sudoku solver.
  teragen: Generate data for the terasort
  terasort: Run the terasort
  teravalidate: Checking results of terasort
  wordcount: A map/reduce program that counts the words in the input files.
  wordmean: A map/reduce program that counts the average length of the words in the input files.
  wordmedian: A map/reduce program that counts the median length of the words in the input files.
  wordstandarddeviation: A map/reduce program that counts the standard deviation of the length of the words in the input files.