Difference between revisions of "Bright:Testing Hadoop"
Jump to navigation
Jump to search
(Created page with "== Bright's Hadoop Tests == <syntaxhighlight> cd /cm/local/apps/cluster-tools/hadoop/ ./cm-hadoop-tests.sh <instance> # instance is hdfs1 in our case </syntaxhighlight>") |
|||
| (9 intermediate revisions by 2 users not shown) | |||
| Line 1: | Line 1: | ||
== Bright's Hadoop Tests == | == Bright's Hadoop Tests == | ||
| + | |||
| + | === Automatic tests: === | ||
<syntaxhighlight> | <syntaxhighlight> | ||
cd /cm/local/apps/cluster-tools/hadoop/ | cd /cm/local/apps/cluster-tools/hadoop/ | ||
| − | ./cm-hadoop-tests.sh <instance> # instance is | + | ./cm-hadoop-tests.sh <instance> # instance is boston-hdfs in our case |
| + | # only worked on the nodes after I remove cluster-tools-slave then installed cluster-tools (hadoop setup on 4 compute nodes, not on headnode) | ||
| + | rpm -e cluster-tools-slave-7.0-41_cm7.0.noarch | ||
| + | yum install cluster-tools-7.0-4764_cm7.0.noarch | ||
| + | </syntaxhighlight> | ||
| + | |||
| + | === End User test runs: === | ||
| + | |||
| + | * Firstly, you must give the user access to the Hadoop instance: | ||
| + | <syntaxhighlight> [bright70->user[fred]]% set hadoophdfsaccess boston-hdfs; commit </syntaxhighlight> | ||
| + | |||
| + | * Then the user can run the tests like this: | ||
| + | <syntaxhighlight> | ||
| + | [fred@bright70 ~]$ module load hadoop/hdfs1/Hortonworks/2.6.0.2.2.0.0-2041 | ||
| + | [fred@bright70 ~]$ module load zookeeper/hdfs1/Hortonworks/3.4.6.2.2.0.0-2041 | ||
| + | [fred@bright70 ~]$ module load hbase/hdfs1/Hortonworks/0.98.4.2.2.0.0-2041-hadoop2 | ||
| + | [fred@bright70 ~]$ hadoop jar $HADOOP_PREFIX/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.2.2.0.0-2041.jar pi 1 5 | ||
| + | ... | ||
| + | Job Finished in 19.732 seconds | ||
| + | Estimated value of Pi is 4.00000000000000000000 | ||
| + | [fred@bright70 ~]$ hadoop jar $HADOOP_PREFIX/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.2.2.0.0-2041.jar pi <numMappers> <numSamplesPerMapper> | ||
| + | </syntaxhighlight> | ||
| + | |||
| + | * Do some basic HDFS checks (lots more info here: http://wiki.bostonlabs.co.uk/w/index.php?title=Hortonworks_HDP:_Using_the_command_line_to_manage_files_on_HDFS) | ||
| + | <syntaxhighlight> | ||
| + | [david@openstack1 bin]$ cd $HADOOP_PREFIX/bin | ||
| + | [david@openstack1 bin]$ ./hadoop fs -ls / | ||
| + | Found 5 items | ||
| + | drwx------ - root hadoop 0 2015-06-06 15:02 /accumulo | ||
| + | drwx------ - hbase hadoop 0 2015-06-06 15:28 /hbase | ||
| + | drwxrwxrwt - hdfs hadoop 0 2015-06-06 15:44 /staging | ||
| + | drwxrwxrwt - hdfs hadoop 0 2015-06-06 15:01 /tmp | ||
| + | drwxr-xr-x - hdfs hadoop 0 2015-06-06 15:42 /user | ||
| + | </syntaxhighlight> | ||
| + | |||
| + | === List of All Tests Available === | ||
| + | <syntaxhighlight> | ||
| + | [david@openstack1 ~]$ hadoop jar /cm/shared/apps/hadoop/ubs-test/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.2.2.0.0-2041.jar -h | ||
| + | Unknown program '-h' chosen. | ||
| + | Valid program names are: | ||
| + | aggregatewordcount: An Aggregate based map/reduce program that counts the words in the input files. | ||
| + | aggregatewordhist: An Aggregate based map/reduce program that computes the histogram of the words in the input files. | ||
| + | bbp: A map/reduce program that uses Bailey-Borwein-Plouffe to compute exact digits of Pi. | ||
| + | dbcount: An example job that count the pageview counts from a database. | ||
| + | distbbp: A map/reduce program that uses a BBP-type formula to compute exact bits of Pi. | ||
| + | grep: A map/reduce program that counts the matches of a regex in the input. | ||
| + | join: A job that effects a join over sorted, equally partitioned datasets | ||
| + | multifilewc: A job that counts words from several files. | ||
| + | pentomino: A map/reduce tile laying program to find solutions to pentomino problems. | ||
| + | pi: A map/reduce program that estimates Pi using a quasi-Monte Carlo method. | ||
| + | randomtextwriter: A map/reduce program that writes 10GB of random textual data per node. | ||
| + | randomwriter: A map/reduce program that writes 10GB of random data per node. | ||
| + | secondarysort: An example defining a secondary sort to the reduce. | ||
| + | sort: A map/reduce program that sorts the data written by the random writer. | ||
| + | sudoku: A sudoku solver. | ||
| + | teragen: Generate data for the terasort | ||
| + | terasort: Run the terasort | ||
| + | teravalidate: Checking results of terasort | ||
| + | wordcount: A map/reduce program that counts the words in the input files. | ||
| + | wordmean: A map/reduce program that counts the average length of the words in the input files. | ||
| + | wordmedian: A map/reduce program that counts the median length of the words in the input files. | ||
| + | wordstandarddeviation: A map/reduce program that counts the standard deviation of the length of the words in the input files. | ||
</syntaxhighlight> | </syntaxhighlight> | ||
Latest revision as of 20:36, 6 June 2015
Bright's Hadoop Tests
Automatic tests:
cd /cm/local/apps/cluster-tools/hadoop/
./cm-hadoop-tests.sh <instance> # instance is boston-hdfs in our case
# only worked on the nodes after I remove cluster-tools-slave then installed cluster-tools (hadoop setup on 4 compute nodes, not on headnode)
rpm -e cluster-tools-slave-7.0-41_cm7.0.noarch
yum install cluster-tools-7.0-4764_cm7.0.noarchEnd User test runs:
- Firstly, you must give the user access to the Hadoop instance:
[bright70->user[fred]]% set hadoophdfsaccess boston-hdfs; commit- Then the user can run the tests like this:
[fred@bright70 ~]$ module load hadoop/hdfs1/Hortonworks/2.6.0.2.2.0.0-2041
[fred@bright70 ~]$ module load zookeeper/hdfs1/Hortonworks/3.4.6.2.2.0.0-2041
[fred@bright70 ~]$ module load hbase/hdfs1/Hortonworks/0.98.4.2.2.0.0-2041-hadoop2
[fred@bright70 ~]$ hadoop jar $HADOOP_PREFIX/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.2.2.0.0-2041.jar pi 1 5
...
Job Finished in 19.732 seconds
Estimated value of Pi is 4.00000000000000000000
[fred@bright70 ~]$ hadoop jar $HADOOP_PREFIX/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.2.2.0.0-2041.jar pi <numMappers> <numSamplesPerMapper>- Do some basic HDFS checks (lots more info here: http://wiki.bostonlabs.co.uk/w/index.php?title=Hortonworks_HDP:_Using_the_command_line_to_manage_files_on_HDFS)
[david@openstack1 bin]$ cd $HADOOP_PREFIX/bin
[david@openstack1 bin]$ ./hadoop fs -ls /
Found 5 items
drwx------ - root hadoop 0 2015-06-06 15:02 /accumulo
drwx------ - hbase hadoop 0 2015-06-06 15:28 /hbase
drwxrwxrwt - hdfs hadoop 0 2015-06-06 15:44 /staging
drwxrwxrwt - hdfs hadoop 0 2015-06-06 15:01 /tmp
drwxr-xr-x - hdfs hadoop 0 2015-06-06 15:42 /userList of All Tests Available
[david@openstack1 ~]$ hadoop jar /cm/shared/apps/hadoop/ubs-test/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.2.2.0.0-2041.jar -h
Unknown program '-h' chosen.
Valid program names are:
aggregatewordcount: An Aggregate based map/reduce program that counts the words in the input files.
aggregatewordhist: An Aggregate based map/reduce program that computes the histogram of the words in the input files.
bbp: A map/reduce program that uses Bailey-Borwein-Plouffe to compute exact digits of Pi.
dbcount: An example job that count the pageview counts from a database.
distbbp: A map/reduce program that uses a BBP-type formula to compute exact bits of Pi.
grep: A map/reduce program that counts the matches of a regex in the input.
join: A job that effects a join over sorted, equally partitioned datasets
multifilewc: A job that counts words from several files.
pentomino: A map/reduce tile laying program to find solutions to pentomino problems.
pi: A map/reduce program that estimates Pi using a quasi-Monte Carlo method.
randomtextwriter: A map/reduce program that writes 10GB of random textual data per node.
randomwriter: A map/reduce program that writes 10GB of random data per node.
secondarysort: An example defining a secondary sort to the reduce.
sort: A map/reduce program that sorts the data written by the random writer.
sudoku: A sudoku solver.
teragen: Generate data for the terasort
terasort: Run the terasort
teravalidate: Checking results of terasort
wordcount: A map/reduce program that counts the words in the input files.
wordmean: A map/reduce program that counts the average length of the words in the input files.
wordmedian: A map/reduce program that counts the median length of the words in the input files.
wordstandarddeviation: A map/reduce program that counts the standard deviation of the length of the words in the input files.