Unit tests on Hadoop Cluster

230 views

I am a beginner to writing unit tests in hadoop.

As per https://wiki.apache.org/hadoop/HowToDevelopUnitTests

the Hadoop Unit tests are all designed to work on a local machine, rather than a full-scale Hadoop cluster.

However I do see the Hadoop-QA https://issues.apache.org/jira/secure/ViewProfile.jspa?name=hadoopqa also runs unit test cases when it validates a patch for any issue like

https://issues.apache.org/jira/browse/YARN-2459?focusedCommentId=14115586&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14115586

So does this mean it runs unit tests only in single node / local setup for validating any patches?

I wish to write some unit tests in a HA environment thus I need test in a cluster setup

posted Jan 5, 2015 by anonymous

Looking for an answer? Promote on:

Similar Questions

+2 votes

How do I customize data placement on DataNodes (DN) of Hadoop cluster?

Let we change the default block size to 32 MB and replication factor to 1. Let Hadoop cluster consists of 4 DNs. Let input data size is 192 MB. Now I want to place data on DNs as following. DN1 and DN2 contain 2 blocks (32+32 = 64 MB) each and DN3 and DN4 contain 1 block (32 MB) each. Can it be possible? How to accomplish it?

+1 vote

Two hadoop nodes on same machine while a second machine not joining the cluster

I have a test cluster of two machines, on both of them hadoop is installed. I have configured the hadoop cluster but on admin UI (as in the below picture) I see that two nodes are running on the same master machine, and that the other machine has no Hadoop node.

On master machine following services are running:

~$ jps 26310 ResourceManager 27593 Jps 26216 DataNode 26135 NameNode 26557 NodeManager 26701 JobHistoryServer

On the slave machine:

~$ jps 2614 DataNode 2920 Jps 2707 NodeManager

I don't why the slave is not joining the cluster (It was before). I tried to shutdown all servers on both machines and format HDFS then restarting everything but that didnot help. Any help to figure whats causing that behavior is appreciated.

+1 vote

How to stop a mapreduce job from terminal running on Hadoop Cluster?

To run a job we use the command
$ hadoop jar example.jar inputpath outputpath
If job is so time taken and we want to stop it in middle then which command is used? Or is there any other way to do that?

+1 vote

Can we run mapreduce job from eclipse IDE on fully distributed mode hadoop cluster?

A mapreduce job can be run as jar file from terminal or directly from eclipse IDE. When a job run as jar file from terminal it uses multiple jvm and all resources of cluster. Does the same thing happen when we run from IDE. I have run a job on both and it takes less time on IDE than jar file on terminal.

0 votes

How many daemon processes we can run on a Hadoop cluster?

...

Unit tests on Hadoop Cluster

Your comment on this post:

Your answer

Preview