Two hadoop nodes on same machine while a second machine not joining the cluster

+1 vote

I have a test cluster of two machines, on both of them hadoop is installed. I have configured the hadoop cluster but on admin UI (as in the below picture) I see that two nodes are running on the same master machine, and that the other machine has no Hadoop node.

On master machine following services are running:

~$ jps 26310 ResourceManager 27593 Jps 26216 DataNode 26135 NameNode 26557 NodeManager 26701 JobHistoryServer 

On the slave machine:

~$ jps 2614 DataNode 2920 Jps 2707 NodeManager 

I don't why the slave is not joining the cluster (It was before). I tried to shutdown all servers on both machines and format HDFS then restarting everything but that didnot help. Any help to figure whats causing that behavior is appreciated.

posted Jun 16, 2015 by anonymous

+2 votes

Did any one got these error before, please help

ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: error processing WRITE_BLOCK operation  src: /xxxxxxxx:39000 dst: /xxxxxx:50010

at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.(
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(
at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(
at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(
2015-01-11 04:13:21,846 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: IOException in offerService
WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Slow BlockReceiver write packet to mirror took 657ms (threshold=300ms)
+1 vote

Assume I have a machine on the same network as a hadoop 2 cluster but separate from it.

My understanding is that by setting certain elements of the config file or local xml files to point to the cluster I can launch a job without having to log into the cluster, move my jar to hdfs and start the job from the clusters hadoop machine.

Does this work? What Parameters need I sat? Where is the jar file? What issues would I see if the machine is running Windows with cygwin installed?

+1 vote

To run a job we use the command
$ hadoop jar example.jar inputpath outputpath
If job is so time taken and we want to stop it in middle then which command is used? Or is there any other way to do that?

+2 votes

Let we change the default block size to 32 MB and replication factor to 1. Let Hadoop cluster consists of 4 DNs. Let input data size is 192 MB. Now I want to place data on DNs as following. DN1 and DN2 contain 2 blocks (32+32 = 64 MB) each and DN3 and DN4 contain 1 block (32 MB) each. Can it be possible? How to accomplish it?