What configuration parameters cause a Hadoop 2.x job to run on the cluster? [CLOSED]

+1 vote

Assume I have a machine on the same network as a hadoop 2 cluster but separate from it.

My understanding is that by setting certain elements of the config file or local xml files to point to the cluster I can launch a job without having to log into the cluster, move my jar to hdfs and start the job from the clusters hadoop machine.

Does this work? What Parameters need I sat? Where is the jar file? What issues would I see if the machine is running Windows with cygwin installed?

closed with the note: Problem Solved
posted Apr 25, 2014 by Luv Kumar

What version of Hadoop you are using? (YARN or no YARN)

To answer your question; Yes its possible and simple. All you need to to is to have Hadoop JARs on the classpath with relevant configuration files on the same classpath pointing to the Hadoop cluster. Most often people simply copy core-site.xml, yarn-site.xml etc from the actual cluster to the application classpath and then you can run it straight from IDE.

Not a windows user so not sure about that second part of the question.
Thank you for your answer
1) I am using YARN
2) So presumably dropping core-site.xml, yarn-site into user.dir works do I need mapred-site.xml as well?

Yes, if you are running MR

