Hadoop client setup to access remote cluster

626 views

I have setup a HDP 2.3 cluster on Linux(CentOS). Now I am trying to utilize my ETL programs to access this cluster from a windows environment.
Should I setup Apache Hadoop on Windows local/server. What setup should I do ? What goes into the core-site.xml (mention my remote HDFS url ?/)
Any pointers would be helpful.

posted Oct 7, 2015 by anonymous

Looking for an answer? Promote on:

Similar Questions

+3 votes

Hadoop Multi-tenant cluster setUp

How Hadoop provides Multi-tenancy using scheduler's or in simple terms "WHAT ARE THE STEPS TO CONFIGURE A MULTI-TENANT HADOOP CLUSTER?"
Here multi-tenancy means different users can run there applications(similar/different) in a way such that each user is completely unaware of other and one user can't interfere with other user's data in hdfs such that data is secure and each user gets its fair proportion of resources to execute its applications in parallel.
And is there any way to verify that cluster tenants are able to get their applications executed easily without any other intervention while keeping their data not secure and safe in hdfs?

+1 vote

How to stop a mapreduce job from terminal running on Hadoop Cluster?

To run a job we use the command
$ hadoop jar example.jar inputpath outputpath
If job is so time taken and we want to stop it in middle then which command is used? Or is there any other way to do that?

+1 vote

How to upgrade hadoop cluster?

I want to upgrade my cluster ,in doc,one of step is backup namenode dfs.namenode.name.dir directory.
I have two directories defined in hdfs-site.xml, should I backup them all ,or just one of them?

dfs.namenode.name.dir
file:///data/namespace/1,file:///data/namespace/2

+1 vote

What configuration parameters cause a Hadoop 2.x job to run on the cluster?

Assume I have a machine on the same network as a hadoop 2 cluster but separate from it.

My understanding is that by setting certain elements of the config file or local xml files to point to the cluster I can launch a job without having to log into the cluster, move my jar to hdfs and start the job from the clusters hadoop machine.

Does this work? What Parameters need I sat? Where is the jar file? What issues would I see if the machine is running Windows with cygwin installed?

+3 votes

Ways to manage user accounts on hadoop cluster when using kerberos security

From the documentation + code, "when kerberos is enabled, all tasks are run as the end user (e..g as user "joe" and not as hadoop user "mapred") using the task-controller (which is setuid root and when it runs, it does a setuid/setgid etc. to Joe and his groups ). For this to work, user "joe" linux account has to be present on all nodes of the cluster."

In a environment with large and dynamic user population; it is not practical to add every end user to every node of the cluster (and drop user when end user is deactivated etc.)

What are other options get this working ? I am assuming that if the users are in a LDAP, can using the PAM for LDAP solve the issue. Any other suggestions?

...

Hadoop client setup to access remote cluster

Your comment on this post:

Your answer

Preview