Run my own application master on a specific node in a YARN cluster

519 views

First of all, I'm using Hadoop-2.6.0. I want to launch my own app master on a specific node in a YARN cluster in order to open a server on a predetermined IP address and port. To that end, I wrote a driver program in which I created a ResourceRequest object and called setResourceName method to set a hostname, and attached it to a ApplicationSubmissionContext object by callingsetAMContainerResourceRequest method.

I tried several times but couldn't launch the app master on a specific node. After searching code, I found that RMAppAttemptImpl invalidates what I've set in ResourceRequest as follows:

 // Currently, following fields are all hard code,
 // TODO: change these fields when we want to support
 // priority/resource-name/relax-locality specification for AM containers
 // allocation.
 appAttempt.amReq.setNumContainers(1);
 appAttempt.amReq.setPriority(AM_CONTAINER_PRIORITY);
 appAttempt.amReq.setResourceName(ResourceRequest.ANY);
 appAttempt.amReq.setRelaxLocality(true);

Is there another way to launch a container for an application master on a specific node in Hadoop-2.6.0?

posted Mar 30, 2015 by anonymous

Looking for an answer? Promote on:

Facebook Share Button

Twitter Share Button

LinkedIn Share Button

Similar Questions

+1 vote

Getting times for all the jobs run on a YARN cluster

I'm trying to get all the start and finish times for all the run jobs on a yarn cluster.

yarn application -list -appStates ALL

Will get me most of the details of the jobs, but not the times. However, I can parse this for the application ids and then run

yarn application -status $ID

on each application id to get an output that I can parse for the time.

However this involves making lots of connections to yarn, so is relatively slow. Is there a single command I can use to get all this information?

+1 vote

How can I track a job failure on node or list of nodes, using YARN APIs?

How can I track a job failure on node or list of nodes, using YARN apis. I could get the list of long running jobs, using yarn client API,Â but need to go further to AM, NM, task attempts for map or reduce.
Say, I have a job running for long,(about 4hours), might be caused of some task failures.

Please provide the sequence of APIs, or any reference.

+2 votes

Run arbitrary job (non-MR) on YARN ?

I happened to run into this interesting scenario:

I had some mahout seq2sparse jobs, originally I run them in parallel using the distributed mode. But because the input files are so small, running them locally actually is much faster. so I turned them to local mode. But I run 10 of these jobs in parallel, so when 10 mahout jobs are run together, everyone became very slow. Is there an existing code that takes a desired shell script, and possibly some archive files (could contain the jar file, or C++ --generated executable code). I understand that I could use yarn API to code such a thing, but it would be nice if I could just take it and run in shell..

+1 vote

Can we run mapreduce job from eclipse IDE on fully distributed mode hadoop cluster?

A mapreduce job can be run as jar file from terminal or directly from eclipse IDE. When a job run as jar file from terminal it uses multiple jvm and all resources of cluster. Does the same thing happen when we run from IDE. I have run a job on both and it takes less time on IDE than jar file on terminal.

+1 vote

Time out after 600 for YARN mapreduce application

I keep encountering an error when running nutch on hadoop YARN:

AttemptID:attempt_1423062241884_9970_m_000009_0 Timed out after 600 secs

Some info on my setup. I'm running a 64 nodes cluster with hadoop 2.4.1. Each node has 4 cores, 1 disk and 24Gb of RAM, and the namenode/resourcemanager has the same specs only with 8 cores.

I am pretty sure one of these parameters is to the threshold I'm hitting:

yarn.am.liveness-monitor.expiry-interval-ms 
yarn.nm.liveness-monitor.expiry-interval-ms 
yarn.resourcemanager.nm.liveness-monitor.interval-ms

but I would like to understand why.

The issue usually appears under heavier load, and most of the time the on the next attempts it is successful. Also if I restart the Hadoop cluster the error goes away for some time.

...