What is Job Tracker role in Hadoop?

1 Answer

The JobTracker is the service within Hadoop that farms out MapReduce tasks to specific nodes in the cluster, ideally the nodes that have the data, or at least are in the same rack.

1: Client applications submit jobs to the Job tracker.
2: The JobTracker talks to the NameNode to determine the location of the data
3: The JobTracker locates TaskTracker nodes with available slots at or near the data
4: The JobTracker submits the work to the chosen TaskTracker nodes.
5: The TaskTracker nodes are monitored. If they do not submit heartbeat signals often enough, they are deemed to have failed and 6: the work is scheduled on a different TaskTracker.
A TaskTracker will notify the JobTracker when a task fails. The JobTracker decides what to do then: it may resubmit the job
elsewhere, it may mark that specific record as something to avoid, and it may may even blacklist the TaskTracker as unreliable.
7: When the work is completed, the JobTracker updates its status.
8: Client applications can poll the JobTracker for information.
The JobTracker is a point of failure for the Hadoop MapReduce service. If it goes down, all running jobs are halted.

answer Jul 11, 2017 by Ajay Kumar

Similar Questions

+1 vote

Whenever a client submits a hadoop job, who receives it?

+1 vote

Can we run mapreduce job from eclipse IDE on fully distributed mode hadoop cluster?

A mapreduce job can be run as jar file from terminal or directly from eclipse IDE. When a job run as jar file from terminal it uses multiple jvm and all resources of cluster. Does the same thing happen when we run from IDE. I have run a job on both and it takes less time on IDE than jar file on terminal.

+1 vote

What configuration parameters cause a Hadoop 2.x job to run on the cluster?

Assume I have a machine on the same network as a hadoop 2 cluster but separate from it.

My understanding is that by setting certain elements of the config file or local xml files to point to the cluster I can launch a job without having to log into the cluster, move my jar to hdfs and start the job from the clusters hadoop machine.

Does this work? What Parameters need I sat? Where is the jar file? What issues would I see if the machine is running Windows with cygwin installed?

+1 vote

hadoop: Is there any way to limit the concurrent running mappers per job?

After upgraded to Hadoop 2 (yarn), I found that mapred.jobtracker.taskScheduler.maxRunningTasksPerJob no longer worked, right?

One workaround is to use queue to limit it, but its not easy to control it from job submitter.

Is there any way to limit the concurrent running mappers per job? Any documents or pointer?

+2 votes

Issue about running hive MR job in hadoop

I submit a MR job through hive ,but when it run stage-2 , it failed but why? It seems permission problem , but I do not know which dir cause the problem

Application application_1388730279827_0035 failed 1 times due to AM Container for appattempt_1388730279827_0035_000001 exited with exitCode: -1000 due to: EPERM: 
Operation not permitted at org.apache.hadoop.io.nativeio.NativeIO.chmod(Native Method) at 
org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:581) at 
org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:388) at 
org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:1041) at 
org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:150) at 
org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:190) at 
org.apache.hadoop.fs.FileContext$4.next(FileContext.java:698) at 
org.apache.hadoop.fs.FileContext$4.next(FileContext.java:695) at 
org.apache.hadoop.fs.FileContext$FSLinkResolver.resolve(FileContext.java:2325) at 
org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:695) at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.initDirs(ContainerLocalizer.java:385) at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:130) at 
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:103) at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:861) .
Failing this attempt.. 
Failing the application.

What is Job Tracker role in Hadoop?

Your comment on this post:

1 Answer

Your comment on this answer:

Your answer

Preview