top button
Flag Notify
    Connect to us
      Facebook Login
      Site Registration

Facebook Login
Site Registration

What is Job Tracker role in Hadoop?

+1 vote
What is Job Tracker role in Hadoop?
posted Jul 11, 2017 by Karthick.c

Share this question
Facebook Share Button Twitter Share Button LinkedIn Share Button

1 Answer

+1 vote

The JobTracker is the service within Hadoop that farms out MapReduce tasks to specific nodes in the cluster, ideally the nodes that have the data, or at least are in the same rack.

1: Client applications submit jobs to the Job tracker.
2: The JobTracker talks to the NameNode to determine the location of the data
3: The JobTracker locates TaskTracker nodes with available slots at or near the data
4: The JobTracker submits the work to the chosen TaskTracker nodes.
5: The TaskTracker nodes are monitored. If they do not submit heartbeat signals often enough, they are deemed to have failed and 6: the work is scheduled on a different TaskTracker.
A TaskTracker will notify the JobTracker when a task fails. The JobTracker decides what to do then: it may resubmit the job
elsewhere, it may mark that specific record as something to avoid, and it may may even blacklist the TaskTracker as unreliable.
7: When the work is completed, the JobTracker updates its status.
8: Client applications can poll the JobTracker for information.
The JobTracker is a point of failure for the Hadoop MapReduce service. If it goes down, all running jobs are halted.

answer Jul 11, 2017 by Ajay Kumar
Similar Questions
+1 vote

A mapreduce job can be run as jar file from terminal or directly from eclipse IDE. When a job run as jar file from terminal it uses multiple jvm and all resources of cluster. Does the same thing happen when we run from IDE. I have run a job on both and it takes less time on IDE than jar file on terminal.

+1 vote

Assume I have a machine on the same network as a hadoop 2 cluster but separate from it.

My understanding is that by setting certain elements of the config file or local xml files to point to the cluster I can launch a job without having to log into the cluster, move my jar to hdfs and start the job from the clusters hadoop machine.

Does this work? What Parameters need I sat? Where is the jar file? What issues would I see if the machine is running Windows with cygwin installed?

+1 vote

After upgraded to Hadoop 2 (yarn), I found that mapred.jobtracker.taskScheduler.maxRunningTasksPerJob no longer worked, right?

One workaround is to use queue to limit it, but its not easy to control it from job submitter.

Is there any way to limit the concurrent running mappers per job? Any documents or pointer?

+2 votes

I submit a MR job through hive ,but when it run stage-2 , it failed but why? It seems permission problem , but I do not know which dir cause the problem

Application application_1388730279827_0035 failed 1 times due to AM Container for appattempt_1388730279827_0035_000001 exited with exitCode: -1000 due to: EPERM: 
Operation not permitted at Method) at 
org.apache.hadoop.fs.RawLocalFileSystem.setPermission( at 
org.apache.hadoop.fs.RawLocalFileSystem.mkdirs( at 
org.apache.hadoop.fs.FileSystem.primitiveMkdir( at 
org.apache.hadoop.fs.DelegateToFileSystem.mkdir( at 
org.apache.hadoop.fs.FilterFs.mkdir( at 
org.apache.hadoop.fs.FileContext$ at 
org.apache.hadoop.fs.FileContext$ at 
org.apache.hadoop.fs.FileContext$FSLinkResolver.resolve( at 
org.apache.hadoop.fs.FileContext.mkdir( at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.initDirs( at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization( at 
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer( at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$ .
Failing this attempt.. 
Failing the application.
Contact Us
+91 9880187415
#280, 3rd floor, 5th Main
6th Sector, HSR Layout
Karnataka INDIA.