JobTracker and TaskTracker

JobTracker and TaskTracker are 2 essential process involved in MapReduce execution in MRv1 (or Hadoop version 1). Both processes are now deprecated in MRv2 (or Hadoop version 2) and replaced by Resource Manager, Application Master and Node Manager Daemons.

Job Tracker –

JobTracker process runs on a separate node and not usually on a DataNode.
JobTracker is an essential Daemon for MapReduce execution in MRv1. It is replaced by ResourceManager/ApplicationMaster in MRv2.
JobTracker receives the requests for MapReduce execution from the client.
JobTracker talks to the NameNode to determine the location of the data.
JobTracker finds the best TaskTracker nodes to execute tasks based on the data locality (proximity of the data) and the available slots to execute a task on a given node.
JobTracker monitors the individual TaskTrackers and the submits back the overall status of the job back to the client.
JobTracker process is critical to the Hadoop cluster in terms of MapReduce execution.
When the JobTracker is down, HDFS will still be functional but the MapReduce execution can not be started and the existing MapReduce jobs will be halted.

TaskTracker –

TaskTracker runs on DataNode. Mostly on all DataNodes.
TaskTracker is replaced by Node Manager in MRv2.
Mapper and Reducer tasks are executed on DataNodes administered by TaskTrackers.
TaskTrackers will be assigned Mapper and Reducer tasks to execute by JobTracker.
TaskTracker will be in constant communication with the JobTracker signalling the progress of the task in execution.
TaskTracker failure is not considered fatal. When a TaskTracker becomes unresponsive, JobTracker will assign the task executed by the TaskTracker to another node.

Like what you are reading? Enroll in our free Hadoop Starter Kit course & explore Hadoop in depth.

Big Data In Real World

We are a group of Big Data engineers who are passionate about Big Data and related Big Data technologies. We have designed, developed, deployed and maintained Big Data applications ranging from batch to real time streaming big data platforms. We have seen a wide range of real world big data problems, implemented some innovative and complex (or simple, depending on how you look at it) solutions.

1 Comment

Hadoop Modes - Big Data In Real World says:

March 29, 2023 at 7:51 am

[…] NameNode 2. DataNode 3. JobTracker 4. TaskTracker 5. ResourceManager (MRv2) 6. ApplicationMaster (MRv2) 7. NodeManager (MRv2) 8. […]

JobTracker and TaskTracker

NameNode and DataNode

Hadoop Modes

NameNode and DataNode

Hadoop Modes

JobTracker and TaskTracker

Job Tracker –

TaskTracker –

Big Data In Real World

Related posts

How to recursively delete files, folders or bucket from S3?

Hadoop In Real World is now Big Data In Real World!

Hadoop In Real World is changing to Big Data In Real World

1 Comment