1 / 9

Ravi Namboori Equinix on Hadoop Mapreduce Execution

Over view of running a MapReduce Job in Hadoop is explained by Ravi Namboori a Data Center Expert.<br>Every step by step Process from Job Submission , Job initialization, Task Assignment & heartbeat, Task Execution & Task Runner. Also search for many Hadoop ppts from Ravi Namboori.

Download Presentation

Ravi Namboori Equinix on Hadoop Mapreduce Execution

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. How Hadoop Runs AMapreduce Job Presentation by Ravi Namboori Visit us: http://ravinamboori.net

  2. Involved Topics Are: • Flow diagram • Job Submission Process • Job initialization • Task Assignment & heartbeat • Task Execution • Task Runner

  3. http://ravinamboori.net Image source: Computaholics

  4. Job Submission Process The job submission process implemented by JobClient’ssubmitJob() method : • Asks the jobtracker for a new job ID (step 2) • Checks the output specification of the job • Computes the input splits for the job. If the splits cannot be computed, because the input paths don’t exist, for example, then the job is not submitted and an error is thrown to the MapReduce program. http://ravinamboori.net

  5. JobInitialization When the JobTracker receives a call to its submitJob() method It puts it into an internal queue from where the job scheduler will pick it up and initialize it. Initialization involves creating an object to represent the job being run, which encapsulates its tasks

  6. Task Assignment & heartbeat Tasktrackers run a simple loop that periodically sends heartbeat method calls to the jobtracker. Heartbeats tell the jobtracker that a tasktracker is alive. Tasktrackershave a fixed number of slots for map tasks and for reduce tasks: The default scheduler fills empty map task slots before reduce task slots If the tasktracker has at least one empty map task slot, the jobtracker will select a map task; otherwise, it will select a reduce task. http://ravinamboori.net

  7. Task Execution Once the tasktracker has been assigned a task • Task Execution localizes the job JAR by copying it from the shared filesystem to the tasktracker’sfilesystem • It also copies any files needed from the distributed cache by the application to the local disk(step 8)

  8. Task Runner TaskRunner launches a new Java Virtual Machine (step 9) run each task in(step 10). http://ravinamboori.net

  9. THANKS Presentation by Ravi Namboori Visit us: http://ravinamboori.in

More Related