Lecture #6 MapReduce (II)

Lecture #6MapReduce (II) CS492 Special Topics in Computer Science: Distributed Algorithms and Systems

MapReduce Assumptions • Hardware • Components are reliable • Components are homogeneous • Software • It’s correct • Network • Latency is zero • Bandwidth is infinite • It’s secure • Overall system • Configuration is stable • There is one administrator

MapReduce Execution Overview

Question of the Day What goes on underneath?

Step #1 Splits the input files into M pieces (64MB) Starts up many copies of the program

Step #2 One special copy (the master) of the porgram assigns work to the rest of copies (workers) M map tasks and R reduce tasks

Step #3 A worker with a map task conducts the Map function. Output buffered in memory

Step #4 Periodically, the buffered output is written to local disk, partitioned into R regions by the partitioning function => info passed onto the master

Step #5 When a reduce worker is notified by the master about the locations, it uses RPC to read the buffered data from the local disks of the map worker. The reduce worker sorts the intermediate keys

Step #6 It goes thru the unique keys and perform Reduce

Step #7 When all Map and Reduce tasks are complete, the master wakes up the user program

Fault Tolerance • Master detects worker failures • How? • What if a Map worker dies? • What if a Reduce worker died after completing the task?

Locality • Task assignment by the Master • Mapping between the input file and workers?

Backup How to deal with stragglers?

Refinements Partitioning functions Ordering guarantees Combiner function Input and output types Side-effects Skipping bad records Local execution Status information Counters

Reading for next class “Lessons from Giant-Scale Services” by Eric Brewer, IEEE Internet Computing, July-August 2001 “The Google File System” by Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung, SOSP 2003, NY Short quiz on “Lessons ...”

Lecture #6 MapReduce (II)

Lecture #6 MapReduce (II)

Presentation Transcript

LECTURE

Lecture 25 Lecture 26

Lecture

Lecture

Lecture VIII Lecture IX

Lecture

Lecture 10 Lecture 10 Lecture 11 Lecture 11 Lecture 11 Lecture 11

Lecture S1: Sample Lecture

Lecture