MapReduce : Simplified Data Processing on Large Clusters

MapReduce:Simplified Data Processing on Large Clusters Süleyman Fatih GİRİŞ 500512009

CONTENT 1. Introduction 2. Programming Model • 2.1 Example • 2.2 MoreExamples 3. Implementation • 3.1 ExecutionOverview • 3.2 Master Data Structures • 3.3 FaultTolerance • 3.4 BackupTasks

CONTENT 4. Refinements • 4.1 PartitioningFunction • 4.2 CombinerFunction • 4.3 Input and Output Types • 4.4 Side-effects • 4.5 SkippingBadRecords 5. Experience 6. Conclusions

1.Introduction • MapReduce is a programming model and an associatedimplementation for processing and generating large data sets. • This allows programmers without anyexperience with parallel and distributed systems to easilyutilize the resources of a large distributed system.

1.Introduction • Inspiredby the mapand reduce primitives present in Lispand many other functional languages. • Map()--applies a given function to each element of a list • Reduce()--recombine through use of a given combining operation • Enablesautomatic parallelizationand distribution of large-scale computations • Highperformance on large clusters

2.Programming Model • Takesa set of input key/value pairs, andproduces a set of output key/value pairs • Expressesthecomputation w/twofunctions : • Mapand Reduce • Map, takes an input pair and producesa set of intermediate key/value pairs • Reduce,merges together values,was sent fromMap, to form a possibly smaller set of values

2.Programming Model

2.1 Example • Therearefruits and number of soldonesaregiven. • Apple 9 • Banana 10 • Apple 2 • Strawberry 5 • Strawberry 9 • Apple 4 • Strawberry 3 • Banana 8

2.1 Example • Wesplittheminto 2 pieces. • Apple 9 • Banana 10 • Apple 2 • Strawberry 5 • --------------------------------------------------------------- • Strawberry 9 • Apple 4 • Strawberry 3 • Banana 8

2.1 Example • Apple 11 • Banana 10 • Strawberry 5 • --------------------------------------------------------------- • Strawberry 12 • Apple 4 • Banana 8

2.1 Example • Apple 11 • Banana 10 • Strawberry 5 • --------------------------------------------------------------- • Strawberry 12 • Apple 4 • Banana 8 Key-valuepair

2.2 MoreExamples • Word Count • Distributed Grep • Count of URL Access Frequency • ReverseWeb-Link Graph • Term-VectorperHost • InvertedIndex • Distributed Sort

2.1 Example • As a resultwehave: • Apple 15 • Banana 18 • Strawberry 17

3.Implementation • Many different implementations of the MapReduceinterfacearepossible • The right choice depends on theenvironment • Wewillcheckmostlytheimplementationtargetedto the computing environment in wide use at Google

3. Implementation • (1) Machines are typically dual-processor x86 processorsrunning Linux, with 2-4 GB of memory per machine. • (2) Commodity networking hardware is used – typicallyeither 100 megabits/second or 1 gigabit/second at themachinelevel.

3. Implementation • (3) A cluster consists of hundreds or thousands of machines,and therefore machine failures are common. • (4) Storage is provided by inexpensive IDE disks attacheddirectlytoindividualmachines. • (5) Users submit jobs to a scheduling system. Each jobconsists of a set of tasks, and is mapped by the schedulerto a set of available machines within a cluster.

3.1 ExecutionOverview • The Map invocations are distributed across multiplemachines by automatically partitioning the input datainto a set of M splits. • Reduceinvocationsare distributed by partitioning the intermediate keyspace into R pieces using a partitioning function(e.g., hash(key) modR)

3.1 ExecutionOverview • 1. The MapReduce library in the user program firstsplits the input files into M pieces of typically 16megabytes to 64 megabytes (MB) per piece

3.1 ExecutionOverview • 2. One of the copies of the program is special – themaster. There areM map tasks and R reducetasks to assign. The master picks idle workers andassigns each one a map task or a reduce task.

3.1 ExecutionOverview • 3. A worker who is assigned a map task reads thecontents of the corresponding input split. It parseskey/value pairs out of the input data and passes eachpair to the user-defined Map function.

3.1 ExecutionOverview • 4. Periodically, the buffered pairs are written to localdisk, partitioned into R regions by the partitioningfunction. The locations of these buffered pairs onthe local disk are passed back to the master.

3.1 ExecutionOverview • 5. When a reduce worker is notified by the masterabout these locations, it uses remote procedure callsto read the buffered data from the local disks of themapworkers.

3.1 ExecutionOverview • 6. When a reduce worker has read all intermediatedata, it sorts it by the intermediate keysso that all occurrences of the same key are groupedtogether.

3.1 ExecutionOverview • 7. The reduce worker iterates over the sorted intermediatedata and for each unique intermediate key encountered,it passes the key and the correspondingset of intermediate values to the user’s Reduce function.

3.1 ExecutionOverview • 8. When all map tasks and reduce tasks have beencompleted, the master wakes up the user program.At this point, the MapReduce call in the user programreturns back to the user code.

3.2 Master Data Structures • Foreachmaptask and reduce task, it stores : • State (idle, in-progress,orcompleted) • Identity of the worker machine(fornon-idletasks). • For each completed map task,the master stores the locations and sizes of the R intermediatefile regions produced by the map task. • Updatesto this location and size information are received as maptasksarecompleted.

3.3 FaultTolerance • WorkerFailure • Ifnoresponseis received from a worker in a certain amount oftime, the master marks the worker as failed. • Anymaptasks completed by the worker are reset back to their initialidlestate. • Completed map tasks are re-executed on a failure. • Completedreduce tasks do not need to be re-executed since theiroutput is stored in a global file system.

3.3 FaultTolerance • Master Failure • Writesperiodiccheckpoints of data structures • Ifthemastertask dies, a new copy can be started from the lastcheckpointedstate • Itsfailure is unlikely; therefore currentimplementation aborts the MapReducecomputationifthemasterfails

3.4 BackupTasks • One of thereasoncauses that lengthens the total timetaken for a MapReduceoperationis • Straggler :A machinethat takes an unusually long time to complete oneof the last few map or reduce tasks in the computation. • Solution: • When a MapReduce operation is closeto completion, the master schedules backup executionsof the remaining in-progress tasks • Thetask is markedas completed whenever either the primary or the backupexecutioncompletes.

4. Refinements • Although the basic functionality provided by simplywriting Map and Reduce functions is sufficient for mostneeds, thereare a few extensions useful.

4.1 PartitioningFunction • The users specify the number of reducetasks/output files that they desire (R). • Data getspartitionedacross these tasks using a partitioning function onthe intermediate key. • A default partitioning function isprovided that uses hashing (e.g. “hash(key) mod R”)

4.2 CombinerFunction • All of counts will be sent over the network to a single reducetask and then added together by the Reduce functiontoproduceonenumber. • It is allowedthe user to specify anoptional Combiner function that does partial merging ofthis data before it is sent over the network. • The Combiner function is executed on each machinethat performs a map task

4.3 Input and Output Types • The MapReduce library provides support for reading input data in severaldifferentformats • Forexample, “text” mode input treats each line as a key/value pair: • Keyis the offset in the file and the value is the contents oftheline. • Another common supported format stores asequence of key/value pairs sorted by key.

4.4 Side-effects • Insomecases , users of MapReduce have found it convenientto produce auxiliary files as additional outputsfrom their map and/or reduce operators • Typically the application writes to a temporaryfile and atomically renames this file once it has beenfullygenerated.

4.5 SkippingBadRecords • There are bugs in user code that cause the Mapor Reduce functions to crash deterministically on certainrecords. • Sometimesit is acceptable to ignore a fewrecords(e.gwhen doing statistical analysis on a large data set) • MapReduce library detects which recordscause deterministic crashes and skips these records in ordertomakeforwardprogress.

5. Experience • It has been used across a wide range of domains within Google, including: • Large-scalemachinelearningproblems • Clustering problems for the Google News andFroogleproducts • Extractionof data used to produce reports of popularqueries(e.g. Google Zeitgeist) • Extractionof properties of web pages for new experimentsand products (e.g. extraction of geographicallocations from a large corpus of web pages forlocalizedsearch) • Large-scalegraphcomputations

5. Experience

6. Conclusions • The model is easy to use, even for programmers without experiencewith parallel and distributed systems • Alargevarietyof problems are easily expressible as MapReducecomputations • Scalestolargeclustersof machines comprising thousands of machines

6. Conclusions • Restrictingthe programming model makes it easy to parallelizeand distribute computations and to make suchcomputationsfault-tolerant • Network bandwidth is a limittedresource; A number of optimizations in this systemarereducingtheamountof data sent across the network • Redundantexecution can be used toreduce the impact of slow machines, and to handle machinefailuresand data loss.

THANKS FORLISTENING

MapReduce : Simplified Data Processing on Large Clusters