1 / 43

Data Parallel and Graph Parallel Systems for Large-scal e Data P rocessing

Data Parallel and Graph Parallel Systems for Large-scal e Data P rocessing. Presenter: Kun Li. Threads, Locks, and Messages. ML experts repeatedly solve the same parallel design challenges: Implement and debug complex parallel system Tune for a specific parallel platform

ruby-rios
Download Presentation

Data Parallel and Graph Parallel Systems for Large-scal e Data P rocessing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Parallel and Graph Parallel Systems for Large-scale Data Processing Presenter: Kun Li

  2. Threads, Locks, and Messages • ML experts repeatedly solve the same parallel design challenges: • Implement and debug complex parallel system • Tune for a specific parallel platform • Two months later the conference paper contains: “We implemented ______ in parallel.” • The resulting code: • is difficult to maintain • is difficult to extend • couples learning model to parallel implementation

  3. ... a better answer: Map-Reduce / Hadoop Build learning algorithms on-top of high-level parallel abstractions

  4. Motivation • Large-Scale Data Processing • Want to use 1000s of CPUs • But don’t want hassle of managing things • MapReduce provides • Automatic parallelization & distribution • Fault tolerance • I/O scheduling • Monitoring & status updates

  5. Map/Reduce • map(key, val) is run on each item in set • emits new-key / new-val pairs • reduce(key, vals)is run for each unique key emitted by map() • emits final output

  6. Countcount indocs map(key=url, val=contents): For each word w in contents, emit (w, “1”) reduce(key=word, values=uniq_counts): Sum all “1”s in values list Emit result “(word, sum)” see 1 bob 1 run 1 see 1 spot 1 throw 1 bob 1 run 1 see 2 spot 1 throw 1 see bob throw see spot run

  7. Grep • Input consists of (url+offset, single line) • map(key=url+offset, val=line): • If contents matches regexp, emit (line, “1”) • reduce(key=line, values=uniq_counts): • Don’t do anything; just emit line

  8. Reverse Web-Link Graph • Map • For each URL linking to target, … • Output <target, source> pairs • Reduce • Concatenate list of all source URLs • Outputs: <target, list (source)> pairs

  9. Job Processing TaskTracker 0 TaskTracker 1 TaskTracker 2 JobTracker TaskTracker 3 TaskTracker 4 TaskTracker 5 “grep” • Client submits “grep” job, indicating code and input files • JobTracker breaks input file into k chunks, (in this case 6). Assigns work to ttrackers. • After map(), tasktrackers exchange map-output to build reduce() keyspace • JobTracker breaks reduce() keyspace into m chunks (in this case 6). Assigns work. • reduce() output may go to NDFS

  10. Execution

  11. Parallel Execution

  12. Refinement: Locality Optimization • Master scheduling policy: • Asks GFS for locations of replicas of input file blocks • Map tasks scheduled so GFS input block replica are on same machine or same rack • Effect • Thousands of machines read input at local disk speed • Without this, rack switches limit read rate • Combiner • Useful for saving network bandwidth

  13. Map-Reduce for Data-Parallel ML • Excellent for large data-parallel tasks! Data-ParallelGraph-Parallel Is there more to Machine Learning ? Map Reduce Label Propagation Lasso Feature Extraction Cross Validation Belief Propagation Kernel Methods Computing Sufficient Statistics Tensor Factorization PageRank Neural Networks Deep Belief Networks

  14. Properties of Graph Parallel Algorithms Dependency Graph Factored Computation Iterative Computation What I Like What My Friends Like

  15. Map-Reduce for Data-Parallel ML • Excellent for large data-parallel tasks! Data-ParallelGraph-Parallel Map Reduce Map Reduce? ? Label Propagation Lasso Feature Extraction Cross Validation Belief Propagation Kernel Methods Computing Sufficient Statistics Tensor Factorization PageRank Neural Networks Deep Belief Networks

  16. Why not use Map-Reducefor Graph Parallel Algorithms?

  17. Data Dependencies • Map-Reduce does not efficiently express dependent data • User must code substantial data transformations • Costly data replication Independent Data Rows

  18. Iterative Algorithms • Map-Reduce not efficiently express iterative algorithms: Iterations Data Data Data Data CPU 1 CPU 1 CPU 1 Data Data Data Data Data Data Data Data CPU 2 CPU 2 CPU 2 Data Data Data Data Data Data Data Data CPU 3 CPU 3 CPU 3 Data Data Data Slow Processor Data Data Data Data Data Barrier Barrier Barrier

  19. MapAbuse: Iterative MapReduce • Only a subset of data needs computation: Iterations Data Data Data Data CPU 1 CPU 1 CPU 1 Data Data Data Data Data Data Data Data CPU 2 CPU 2 CPU 2 Data Data Data Data Data Data Data Data CPU 3 CPU 3 CPU 3 Data Data Data Data Data Data Data Data Barrier Barrier Barrier

  20. MapAbuse: Iterative MapReduce • System is not optimized for iteration: Iterations Data Data Data Data CPU 1 CPU 1 CPU 1 Data Data Data Data Data Data Data Data CPU 2 CPU 2 CPU 2 Data Data Data StartupPenalty Disk Penalty Disk Penalty Startup Penalty Startup Penalty Disk Penalty Data Data Data Data Data CPU 3 CPU 3 CPU 3 Data Data Data Data Data Data Data Data

  21. Map-Reduce for Data-Parallel ML • Excellent for large data-parallel tasks! Data-ParallelGraph-Parallel Map Reduce GraphLab Map Reduce? SVM Lasso Feature Extraction Cross Validation Belief Propagation Kernel Methods Computing Sufficient Statistics Tensor Factorization PageRank Neural Networks Deep Belief Networks

  22. The GraphLab Framework Scheduler Graph Based Data Representation Update Functions User Computation Consistency Model

  23. Data Graph A graph with arbitrary data (C++ Objects) associated with each vertex and edge. • Graph: • Social Network • Vertex Data: • User profile text • Current interests estimates • Edge Data: • Similarity weights

  24. Implementing the Data Graph Multicore Setting • In Memory • Relatively Straight Forward • vertex_data(vid)  data • edge_data(vid,vid)  data • neighbors(vid)  vid_list • Challenge: • Fast lookup, low overhead • Solution: • Dense data-structures • Fixed Vdata&Edata types • Immutable graph structure

  25. The GraphLab Framework Scheduler Graph Based Data Representation Update Functions User Computation Consistency Model

  26. Update Functions An update function is a user defined program which when applied to a vertex transforms the data in the scopeof the vertex label_prop(i, scope){ // Get Neighborhood data (Likes[i], Wij, Likes[j]) scope; // Update the vertex data // Reschedule Neighbors if needed if Likes[i] changes then reschedule_neighbors_of(i); }

  27. The GraphLab Framework Scheduler Graph Based Data Representation Update Functions User Computation Consistency Model

  28. The Scheduler The scheduler determines the order that vertices are updated. b d a c CPU 1 c b e f g Scheduler e f b a i k h j i h i j CPU 2 The process repeats until the scheduler is empty.

  29. The GraphLab Framework Scheduler Graph Based Data Representation Update Functions User Computation Consistency Model

  30. Ensuring Race-Free Code • How much can computation overlap?

  31. GraphLab Ensures Sequential Consistency For each parallel execution, there exists a sequential execution of update functions which produces the same result. time CPU 1 Parallel CPU 2 Single CPU Sequential

  32. Consistency Rules Full Consistency Data Guaranteed sequential consistency for all update functions

  33. Full Consistency Full Consistency

  34. Obtaining More Parallelism Full Consistency Edge Consistency

  35. Edge Consistency Edge Consistency Safe Read CPU 1 CPU 2

  36. Consistency Through R/W Locks • Read/Write locks: • Full Consistency • Edge Consistency Write Write Write Canonical Lock Ordering Read Read Write Read Write

  37. Consistency Through Scheduling • Edge Consistency Model: • Two vertices can be Updated simultaneously if they do not share an edge. • Graph Coloring: • Two vertices can be assigned the same color if they do not share an edge. Phase 1 Phase 2 Phase 3 Barrier Barrier Barrier

More Related