1 / 22

Parallel and Distributed Programming Models and Languages

Parallel and Distributed Programming Models and Languages. 15-740/18-740 Computer Architecture In-Class Discussion Dong Zhou Kun Li Mike Ralph. Why distributed computations?. Buzzword: Big Data Take sorting as an example A mount of data that can be sorted in 60 seconds

fuller
Download Presentation

Parallel and Distributed Programming Models and Languages

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Parallel and Distributed ProgrammingModels and Languages 15-740/18-740 Computer Architecture In-Class Discussion Dong Zhou Kun Li Mike Ralph

  2. Why distributed computations? • Buzzword: Big Data • Take sorting as an example • Amount of data that can be sorted in 60 seconds • One computer can read ~60 MB/sec from one disk • 2012 world record • Flat Datacenter Storage by Ed Nightingale et.al • 1470 GB • 256 heterogeneous nodes, 1033 disks • Google indexes 100 billion+ web pages

  3. Solution: use many nodes • Grid computing • Hundreds of supercomputers connected by high speed net • Cluster computing • Thousands or tens of thousands of PCs connected by high speed LANS • 1000 nodes potentially give 1000x speedup

  4. Distributed computations are difficult to program • Sending data to/from nodes • Coordinating among nodes • Recovering from node failure • Optimizing for locality • Debugging • …

  5. MapReduce • A programming model for large-scale computations • Process large amounts of input, produce output • No side-effects or persistent state • MapReduce is implemented as a runtime library • Automatic parallelization • Load balancing • Locality optimization • Handling of machine failures

  6. MapReduce design • Input data is partitioned into M splits • Map: extract information on each split • Each map produces R partitions • Shuffle and sort • Bring M partitions to the same reducer • Reduce: aggregate, summarize, filter or transform • Output is in R result files

  7. More specifically • Programmer specifies two methods • map(k, v) → <k', v'>* • reduce(k', <v'>*) → <k'', v''>* • All v' with same k' are reduced together • Usually also specify: • partition(k', total partitions) → partition for k’ • often a simple hash of the key

  8. Runtime

  9. MapReduce is widely applicable • Distributed grep • Distributed clustering • Web link graph reversal • Detecting approx. duplicate web pages • …

  10. Dryad • Similar goals as MapReduce • Focus on throughput, not latency • Automatic management of scheduling, distribution, fault tolerance • Computations expressed as a graph • Vertices are computations • Edges are communication channels • Each vertex has several input and output edges

  11. Why using a dataflow graph? • Many programs can be represented as a distributed dataflow graph • The programmer may not have to know this • ``SQL-like’’ queries: LINQ • Dryad will run them for you

  12. V V V Runtime • Vertices (V) run arbitrary app code • Vertices exchange data through • files, TCP pipes etc. • Vertices communicate with JM to report • status • Daemon process (D) • executes vertices • Job Manager (JM) consults name server(NS) • to discover available machines. • JM maintains job graph and schedules vertices

  13. Job = Directed Acyclic Graph Outputs Processing vertices Channels (file, pipe, shared memory) Inputs

  14. Advantages of DAG over MapReduce • Big jobs more efficient with Dryad • MapReduce: big jobs runs > 1 MR stages • Reducers of each stage write to replicated storage • Output of reduce: 2 network copies, 3 disks • Dryad: each job is represented with a DAG • Intermediate vertices write to local file • …

  15. Pig Latin • High-level procedural abstraction of MapReduce • Contains SQL-like primitives • Example: good_urls = FILTER urls BY pagerank > 0.2; groups = GROUP good_urls BY category; big_groups = FILTER groups BY COUNT(good_urls)>106; Output = FOREACH big_groups GENERATE category, AVG(good_urls.pagerank); • Plus user-defined functions (UDFs)

  16. Value • Reduces development time • Procedural vs. declarative • Overhead/performance costs worthwhile? C/C++ Assembly Pig Latin MapReduce

  17. Green-Marl • High-level graph analysis language/compiler • Uses basic data types and graph primitives • Built-in graph function • BFS, RBFS, DFS • Uses domain specific optimizations • Both non-architecture and architecture specific • Compiler translates Green-Marl to other high-level language (ex. C++)

  18. Tradeoffs • Achieve speedup over hand-tuned parallel equivalents • Tested only on single workstation • Only works with graph representations • Difficulty representing certain data sets and computations • Domain specific vs. general purpose languages • Future work for more architectures, user-defined data structures

  19. Questions and Discussion

  20. Example: count word frequencies in web page • Input is files with one doc per record • Map parses document into words • key = document URL • value = document contents • Output of map "to", "1" "be", "1" "or", "1" "not", "1" "to", "1" "be", "1" "doc1", "to be or not to be"

  21. Example: count word frequencies in web page • Reduce: computes sum for a key • Output of reduce saved key = "be" values = "1", "1" key = "not" values = "1" key = "or" values = "1" key = "to" values = "1", "1" "2" "1" "2" "2" "to", "2" "be", "2" "or", "1" "not", "1"

  22. Example: Pseudo-code Map(String input_key, String input_value): //input_key: document name //input_value: document contents for each word w in input_values: EmitIntermediate(w, "1"); Reduce(String key, Iterator intermediate_values): //key: a word, same for input and output //intermediate_values: a list of counts int result = 0; for each v in inermediate_values: result += ParseInt(v); Emit(AsString(result))

More Related