1 / 39

Ex-MATE: Data-Intensive Computing with Large Reduction Objects and Its Application to Graph Mining

Ex-MATE: Data-Intensive Computing with Large Reduction Objects and Its Application to Graph Mining. Wei Jiang and Gagan Agrawal. Outline. Background System Design of Ex-MATE Parallel Graph Mining with Ex-MATE Experiments Related Work Conclusion. Outline. Background

vern
Download Presentation

Ex-MATE: Data-Intensive Computing with Large Reduction Objects and Its Application to Graph Mining

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Ex-MATE: Data-Intensive Computing with Large Reduction Objects and Its Application to Graph Mining Wei Jiang and Gagan Agrawal

  2. Outline • Background • System Design of Ex-MATE • Parallel Graph Mining with Ex-MATE • Experiments • Related Work • Conclusion

  3. Outline • Background • System Design of Ex-MATE • Parallel Graph Mining with Ex-MATE • Experiments • Related Work • Conclusion

  4. Background (I) • Map-Reduce • Simple API : map and reduce • Easy to write parallel programs • Fault-tolerant for large-scale data centers • Performance? • Always a concern for HPC community • Generalized Reduction • First proposed in FREERIDE that was developed at Ohio State 2001-2003 • Shared a similar processing structure • The key difference lies in a programmer-managed reduction-object • Better performance?

  5. Map-Reduce Execution

  6. Comparing Processing Structures • Reduction Objectrepresents the intermediate state of the execution • Reduce func. is commutative and associative • Sorting, grouping.. .overheads are eliminated with red. func/obj.

  7. Our Previous Work • A comparative study between FREERIDE and Hadoop: • FREERIDE outperformed Hadoop with factors of 5 to 10 • Possible reasons: • Java VS C++? HDFS overheads? Inefficiency of Hadoop? • API difference? • Developed MATE (Map-Reduce system with an AlternaTE API) on top of Phoenix from Stanford • Adopted Generalized Reduction • Focused on API differences • MATE improved Phoenix with an average of 50% • Avoids large set of intermediate pairs between Map & Reduce • Reduces memory requirements

  8. Extending MATE • Main issues of the original MATE: • Only works on a single multi-core machine • Datasets should reside in memory • Assumes the reduction object MUST fit in memory • This paper extended MATE to address these limitations • Focus on graph mining: an emerging class of apps • Require large-sized reduction objects as well as large-scale datasets • E.g., PageRank could have a 8GB reduction object! • Support of managing arbitrary-sized reduction objects • Also reading disk-resident input data • Evaluated Ex-MATE using PEGASUS • PEGASUS: A Hadoop-based graph mining system

  9. Outline • Background • System Design of Ex-MATE • Parallel Graph Mining with Ex-MATE • Experiments • Related Work • Conclusion

  10. System Design and Implementation System design of Ex-MATE Execution overview Support of distributed environments System APIs in Ex-MATE One set provided by the runtime operations on reduction objects Another set defined or customized by the users reduction, combination, etc.. Runtime in Ex-MATE Data partitioning Task scheduling Other low-level details

  11. Ex-MATE Runtime Overview Basic one-stage execution

  12. Implementation Considerations Support for processing very large datasets Partitioning function: Partition and distribute to a number of nodes Splitting function: Use the multi-core CPU on each node Management of a large reduction-object (R.O.): Reduce disk I/O! Outputs (R.O.) are updated in a demand-driven way Partition the reduction object into splits Inputs are re-organized based on data access patterns Reuse a R.O. split as much as possible in memory Example: Matrix-Vector Multiplication

  13. A MV-Multiplication Example Input Vector (1, 1) (1, 2) Output Vector Input Matrix (2, 1)

  14. Outline • Background • System Design of Ex-MATE • Parallel Graph Mining with Ex-MATE • Experiments • Related Work • Conclusion

  15. GIM-V for Graph Mining (I) • Generalized Iterative Matrix-Vector Multiplication(GIM-V) • Proposed at CMU at first • Similar to the common MV Multiplication • MV Mul. : • Three operations in • GIM-V: • combinem(i, j) and v(j) : • Not have to be a multiplication • combineAlln partial results for the element i : • Not have to be the sum • assignv(new) to v(i) : • The previous value of v(i) is updated by a new value Multiplication Sum Assignment

  16. GIM-V for Graph Mining (II) • A set of graph mining applications can fit into this GIM-V • PageRank, Diameter Estimation, Finding Connected Components, Random Walk with Restart, etc.. • Parallelization of GIM-V: • Use Map-Reduce in PEGASUS • A two-stage algorithm: two consecutive map-reduce jobs • Use Generalized Reduction in Ex-MATE • A one-stage algorithm: simpler code

  17. GIM-V Example: PageRank • PageRank is used by Google to calculate the relative importance of web-pages: • Direct implementation of GIM-V: v(j) is the ranking value • The three customized operations are: Multiplication Sum Assignment

  18. GIM-V: Other Algorithms • Diameter Estimation: HADI is an algorithm to estimate the diameter of a given graph • The three customized operations are: • Finding Connected Components: HCC is a new algorithm to find the connected components of large graphs • The three customized operations are: Multiplication Bitwise-or Bitwise-or Multiplication Minimal Minimal

  19. Parallelization of GIM-V (I) • Using Map-Reduce: Stage I • Map: Map M(i,j) and V(j) to reducer j

  20. Parallelization of GIM-V (II) • Using Map-Reduce: Stage I (cont.) • Reduce: Map “combine2(M(i,j) , V(j)) “ to reducer i

  21. Parallelization of GIM-V (III) • Using Map-Reduce: Stage II • Map:

  22. Parallelization of GIM-V (IV) • Using Map-Reduce: Stage II (cont.) • Reduce:

  23. Parallelization of GIM-V (V) • Using Generalized Reduction in Ex-MATE: • Reduction:

  24. Parallelization of GIM-V (VI) • Using Generalized Reduction in Ex-MATE: • Finalize:

  25. Outline • Background • System Design of Ex-MATE • Parallel Graph Mining with Ex-MATE • Experiments • Related Work • Conclusion

  26. Experiments Design • Applications: • Three graph mining algorithms: • PageRank, Diameter Estimation, and Finding Connected Components • Evaluation: • Performance comparison with PEGASUS • PEGASUS provides a naïve version and an optimized version • Speedups with an increasing number of nodes • Scalability speedups with an increasing size of datasets • Experimental platform: • A cluster of multi-core CPU machines • Used up to 128 cores (16 nodes) 26 September 17, 2014

  27. Results: Graph Mining (I) PageRank: 16GB dataset; a graph of 256 million nodes and 1 billion edges Avg. Time Per Iteration (min) 10.0 speedup # of nodes

  28. Results: Graph Mining (II) HADI: 16GB dataset; a graph of 256 million nodes and 1 billion edges Avg. Time Per Iteration (min) 11.0 speedup # of nodes

  29. Results: Graph Mining (III) HCC: 16GB dataset; a graph of 256 million nodes and 1 billion edges Avg. Time Per Iteration (min) 9.0 speedup # of nodes

  30. Scalability: Graph Mining (IV) HCC: 8GB dataset; a graph of 256 million nodes and 0.5 billion edges Avg. Time Per Iteration (min) 1.7 speedup 1.9 speedup # of nodes

  31. Scalability: Graph Mining (V) HCC: 32GB dataset; a graph of 256 million nodes and 2 billion edges Avg. Time Per Iteration (min) 1.9 speedup 2.7 speedup # of nodes

  32. Scalability: Graph Mining (VI) HCC: 64GB dataset; a graph of 256 million nodes and 4 billion edges Avg. Time Per Iteration (min) 1.9 speedup 2.8 speedup # of nodes

  33. Observations • Performance trends are similar for all three applications • Consistent with the fact that all three applications are implemented using the GIM-V method • Ex-MATE outperforms PEGASUS significantly for all three graph mining algorithms • Reasonable speedups for different datasets • Better scalability for larger datasets with a increasing number of nodes 33 September 17, 2014

  34. Outline • Background • System Design of Ex-MATE • Parallel Graph Mining with Ex-MATE • Experiments • Related Work • Conclusion

  35. Related Work: Academia • Evaluation of Map-Reduce-like models in various parallel programming environments: • Phoenix-rebirth for large-scale multi-core machines • Mars for a single GPU • MITHRA for GPGPUs in heterogeneous platforms • Recent IDAV for GPU clusters • Improvement of Map-Reduce API: • Integrating pre-fetch and pre-shuffling into Hadoop • Supporting online queries • Enforcing a less restrictive synchronization semantics between Map and Reduce 35 September 17, 2014

  36. Related Work: Industry • Google’s Pregel System: • Map-reduce may not so suitable for graph operations • Proposed to target graph processing • Open source version: HAMA project in Apache • Variants of Map-Reduce: • Dryad/DryadLINQ from Microsoft • Sawzall from Google • Pig/Map-Reduce-Merge from Yahoo! • Hive from Facebook 36 September 17, 2014

  37. Outline • Background • System Design of Ex-MATE • Parallel Graph Mining with Ex-MATE • Experiments • Related Work • Conclusion

  38. Conclusion Ex-MATE supports the management of reduction objects of arbitrary sizes Deals with disk-resident reduction objects Outperforms PEGASUS for both the naïve and optimized implementations for all three graph mining application Has a simpler code Offers a promising alternative for developing efficient data-intensive applications, Uses GIM-V for parallelizing graph mining

  39. Thank You, and Acknowledgments • Questions and comments • Wei Jiang - jiangwei@cse.ohio-state.edu • Gagan Agrawal - agrawal@cse.ohio-state.edu • This project was supported by:

More Related