C l oud mapreduce a mapreduce implementation on top of a cloud operation system
This presentation is the property of its rightful owner.
Sponsored Links
1 / 21

C l oud MapReduce: A MapReduce Implementation on top of a Cloud Operation System PowerPoint PPT Presentation


  • 112 Views
  • Uploaded on
  • Presentation posted in: General

C l oud MapReduce: A MapReduce Implementation on top of a Cloud Operation System. Huan Liu, Dan Orban Accenture Technology Labs. 9962161 江嘉福 100062228 徐光成 100062229 章博遠. 2011, 11th IEEE/ACM International Symposium on. 1. OUTLINE. I. I ntroduction

Download Presentation

C l oud MapReduce: A MapReduce Implementation on top of a Cloud Operation System

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


C l oud mapreduce a mapreduce implementation on top of a cloud operation system

Cloud MapReduce:

A MapReduce Implementation

on top of a

Cloud Operation System

Huan Liu, Dan Orban

Accenture Technology Labs

9962161 江嘉福

100062228 徐光成

100062229 章博遠

2011, 11th IEEE/ACM International Symposium on

1


Outline

OUTLINE

I. Introduction

II. Cloud MapReduceArchitecture & Implementation

III. Pros & Cons of Cloud MapReduce

IV. Experimental Evaluation

V. Conclusions & Future Works

VI. References

2

9962161 江嘉福100062228 徐光成 100062229 章博遠


Introduction

INTRODUCTION

1. What is Cloud OS ?

2. Challenges posed by a cloud OS

3. Cloud MapReduce?

4. Advantages of Cloud MapReduce

3

9962161 江嘉福100062228 徐光成 100062229 章博遠


What is cloud os

What is Cloud OS ?

1.Managing the low level cloud resources

2.Presenting a high level interface to

the application programmers

3.key difference :scalable

圖一

4

9962161 江嘉福100062228 徐光成 100062229 章博遠


Challenges posed by a cloud os

Challenges posed by a cloud OS

1.Scalability comes at a price.

2.Data consistency, system availability,

and tolerance to network partition.

圖二

5

9962161 江嘉福100062228 徐光成 100062229 章博遠


Cloud mapreduce

Cloud MapReduce?

1.MapReduce programming model

2.horizontal scaling

3.eventual consistency

4.overcome limitations

6

9962161 江嘉福100062228 徐光成 100062229 章博遠


Advantages of cloud mapreduce

Advantages of Cloud MapReduce

1.Incremental scalability:

Can scale incrementally in the number of computing nodes.

2.Symmetry and Decentralization:

Node has the same set of responsibilities.

3.Heterogeneity:

Nodes have varying computation capacity.

7

9962161 江嘉福100062228 徐光成 100062229 章博遠


Cloud mapreducearchitecture and implementation

Cloud MapReduceArchitecture and Implementation

1.The architecture

2.Cloud challnenges

3.General solution approaches

8

9962161 江嘉福100062228 徐光成 100062229 章博遠


The architecture

The Architecture

9

9962161 江嘉福100062228 徐光成 100062229 章博遠


Cloud challenges general solution approaches

Cloud challenges &

General solution approaches

1.Long latency

2.Horizontal scaling

3.Don’t know when a queue is created for the first time

10

9962161 江嘉福100062228 徐光成 100062229 章博遠


Con t

Con’t

4.Duplicate message

5.Potential node failure

6.Indeterminstic eventual consistency windows

11

9962161 江嘉福100062228 徐光成 100062229 章博遠


C l oud mapreduce a mapreduce implementation on top of a cloud operation system

Pros

  • 3000 lines of Java code(L.O.C) vs 285375 Hadoop L.O.C

  • Large & Reliable FS

  • High Bandwidth(fast read/write)

  • Single point of contact(high throughput)

12

9962161 江嘉福100062228 徐光成 100062229 章博遠


C l oud mapreduce a mapreduce implementation on top of a cloud operation system

Cons

  • Uses only network(no local storage)

  • Leads to bottleneck

13

9962161 江嘉福100062228 徐光成 100062229 章博遠


Evaluation

Evaluation

Almost twice as fast!

14

9962161 江嘉福100062228 徐光成 100062229 章博遠


Evaluation1

Evaluation

  • Hadoop - 385s total, network/CPU under utilized

  • CMR - 210s, more efficient network/CPU usage

15

9962161 江嘉福100062228 徐光成 100062229 章博遠


Evaluation2

Evaluation

Wiki Word Count

  • Combiner:

    Hadoop - 747s

    CMR - 436s

  • No Combiner:

    Hadoop - 1733s

    CMR - 1247s

16

9962161 江嘉福100062228 徐光成 100062229 章博遠


Evaluation3

Evaluation

Amazon

  • Word Count -> 400GB using 100 nodes

  • Approx. 1hr

  • 983,152 Requests -> $0.98

  • Using SimpleDB?

  • 3.7hrs -> $0.52

17

9962161 江嘉福100062228 徐光成 100062229 章博遠


Evaluation4

Evaluation

Comparison

  • Distributed Grep Word Count -> 13GB of data

  • CMR = 962 seconds

  • Hadoop 1047 seconds

  • Results are almost the same, why?

  • More CPU intensive tasks

18

9962161 江嘉福100062228 徐光成 100062229 章博遠


Evaluation5

Evaluation

12GB - 923670 HTML files

  • Hadoop -> 6hrs+

  • CMR -> 297 seconds

  • Hadoop - High overhead from task creation

19

9962161 江嘉福100062228 徐光成 100062229 章博遠


Conclusion

Conclusion

  • Cloud cannot be implemented on any system

  • Poor Performance

  • CMR techniques overcome cloud limitations

  • 0 Performance Degradation

  • Good to use for other systems

20

9962161 江嘉福100062228 徐光成 100062229 章博遠


References

REFERENCES

圖一:http://techcrunch.com/

圖二:http://blog.csdn.net/zouqingfang/article/details/7269920

http://zh.wikipedia.org/

https://code.google.com/p/cloudmapreduce/

http://searchcloudcomputing.techtarget.com/definition/MapReduce

http://myblog-maurice.blogspot.tw/2012/08/nosqlcap.html

21

9962161 江嘉福100062228 徐光成 100062229 章博遠


  • Login