Download
1 / 36

Outline - PowerPoint PPT Presentation


  • 71 Views
  • Uploaded on

Outline. Introduction to Cloud Computing Background on AWS and Motivation Cost and Performance Evaluation Conclusion. Cloud Computing Paradigm. Cloud “Utility” Providers: Amazon AWS, Azure, Cloudera, Google App Engine. Consumers: Companies, labs, schools, et al. Algorithms & Data.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Outline' - wendi


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Outline
Outline

  • Introduction to Cloud Computing

  • Background on AWS and Motivation

  • Cost and Performance Evaluation

  • Conclusion


Cloud computing paradigm
Cloud Computing Paradigm

Cloud “Utility” Providers:

Amazon AWS, Azure, Cloudera, Google App Engine

Consumers:

Companies, labs, schools, et al.


Cloud computing paradigm1

Algorithms

& Data

Cloud Computing Paradigm

Cloud “Utility” Providers:

Amazon AWS, Azure, Cloudera, Google App Engine

Consumers:

Companies, labs, schools, et al.


Cloud computing paradigm2

Algorithms

& Data

Cloud Computing Paradigm

Cloud “Utility” Providers:

Amazon AWS, Azure, Cloudera, Google App Engine

Consumers:

Companies, labs, schools, et al.


Cloud computing paradigm3

Algorithms

& Data

Processed

Results

Cloud Computing Paradigm

Cloud “Utility” Providers:

Amazon AWS, Azure, Cloudera, Google App Engine

Consumers:

Companies, labs, schools, et al.


Promises of cloud computing
Promises of Cloud Computing

  • Allows us to consolidate machines and outsource computation and storage

  • Pay-as-you-go Computing

  • “Infinite” compute resources and storage


Outline1
Outline

  • Introduction to Cloud Computing

  • Background on AWS and Motivation

  • Cost and Performance Evaluation

  • Conclusion


A motivating example
A Motivating Example

  • A service-oriented system that answers queries from a similar domain

  • Intermediate and final results can be cached and reused for future queries

  • Often present in workflow applications


Leveraging the cloud for storage
Leveraging the Cloud for Storage

  • Store and Cache Intermediate and Final Results in the Cloud

  • The Cloud has many options for data storage

    • Memory

    • Disks

    • Network Disks

    • Highly Available Persistent Storage

  • There are several tradeoffs in each option


Amazon web services aws
Amazon Web Services (AWS)

  • A Case study: AWS has emerged as one of the most widely used Cloud platform

  • We consider caching and storage performance in three AWS Services:

    • Elastic Compute Cloud (EC2) Machine instances

    • Simple Storage Service (S3)

    • Elastic Block Storage (EBS)


Aws services ec2
AWS Services: EC2

  • Elastic Compute Cloud (EC2)

    • Access to virtualized machines with varying capabilities (e.g., CPU cores, memory, disk space) depending on price.


Aws services ebs
AWS Services: EBS

  • Elastic Block Storage (EBS)

    • Persisted network disks.

    • Must be mounted onto EC2 machine before use.

    • Users must initially specify a fixed size and format to appropriate file system.


Aws services s3
AWS Services: S3

  • Simple Storage Service (S3)

    • Simple FTP-style API: GET, PUT, etc.

    • Highly available, reliable, and durable storage (but slower)

    • “Infinite capacity”

    • Not required to be used with EC2 machines.

    • Very inexpensive in terms of costs.



Tradeoffs per application and service
Tradeoffs Per Application and Service

  • Caching in-core (EC2-Memory)

    • Fast, but expensive

    • Small, may need extra logic to coordinate set of EC2 nodes

    • Data is volatile


Tradeoffs per application and service1
Tradeoffs Per Application and Service

  • Caching on local disk (EC2-Disk)

    • Much slower than memory

    • Much more space

    • Data is still volatile


Tradeoffs per application and service2
Tradeoffs Per Application and Service

  • Caching on Elastic Block Store (EC2-EBS)

    • Possibly slower than disk

    • Volume size is initially configured by application users

    • Data is persisted


Tradeoffs per application and service3
Tradeoffs Per Application and Service

  • Caching on S3

    • Slowest option, but most reliable

    • No bound on size

    • Data is persisted


Outline2
Outline

  • Introduction to Cloud Computing

  • Background on AWS and Motivation

  • Cost and Performance Evaluation

  • Conclusion


Experimental application
Experimental Application

  • Geospatial Application: Land Elevation Change

    • In general, 2 large matrices (DEM files) are retrieved, and their difference is returned

  • 500 unique requests

  • Requests are issued randomly

  • Eviction not considered (we assume cache/storage configuration is being used to store all results)


Performance
Performance

  • We use 4 different DEM data sizes to test performance:

    • 1KB, 1MB, 5MB, 50MB

  • This means a full cache would hold

    • 500KB, 500MB, 2.5GB, 25GB






Cost analysis
Cost Analysis

  • We next assess the costs versus the performance

  • Performance is being measured as relative speedup over the baseline DEM process execution, shown in Table 2

  • We project costs and speedup over 2000 and 200000 requests


Monthly costs for volatile cache 1mb
Monthly Costs for Volatile Cache (1MB)

2000 I/O Requests

outside of AWS

200000 I/O Requests

outside of AWS

3.5

3.26

3.6

3.6

Speedup

267

28

347

180.5

Cost per unit speedup is low when requests are high.

I/O costs are still low because of small data size


Monthly costs for volatile cache 50mb
Monthly Costs for Volatile Cache (50MB)

2000 I/O Requests

outside of AWS

200000 I/O Requests

outside of AWS

2.9

3.3

Speedup

16.05

31.66

Costs are now dominated by I/O due to large data size

In terms of performance, makes more sense to use xlarge for large data size

small instance makes better economic sense for small number of requests


Monthly costs for persistent cache 1mb
Monthly Costs for Persistent Cache (1MB)

2000 I/O Requests

outside of AWS

200000 I/O Requests

outside of AWS

3.4

3.62

3.58

Speedup

30

13.6

134

S3 performance is comparable for a cache with small I/O requests

S3 makes better economic sense than EBS-based instances


Monthly costs for persistent cache 50mb
Monthly Costs for Persistent Cache (50MB)

2000 I/O Requests

outside of AWS

200000 I/O Requests

outside of AWS

2.59

2.74

3.19

Speedup

6.4

11.09

22.66

Interesting - Even with low cost of S3, it still makes sense to use xlarge when I/O requests are high

S3 still comparable, and makes better economic sense than EBS-based instances


Outline3
Outline

  • Introduction to Cloud Computing

  • Background on AWS and Motivation

  • Cost and Performance Evaluation

  • Conclusion


Summary 1
Summary (1)

  • For smaller data (<= 5MB)

    • If request rate is low: Use small instance on-disk

    • If request rate is high: Use small instance in-memory

    • Although I/O is slow, the cost of using small instance is very low

  • If persistence is needed,

    • Use S3, and avoid EBS


Summary 2
Summary (2)

  • For larger data (>= 50MB and large cache sizes)

    • Use xlarge instances

    • Higher I/O rates

    • Larger memory and disk capacity

  • EBS may be considered in conjunction to XLarge instances for persistence

  • If performance is not an issue, but persistence and costs are, use S3


Conclusion
Conclusion

  • Cloud offers many viable options for data storage and caching

  • We evaluated the cost-performance tradeoffs of these various options, and determined a roadmap for making clear decisions on resource usage


Thank you
Thank you


ad