Dolly virtualization driven database provisioning for the cloud
Download
1 / 41

DOLLY: Virtualization-Driven Database Provisioning for the Cloud - PowerPoint PPT Presentation


  • 139 Views
  • Uploaded on

VEE 2011 paper available at: http://lass.cs.umass.edu/papers.html. DOLLY: Virtualization-Driven Database Provisioning for the Cloud. Emmanuel Cecchet Joint work with Rahul Singh, Upendra Sharma and Prashant Shenoy. PROVISIONING IN THE CLOUD. Virtualization No shared disk

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' DOLLY: Virtualization-Driven Database Provisioning for the Cloud' - durin


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Dolly virtualization driven database provisioning for the cloud

VEE 2011 paper available at:

http://lass.cs.umass.edu/papers.html

DOLLY: Virtualization-Driven Database Provisioning for the Cloud

Emmanuel Cecchet

Joint work with Rahul Singh, Upendra Sharma and PrashantShenoy


Provisioning in the cloud
PROVISIONING IN THE CLOUD

  • Virtualization

  • No shared disk

  • Provisioning based on volume and machine load

Internet

Frontend/

Load balancer

App.

Servers

Provisioning

logic

Databases


Why is it hard to add a db replica
WHY IS IT HARD TO ADD A DB REPLICA?

Administration issues

Install/setup new database

Backup/Restore database content

Configure new replica

Synchronize live content

How much time does it take?

Depends on the database size

Depends on the backup/restore technique

From minutes to hours…

Dolly: VM cloning to blackbox the database


Dolly

  • Database replication in the Cloud

  • Provisioning with Dolly

  • Prototype & Evaluation


Spawning a replica
SPAWNING A REPLICA

Multi-master middleware-based replication

Three main phases

Backup

Restore

Replay

add replica

1

4

resynchronize

2

3

restore

snapshot

backup

Client SQL requests

Management console

Replication middleware

Transactional log

Load balancer

new replica

DB1

DB2


Spawning a replica with cloning
SPAWNING A REPLICA WITH CLONING

Backup & Restore replace by VM cloning

VM3

VM3

DB2

DB2

OS

OS

4

add replica

resynchronize

1

VM1

VM2

2

3

clone

clone

OS

OS

Client SQL requests

Management console

Replication middleware

Transactional log

Load balancer

DB1

DB2

3


Cloning backup restore in constant time
Cloning: Backup/Restore in constant time

  • Filesystem snapshot/copy is DB agnostic

  • Only depends on VM size


Dolly

  • Database replication in the Cloud

  • Provisioning with Dolly

  • Prototype & Evaluation


Modeling spawning time
MODELING SPAWNING TIME

Predictable backup and restore times are required

Replay time can be estimated from current write throughput (wt)

replica spawning time

backup

restore

replay

time

ri

bi

updates


When to snapshot
WHEN TO SNAPSHOT?

Time to spawn from a live replica

Time to spawn from an existing snapshot

Faster to take a new snapshot j to spawn a new replica than using old snapshot i if:backupj+restorej<restorei +replayi


Dolly overview
DOLLY OVERVIEW

Input

capacity prediction

write prediction

Output

schedule of snapshots

schedule of replica spawning

admission control if needed

Dolly

write predictions

capacity predictions

Snapshot scheduler

Capacity Provisioning

HA adjuster

Spawning options

Write throttling

Paused pool cleaner

Scheduler

start/stop

clone/ snapshot

write throttling/ read throttling

delete VM/ snapshot

Predictors

Admission Control

Management API

Monitoring

Free pool Manager

reclaim


Provisioning replicas
PROVISIONING REPLICAS

  • Dolly does not provide predictors

  • Dolly can work with any predictor (see [Eurosys09])

Workload

prediction

Capacity

prediction

Write

prediction


Cloud cost functions
CLOUD COST FUNCTIONS

Adapt the provisioning decisions to the cloud platform specifics

Cost can be $ on public cloud or time on private cloud


Provisioning replicas1
PROVISIONING REPLICAS

Parse capacity provisioning predictions

Decrease capacity by pausing VMs

Increasing capacity

Check if we can reuse a paused VM

Check if we can spawn from an existing snapshot

Choose cheapest options according to spawn_costfunction

Perform admission control if all replicas cannot be provisioned in time


Snapshot scheduling
SNAPSHOT SCHEDULING

How to snapshot?

Clone a paused VM

Pause an active VM to clone it

When to snapshot?

At time j whenbackupj+restorej<restorei+replayi

If new snapshot is scheduled, re-run capacity provisioning

Prediction window must have minimum size


Dolly

  • Database replication in the Cloud

  • Provisioning with Dolly

  • Prototype & Evaluation


Implementation
IMPLEMENTATION

C-JDBC/Sequoia replication middleware

OpenNebula Cloud management middleware

Cost functions

private cloud: minimize resource utilization time

Amazon EC2: minimize cost

VM4

VMclone

VM5

VM3

VM2

VM1

OS

OS

OS

OS

OS

OS

Dolly

Private

EC2

Recovery Log

Dump table

Log table

DB3snapshot

New replica

New replica

admission control

TPC-W

load injector

predictions

Sequoia driver

write throttling

SQL requests

add/remove replica snapshot/pause/…

Sequoia controller

JMX Management API

Scheduler

Backupers

Dolly OpenNebula

Loadbalancer

start/stop/ clone/…

DB1

DB2

DB3

OpenNebula

clone

clone

Backup server

or NAS


Implementation cost functions
IMPLEMENTATION – COST FUNCTIONS

Private cloud: minimize resource utilization

Amazon EC2: minimize cost


Workload description
WORKLOAD DESCRIPTION

TPC-W online bookstore e-commerce benchmark

Snapshot s0 available at t0



Reactive provisioning 2h snapshot
Reactive provisioning – 2h snapshot

replicas available

replica spawning triggered here


Dolly 30m prediction window
Dolly – 30m Prediction Window

Private Cloud

Amazon EC2

s2

s1

cheaper to leave instances online


Conclusion

VM cloning

Solves administration issues by blackboxing the database

Constant time backup/restore needed to predict replica spawning time

New provisioning algorithm

Decouples capacity provisioning from snapshot scheduling

Cost functions to optimize for cloud platform specifics

CONCLUSION

VEE 2011 paper available at:

http://lass.cs.umass.edu/papers.html





Dolly 10m prediction window
Dolly – 10m Prediction Window

Private Cloud

Amazon EC2

s1

s1

s2

s2


Spawning in a public cloud
SPAWNING IN A PUBLIC CLOUD

Storage decoupled from computing resource

Starting a new instance clones the volume

Vol2

Vol4

Vol2

Vol3

DB2

DB2

DB2

DB2

DB1

DB1

OS

OS

OS

OS

DB1

snapshot

stop

register

start

restart

Vol1

Vol1

Vol1

OS

OS

OS


Backup restore techniques
BACKUP/RESTORE TECHNIQUES

Database native tools

Vendor specific or 3rd party ETL

Understand database semantics

Filesystem copy

Low-level data copy

Need to know what to copy

VM cloning

Copies database content + configuration + OS

Unused space can be compressed



Backup restore performance 1 3
BACKUP/RESTORE PERFORMANCE (1/3)

Performance depends on database content


Backup restore performance 2 3
BACKUP/RESTORE PERFORMANCE (2/3)

  • File copy is the most effective for small databases


Backup restore performance 3 3
BACKUP/RESTORE PERFORMANCE (3/3)

  • VM cloning most effective on large databases



Dolly main algorithm
DOLLY MAIN ALGORITHM

Capacity provisioning depends on available snapshots

Snapshots scheduled according to capacity demand

Decouple capacity provisioning from snapshot scheduling

if (predictor.capacity_changes ||

predictor.write_workload_changes) {

do {

schedule = capacity_provisioning(predictions)

snapshot_schedule = snapshot_scheduling(predictions)

} while (snapshot_schedule schedules new snapshots)

scheduler.schedule(snapshot_schedule)

scheduler.schedule(capacity_schedule)

}

if (time since last operation > threshold) {

paused_pool_cleaner.release_old_paused_vms();

paused_pool_cleaner.delete_old_snapshots();

}


Releasing resources
RELEASING RESOURCES

Paused VMs

VM never re-used if cost to resume > cost to spawn from last snapshot

Snapshots

Old snapshots can be released based on cost to keep them around

Free server pool

Can reclaim servers with paused VMs when pool is empty


Tpc w evaluation
TPC-W EVALUATION

Multi-tier online bookstore benchmark

4GB VM for the database

Large EC2 instances from EBS volumes with CloudWatch


Phase 1 capacity provisioning
PHASE 1 – CAPACITY PROVISIONING

Pause VMs when capacity decreases

Resume VMs when capacity increases

Spawn VMs from snapshot when additional capacity required

Not cost effective to pause VMs for less than 1 hour with EC2


Phase 2 snapshot scheduling
PHASE 2 – SNAPSHOT SCHEDULING

Faster (but more costly) to spawn a new replica and take a new snapshot from it to spawn 3 replicas at d3

Faster to snapshot paused VM and spawn replica from it for d5

Cost of volume storage on EC2 more expensive than IO cost


Phase 3 capacity provisioning
PHASE 3 – CAPACITY PROVISIONING

Schedule new replica spawning using new snapshots

Next snapshot scheduling does not generate new snapshots and the algorithm terminates


Final scheduling
FINAL SCHEDULING

Different scheduling for private and public cloud

Minimize resource (energy) usage for private cloud

Minimize cost for for public cloud


ad