Schism:  Graph Partitioning
This presentation is the property of its rightful owner.
Sponsored Links
1 / 48

Schism: Graph Partitioning for OLTP Databases in a Relational Cloud Implications for the design of GraphLab PowerPoint PPT Presentation


  • 110 Views
  • Uploaded on
  • Presentation posted in: General

Schism: Graph Partitioning for OLTP Databases in a Relational Cloud Implications for the design of GraphLab. Samuel Madden MIT CSAIL Director, Intel ISTC in Big Data. GraphLab Workshop 2012. The Problem with Databases. Tend to proliferate inside organizations

Download Presentation

Schism: Graph Partitioning for OLTP Databases in a Relational Cloud Implications for the design of GraphLab

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Schism graph partitioning for oltp databases in a relational cloud implications for the design of graphlab

Schism: Graph Partitioning for OLTP Databases in a Relational CloudImplications for the design of GraphLab

Samuel Madden

MIT CSAIL

Director, Intel ISTC in Big Data

GraphLab Workshop 2012


The problem with databases

The Problem with Databases

  • Tend to proliferate inside organizations

    • Many applications use DBs

  • Tend to be given dedicated hardware

    • Often not heavily utilized

  • Don’t virtualize well

  • Difficult to scale

    This is expensive & wasteful

    • Servers, administrators, software licenses, network ports, racks, etc …


Relationalcloud vision

RelationalCloudVision

  • Goal: A database service that exposes self-serve usage model

    • Rapid provisioning: users don’t worry about DBMS & storage configurations

      Example:

  • User specifies type and size of DB and SLA(“100 txns/sec, replicated in US and Europe”)

  • User given a JDBC/ODBC URL

  • System figures out how & where to run user’s DB & queries


Before database silos and sprawl

Before: Database Silos and Sprawl

Application #4

Application #1

Application #2

Application #3

$$

$$

Database #1

Database #2

Database #3

Database #4

$$

$$

Must deal with many one-off database configurations

And provision each for its peak load


After a single scalable service

After: A Single Scalable Service

App #2

App #3

App #4

App #1

Reduces server hardware by aggressive workload-aware multiplexing

Automatically partitions databases across multiple HW resources

Reduces operational costs by automating service management tasks


What about virtualization

What about virtualization?

Max Throughput w/ 20:1 consolidation (Us vs. VMWareESXi)

All DBs equal load

One DB 10x loaded

  • Could run each DB in a separate VM

  • Existing database services (Amazon RDS) do this

    • Focus is on simplified management, not performance

  • Doesn’t provide scalability across multiple nodes

  • Very inefficient


Key ideas in this talk schism

Key Ideas in this Talk: Schism

  • How to automatically partition transactional (OLTP) databases in a database service

  • Some implications for GraphLab


System overview

System Overview

Schism

  • Not going to talk about:

  • Database migration

  • Security

  • Placement of data


Schism graph partitioning for oltp databases in a relational cloud implications for the design of graphlab

This is your OLTP Database

Curino et al, VLDB 2010


Schism graph partitioning for oltp databases in a relational cloud implications for the design of graphlab

This is your OLTP database on Schism


Schism

Schism

New graph-based approach to automatically partition OLTP workloads across many machines

Input: trace of transactions and the DB

Output: partitioning plan

Results: As good or better than best manual partitioning

Static partitioning – not automatic repartitioning.


Challenge partitioning

Challenge: Partitioning

Goal: Linear performance improvement when adding machines

Requirement: independence and balance

Simple approaches:

  • Total replication

  • Hash partitioning

  • Range partitioning


Partitioning challenges

Partitioning Challenges

Transactions access multiple records?

Distributed transactions

Replicated data

Workload skew?

Unbalanced load on individual servers

Many-to-many relations?

Unclear how to partition effectively


Many to many users groups

Many-to-Many: Users/Groups


Many to many users groups1

Many-to-Many: Users/Groups


Many to many users groups2

Many-to-Many: Users/Groups


Distributed txn disadvantages

Distributed Txn Disadvantages

Require more communication

At least 1 extra message; maybe more

Hold locks for longer time

Increases chance for contention

Reduced availability

Failure if any participant is down


Example

Example

Single partition: 2 tuples on 1 machine

Distributed: 2 tuples on 2 machines

Same issue would arise in distributed GraphLab

Each transaction writes two different tuples


Schism overview

Schism Overview


Schism overview1

Schism Overview

  • Build a graph from a workload trace

    • Nodes: Tuples accessed by the trace

    • Edges: Connect tuples accessed in txn


Schism overview2

Schism Overview

  • Build a graph from a workload trace

  • Partition to minimize distributed txns

    Idea: min-cut minimizes distributed txns


Schism overview3

Schism Overview

  • Build a graph from a workload trace

  • Partition to minimize distributed txns

  • “Explain” partitioning in terms of the DB


Building a graph

Building a Graph


Building a graph1

Building a Graph


Building a graph2

Building a Graph


Building a graph3

Building a Graph


Building a graph4

Building a Graph


Building a graph5

Building a Graph


Replicated tuples

Replicated Tuples


Replicated tuples1

Replicated Tuples


Partitioning

Partitioning

Use the METIS graph partitioner:

min-cut partitioning with balance constraint

Node weight:

# of accesses → balance workload

data size → balance data size

Output: Assignment of nodes to partitions


Graph size reduction heuristics

Graph Size Reduction Heuristics

Coalescing: tuples always accessed together → single node (lossless)

Blanket Statement Filtering: Remove statements that access many tuples

Sampling: Use a subset of tuples or transactions


Explanation phase

Explanation Phase

Goal:

Compact rules to represent partitioning

Users

Partition


Explanation phase1

Explanation Phase

Goal:

Compact rules to represent partitioning

Classification problem:

tuple attributes → partition mappings

Users

Partition


Decision trees

Decision Trees

Machine learning tool for classification

Candidate attributes:

attributes used in WHERE clauses

Output: predicates that approximate partitioning

Users

Partition

IF (Salary>$12000)

P1

ELSE

P2


Evaluation partitioning strategies

Evaluation: Partitioning Strategies

Schism: Plan produced by our tool

Manual: Best plan found by experts

Replication: Replicate all tables

Hashing: Hash partition all tables


Benchmark results simple

Benchmark Results: Simple

% Distributed Transactions


Benchmark results tpc

Benchmark Results: TPC

% Distributed Transactions


Benchmark results complex

Benchmark Results: Complex

% Distributed Transactions


Implications for graphlab 1

Implications for GraphLab (1)

  • Shared architectural components for placement, migration, security, etc.

  • Would be great to look at building a database-like store as a backing engine for GraphLab


Implications for graphlab 2

Implications for GraphLab (2)

  • Data driven partitioning

    • Can co-locate data that is accessed together

      • Edge weights can encode frequency of read/writes from adjacent nodes

    • Adaptively choose between replication and distributed depending on read/write frequency

    • Requires a workload trace and periodic repartitioning

    • If accesses are random, will not be a win

    • Requires heuristics to deal with massive graphs, e.g., ideas from GraphBuilder


Implications for graphlab 3

Implications for GraphLab (3)

  • Transactions and 2PC for serializability

    • Acquire locks as data is accessed, rather than acquiring read/write locks on all neighbors in advance

    • Introduces deadlock possibility

    • Likely a win if adjacent updates are infrequent, or not all neighbors accessed on each iteration

    • Could also be implemented using optimistic concurrency control schemes


Schism1

Schism

Automatically partitions OLTP databases as well or better than experts

Graph partitioning combined with decision trees finds good partitioning plans for many applications

Suggests some interesting directions for distributed GraphLab; would be fun to explore!


Graph partitioning time

Graph Partitioning Time


Collecting a trace

Collecting a Trace

Need trace of statements and transaction ids (e.g. MySQLgeneral_log)

Extract read/write sets by rewriting statements into SELECTs

Can be applied offline: Some data lost


Effect of latency

Effect of Latency


Replicated data

Replicated Data

Read: Access the local copy

Write: Write all copies (distributed txn)

  • Add n + 1 nodes for each tuple

    n = transactions accessing tuple

  • connected as star with weight = # writes

    Cut a replication edge: cost = # of writes


Partitioning advantages

Partitioning Advantages

Performance:

  • Scale across multiple machines

  • More performance per dollar

  • Scale incrementally

    Management:

  • Partial failure

  • Rolling upgrades

  • Partial migrations


  • Login