Inter-Transactional Parallelism for Persistent Distributed Shared Virtual Memory
Download
1 / 54

金 泰勇 九州大学大学院システム情報研究院 知能システム専攻 - PowerPoint PPT Presentation


  • 63 Views
  • Uploaded on

Inter-Transactional Parallelism for Persistent Distributed Shared Virtual Memory - Implementation and Performance -. 金 泰勇 九州大学大学院システム情報研究院 知能システム専攻. Outline. Introduction Overview of WAKASHI Network Of Workstations Persistent Distributed Shared Virtual Memory

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' 金 泰勇 九州大学大学院システム情報研究院 知能システム専攻' - bessie


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Inter-Transactional Parallelism for Persistent Distributed Shared Virtual Memory- Implementation and Performance -

金 泰勇

九州大学大学院システム情報研究院

知能システム専攻


Outline
Outline Shared Virtual Memory

  • Introduction

    • Overviewof WAKASHI

    • Network Of Workstations

    • Persistent Distributed Shared Virtual Memory

  • GeneralizedDistributedLockProtocol

    • Algorithm

    • Related Work

    • Evaluation (M-OO7 benchmark)

  • Cost-basedDistributedTransaction Coordinator

    • Architecture and Algorithm

    • Related Work

    • Evaluation ( TPC-C benchmark)

  • Conclusion and Future Work


Introduction 1
Introduction (1) Shared Virtual Memory

  • ShusseUo - An Object Database Management Group (ODMG) compliantObject Database System

  • OQL Compiler

  • ODL Pre-Processor

WARASA

  • ODMG Object Model

  • Persistent Object Manipulation Language

  • (C++ Binding)

INADA

  • PersistentDistributedSharedVirtual Memory

WAKASHI

  • Transaction Management

Operating System


Introduction 2
Introduction (2) Shared Virtual Memory

  • Network Of Workstations(NOW)

NOW

Local Area Network

CPU

CPU

CPU

Disk

Disk

Disk

Workstation

Workstation

Workstation


Introduction 3
Introduction (3) Shared Virtual Memory

  • Characteristics of NOW [Berkeley NOW group]

    • Better performance for sequential application than an individual workstation

      • Most of the sequential applications can be divided into several independent parts

      • These parts are able to be executed in parallel at NOW.

    • Better price/performance for parallel applications than Massively Parallel Processors (MPP)

      We utilized NOW as the hardware environment of Database server


Introduction 4

T1a Shared Virtual Memory

T1b

T1

T1c

T2a

T2

T2b

Introduction (4)

  • Two Transactional Parallelisms

Inter-Transactional Parallelism

NOW

Intra-Transactional Parallelism


Introduction 5
Introduction (5) Shared Virtual Memory

  • Distributed Shared Virtual Memory (DSVM) [Kai Li 1989, Princeton University]

DSVM

Space

Memory

Memory

Memory

  • Hardware Level DSVM

  • [DASH, KSRI]

  • Software Level DSVM

  • [MUNIN, TreadMark, WAKASHI]

Disk

Disk

Disk

Workstation

Workstation

Workstation


Introduction 6
Introduction (6) Shared Virtual Memory

  • Persistent Distributed Shared Virtual Memory

    • Transaction integrated DSVM

      • All of DSVM access are included into the transactions

      • Data in DSVM space is maintained persistent

    • The problems which PDSVM has to face

      • Utilize the resource efficiently

      • Decrease the cost of the communication exchanging among different sites

        Two main factors to evaluate the communication cost.

        • Message Size

        • Number of the messages

          Cost( n KB message) < n ×Cost( 1 KB message )


Introduction 7
Introduction (7) Shared Virtual Memory

  • PDSVM implemented at WAKASHI

DSVM Mapping

DSVM Mapping

DSVM Mapping

DSVM Mapping

Disk Mapping

Disk Mapping

Primary Site

Mirror Site

Primary Site

Mirror Site

Mirror Site

Mirror Site


Introduction 8

Write(p) Shared Virtual Memory

Read(p)

Swap p out to

disk

Swap p into

memory

Swap p into

memory

Primary Read

Primary Write

Introduction (8)

  • PDSVM Data Access patterns and cost at Primary Site

P


Introduction 81

Read(p) Shared Virtual Memory

Write(p)

remote_page_lock

page_transfer

remote_page_lock

page_transfer

Swap p out

to disk

Swap p out

to disk

Swap p out

to disk

Swap p into

memory

Primary Site

Mirror Site

Introduction (8)

  • PDSVM Data Access patterns at Mirror Site



Generalized distributed lock protocol of wakashi
Generalized Distributed Lock Protocol of WAKASHI Shared Virtual Memory

  • Lock Release

transaction

begin

transaction

end

w(p)

Primary site

Mirror site

transaction

begin

transaction

end

transaction

begin

transaction

end

r(p)

r(p)

Message Type:

  • Remote Page Lock

  • Page Transfer

  • All Page Lock Release

  • Remote Page Lock Forward


Generalized distributed lock protocol of wakashi1
Generalized Distributed Lock Protocol of WAKASHI Shared Virtual Memory

  • Lock Retain

transaction

begin

transaction

end

w(p)

Primary site

Mirror site

transaction

begin

transaction

end

transaction

begin

transaction

end

r(p)

r(p)


Generalized distributed lock protocol of wakashi2
Generalized Distributed Lock Protocol of WAKASHI Shared Virtual Memory

  • Retain Mode ( Commit Mode and Abort Mode)

Lock Release

Read Lock

Retain Mode

Read Lock Retain

Write Lock

Write Lock Retain

Transaction Commit

or

Transaction Abort

Retain Mode:

LRL_LRL, RLRT_LRL, WLRT_LRL

LRL_RLRT, RLRT_RLRT, WLRT_RLRT


Generalized distributed lock protocol of wakashi3
Generalized Distributed Lock Protocol of WAKASHI Shared Virtual Memory

  • Attach retain mode to transaction

Transaction_Begin( <h1, mode_1>,

<h2, mode_2>…);

READ(h1, p1);

READ(h2, P2);

WRITE(h3, P3);

Transaction_Commit();

  • Retain modes are decided when a transaction begins

  • Using <HID, RETAIN_MODE > to attach retain mode to a heap

  • Attached retain modes are just valid on the pages accessed during the transaction


Generalized distributed lock protocol of wakashi4
Generalized Distributed Lock Protocol of WAKASHI Shared Virtual Memory

  • Related Work

    • Lazy Release Consistency (LRC) Protocol [Rice University, 1992]

      • Locks are managed by a lock manager

      • At the client programs, locks are processed by two kinds of primitive: Acquire and Release

      • Lock are not released immediately when locks are released at client program

      • From their measurement result, LRC performs better than common release consistency protocol at some applications

  • In LRC, the lockprimitives are explicitly set by client programmer

  • LRC is designed for the distributed parallel computing applications


Generalized distributed lock protocol of wakashi5
Generalized Distributed Lock Protocol of WAKASHI Shared Virtual Memory

  • Related Work

    • Cache Consistency Protocol at Client-Server Database Architecture

      • Caching 2 Phase Lock (C2PL) Approach [Franklin 1992]

        • The lock should be granted when a cache is to be accessed

        • The lock should be released when a transaction ends

      • Callback (CB) Approach [Wisconsin Univ. 1992, 1997]

        All of the caches remain valid until a callback message comes

        • Callback Read:

          When a cache is to be updated, the callbacks are sent to other clients where the caches are READ

        • Callback All

          When a cache is to be accessed, the callbacks are sent to other clients which are holding the caches that are conflict with the access

      • Difference with GDL

        • Architectures are different

        • GDL supports more lock processing modes


Evaluation of gdl

Shared Virtual Memory

Private

Module

Private

Module

Private

Module

User

User

User

Module

Assembly (7Level)

Shared Module

Mega

Module

Sub

Module

Sub

Module

Sub

Module

Composite Parts

Atomic Parts

Evaluation of GDL

  • Multi-User OO7


Evaluation of gdl1
Evaluation of GDL Shared Virtual Memory

  • TransactionType

    • Read Only: Traverse module without any update

    • Update: Traverse with update each atomic parts

  • Operation Configuration Vector (OCV)

    • <Pr, Pw, Sr, Sw>

      • Pr/Pw is the probability of read/write operations occurring at private modules

      • Sr/Sw is the probability of read/write operations occurring at shared modules

    • OCV Types

      • Read Only <50, 0, 50, 0>

      • 10% Update <45, 5, 45, 5>

      • 50% Update <25, 25, 25, 25>


Evaluation of gdl2
Evaluation of GDL Shared Virtual Memory

  • Testbed

Ethernet-100M bit

Ultra5

Ultra5

Ultra5

Ultra5

Ultra5

Ultra5

Ultra5

Super-Sparc

(400Mhz)

Disk

IBM DJNA

(22G)

Main Memory

128M


Evaluation of gdl3
Evaluation of GDL Shared Virtual Memory

  • Non-Clustering Plan

Ultra5

Ultra5

Ultra5

Ultra5

Ultra5

Ultra5

Database

  • All of modules are located in 1 heap

  • The heap is located at 1 site


Evaluation result
Evaluation Result Shared Virtual Memory

  • Read Only


Evaluation result1
Evaluation Result Shared Virtual Memory

  • 10%Update


Evaluation result2
Evaluation Result Shared Virtual Memory

  • 50%Update


Evaluation of gdl4

Ultra5 Shared Virtual Memory

Evaluation of GDL

  • Clustering Plan

Ultra5

Ultra5

Ultra5

Ultra5

Ultra5

Ultra5

Shared

Private

Private

Private

Private

Private

Private

Module

Module

Module

Module

Module

Module

Module

  • Each Module are located in 2 heaps (ReadOnly, Update)

  • Private Modules are distributed in all of the sites

  • Shared Module is allocated at 1 site.


Evaluation result3
Evaluation Result Shared Virtual Memory

  • Read Only


Evaluation result4
Evaluation Result Shared Virtual Memory

  • Number of the messages at Read Only


Evaluation result5
Evaluation Result Shared Virtual Memory

  • 10%Update


Evaluation result6
Evaluation Result Shared Virtual Memory

  • Number of the messages at 10% Update


Evaluation result7
Evaluation Result Shared Virtual Memory

  • Number of the remote page lock messages at 10% Update


Evaluation result8
Evaluation Result Shared Virtual Memory

  • Number of the remote page lock forward messages at 10% Update


Evaluation result9
Evaluation Result Shared Virtual Memory

  • 50%Update


Evaluation result10
Evaluation Result Shared Virtual Memory

  • Number of the messages at 50% Update


Evaluation result11
Evaluation Result Shared Virtual Memory

  • Number of the remote page lock messages at 50% Update


Evaluation result12
Evaluation Result Shared Virtual Memory

  • Number of the remote page lock forward messages at 50% Update



Cost based distributed transaction coordinator1

Transactions Shared Virtual Memory

Cost-based Distributed Transaction Coordinator

  • Transaction Coordinator

  • Utilize all of the workstations efficiently

  • Execute the transactions with lower cost

    • Transaction Cost is decided by

    • The type of transaction

    • The site where the transaction runs

Transaction Coordinator

Workstation

Workstation

Workstation

Workstation


Architecture

T1 Shared Virtual Memory

T2

T3

T4

Architecture

Functionalities of Execute Element

Transaction Pool

  • Execute the coordinated

    transactions

Transaction Scheduler

Database

Distribution

Manager

Load

Information

Manager

Cost-based

Transaction

Coordinator

  • Collect the load information of

    the executed transactions

Execute Element Manager

Adapter

Dispatcher

  • Feedback the load information of

    the executed transactions to

    transaction coordinator

Dispatcher

Dispatcher

Dispatcher

Dispatcher

Execute

Element

Dispatcher

Execute

Element

Execute

Element

  • Transaction Placement Policy:

  • Decide how to coordinate the transaction when it is submitted to TC

  • Transaction Scheduling Policy:

  • Decide which blocked transaction in Transaction Pool is executed

  • at which site when a transaction is finished at an Execute Element


Cost based approaches
Cost-based Approaches Shared Virtual Memory

  • Cost-based Approach - 1

    • Static approach(CTC-Static)

      • Static Coordinator Description File(SCDF)

        T  S

        • T is the type ID of transaction

        • S is the ip address of the host where the

          transaction t is executed.

    • Transaction Placement Policy

      • Select an idle EE for the submitted transaction according to SCDF

    • Transaction Scheduling Policy

      • Select the next transaction which is blocked also according to SCDF


Cost based approaches1
Cost-based Approaches Shared Virtual Memory

  • Cost-based Approach - 2

    • Transaction Priority Oriented approach(CTC-TPOA)

      • Transaction Placement Policy

        Look through all of the EEs to find an idle EE to execute the

        transaction submitted

      • Transaction Scheduling Policy

        Look through all of the EEs to find an idle EE to execute the

        blocked transaction whose arrival time is the earliest.


Cost based approaches2
Cost-based Approaches Shared Virtual Memory

  • Cost-based Approaches -3

    • Low-Cost Oriented approach(CTC-LCOA)

      • Priority Value (PV)

        PV(t, s) = Cost(t,s) – Preemption Factor(t)

        • Cost(t,s)is the cost for executing t at host s.

        • If a transaction whose arrival time is later than that

          Of t is coordinated in prior to t, the Preemption Factor of t is increased by k

      • Transaction Placement Policy

        It is the same with CTC-TPOA

      • Transaction Scheduling Policy

        For the site where the EE has finished a transaction, look through all of the blocked transaction and coordinate the transaction whose PV is the lowest to the site

Distribution of Active EEs is fixed in CTC-LCOA


Related work
Related Work Shared Virtual Memory

  • Degree Multi-Programming (DMP) based Algorithm [ObjectStore, 1991]

    • Limit the Multi-Programming Level (number of the concurrent transactions)

  • Feedback based Algorithm

    • Throughput Feedback based Algorithm [VLDB, 1991]

      resource contention aided algorithm

    • Conflict Ratio Feedback based Adaptive Transaction Scheduling Algorithm [VLDB, 1992]

      data contention aided algorithm

  • Resource Contention

  • The current available resource does not satisfy the required resource

  • Data Contention

  • The excessive lock conflicts degrades the performance significantly


Evaluation 1

Customers Shared Virtual Memory

Districts

3,000

1+

History

1+

10

Order-Line

Order

Warehouse

5-15

Stocks

100,000

0-1

New-Order

100,000

Items

Evaluation (1)

  • TPC-C benchmark model

    • TPC-C is an Online Transaction Processing Benchmark

    • Database Scheme


Evaluation 2
Evaluation (2) Shared Virtual Memory

  • Transaction Type

    • New-Order(n/a)

    • Payment(43%)

    • Order-Status(4%)

    • Delivery(4%)

    • Stock‐Level(4%)

      The measured throughput of New-Order (MQTH) is reported as performance result.


Evaluation 3
Evaluation (3) Shared Virtual Memory

  • Testbed

Coordinator Site

16 ×Execute Element Site

Ethernet-100M bit

Ultra5

Ultra5

Ultra5

……

Ultra5

Ultra5

Super-Sparc

(400Mhz)

Disk

IBM DJNA

(22G)

Main Memory

128M


Evaluation 4
Evaluation (4) Shared Virtual Memory

  • MQTH Result


Evaluation 5
Evaluation (5) Shared Virtual Memory

  • Rate of Primary Accessed Pages


Evaluation 6
Evaluation (6) Shared Virtual Memory

  • Distribution of Active EEs (MPL=32)


Evaluation 7

T1,T2,T2,T2,T2,T3 Shared Virtual Memory

Evaluation (7)

  • Why the distribution of Active Execute Elements in CTC-Static is unbalanced?

T1→S1

T2→S2

T3→S3

Transaction

Coordinator

SCDF

EE

EE

EE

EE

Execute

Element

EE

EE

EE

EE

EE

EE

Execute

Element

Execute

Element

Execute

Element

Execute

Element

EE

Execute

Element

MPL=3

S1

S2

S3


Conclusion and future work

Conclusion and Future Work Shared Virtual Memory


Conclusion
Conclusion Shared Virtual Memory

  • WAKASHI is the lowest layer of ShusseUo, which is an Object Database System

  • In WAKASHI, Persistent Distributed Shared Virtual Memory(PDSVM) functionality is supported

  • Based on PDSVM, two kinds of Inter-Transaction Parallelism implemented in WAKASHI were proposed at this thesis

    • Generalized Distributed Lock Protocol (GDL):

      It is proposed to decrease the communication overhead for acquiring page locks on PDSVM spaces

    • Cost-based Distributed Transaction Coordinator (DTC) :

      It is proposed to coordinate transaction at NOW environment with lower cost to execute the transactions. And the resource of NOW is also utilized efficiently.

  • The two kinds of parallelism are resource contention aided algorithms and they are evaluated by two benchmark model: Mutli-User OO7 and TPC


Future work 1
Future Work (1) Shared Virtual Memory

  • Integrate GDL in Transaction Coordinator

    • When a transaction is coordinated, an ideal retain mode is attached by transaction coordinator automatically

    • The retain mode is also tuned based on the feedback information.


Future work 2
Future Work (2) Shared Virtual Memory

  • Utilize Transaction Coordinator into Intra-Transaction Parallelism

  • Transaction Coordinator should

  • support transaction dependency

T0

  • Page lock mechanism of WAKASHI

  • should be modified

T2

T1

T4

  • Dead lock detecting mechanism

  • of WAKASHI should be modified

T3


ad