slide1
Download
Skip this Video
Download Presentation
金 泰勇 九州大学大学院システム情報研究院 知能システム専攻

Loading in 2 Seconds...

play fullscreen
1 / 54

金 泰勇 九州大学大学院システム情報研究院 知能システム専攻 - PowerPoint PPT Presentation


  • 63 Views
  • Uploaded on

Inter-Transactional Parallelism for Persistent Distributed Shared Virtual Memory - Implementation and Performance -. 金 泰勇 九州大学大学院システム情報研究院 知能システム専攻. Outline. Introduction Overview of WAKASHI Network Of Workstations Persistent Distributed Shared Virtual Memory

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' 金 泰勇 九州大学大学院システム情報研究院 知能システム専攻' - bessie


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1

Inter-Transactional Parallelism for Persistent Distributed Shared Virtual Memory- Implementation and Performance -

金 泰勇

九州大学大学院システム情報研究院

知能システム専攻

outline
Outline
  • Introduction
    • Overviewof WAKASHI
    • Network Of Workstations
    • Persistent Distributed Shared Virtual Memory
  • GeneralizedDistributedLockProtocol
    • Algorithm
    • Related Work
    • Evaluation (M-OO7 benchmark)
  • Cost-basedDistributedTransaction Coordinator
    • Architecture and Algorithm
    • Related Work
    • Evaluation ( TPC-C benchmark)
  • Conclusion and Future Work
introduction 1
Introduction (1)
  • ShusseUo - An Object Database Management Group (ODMG) compliantObject Database System
  • OQL Compiler
  • ODL Pre-Processor

WARASA

  • ODMG Object Model
  • Persistent Object Manipulation Language
  • (C++ Binding)

INADA

  • PersistentDistributedSharedVirtual Memory

WAKASHI

  • Transaction Management

Operating System

introduction 2
Introduction (2)
  • Network Of Workstations(NOW)

NOW

Local Area Network

CPU

CPU

CPU

Disk

Disk

Disk

Workstation

Workstation

Workstation

introduction 3
Introduction (3)
  • Characteristics of NOW [Berkeley NOW group]
    • Better performance for sequential application than an individual workstation
      • Most of the sequential applications can be divided into several independent parts
      • These parts are able to be executed in parallel at NOW.
    • Better price/performance for parallel applications than Massively Parallel Processors (MPP)

We utilized NOW as the hardware environment of Database server

introduction 4

T1a

T1b

T1

T1c

T2a

T2

T2b

Introduction (4)
  • Two Transactional Parallelisms

Inter-Transactional Parallelism

NOW

Intra-Transactional Parallelism

introduction 5
Introduction (5)
  • Distributed Shared Virtual Memory (DSVM) [Kai Li 1989, Princeton University]

DSVM

Space

Memory

Memory

Memory

  • Hardware Level DSVM
  • [DASH, KSRI]
  • Software Level DSVM
  • [MUNIN, TreadMark, WAKASHI]

Disk

Disk

Disk

Workstation

Workstation

Workstation

introduction 6
Introduction (6)
  • Persistent Distributed Shared Virtual Memory
    • Transaction integrated DSVM
      • All of DSVM access are included into the transactions
      • Data in DSVM space is maintained persistent
    • The problems which PDSVM has to face
      • Utilize the resource efficiently
      • Decrease the cost of the communication exchanging among different sites

Two main factors to evaluate the communication cost.

        • Message Size
        • Number of the messages

Cost( n KB message) < n ×Cost( 1 KB message )

introduction 7
Introduction (7)
  • PDSVM implemented at WAKASHI

DSVM Mapping

DSVM Mapping

DSVM Mapping

DSVM Mapping

Disk Mapping

Disk Mapping

Primary Site

Mirror Site

Primary Site

Mirror Site

Mirror Site

Mirror Site

introduction 8

Write(p)

Read(p)

Swap p out to

disk

Swap p into

memory

Swap p into

memory

Primary Read

Primary Write

Introduction (8)
  • PDSVM Data Access patterns and cost at Primary Site

P

introduction 81

Read(p)

Write(p)

remote_page_lock

page_transfer

remote_page_lock

page_transfer

Swap p out

to disk

Swap p out

to disk

Swap p out

to disk

Swap p into

memory

Primary Site

Mirror Site

Introduction (8)
  • PDSVM Data Access patterns at Mirror Site
generalized distributed lock protocol of wakashi
Generalized Distributed Lock Protocol of WAKASHI
  • Lock Release

transaction

begin

transaction

end

w(p)

Primary site

Mirror site

transaction

begin

transaction

end

transaction

begin

transaction

end

r(p)

r(p)

Message Type:

  • Remote Page Lock
  • Page Transfer
  • All Page Lock Release
  • Remote Page Lock Forward
generalized distributed lock protocol of wakashi1
Generalized Distributed Lock Protocol of WAKASHI
  • Lock Retain

transaction

begin

transaction

end

w(p)

Primary site

Mirror site

transaction

begin

transaction

end

transaction

begin

transaction

end

r(p)

r(p)

generalized distributed lock protocol of wakashi2
Generalized Distributed Lock Protocol of WAKASHI
  • Retain Mode ( Commit Mode and Abort Mode)

Lock Release

Read Lock

Retain Mode

Read Lock Retain

Write Lock

Write Lock Retain

Transaction Commit

or

Transaction Abort

Retain Mode:

LRL_LRL, RLRT_LRL, WLRT_LRL

LRL_RLRT, RLRT_RLRT, WLRT_RLRT

generalized distributed lock protocol of wakashi3
Generalized Distributed Lock Protocol of WAKASHI
  • Attach retain mode to transaction

Transaction_Begin( <h1, mode_1>,

<h2, mode_2>…);

READ(h1, p1);

READ(h2, P2);

WRITE(h3, P3);

Transaction_Commit();

  • Retain modes are decided when a transaction begins
  • Using <HID, RETAIN_MODE > to attach retain mode to a heap
  • Attached retain modes are just valid on the pages accessed during the transaction
generalized distributed lock protocol of wakashi4
Generalized Distributed Lock Protocol of WAKASHI
  • Related Work
    • Lazy Release Consistency (LRC) Protocol [Rice University, 1992]
      • Locks are managed by a lock manager
      • At the client programs, locks are processed by two kinds of primitive: Acquire and Release
      • Lock are not released immediately when locks are released at client program
      • From their measurement result, LRC performs better than common release consistency protocol at some applications
  • In LRC, the lockprimitives are explicitly set by client programmer
  • LRC is designed for the distributed parallel computing applications
generalized distributed lock protocol of wakashi5
Generalized Distributed Lock Protocol of WAKASHI
  • Related Work
    • Cache Consistency Protocol at Client-Server Database Architecture
      • Caching 2 Phase Lock (C2PL) Approach [Franklin 1992]
        • The lock should be granted when a cache is to be accessed
        • The lock should be released when a transaction ends
      • Callback (CB) Approach [Wisconsin Univ. 1992, 1997]

All of the caches remain valid until a callback message comes

        • Callback Read:

When a cache is to be updated, the callbacks are sent to other clients where the caches are READ

        • Callback All

When a cache is to be accessed, the callbacks are sent to other clients which are holding the caches that are conflict with the access

      • Difference with GDL
        • Architectures are different
        • GDL supports more lock processing modes
evaluation of gdl

Private

Module

Private

Module

Private

Module

User

User

User

Module

Assembly (7Level)

Shared Module

Mega

Module

Sub

Module

Sub

Module

Sub

Module

Composite Parts

Atomic Parts

Evaluation of GDL
  • Multi-User OO7
evaluation of gdl1
Evaluation of GDL
  • TransactionType
    • Read Only: Traverse module without any update
    • Update: Traverse with update each atomic parts
  • Operation Configuration Vector (OCV)
    • <Pr, Pw, Sr, Sw>
      • Pr/Pw is the probability of read/write operations occurring at private modules
      • Sr/Sw is the probability of read/write operations occurring at shared modules
    • OCV Types
      • Read Only <50, 0, 50, 0>
      • 10% Update <45, 5, 45, 5>
      • 50% Update <25, 25, 25, 25>
evaluation of gdl2
Evaluation of GDL
  • Testbed

Ethernet-100M bit

Ultra5

Ultra5

Ultra5

Ultra5

Ultra5

Ultra5

Ultra5

Super-Sparc

(400Mhz)

Disk

IBM DJNA

(22G)

Main Memory

128M

evaluation of gdl3
Evaluation of GDL
  • Non-Clustering Plan

Ultra5

Ultra5

Ultra5

Ultra5

Ultra5

Ultra5

Database

  • All of modules are located in 1 heap
  • The heap is located at 1 site
evaluation of gdl4

Ultra5

Evaluation of GDL
  • Clustering Plan

Ultra5

Ultra5

Ultra5

Ultra5

Ultra5

Ultra5

Shared

Private

Private

Private

Private

Private

Private

Module

Module

Module

Module

Module

Module

Module

  • Each Module are located in 2 heaps (ReadOnly, Update)
  • Private Modules are distributed in all of the sites
  • Shared Module is allocated at 1 site.
evaluation result4
Evaluation Result
  • Number of the messages at Read Only
evaluation result6
Evaluation Result
  • Number of the messages at 10% Update
evaluation result7
Evaluation Result
  • Number of the remote page lock messages at 10% Update
evaluation result8
Evaluation Result
  • Number of the remote page lock forward messages at 10% Update
evaluation result10
Evaluation Result
  • Number of the messages at 50% Update
evaluation result11
Evaluation Result
  • Number of the remote page lock messages at 50% Update
evaluation result12
Evaluation Result
  • Number of the remote page lock forward messages at 50% Update
cost based distributed transaction coordinator1

Transactions

Cost-based Distributed Transaction Coordinator
  • Transaction Coordinator
  • Utilize all of the workstations efficiently
  • Execute the transactions with lower cost
    • Transaction Cost is decided by
    • The type of transaction
    • The site where the transaction runs

Transaction Coordinator

Workstation

Workstation

Workstation

Workstation

architecture

T1

T2

T3

T4

Architecture

Functionalities of Execute Element

Transaction Pool

  • Execute the coordinated

transactions

Transaction Scheduler

Database

Distribution

Manager

Load

Information

Manager

Cost-based

Transaction

Coordinator

  • Collect the load information of

the executed transactions

Execute Element Manager

Adapter

Dispatcher

  • Feedback the load information of

the executed transactions to

transaction coordinator

Dispatcher

Dispatcher

Dispatcher

Dispatcher

Execute

Element

Dispatcher

Execute

Element

Execute

Element

  • Transaction Placement Policy:
  • Decide how to coordinate the transaction when it is submitted to TC
  • Transaction Scheduling Policy:
  • Decide which blocked transaction in Transaction Pool is executed
  • at which site when a transaction is finished at an Execute Element
cost based approaches
Cost-based Approaches
  • Cost-based Approach - 1
    • Static approach(CTC-Static)
      • Static Coordinator Description File(SCDF)

T  S

          • T is the type ID of transaction
          • S is the ip address of the host where the

transaction t is executed.

      • Transaction Placement Policy
        • Select an idle EE for the submitted transaction according to SCDF
      • Transaction Scheduling Policy
        • Select the next transaction which is blocked also according to SCDF
cost based approaches1
Cost-based Approaches
  • Cost-based Approach - 2
    • Transaction Priority Oriented approach(CTC-TPOA)
      • Transaction Placement Policy

Look through all of the EEs to find an idle EE to execute the

transaction submitted

      • Transaction Scheduling Policy

Look through all of the EEs to find an idle EE to execute the

blocked transaction whose arrival time is the earliest.

cost based approaches2
Cost-based Approaches
  • Cost-based Approaches -3
    • Low-Cost Oriented approach(CTC-LCOA)
      • Priority Value (PV)

PV(t, s) = Cost(t,s) – Preemption Factor(t)

        • Cost(t,s)is the cost for executing t at host s.
        • If a transaction whose arrival time is later than that

Of t is coordinated in prior to t, the Preemption Factor of t is increased by k

      • Transaction Placement Policy

It is the same with CTC-TPOA

      • Transaction Scheduling Policy

For the site where the EE has finished a transaction, look through all of the blocked transaction and coordinate the transaction whose PV is the lowest to the site

Distribution of Active EEs is fixed in CTC-LCOA

related work
Related Work
  • Degree Multi-Programming (DMP) based Algorithm [ObjectStore, 1991]
    • Limit the Multi-Programming Level (number of the concurrent transactions)
  • Feedback based Algorithm
    • Throughput Feedback based Algorithm [VLDB, 1991]

resource contention aided algorithm

    • Conflict Ratio Feedback based Adaptive Transaction Scheduling Algorithm [VLDB, 1992]

data contention aided algorithm

  • Resource Contention
  • The current available resource does not satisfy the required resource
  • Data Contention
  • The excessive lock conflicts degrades the performance significantly
evaluation 1

Customers

Districts

3,000

1+

History

1+

10

Order-Line

Order

Warehouse

5-15

Stocks

100,000

0-1

New-Order

100,000

Items

Evaluation (1)
  • TPC-C benchmark model
    • TPC-C is an Online Transaction Processing Benchmark
    • Database Scheme
evaluation 2
Evaluation (2)
  • Transaction Type
    • New-Order(n/a)
    • Payment(43%)
    • Order-Status(4%)
    • Delivery(4%)
    • Stock‐Level(4%)

The measured throughput of New-Order (MQTH) is reported as performance result.

evaluation 3
Evaluation (3)
  • Testbed

Coordinator Site

16 ×Execute Element Site

Ethernet-100M bit

Ultra5

Ultra5

Ultra5

……

Ultra5

Ultra5

Super-Sparc

(400Mhz)

Disk

IBM DJNA

(22G)

Main Memory

128M

evaluation 4
Evaluation (4)
  • MQTH Result
evaluation 5
Evaluation (5)
  • Rate of Primary Accessed Pages
evaluation 6
Evaluation (6)
  • Distribution of Active EEs (MPL=32)
evaluation 7

T1,T2,T2,T2,T2,T3

Evaluation (7)
  • Why the distribution of Active Execute Elements in CTC-Static is unbalanced?

T1→S1

T2→S2

T3→S3

Transaction

Coordinator

SCDF

EE

EE

EE

EE

Execute

Element

EE

EE

EE

EE

EE

EE

Execute

Element

Execute

Element

Execute

Element

Execute

Element

EE

Execute

Element

MPL=3

S1

S2

S3

conclusion
Conclusion
  • WAKASHI is the lowest layer of ShusseUo, which is an Object Database System
  • In WAKASHI, Persistent Distributed Shared Virtual Memory(PDSVM) functionality is supported
  • Based on PDSVM, two kinds of Inter-Transaction Parallelism implemented in WAKASHI were proposed at this thesis
    • Generalized Distributed Lock Protocol (GDL):

It is proposed to decrease the communication overhead for acquiring page locks on PDSVM spaces

    • Cost-based Distributed Transaction Coordinator (DTC) :

It is proposed to coordinate transaction at NOW environment with lower cost to execute the transactions. And the resource of NOW is also utilized efficiently.

  • The two kinds of parallelism are resource contention aided algorithms and they are evaluated by two benchmark model: Mutli-User OO7 and TPC
future work 1
Future Work (1)
  • Integrate GDL in Transaction Coordinator
    • When a transaction is coordinated, an ideal retain mode is attached by transaction coordinator automatically
    • The retain mode is also tuned based on the feedback information.
future work 2
Future Work (2)
  • Utilize Transaction Coordinator into Intra-Transaction Parallelism
  • Transaction Coordinator should
  • support transaction dependency

T0

  • Page lock mechanism of WAKASHI
  • should be modified

T2

T1

T4

  • Dead lock detecting mechanism
  • of WAKASHI should be modified

T3

ad