Two techniques for improving distributed database performance
Sponsored Links
This presentation is the property of its rightful owner.
1 / 20

Two Techniques For Improving Distributed Database Performance PowerPoint PPT Presentation


  • 136 Views
  • Uploaded on
  • Presentation posted in: General

Two Techniques For Improving Distributed Database Performance. ICS 214B Presentation Ambarish Dey Vasanth Venkatachalam March 18, 2004. Issues In Distributed Databases. fast communication among clients data requested by a client can be located and transferred quickly

Download Presentation

Two Techniques For Improving Distributed Database Performance

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Two Techniques For Improving Distributed Database Performance

ICS 214B Presentation

Ambarish Dey

Vasanth Venkatachalam

March 18, 2004


Issues In Distributed Databases

  • fast communication among clients

    • data requested by a client can be located and transferred quickly

  • good utilization of client CPU and memory resources

  • removing I/O bottlenecks

    • reducing disk accesses

    • reducing communication with servers

  • increased scalability


Focus Of This Talk

  • two approaches for improving performance of distributed systems

  • client server caching (Franklin and Carey)

  • fast page transfer schemes (Mohan and Narang)

    • shared disk architecture

  • similarities


Client Server Caching

  • caching of data and locks at multiple clients

    • minimizes communication overhead between clients and servers

    • reduces contention for server resources

    • reduces contention for data

    • increases autonomy of clients


Existing Techniques

  • existing techniques for distributed data management fall into three categories

    • techniques that avoid caching

    • techniques that cache data but not locks

    • optimistic 2 phase locking

      • O2PL-Invalidate (O2PL-I)

      • O2PL-Propagate(O2PL-P)

      • O2PL-Dynamic (O2PL-D)


Novel Techniques

  • callback locking

    • an alternate method of maintaining cache consistency

  • adaptive locking

    • a protocol that improves upon O2PL-D


Callback Locking

  • supports caching of data pages and non-optimistic caching of locks

  • locks obtained prior to data access

  • server issues ‘call-back’ for conflicting locks

  • no consistency maintenance operations in the commit phase


Techniques For Callback Locking

  • callback read (CB-Read)

    • caches only read locks

    • lock issued only after completion of all the call-backs

    • on commit pages are sent back to server, but copies and hence a read lock is retained at the client

  • callback all (CB-All)

    • write locks are cached in clients rather than read locks

    • information about exclusive copies is stored at the client

    • server issues downgrade requests when it gets read lock requests for a page


Novel Techniques

  • callback Locking

  • adaptive locking


The New Adaptive Heuristic

  • the variety of the O2PL algorithms try to optimize the actions that they perform on the remote sites, once a lock has been obtained.

  • propagate pages only when

    • the page is resident at the site when the consistency operation is attempted

    • if the page was previously propagated to this site, and it has been re-accessed since then

    • the page was previously invalidated at the site and that invalidation was a mistake


Where We Are

  • client server caching

  • fast page transfer schemes

    • shared disk architecture

  • comparisons


Motivation

  • disk based data sharing involves a lot of overhead

  • system A wants to access a page owned by system B.

    • GLM sends B a lock conflict message

    • B writes the page to disk after forcing its logs (WAL)

    • B sends GLM a message to downgrade its lock, allowing A to read the page

    • A reads the page from disk

  • cost is 2 I/Os, 2 messages, and a log force


Alternative: Fast Page Transfer

  • systems transfer pages through message passing, rather than disk I/Os.

  • improves performance

  • requires buffer coherency protocols

  • requires special recovery protocols

    • what if a message is lost?

    • what if one or more systems fail?

  • four schemes for fast page transfer

    • medium, fast, superfast schemes


SuperFast Page Transfer

  • pages transferred from one system to another without writing them or their logs to disk

  • the final owner is responsible for writing the page to disk and ensuring that logs of all updates by all systems written to disk

  • cost is 0 I/O and 3 messages

  • how to deal with system failures?

  • how to preserve write-ahead logging?


Recovery

  • uses a merged log of all systems that have updated the page

  • recovery LSN (RLSN) is the earliest point in the merged log from which redo processing for a page has to start

    • initialized to HIGH (no recovery needed)

    • changed to the next LSN value when a page is locked in update mode

    • reset to HIGH after the updated page is written to disk

  • global lock manager adjusts RLSN value as it receives information from the systems


Single System Failure

  • locking information preserved at the GLM

  • a single system responsible for merging logs and doing REDO processing for all pages on behalf of all failed systems

  • pages requiring REDO are those locked in U mode and whose RLSN < HIGH

  • the minimum of these RLSN values is the starting point in the merged log for the REDO pass

  • ARIES style REDO, followed by UNDO

    • if LSNlog > LSNpage, reapply the log


Complex System Failure

  • the GLM crashes and at least one LLM crashes, so locking information is lost

  • each system periodically checkpoints the global lock manager’s state

    • write a Begin_GLM_Checkpoint log record

    • request <pageID, RLSN> for all pages with RLSN not equal to HIGH

    • write these into an End_GLM_Checkpoint log record


Complex System Failure

  • find the minimum RLSN contained in the End_GLM_Checkpoint log record

  • start REDO processing at this RLSN, or at the LSN of the Begin_GLM_Checkpoint log record, if all pages have RLSN of HIGH.

  • continue until end of log reached

  • undo processing done by individual systems


Preserving WAL

  • pages contain slots for attaching log information

    • <systemID, LSN>

  • when transferring a page, a system piggybacks the LSN of the latest log record it hasn’t written to disk

  • the final owner reads the slots and enforces WAL


Conclusion

  • the page transfer schemes incorporate ideas from client server caching for buffer coherency

    • central server maintains LSN information and transactions update this information when they commit

    • lock degradation

  • caching and fast page transfer can coexist, but both share tradeoffs

    • overhead of maintaining cache/buffer coherency

    • overhead of recovery protocols


  • Login