Distributed database management systems
1 / 23

Distributed Database Management Systems - PowerPoint PPT Presentation

  • Updated On :
  • Presentation posted in: General

Distributed Database Management Systems. Evolution of DDBMS. Decentralized database management systems (DDBMS) Interconnected computer systems Data/processing functions reside on multiple sites 1970’s: Centralized DBMS 1980’s: Social and Technical Changes Ad hoc capability required

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

Download Presentation

Distributed Database Management Systems

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

Distributed Database Management Systems

Evolution of DDBMS

  • Decentralized database management systems (DDBMS)

    • Interconnected computer systems

    • Data/processing functions reside on multiple sites

  • 1970’s: Centralized DBMS

  • 1980’s: Social and Technical Changes

    • Ad hoc capability required

    • Decentralized management structure common

  • 1990’s: New forces

    • Internet and the World Wide Web used for data access and distribution

    • Data analysis through data mining and data warehousing

DDBMS Advantages

  • Data located near site with greatest demand

  • Faster data access

  • Faster data processing

  • Growth facilitation

  • Improved communications

  • Reduced operating costs

  • User-friendly interface

  • Less danger of single-point failure

  • Processor independence

DDBMS Disadvantages

  • Complexity of management and control

  • Security

  • Lack of standards

  • Increased storage requirements

  • Greater difficulty in managing data environment

  • Increased training costs

Distributed Processing

Shares database’s logical processing among physically, networked independent sites

Distributed Database

Stores logically related database over physically independent sites

Distributed Database vs. Distributed Processing

  • Distributed processing

    • Does not require distributed database

    • May be based on a single database on single computer

    • Copies or parts of database processing functions must be distributed to all data storage sites

  • Distributed database

    • Requires distributed processing

  • Both

    • Require a network to connect components

Functions of DDBMS

  • Application/end user interface

  • Validation to analyze data requests

  • Transformation to determine request components

  • Query optimization to find the best access strategy

  • Mapping to determine the data location

  • I/O interface to read or write data

  • Formatting to prepare the data for presentation

  • Security to provide data privacy

  • Backup and recovery

  • DB Administration

  • Concurrency Control

  • Transaction Management

Fully Distributed Database Management System

DDBMS Components

  • Computer workstations

  • Network hardware and software components

  • Communications media

  • Transaction processor (TP)

    • Also called application manager (AP) or transaction manager (TM)

  • Data processor (DP)

    • Also called data manager (DM)

Distributed Database Components

Figure 10.5

DDBMS Protocols

  • Interface with network to transport data and commands between DPs and TPs

  • Synchronize data received from DPs and route to appropriate TPs

  • Ensure common database functions

    • Security

    • Concurrency control

    • Backup and recovery

Levels of Data and Process Distribution

Database systems can be classified based on process distribution and data distribution

Table 10.1

Single-Site Processing, Single-Site Data (SPSD)

  • All processing on single CPU or host computer

  • All data are stored on host computer disk

  • DBMS located on the host computer

  • DBMS accessed by dumb terminals

  • Typical of mainframe and minicomputer DBMSs

  • Typical of 1st generation of single-user microcomputer database

Multiple-Site Processing, Single-Site Data (MPSD)

  • Requires network file server

  • Applications accessed through LAN

  • Variation known as client/server architecture

Multiple-Site Processing, Multiple-Site Data (MPMD)

  • Fully distributed DDBMS with support for multiple DPs and TPs at multiple sites

    • Homogeneous I

      • Integrate one type of centralized DBMS over the network

    • Heterogeneous

      • Integrate different types of centralized DBMSs over a network

Distributed DB Transparency

  • Allows end users to feel like only database user

  • Hides complexities of distributed database

  • Transparency features

    • Distribution

    • Transaction

    • Failure

    • Performance

    • Heterogeneity

Distributed Concurrency Control

  • Multisite, multiple-process operations more likely to create data inconsistencies and deadlocked transactions

  • Problems

    • Transaction committed by local DP

    • One DP could not commit transaction’s result

    • Yields inconsistent database

Two-Phase Commit Protocol

  • DO-UNDO-REDO protocol

    • Write-ahead protocol

    • Two kinds of nodes

      • Coordinator

      • Subordinates

  • Phases

    • Preparation

      • Coordinator sends message to all subordinates

      • Confirms all are ready to commit or abort

    • Final Commit

      • Ensures all subordinates have committed or aborted

Performance Transparency and Query Optimization

  • Objective: Minimize total cost associated with execution of request

  • Main costs

    • Access time

    • Communication

    • CPU time

  • Basis for query optimization algorithms

    • Optimum execution order

    • Sites accessed to minimize communication costs

  • Dynamic or static optimization

  • Statistically based vs. rule-based query optimization algorithms

Distributed Database Design

  • Partition database into fragments

    • Horizontal

    • Vertical

    • Mixed

  • Fragments to replicate

    • Storage of data copies at multiple sites

    • Fully, partially, unreplicated databases

  • Data allocation

    • Where to locate data

    • Centralized, partitioned, replicated

Client/Server Advantages Over DDBMS

  • Client/server less expensive

  • Client/server solutions allow use of microcomputer’s GUI

  • More people with PC skills than mainframe skills

  • PC is well established in workplace

  • Numerous data analysis and query tools exist

  • Considerable cost advantages to off-loading application development

Client/Server Disadvantages

  • Creates more complex environment with different platforms

  • Increased number of users and sites creates security problems

  • Training issues become more complex and expensive

  • Login