1 / 16

Distributed Databases

Distributed Databases. And replication. Definitions. Distributed Database: A single logical database that is spread physically across computers in multiple locations that are connected by a data communications link.

reedtaylor
Download Presentation

Distributed Databases

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Distributed Databases And replication

  2. Definitions • Distributed Database: A single logical database that is spread physically across computers in multiple locations that are connected by a data communications link. • Decentralized Database: A collection of independent databases on non-networked computers.

  3. Reasons forDistributed Database • Local business units want control over data. • Consolidate data across local databases for integrated decision making. • Reduce telecommunications costs. • Reduce the risk of telecommunications failures.

  4. Distributed Database Options • Fig. 11-1. (Slide 11-5) • Homogeneous - Same DBMS at each node. • Autonomous - Independent DBMSs. • Non-autonomous - Central , coordinating DBMS. • Heterogeneous - Different DBMSs at different nodes. • Gateways - Simple paths are created to other databases without the benefits of one logical database.

  5. Distributed database environments (adapted from Bell and Grimson, 1992)

  6. Distributed Database Options • Systems - Supports some or all of the functionality of one logical database. • Full DBMS Functionality - All dist. Db functions. • Partial-Multi-database - Some dist. Db functions. • Federated - Supports local databases for unique data requests. • Loose Integration - Local dbs have their own schemas. • Tight Integration - Local dbs use common schema. • Unfederated - Requires all access to go through a central, coordinating module.

  7. Homogeneous, Non-Autonomous Database • Fig. 11-2. • Data is distributed across all the nodes. • Same DBMS at each node. • All data is managed by the distributed DBMS (no exclusively local data.) • All access is through one, global schema. • The global schema is the union of all the local schema.

  8. Focus on The Following Heterogeneous Environment • Fig. 11-3. • Data distributed across all the nodes. • Different DBMSs may be used at each node. • Local access is done using the local DBMS and schema. • Remote access is done using the global schema.

  9. Objectives and Trade-offs • Location Transparency - User does not have to know the location of the data. • Local Autonomy - Local site can operate with its database when central site is down. • Synchronous Distributed Database - All copies of the same data are always identical. • Asynchronous Distributed Database - Some data inconsistency is tolerated.

  10. Advantages ofDistributed Database • Increased reliability and availability. • Local control over data. • Modular growth. • Lower communication costs. • Faster response for certain queries.

  11. Disadvantages ofDistributed Database • Software cost and complexity. • Processing overhead. • Data integrity exposure. • Slower response for certain queries.

  12. Options forDistributing a Database • Data replication. • Horizontal partitioning. • Vertical partitioning. • Combinations of the above.

  13. Data Replication • Advantages - • Reliability. • Fast response. • May avoid complicated distributed transaction integrity routines (if replicated data is refreshed at scheduled intervals.) • De-couples nodes (transactions proceed even if some nodes are down.) • Reduced network traffic at prime time (if updates can be delayed.)

  14. Data Replication • Disadvantages - • Additional requirements for storage space. • Additional time for update operations. • Complexity and cost of updating. • Integrity exposure of getting incorrect data if replicated data is not updated simultaneously. • Therefore, better when used for non-volatile data.

  15. Types of Data Replication • Snapshot Replication - • Changes are periodically sent to a master site which sends an updated snapshot out to the other sites. • Near Real-Time Replication - • Broadcast update orders without requiring confirmation. • Pull Replication - • Each site controls when it wants updates.

  16. Issues in Data Replication Use • Data timeliness. • Useful if DBMS cannot reference data from more than one node. • Batched updates can cause performance problems. • Updates complicated with heterogeneous DBMSs or database design. • Telecommunications speeds may limit mass updates.

More Related