1 / 20

Distributed Databases

Distributed Databases. Business needs for distributed databases Introduction to distributed databases Subscriber / Publisher Model Snapshots Transactional Replication Merge Replication Dissimilar Databases Implementing Distributed DB Design Implications Advantages & Disadvantages.

tomko
Download Presentation

Distributed Databases

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Distributed Databases • Business needs for distributed databases • Introduction to distributed databases • Subscriber / Publisher Model • Snapshots • Transactional Replication • Merge Replication • Dissimilar Databases • Implementing Distributed DB • Design Implications • Advantages & Disadvantages

  2. Business Needs for Distributed Databases • The concept of a central database to handle all of the organization’s needs has several potential limitations • Geographically dispersed organization requires extensive database traffic • Large organization creates congestion at the server • Large volumes of data must be moved across the network • The entire organization can be vulnerable to a problem with a single server • Data communications interruptions can disrupt the entire organization’s operations

  3. Business Needs for Distributed Databases (cont.) • Central database limitations (cont.) • Dissimilar operating units create differing data access needs • Local units require autonomy over the design and implementation of DB systems • Information sharing across the organization still requires connectivity • Local unit DB designers will not be allowed to design against the entire DB

  4. Business Needs for Distributed Databases (cont.) • Central database limitations (cont.) • Mergers and acquisitions create ad-hoc integration of dissimilar DB systems • Different business units may have fully developed DB and applications on dissimilar platforms, DBMS, etc. • The organization still requires information sharing for organizational effectiveness • Rewriting the whole system in a single DB is impractical (or may take time to implement)

  5. Distributed Databases • Distributed Databases are characterized by decisions made regarding: • Distribution of data schema • All nodes share same schema or not • Update rights on objects (especially table data) • Latency / concurrency requirements • Commonality of DBMS

  6. Subscriber/Publisher Model • A susbcriber / publisher model is often used to describe database updates • Nodes allowed to change data & objects are publishers • Nodes needing to be aware of changes are subscribers • Decisions are made on methods for making subscribers aware of changes and of getting changes to them • Near real time • On demand • Batch • On schedule

  7. Snapshots • Distribution of databases (except in connecting existing databases) usually start with a snapshot of all or part of a DB • Copy of structures, data, SP, triggers, etc. • The snapshot is distributed to all nodes • May be different snapshots to different nodes

  8. A Scenario • Corporate HQ isthe central site • Regional HQ or even ‘retail’ locations are Remote sites • Remote sites executefrequent transactions • Q: What data isneeded in each locationfor the organization’s business needs?

  9. Transactional Replication • In transactional replication aseach transaction is executed on any node it is ‘published’ to all subscribing nodes which also execute the transaction • Data integrity rules are checked at each node • Violation of a data integrity rule at any node can roll back the transaction at all nodes • Data is kept relatively current at all nodes

  10. Transactional Replication (cont.) • Application (“business”) needs control urgency and frequency of updates • Some data is read only at some nodes • Price schedule might be set centrally and only read locally • Sales transactions are probably executed locally and rolled up centrally

  11. Transactional Replication • When is Transactional Replication appropriate? • Higher interaction between actions at nodes (easier to cause conflicts with out of date data) • Decision making requires updated information • Frequent changes can cause concurrency problems • Connectivity is not an issue • Detected problems can result in near-real time rollbacks

  12. Merge Replication • In Merge Replication subscribers may receive a partition of the data • Certain rows • Only customers or employees in their region • Certain columns • Employee contact info but not salary info • Subscribers may add, update, or delete rows to which they have write access • Changes are committed (published) to the subcribers in a batch (merged back into the subscriber DB)

  13. Merge Replication (cont.) • System is able to detect whenremote site copy of data haschanged (including newrecords) • Changed data is marked forupdating in central copyduring merge

  14. Merge Replication (cont) • When is merge replication appropriate? • Few chances for node operations to create conflicts • Highly autonomous activities • Different lines of business • Infrequent changes requiring immediate awareness by all subscribers • Physical connectivity issues • May create more complex problems when a conflict does occur • Rolling back already committed transactions

  15. Dissimilar Databases • Distributed DB nodes may be dissimilar on two dimensions • DB architecture (table structure, field data types/names, etc.) • DBMS and OS (may not even be relational data) • “Messages” sent between nodes to inform them of updates must be translated somewhere • Imposes new layers of complexity for connectivity • SQL Server provides support for this process • Many third party products for logical integration

  16. Implementing DB Distribution • SQL Server comes with a wealth of distributed DB management tools • Specify publication schedules, rights, update frequencies, etc. • Manage conflicts when they occur and notify clients • Perform translations between DBMS • Perform translations between structures

  17. Design Implications • Some DB designs may change when the DB is replicated • Relationships may not be enforced in remote nodes because matching parent rules may not exist • GUID attributes may be needed for PKs since independently generated Identity attributes could conflict when rolled up • Triggers or constraints may be different • May violate locally but be OK globally • Vice-versa

  18. Database Distribution Advantages & Tradeoffs • Key advantages of distributed DB • Increased reliability • Local access and control • Modular growth • Lower communication costs • Faster response What are the mechanisms thatgive rise to theseadvantages?

  19. Database Distribution Advantages & Tradeoffs (cont.) • Disadvantages of distributed DB • Software cost & complexity • Keeping data current • Maintaining data integrity • Integrating multiple sites and applications • Processing overhead • Data integrity • Slow response from poor design

  20. Distributed DBMS (cont.) • Distributed DBMS attempts to achieve “Location Transparency” • User or application will not need to know that the query is going to multiple nodes • User has one integrated DB schema • Distributed DBMS performs all network operations • Also seek to achieve “Replication Transparency” • Replication operations are performed automatically • Manages multiple updates against different copies of replicated data

More Related