1 / 27

Lecture 4: DBMS Architecture

Lecture 4: DBMS Architecture. Sept. 5, 2007 ChengXiang Zhai. Most slides are adapted from Kevin Chang’s lecture slides. DBMS Mission Statement. Simply: maintenance and computation of data But how to do it?. Data. Operations. Results. DBMS Architecture. User/Web Forms/Applications/DBA.

erv
Download Presentation

Lecture 4: DBMS Architecture

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture 4: DBMS Architecture Sept. 5, 2007 ChengXiang Zhai Most slides are adapted from Kevin Chang’s lecture slides

  2. DBMS Mission Statement • Simply: maintenance and computation of data • But how to do it? Data Operations Results

  3. DBMS Architecture User/Web Forms/Applications/DBA query transaction Query Parser Transaction Manager Query Rewriter Logging & Recovery Query Optimizer Lock Manager Query Executor Files & Access Methods Lock Tables Buffers Buffer Manager Main Memory Storage Manager Storage

  4. A Design Dilemma • To what extent should we reuse OS services? • Reuse as much as we can • Performance problem (inefficient) • Lack of control (incorrect crash recovery) • Replicating some OS functions (“mini OS”) • Have its own buffer pool • Directly manage record structures with files • …

  5. OS vs. DBMS Similarities? • What do they manage? • What do they provide?

  6. OS vs. DBMS: Similarities • Purpose of an OS: • managing hardware • presenting interface abstraction to applications • DBMS is in some sense an OS? • DBMS manages data • presenting interface abstraction to applications • Both as API for application development!

  7. Applications built upon DBMS • ERP: Enterprise Resource Planning • SAP, Baan, PeopleSoft, Oracle, IBM,... • CRM: Customer Relationship Management • E.phiphany, Siebel, Vantive, Oracle, IBM, ... • SCM: Supply Chain Management • Trilogy, i2, Oracle, IBM, ... • A lot more in the Info Tech era: • e-business software • scientific data • multimedia • data analysis and decision support

  8. OS vs. DBMS: Related Concepts • Process Management  What DB concepts? • process synchronization • deadlock handling • Storage management  What DB concepts? • virtual memory • file system • Protection and security  What DB concepts? • authentication • access control

  9. OS vs. DBMS: Differences?

  10. OS vs. DBMS: Differences • DBMS: Top-down to encapsulate high-level semantics! • Data • data with particular logical structures • Queries • query language with well defined operations • Transactions • transactions with ACID properties • OS: Bottom-up to present low-level hardware

  11. DBMS on top of OS: Relations vs. File system • Data object abstraction • file: array of characters • relation: set of tuples • Physical contiguity: • large DB files want clustering of blocks • sol1: managing raw disks by DBMS • sol2: simulate by managing free spaces in DBMS • Multiple trees (access methods) • file access: directory hierarchy (user access method) • block access: inodes • tuple access: DBMS indexes

  12. Problems with DBMS on top of OS • Buffer pool management • File system • Process management • Consistency control • Paged virtual memory

  13. Buffer Pool Management • LRU replacement • Query-aware replacement needed for performance • Examples: hash join, short-merge join… • Prefetching • DBMS knows exactly which block is to be fetched next • Crash recovery • Need “selected force out”

  14. Updating Semantics • Update emp.sal = 0.8*emp.sal if emp.sal > mgr.sal empname sal manager Smith 10k Brown Jones 9k Brown 11k Jones • what are the possible semantics? • INGRES solution: deferred updates • buffer updates in intentions list for actual updates (also serve as redo log) • an example of “needing buffer knowledge in DBMS”, so perhaps not sensible to do BM totally in OS

  15. As the data model and application context change, so does the DBMS architecture…

  16. Post-Relational DB Projects • Motivation: • RDBMS not powerful enough for non-administrative data-intensive applications such as: CAD/CAM, GIS… • Buzz terms: object-oriented, extensible • Sample projects • Postgres: U.C. Berkeley • Starburst: IBM Almaden – “highly extensible” • after System R (relational), R* (distributed) • ultimately finding its way into IBM DB2 UDB • Exodus: U. Wisconsin • not a complete DB; an OO-style storage manager toolkit • followed by Shore at Wisconsin, Predator at Cornell

  17. Quest for a Richer Model • Object-oriented data model • Extensible ADTs • Programming-language constructs

  18. ORDBMS vs. OODBMS • Question: How important is the relation? • ORDBMS: • RDBMS + OO features # • query-based • OODBMS: • OO PL + database features (persistent objects) • programming-based • Meeting in the middle

  19. Stonebraker’s Matrix • Prediction: ORDBMS will dominate • evidence: big DB players are all on this side Simple Data Complex Data QueryRDBMS ORDBMS No QueryFile System OODBMS

  20. Object Orientation Concepts • Classes: • classes as types • encapsulation: interface + implementation • inheritance: building class hierarchies • Objects: • complex objects: • built from constructors, e.g., set-of, array, nested objs • object identity (OID): • system generated as unique object reference • enables (efficient) object linking and navigation

  21. POSTGRES Data Model POSTGRES data model: • OO constructs • classes as relations • object (class instance) = tuple • object-id = tuple-id • method = attribute or function of attributes • inheritance (multiple parents) • ADT constructs: • types • functions

  22. POSTGRES Functions • Arbitrary C functions • e.g.: overpaid(Employee) • arbitrary semantics-- not optimized • no fancy access methods-- typically sequential scan • Binary operators • “hints” to provide semantics • extensible access methods • extensible B+tree or user-defined index • PostQuel procedures • parameterized queries as functions • e.g.: sal-lookup(name): retrieve Emp.salary where Emp.name = name

  23. POSTGRES Storage System We were guided by a missionary zeal to do something different… • No-overwrite system • Logging: • old values are not overwritten-- no value logging necessary • log only needs to keep transaction state (commit/abort/going) • crash recovery-- how? • Vacuum-cleaner daemon to archive historical data • Advantages: • recovery is cheap • time travel is easy

  24. Storage System: Problems • Problems • flushing differential data by commit time can be costly • unless “stable” main memory • more costly than sequentially writing out logs • reads have to stitch together current picture • And, yes, there are lots details unexplored or unexplained

  25. Questing for the Right Models Speaking about knowledge representation– The simple relational model is by far the only successful KR paradigm. When the relational model came along, the network guys resisted and their companies went under. … When the OO model came along, the relational guys absorb its best, and their companies prospered again! -- Jeffery Ullman

  26. What You Should Know • What are some major limitations of services provided by an OS in supporting a DBMS? • In response to such limitations, what does a DBMS do? • As the data model and task environment change, the architecture will also need to change

  27. Carry Away Messages • One usually doesn’t fit all! • An OS is designed to serve all kinds of applications, so it’s not optimal for supporting a DBMS • Other examples: a search engine is designed to serve all kinds of people, so it’s not optimal for a particular person (personalized search) • When a problem is recognized, there are often opportunities for breakthroughs in multiple areas • DBMS could take over OS functions • OS could provide more opportunities for customization • From “day 1”, high efficiency has been the primary challenge/concern in designing and implementing a DBMS; reliability is another major concern • In contrast, “accuracy of answers” is at least as important as efficiency for a Web search engine • In the future, accuracy of answers will likely become more important for new applications of databases

More Related