1 / 20

Introduction to Database Systems Chpt 1

Introduction to Database Systems Chpt 1. Instructor: Xintao Wu. http://www.sigmod.org/record/issues/0606/index.html. History. 60s C. Bachman GE network data model Late 60s IBM IMS hierarchical data model 70 E.Codd relational model

wschultz
Download Presentation

Introduction to Database Systems Chpt 1

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to Database SystemsChpt 1 Instructor: Xintao Wu Ramakrishnan & Gehrke

  2. http://www.sigmod.org/record/issues/0606/index.html Ramakrishnan & Gehrke

  3. History 60s C. Bachman GE network data model Late 60s IBM IMS hierarchical data model 70 E.Codd relational model 80s SQL IBM R trasaction J. Gray Late 80s-90s DB2, Oracle, informaix, sybase 90s-10s Data Warehouse, internet 10s - Big Data, NoSQL/NewSQL M. Stonebreaker Turing award and Turing test? Turing award listTuring website Ramakrishnan & Gehrke

  4. What Is a DBMS? • A very large, integrated collection of data. • Models real-world enterprise. • Entities (e.g., students, courses) • Relationships (e.g., Madonna is taking ITCS6160) • A Database Management System (DBMS)is a software package designed to maintain and utilize databases. Ramakrishnan & Gehrke

  5. Why Use a DBMS? • Data independence and efficient access. • Reduced application development time. • Data integrity and security. • Uniform data administration. • Concurrent access, recovery from crashes. Ramakrishnan & Gehrke

  6. Why Study Databases?? • Shift from computation to information • at the “low end”: scramble to webspace • at the “high end”: scientific applications • Datasets increasing in diversity and volume. • Digital libraries, interactive video, Human Genome project, EOS project • ... need for DBMS exploding • DBMS encompasses most of CS • OS, languages, theory, “A”I, multimedia, logic Ramakrishnan & Gehrke

  7. Data Models • A data modelis a collection of concepts for describing data. • Aschemais a description of a particular collection of data, using the given data model. • The relational model of datais the most widely used model today. • Main concept: relation, basically a table with rows and columns. • Every relation has a schema, which describes the columns, or fields. Ramakrishnan & Gehrke

  8. Levels of Abstraction View 1 View 2 View 3 • Many views, single conceptual (logical) schemaand physical schema. • Views describe how users see the data. • Conceptual schema defines logical structure • Physical schema describes the files and indexes used. Conceptual Schema Physical Schema • Schemas are defined using DDL; data is modified/queried using DML. Ramakrishnan & Gehrke

  9. Example: University Database • Conceptual schema: • Students(sid: string, name: string, login: string, age: integer, gpa:real) • Courses(cid: string, cname:string, credits:integer) • Enrolled(sid:string, cid:string, grade:string) • Physical schema: • Relations stored as unordered files. • Index on first column of Students. • External Schema (View): • Course_info(cid:string,enrollment:integer) Ramakrishnan & Gehrke

  10. Data Independence • Applications insulated from how data is structured and stored. • Logical data independence: Protection from changes in logical structure of data. • Physical data independence: Protection from changes in physical structure of data. • One of the most important benefits of using a DBMS! Ramakrishnan & Gehrke

  11. Query Optimization and Execution Relational Operators Files and Access Methods Buffer Management Disk Space Management DB Structure of a DBMS These layers must consider concurrency control and recovery • A typical DBMS has a layered architecture. • The figure does not show the concurrency control and recovery components. • This is one of several possible architectures; each system has its own variations. Ramakrishnan & Gehrke

  12. Transaction Management: ACID properties • Atomicity: All actions in the Xact happen, or none happen. • Consistency: If each Xact is consistent, and the DB starts consistent, it ends up consistent. • Isolation: Execution of one Xact is isolated from that of other Xacts. • D urability: If a Xact commits, its effects persist. • The Recovery Manager guarantees Atomicity & Durability. Ramakrishnan & Gehrke

  13. Motivation of concurrency control • Consistency • Isolation • Example • Two parallel transactions T1 and T2 • Serial execution • Execution with interleaving actions • Example shown on board Ramakrishnan & Gehrke

  14. Example • Consider two transactions (Xacts): T1: BEGIN A=A+100, B=B-100 END T2: BEGIN A=1.06*A, B=1.06*B END • Intuitively, the first transaction is transferring $100 from B’s account to A’s account. The second is crediting both accounts with a 6% interest payment. • There is no guarantee that T1 will execute before T2 or vice-versa, if both are submitted together. However, the net effect must be equivalent to these two transactions running serially in some order.

  15. Example (Contd.) • Consider a possible interleaving (schedule): T1: A=A+100, B=B-100 T2: A=1.06*A, B=1.06*B • This is OK. But what about: T1: A=A+100, B=B-100 T2: A=1.06*A, B=1.06*B • The DBMS’s view of the second schedule: T1: R(A), W(A), R(B), W(B) T2: R(A), W(A), R(B), W(B)

  16. Motivation of recovery management • Atomicity: • Transactions may abort (“Rollback”). • Durability: • What if DBMS stops running? (Causes?) • Desired Behavior after system restarts: • T1, T2 & T3 should be durable. • T4 & T5should be aborted (effects not seen). crash! T1 T2 T3 T4 T5 Ramakrishnan & Gehrke

  17. Handling the Buffer Pool • Force every write to disk? • Poor response time. • But provides durability. • Steal buffer-pool frames from uncommited Xacts? • If not, poor throughput. • If so, how can we ensure atomicity? No Steal Steal Force Trivial Desired No Force

  18. Databases make these folks happy ... • End users and DBMS vendors • DB application programmers • E.g. smart webmasters • Database administrator (DBA) • Designs logical /physical schemas • Handles security and authorization • Data availability, crash recovery • Database tuning as needs evolve Must understand how a DBMS works! Ramakrishnan & Gehrke

  19. Summary • DBMS used to maintain, query large datasets. • Benefits include recovery from system crashes, concurrent access, quick application development, data integrity and security. • Levels of abstraction give data independence. • A DBMS typically has a layered architecture. • DBAs hold responsible jobs and are well-paid! • DBMS R&D is one of the broadest, most exciting areas in CS. Ramakrishnan & Gehrke

More Related