1 / 27

Comp 3311 Database Management Systems

Comp 3311 Database Management Systems. Introduction Acknowledgement: Slides modified by Dr. Lei Chen using previous versions created by Prof Dimitris Papadias, Prof. Dik Lee and Dr. Wilfred Ng. Course info. L1: Schedule: Monday 1:30pm-2:50pm, Friday 9:00am-10:20am, Room 1103

rdowning
Download Presentation

Comp 3311 Database Management Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Comp 3311 Database Management Systems Introduction Acknowledgement: Slides modified by Dr. Lei Chen using previous versions created by Prof Dimitris Papadias, Prof. Dik Lee and Dr. Wilfred Ng. COMP3311 Fall 2017 CSE, HKUST Slide 1

  2. Course info • L1: Schedule: Monday 1:30pm-2:50pm, Friday 9:00am-10:20am, Room 1103 • L2: Schedule: Wednesday, Friday 4:30pm-5:50pm, Room 2504 Instructor: Lei Chen • WWW page: https://www.cse.ust.hk/~leichen/comp3311/index.html • Midterm exam: Wed Sept 29thth(in class exam). • Exams will be with open books and notes. • Textbook Database System Concepts, A. Silberschatz, H. Korth, and S. Sudarshan. • Reference Database Management Systems, Raghu Ramakrishnan and Johannes Gehrke. • Grading Policy: 40% final, 26% midterm, 24% assignment +Bonus (in class question answers) • The first week of classes (Sept 4th) there will not be any tutorials or labs. COMP3311 Fall 2017 CSE, HKUST Slide 2

  3. Course Outline • E/R Model • Relational Model, Algebra • SQL • Functional Dependencies and Relational Database Design • Storage and File Systems • Tree and Hash Indexes • Query Processing and Implementation of Relational Operators • Query Optimization • Physical Database Design • Transactions • Concurrency Control Protocols • Database Recovery • Advanced Topics COMP3311 Fall 2017 CSE, HKUST Slide 3

  4. What is a Database Management System (DBMS) • Collection of interrelated data + Set of programs to access the data • DBMS contains information about a particular enterprise • DBMS provides an environment that is both convenient and efficient to use. • Database Applications: • Banking: all transactions • Airlines: reservations, schedules • Universities: registration, grades • Sales: customers, products, purchases • Manufacturing: production, inventory, orders, supply chain • Human resources: employee records, salaries, tax deductions • Databases touch all aspects of our lives COMP3311 Fall 2017 CSE, HKUST Slide 4

  5. Commercial Database Systems Company Product Remarks OracleOracle 8i, 9i, etc. World’s 2nd largest software company CEO, Larry Ellison, world’s 2nd richest IBMDB2, Universal Server World’s 2nd largest after Informix acquisition MicrosoftAccess, SQL Server Access comes with MS Office SybaseAdaptive Server CEO John Chen, grown up in HK, bought by SAP, 2010 InformixDynamic Server Acquired by IBM in 2001 COMP3311 Fall 2017 CSE, HKUST Slide 5

  6. DBMS vs File Systems • In the early days, database applications were built on top of file systems • Drawbacks of using file systems to store data: • Data redundancy and inconsistency • Multiple file formats, duplication of information in different files • Difficulty in accessing data • Need to write a new program to carry out each new task • Integrity problems • Integrity constraints (e.g. account balance > 0) become part of program code • Hard to add new constraints or change existing ones COMP3311 Fall 2017 CSE, HKUST Slide 6

  7. DBMS vs File Systems (cont) • Drawbacks of using file systems (cont.) • Atomicity of updates • Failures may leave database in an inconsistent state with partial updates carried out • E.g. transfer of funds from one account to another should either complete or not happen at all • Concurrent access by multiple users • Concurrent accesses needed for performance • Uncontrolled concurrent accesses can lead to inconsistencies • E.g. two people reading a balance and updating it at the same time • Security problems • DBMS offer automated solutions to all the above problems; they solve problems caused by different people writing different applications independently. COMP3311 Fall 2017 CSE, HKUST Slide 7

  8. Data Independence • One big problem in application development is the separation of applications from data • Do I have change my program when I … • replace my hard drive? • store the data in a b-tree instead of a hash file? • partition the data into two physical files (or merge two physical files into one)? • store salary as floating point number instead of integer? • develop other applications that use the same set of data? • add more data fields to support other applications? • Solution: introduce levels of abstraction. COMP3311 Fall 2017 CSE, HKUST Slide 8

  9. ARR CSE Dept Financial Office HKUST database Files on disks Three Levels of Abstraction view 1 view 2 view n ..……... Logical view Physical view COMP3311 Fall 2017 CSE, HKUST Slide 9

  10. Three Levels of Abstraction (cont.) • Physical level: describe how a record is stored on disks. • e.g., “Divide the customer records into 3 partitions and store them on disks 1, 2 and 3.” • Logical level: describes data stored in database, and the relationships among the data. Similar to defining a record type in Pascal or C:Type customer = record name: string; street: string; city: integer; end; • View level: Define a subset of the database for a particular application. Views can also hide information (e.g. salary) for security purposes. COMP3311 Fall 2017 CSE, HKUST Slide 10

  11. Instances and Schemas • Each level is defined by a schema, which describes the data at the corresponding level • A logical schema defines the logical structure of the database (e.g., set of customers and accounts and the relationship between them) • A physical schema defines the file formats and locations • A databaseinstance refers to the actual content of the database at a particular point in time. A database instance must conform to the corresponding schema COMP3311 Fall 2017 CSE, HKUST Slide 11

  12. Data Independence • Ability to modify a schema definition in one level without affecting a schema definition in the next higher level. • The interfaces between the various levels and components should be well defined so that changes in some parts do not seriously influence others. • Two levels of data independence: - Physical data independence (users are shielded from changes in the physical structure of the data) - Logical data independence (users are shielded from changes in the logical structure of the data) COMP3311 Fall 2017 CSE, HKUST Slide 12

  13. Application view View definitions logical Logical schema physical Physical schema Logical Data Independence Physical Data Independence COMP3311 Fall 2017 CSE, HKUST Slide 13

  14. Data on disk Program accessing data directly has to know: • first 4 bytes is an ID number • next 10 bytes is an employee name program 1129 1129 John Law John Law … … … … Schema Data on disk Student: ID: integer Name char(10) DBMS program An Example of Data Independence COMP3311 Fall 2017 CSE, HKUST Slide 14

  15. Data Models • A collection of tools for describing: • data • data relationships • data semantics • data constraints COMP3311 Fall 2017 CSE, HKUST Slide 15

  16. Entity-Relationship Model • Example of entity-relationship model social-security customer-street balance account-number customer-city customer-name DEPOSITOR CUSTOMER ACCOUNT COMP3311 Fall 2017 CSE, HKUST Slide 16

  17. customer- social- customer- customer- account- name security street city number Johnson 192-83-7465 Alma Palo Alto A-101 Smith 019-28-3746 North Rye A-215 Johnson 192-83-7465 Alma Palo Alto A-201 Jones 321-12-3123 Main Harrison A-217 Smith 019-28-3746 North Rye A-201 account-number balance A-101 500 A-201 900 A-215 700 A-217 750 Relational Model Example of tabular data in the relational model: COMP3311 Fall 2017 CSE, HKUST Slide 17

  18. Data Definition Language (DDL) • Specification notation for defining the database schema • Express what were in the previous two slides to the DBMS in a formal language • Data storage and definition language - special type of DDL in which the storage structure and access methods used by the database system are specified COMP3311 Fall 2017 CSE, HKUST Slide 18

  19. Data Manipulation Language (DML) • Language for accessing and manipulation the data organized by the appropriate data model • Two types of formal languages • Algebra (Procedural) - user specifies what data is required and how to get those data. • Calculus (Nonprocedural) - user specifies what data is required without specifying how to get those data • Commercial languages • SQL COMP3311 Fall 2017 CSE, HKUST Slide 19

  20. SQL • Most common language – used in all commercial DBMS • In addition to DML, also DDL and more. • Example SELECT Name FROM Students WHERE Dept = CSE COMP3311 Fall 2017 CSE, HKUST Slide 20

  21. Transaction Management • A transaction is a collection of operations that performs a single logical function in the database Example: ATM withdrawal Read account record Modify balance Write back modified record Give money to customer • Transaction-management component ensures that the database remains in a consistent (correct) state despite system failures (e.g. power failures and operating system crashes) and transaction failures. COMP3311 Fall 2017 CSE, HKUST Slide 21

  22. Transaction 1 Transaction 2 Conflicting read/write Concurrency-control Management • Concurrency-control manager controls the interaction among the concurrent transactions, to ensure the consistency of the database. COMP3311 Fall 2017 CSE, HKUST Slide 22

  23. Storage/Buffer Management • The storage manager is a module that provides the interface between the low-level data stored in the database and the application programs and queries submitted to the system. • The buffer manager is responsible for fetching data from disk storage into main memory and deciding what data to cache in main memory. COMP3311 Fall 2017 CSE, HKUST Slide 23

  24. Database Administrator (DBA) • Coordinates all the activities of the database system; the database administrator has good understanding of the enterprise’s information resources and needs. • Database administrator’s duties include: • Schema definition • Specifying integrity constraints • Storage structure and access method definition • Schema and physical organization modification • Granting user authority to access the database • Acting as liaison with users • Monitoring performance and responding to changes in requirements Primary job of a database designer More system oriented COMP3311 Fall 2017 CSE, HKUST Slide 24

  25. Database Users • Users are differentiated by the way they expected to interact with the system • End users • invoke one of the existing application programs (e.g., print monthly sales report) • Interact with applications through GUI • Application programmers • Develop applications that interact with DBMS through DML calls • Sophisticated users • form requests in a database query language • mostly one-time ad hoc queries COMP3311 Fall 2017 CSE, HKUST Slide 25

  26. Overall System Architecture COMP3311 Fall 2017 CSE, HKUST Slide 26

  27. Application Architectures • Two-tier architecture: E.g. client programs using ODBC/JDBC to communicate with a database • Three-tier architecture: E.g. web-based applications, and applications built using “middleware” COMP3311 Fall 2017 CSE, HKUST Slide 27

More Related