1 / 60

Distributed Database Design

Distributed Database Design. COSC 5040 Week One. Outline. Introduction Course overview Database systems concepts Relational database model Structured query language (SQL). Database System Concept. Data Known facts Database A collection of related data

Download Presentation

Distributed Database Design

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Distributed Database Design COSC 5040 Week One

  2. Outline • Introduction • Course overview • Database systems concepts • Relational database model • Structured query language (SQL) Distributed Database Design

  3. Database System Concept • Data • Known facts • Database • A collection of related data • Database Management System (DBMS) • A software system to facilitate the defining, constructing, manipulating, and sharing of a computerized database • Database System • The DBMS software together with the data itself • Sometimes, the applications are also included Distributed Database Design

  4. Typical DBMS Functionality • Define a database • Construct and load the database • Manipulating the database • Querying, generating reports, insertions, deletions and modifications • Concurrent processing and sharing • Protection or security • Presentation and visualization Distributed Database Design

  5. Database System Environment Distributed Database Design

  6. Example of a Database • Figure 1.2 A database that stores student and course information. Distributed Database Design

  7. Example of a Database Distributed Database Design

  8. Database Manipulation • Database manipulation involves querying and updating • P. 9 • Examples of queries • Examples of updates Distributed Database Design

  9. Database Approach Characteristics • Self-describing nature of a database system • Meta-data • Insulation between programs and data, data abstraction • Program-data independence • Support of multiple views of the data • Virtual data • Sharing of data and multi-user transaction processing • Concurrency control Distributed Database Design

  10. Database Users • Actors on the scene • Database administrators (DBA) • Authorizing access to the database • Acquiring software, and hardware resources • Controlling and monitoring efficiency of operations • Database designers • Define content, structure, constraints, and functions or transactions • Communicate with the end-users • End-users • Queries, reports • Update the database content • Actors behind the scene Distributed Database Design

  11. Database Users Distributed Database Design

  12. Advantages of Database Approach • Controlling redundancy • Restricting unauthorized access • Providing persistent storage • Providing storage structures for efficient query processing • Providing backup and recovery • Providing multiple interfaces • Representing complex relationships among data • Enforcing integrity constraints • Drawing inferences and actions Distributed Database Design

  13. Historical Development • Early database applications • Hierarchical model • Network model • Relational model based systems • Object-oriented applications: OODBs and ORDBMSs • Web and e-commerce applications • Database for new applications Distributed Database Design

  14. Data Models • Data model • Data abstraction • A collection of concepts that can be used to describe the structure of a database • Entities, attributes, relationships • Data types, constraints • Categories of data models • Conceptual (high-level, semantic) data models • Implementation (representational) data models • Physical (low-level, internal) data models Distributed Database Design

  15. Schemas and Instances • Database schema • Description of a database • Schema diagram • Diagrammatic display of a database schema • Database state • Actual data in the database at a particular moment in time • Current set of occurrences or instances Distributed Database Design

  16. Schema Diagram Distributed Database Design

  17. Three-Schema Architecture Distributed Database Design

  18. Data Independence • Logical data independence • The capacity to change the conceptual schema without having to change the external schemas and their application programs • Physical data independence • The capacity to change the internal schema without having to change the conceptual schema Distributed Database Design

  19. DBMS Languages • Structured query language (SQL) • Data definition language (DDL) • To specify database conceptual schema • Data manipulation language (DML) • To specify database retrievals and updates • DBMS Interfaces • Stand-alone query language interfaces • Programmer interfaces for embedding DML in programming languages Distributed Database Design

  20. Database System Utilities • To perform certain functions such as: • Loading data stored in files into a database • Data conversion tools • Backing up the database periodically • Reorganizing database file structures • Report generation utilities • Performance monitoring utilities • Sorting, user monitoring, data compression • Data dictionary Distributed Database Design

  21. Client-Server Architectures • Centralized architecture • Client-server architecture • Client • Provide appropriate interfaces and a client-version of the system to access and utilize the server resources • Server • Provides services to clients • Database server provides database query and transaction services to clients Distributed Database Design

  22. Three Tier Client-Server Architecture Distributed Database Design

  23. Classification of DBMS • Based on data model • Relational • Network • Hierarchical • Object-oriented • Object-relational • Other classifications • Single-user vs. multi-user • Centralized vs. distributed Distributed Database Design

  24. Relational Model Concepts • The relational model is based on the concept of a relation • A relation is a mathematical concept based on the ideas of sets • Relation: A table of values • Contains a set of rows and columns Distributed Database Design

  25. Example of a Relation Distributed Database Design

  26. Definitions • The schema, or description of a relation • R (A1, A2, .....An) • CUSTOMER (Cust-id, Cust-name, Address, Phone#) • A tuple is an ordered set of values • Each value is derived from an appropriate domain • A domain is a set of atomic values • Data type or format • An attribute designates the role played by the domain Distributed Database Design

  27. Definitions • The relation is formed over a subset of the Cartesian product of the sets • Each set has values from a domain • That domain is used in a specific role which is the attribute name • Given R(A1, A2, .........., An) • r(R)  dom (A1) X dom (A2) X ....X dom(An) • R: schema of the relation • r of R: a specific "value" or population of R Distributed Database Design

  28. Example • Let R(A1, A2) • Let dom(A1) = {0,1} • Let dom(A2) = {a,b,c} • Then, for example: • r(R) = {<0,a> , <0,b> , <1,c> } • is one possible “state” or “population” or “extension” r of the relation R, defined over domains D1 and D2 • It has three tuples Distributed Database Design

  29. Definition Comparison Distributed Database Design

  30. Characteristics of Relations • Ordering of tuples in a relation r(R) • The tuples are not considered to be ordered • Ordering of values within each tuple • The attributes in R(A1, A2, ..., An) and the values in t=<v1, v2, ..., vn> are ordered • Values in a tuple • All values are considered atomic (indivisible) • A special null value is used to represent values that are unknown or inapplicable to certain tuples Distributed Database Design

  31. Relational Integrity Constraints • Constraints are conditions that must hold on all valid relation instances • Types of constraints • Domain constraints • Key constraints • Entity integrity constraints • Referential integrity constraints Distributed Database Design

  32. Key Constraints • Uniqueness • A set of attributes of R such that no two tuples in any valid relation instance r(R) will have the same value • Minimal • Removal of any attribute results in a set of attributes that is not a key • If a relation has several candidate keys, one is chosen to be the primary key • The primary key value is used to uniquely identify each tuple in a relation Distributed Database Design

  33. Foreign Key • A set of attributes in one relation that references the primary key in another relation • Same domain(s) • Value of foreign key either occurs as a value of primary key or is null Distributed Database Design

  34. Entity and Referential Integrity • Entity integrity constraint • No primary key value can be null • Referential integrity constraint • Foreign key value can be either an existing primary key value or a null value Distributed Database Design

  35. Update Operations • Update operations • Insert a tuple (p. 76) • Delete a tuple (p. 77) • Update a tuple (p. 78) • Maintain integrity constraints • Child insert restrict • Child update restrict • Parent update restrict • Parent delete restrict Distributed Database Design

  36. Relational Database Schema Distributed Database Design

  37. Distributed Database Design

  38. Exercise 3.16 Consider the following relations for a database that keeps track of student enrollment in courses and the books adopted for each course: STUDENT(SSN, Name, Major, Bdate) COURSE(Course#, Cname, Dept) ENROLL(SSN, Course#, Quarter, Grade) BOOK_ADOPTION(Course#, Quarter, Book_ISBN) TEXT(Book_ISBN, Book_Title, Publisher, Author) Specify the foreign keys for this schema, stating any assumptions you make. Distributed Database Design

  39. SQL • Structured query language (SQL) • SQL-86 or SQL1 • SQL-92 or SQL2 • SQL-99 or SQL3 • Comprehensive database language • Data definition (DDL) • Data manipulation (DML) • Query • Update Distributed Database Design

  40. Data Definition Language (DDL) • Used to CREATE, DROP, and ALTER the descriptions of the tables (relations) of a database • Data types • Numeric • Character string • Boolean • Data/time Distributed Database Design

  41. CREATE TABLE • Specifies its name, its attributes and their data types • A constraint NOT NULL may be specifiedCREATE TABLE DEPARTMENT ( DNAME VARCHAR(10) NOT NULL, DNUMBER INTEGER NOT NULL, MGRSSN CHAR(9), MGRSTARTDATE CHAR(9)); Distributed Database Design

  42. CREATE TABLE • Use the CREATE TABLE command for specifying • Primary key attributes • Secondary keys, and • Referential integrity constraints (foreign keys) CREATE TABLE DEPT ( DNAME VARCHAR(10) NOT NULL, DNUMBER INTEGER NOT NULL, MGRSSN CHAR(9), MGRSTARTDATE CHAR(9), PRIMARY KEY (DNUMBER), UNIQUE (DNAME), FOREIGN KEY (MGRSSN) REFERENCES EMP ); Distributed Database Design

  43. DROP TABLE and ALTER TABLE • Remove a relation (base table) and its definition DROP TABLE DEPENDENT; • Add an attribute to one of the base relations ALTER TABLE EMPLOYEE ADD JOB VARCHAR(12); Distributed Database Design

  44. Retrieval Queries in SQL • One basic statement for retrieving information from a database • SELECT statement • Basic form is a SELECT-FROM-WHERE block SELECT <attribute list> FROM <table list> WHERE <condition> Distributed Database Design

  45. Simple SQL Queries • Query 0: • Retrieve the birthdate and address of the employee whose name is 'John B. Smith' SELECT BDATE, ADDRESS FROM EMPLOYEE WHERE FNAME='John' AND MINIT='B' AND LNAME='Smith'; Distributed Database Design

  46. Simple SQL Queries • Query 1: • Retrieve the name and address of all employees who work for the 'Research' department SELECT FNAME, LNAME, ADDRESS FROM EMPLOYEE, DEPARTMENT WHERE DNAME='Research' AND DNUMBER=DNO; • DNAME='Research' is a selection condition • DNUMBER=DNO is a join condition Distributed Database Design

  47. Simple SQL Queries • Query 2: • For every project located in 'Stafford', list the project number, the controlling department number, and the department manager's last name, address, and birthdate SELECT PNUMBER, DNUM, LNAME, ADDRESS, BDATE FROM PROJECT, DEPARTMENT, EMPLOYEEWHERE DNUM=DNUMBER AND MGRSSN=SSN AND PLOCATION='Stafford'; • There are two join conditions • DNUM=DNUMBER relates a project to its controlling department • MGRSSN=SSN relates the controlling department to the employee who manages that department Distributed Database Design

  48. Aliases • A query that refers to the same name must qualify the attribute name with the relation name • Some queries need to refer to the same relation twice • Query 8: • For each employee, retrieve the employee's name, and the name of his or her immediate supervisorSELECT E.FNAME, E.LNAME, S.FNAME, S.LNAMEFROM EMPLOYEE E, EMPLOYEE SWHERE E.SUPERSSN=S.SSN; Distributed Database Design

  49. Unspecified Where-Clause • Query 9: • Retrieve the SSN values for all employees SELECT SSNFROM EMPLOYEE; • Query 10: • Retrieve the SSN and department name values for all employees SELECT SSN, DNAMEFROM EMPLOYEE, DEPARTMENT; Resulting CARTESIAN PRODUCT Distributed Database Design

  50. Use of Asterisk * • Q1C: SELECT * FROM EMPLOYEE WHERE DNO=5; • Q1D: SELECT * FROM EMPLOYEE, DEPARTMENT WHERE DNAME='Research' AND DNO=DNUMBER; To retrieve all the attribute values Distributed Database Design

More Related