Chapter 2 - PowerPoint PPT Presentation

paul
chapter 2 l.
Skip this Video
Loading SlideShow in 5 Seconds..
Chapter 2 PowerPoint Presentation
play fullscreen
1 / 45
Download Presentation
Chapter 2
141 Views
Download Presentation

Chapter 2

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Chapter 2 Chapter 2 Introduction to Databases

  2. Objectives Introduce databases and database programs  Compare file processing system and integrated databases Discuss various approaches to developing a database Describe relational data model and entity-relationship model (ER) Describe primary keys, foreign keys and referential integrity 2

  3. Objectives • Define entity relationships •  Describe data normalization process •  Describe relational database implementation • Create Microsoft Access Sample Database • Create Microsoft SQL Server 2005 Sample Database • Create Oracle Sample Database 3

  4. Databases, Database Programs and File Processing Systems A database is a structured collection of data stored in a computer The computer programs used to create, manage, and query databases are known as Database Management Systems (DBMS) File processing systems are precursors of the integrated database approach File processing systems are advantageous when data is static and applications are simple; it eliminates the need for an expensive DBMS 4

  5. Integrated Databases A better alternative to a file processing system is an integrated database approach. All data belonging to an organization is stored in a single database in an integrated database. A user may interact either directly with the DBMS or via a program written in a programming language such as C++, Java or Visual Basic. Integration implies a logical relationship, usually provided through a common column in the tables. 5

  6. Advantages of Integrated Databases Users can access the data simultaneously, create reports, and manipulate the data. All related data exists in a single database, therefore there is a minimal need of data duplication. Reducing data redundancy will lead to data consistency. DBA takes into account the needs of various departments and balances it against the overall need of the organization. 6

  7. Advantages of Integrated Databases Security is achieved through various means such as controlling access to the database through passwords, providing various levels of authorizations, data encryption, and restricted views Integrated Databases provide data independence. Data independence implies that even if the physical structure of a database changes the applications are allowed to access the database as before the changes were implemented. 7

  8. Disadvantages of Integrated Databases Exorbitant costs; the hardware, the software, and maintenance are expensive. Providing security, concurrency, integrity, and recovery adds further to this cost. DBMS consists of complex set of programs; trained personnel are needed to maintain an integrated database. 8

  9. Database Development ProcessSystems Development Life Cycle Approach Problem Identification – identify user requirements Project Planning - Establish scope of the project. Problem Analysis - Specify detailed requirements Logical Design – Determine screen designs, report layout designs, data models etc. Physical Design - Develop physical data structures Implementation – Code programs; perform testing. 9

  10. Database Development ProcessThrough Phased Process Planning and Analysis– includes requirement specifications, evaluating alternatives, determining input, output, and reports Conceptual Design – develop a conceptual schema based on the requirement specification Logical Design – Define the tables (entities) and fields (attributes). Identify primary and foreign key for each table. Define relationships between the tables. Physical Design – Develop physical data structures; specify file organization, and data storage etc. Implementation - Implement the physical design. Perform testing. Modify if necessary 10

  11. CSE_DEPT Sample Database • A computer science department sample database, CSE_DEPT, is used as a running example in this book • It is made up of five tables – • LogIn(user_name, pass_word. Faculty_id, student_id) • Faculty(faculty_id, name, office, phone, college, title, email) • Course (course_id, course, credit, classroom, schedule, enrollment, faculty_id) • Student (student_id, name, gpa, credits, major, schoolYear, email) • StudentCourse(s_course_id, student_id, course_id, credit, major) 11

  12. Table 2-1 LogIn 12

  13. Table 2-2 Faculty 13

  14. Table 2-3 Course 14

  15. Table 2-4 Student 15

  16. Table 2-5 StudentCourse 16

  17. Records and Fields in a Table

  18. Relational Data Model A model describes the structure of the database and various data relationships and constraints on the data Relational model implies that a user perceives the database as made up of relations, a database jargon for tables The most commonly used tool is Entity-Relationship Model (ER) 18

  19. Entity-Relationship Model (ER) The ER model is based on the perception that the real world is made up of entities The ER model is graphically depicted as Entity-Relationship diagrams (ERD) The three major components of ERD are entities, relationships, and the attributes. 19

  20. Entities, Relationships, and Attributes. An entity is a data object, either real or abstract, about which we want to collect information A database is made up of related entities An entity in an ER diagram translates into a table Natural association between the entities is referred to as relationship Each entity has properties called attributes 20

  21. Primary Key and Entity Integrity An attribute that uniquely identifies one and only one instance of an entity is called a primary key When a primary key consists of a combination of attributes, it is referred to as a composite key Entity integrity rule states that no attribute that is a member of the primary (composite) key may accept a null value When more than one attribute uniquely identify an instance of an entity, they are referred to as candidate keys 21

  22. Foreign Keys and Referential Integrity • Foreign keys are used to create relationships between tables • The values of an attribute in one table are required to match the values of primary key in another table • Foreign keys are created to enforce referential integrity; it implies that • you may not add a record to a table containing a foreign key unless there is a corresponding record in the related table to which it is logically linked • every value of foreign key in a table must match the primary key of a related table or be null 22

  23. Relationships The entities can have one-to-one, one-to-many, and many-to-many relationships A one-to-one (1:1) relationship occurs when one instance of entity A is related to only one instance of entity B. A one-to-many (1:M) relationship occurs when one instance of entity A is associated with zero, one, or many instances of entity B. However, entity B is associated with only one instance of entity A A many-to-many (M:N) relationship occurs when one instance of entity A is associated with zero, one, or many instances of entity B. And one instance of entity B is associated with zero, one, or many instance of entity A. 23

  24. One to One Relationship

  25. One to Many Relationship

  26. Many-to-Many relationship between Student and Course tables

  27. Relationships in the CSE-DEPT Database 27

  28. ER Notation

  29. Data Normalization Optimum database structure is achieved by eliminating redundancies, inefficiencies, update and deletion anomalies Data normalization is a progressive process The steps in the normalization process are called normal forms Each normal form progressively improves the database and makes it more efficient 29

  30. Normal Forms • First normal form (1NF) implies • that the values in each column of a table are atomic, that is there are no repeating groups of data • Second normal form (2NF) implies • that the table is already in 1NF and every non-key column is fully dependent upon the primary key • Third normal (3NF) form implies • that the table is already in 2NF and every non-key column is non- transitively dependent upon the primary key. In other words all non-key columns are mutually independent, but at the same time they are fully dependent upon the primary key only 30

  31. Unnormalized Faculty Table with Repeating Groups

  32. Normalized Faculty Table (1NF)

  33. Decompose Faculty Table (1NF) to 2 Tables - Faculty and Office (2NF)

  34. Old Course Table in 2 NF(Convert to 3NF)

  35. Old Course Table(2NF) Decomposed Into Course and Instructor Tables (3 NF) Course Instructor

  36. File Server and Client Server Databases • File Server - data is stored in a file and each user retrieves the data, displays the data, or modifies the data directly from or to the file • Client Server - the data is also stored in a file, however, all operations are mediated through a master program called a server • MS Access is a File Server database • Microsoft SQL Server and Oracle are Client Server databases • Advantages of Client Server databases • minimize chances of crashes • provide features for recovery • enforce security • better performance, and more efficient use of the network 36

  37. Microsoft Access Databases A Jet engine (Joint Engine Technology) allows the manipulation of MS Access databases The supporting software provides security, integrity, indexing, record locking etc Access database is made up of a number of components called objects which are stored in a single file referred to as database file A database container is created before creating a table; it is saved with the extension .mdb The data may be retrieved, updated or manipulated, without having to write complex programs, by writing SQL statements 37

  38. SQL Server Databases • The Microsoft SQL Server Database Engine provides a service for storing and processing data in either a relational (tabular) format or as XML documents • Various tasks performed by the Database Engine include: • Designing and creating a database to hold the relational tables or XML documents. • Accessing and modifying the data stored in the database. • Implementing Web sites and applications • Building procedures • Optimizing the performance of the database 38

  39. SQL Server Databases The SQL Server database is a complex entity, made up of multiple components; It is more complex than MS Access database SQL Server database can be stored on the disk using three types of files – primary data files, secondary data files, and transaction log files The database files have an extensions of .mdf The transaction log files carry .ldf extension. 39

  40. Components of SQL Server Databases Data Files - include tables, keys, views, stored procedures etc. Tables - the first objects created in a database Views - Views are virtual tables; they do not contain any data; they are stored as queries in the database and are executed when needed Stored Procedures – queries stored on the client machine or on the server; users may use same queries over and over again Keys and Relationships – primary keys ensure entity integrity; Foreign keys enforce referential integrity Indexes - used to find records, quickly and efficiently, in a table Transaction Log Files - keep track of transactions that are applied to the database 40

  41. Oracle Databases Architecturally more complex and contains more files than SQL Server database The Oracle DBMS comes in three levels: Enterprise, Standard, and Personal Oracle architecture is made up of several components including an Oracle server, Oracle instance and an Oracle database The Oracle server contains several files, processes, and memory structures An Oracle instance consists of background processes and memory structures; background processes perform input/output, monitor Oracle processes for better performance and reliability 41

  42. Components of Oracle Databases Data files -provide the actual physical storage for the data Tables – the data may be stored in a regular table, partitioned table, index-organized table, or clustered table Views - Views are like virtual tables and are used in a similar fashion as in the SQL Server databases Stored Procedures - functions & procedures may be saved as stored program units 42

  43. Components of Oracle Databases Indexes – may be partitioned or nonpartitioned; may consist of a single column or multiple columns; may be unique or non-unique Initialization Parameter files - contains a list of instance parameters, the name of the database the instance is associated with, name and location of control files and information about the undo segments Control Files - are read to determine if the database is in a valid state or not Password files - Oracle provides a password utility to create password file; used for authentication 43

  44. Chapter Summary A detailed discussion and analysis of the structure and components about databases are provided in this chapter. Some key technologies in developing and designing database are also given and discussed in this part. The procedure and components to develop a relational database are analyzed in detailed with some real data tables in our sample database CSE_DEPT. The process in developing and building a sample database is discussed in detailed with the following points: 44

  45. Chapter Summary - 2 Defining Relationships Normalizing the Data Implementing the Relational Database In the second part of this chapter, three sample databases that are developed with three popular database management systems such as Microsoft Access, SQL Server 2005 and Oracle Database 10g XE are provided in detail. All of these three sample databases will be used in the following chapters through out the whole book. 45