1 / 39

Chapter 1: Introduction

Chapter 1: Introduction. What is a database? shared file containing integrated data with controlled redundancy Often implemented as a group of related tables (examples on pages 4-5) Relationships between tables often implemented as other tables. Making Distinctions. Database :

didina
Download Presentation

Chapter 1: Introduction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 1: Introduction • What is a database? • shared file containing integrated data with controlled redundancy • Often implemented as a group of related tables (examples on pages 4-5) • Relationships between tables often implemented as other tables.

  2. Making Distinctions • Database: • A collection of tables as in the previous slide • Database management system (DBMS): • Database plus tools to process requests, enforce integrity constraints, provide security, analyze usage, optimize access, etc. • Database application. • Software that accesses data from a database.

  3. Definition: • Relation: • 2-dimensional table having the following properties • Table entries are single valued or atomic. • This means that an entry does not consist of a structure more complex than strings, dates, etc. • For example, a column type cannot be another table or relation.

  4. entries in any one column are of the same type • Each column, also called attribute, has a name • order of columns & rows is unimportant • no two rows are identical

  5. NOTE: The purist may disagree with this definition as he or she will see a table as an implementation of a relation. • A relation is more of an abstract concept but is usually implemented using a table. • Some do use the terms interchangeably.

  6. Benefits • reduce redundancy and inconsistency. • e.g. most student information from the tables on page 5 is not replicated. • One fact is in one place. • Data is shared • Security is applied centrally

  7. languageindependent (COBOL, VB, C++, java, C#, etc) • Multiple applications • Easier to maintain integrity • data independence • allow data access without knowledge of its internal organization and structure

  8. Design: How many tables? • This is an important decision, and based on a set of rules known as normalization, which we will cover later. • However, the figure on page 18 illustrates a simple example related to an important issue.

  9. SQL: Structured Query Language • Used to extract information from one or more tables. • Can specify what you want without specifying how to get it. • Example on p. 9 • Can be standalone or embedded in application software.

  10. General format Select stuff From one or more tables Where conditions • Ranges from nearly trivial to fairly complex logic

  11. Some Definitions • metadata (Sysfiles) - also data dictionary. • Description of all tables in the database. • Example on p. 12. • More of a concern for a Database Administrator (DBA). • Know what it is - but we will not focus on it.

  12. Client/Server Environment • Database is stored on a server • Application software often runs on a client using languages such as C# and others. • Typically written by application programmers.

  13. Stored Procedure. • Procedures stored on a server. • Typically written by DB people. • Can be invoked by client applications. • Can be used for common activities used by multiple users. • Some DBAs may limit database access except through certain stored procedures. • It gives them more control.

  14. Trigger. • A special type of procedure that is invoked automatically when a certain action occurs. • Can be used to make sure needed data elements are updated due to user actions. • Example: A student adds a class and a trigger is activated to update tuition & fees. • The app that adds the student does not know about the trigger but the DBMS knows.

  15. Building a database: • Some terms: • Entity-Relationship (E-R) Diagrams. • Entitiesare somewhat like the classes you’ve designed in previous courses. • Relationships define how entities are related to one another. • Together they must reflect the reality as it is understood.

  16. Design phase: • Design entities, relationships, and constraints consistent with perceived reality. • Test phase: • Create tables, stored procedures, triggers, forms, reports, etc. consistent with the E-R diagram and test. • Implementation phase: • Put into production

  17. Early database models: • Hierarchical model • IMS (Information Management System) • Developed largely by IBM • Required all data be organized as a hierarchy • Tended to be awkward since not all realities are hierarchical in nature

  18. network model (CODASYL-Conference on Data Systems Languages) • Data organized into complex graph (network) data structures. • Application programs reflected the actual data structures. • Changes in design potentially affect ALL applications – costly!!

  19. Relational Model (dominant form today) • Object Model (not a commercial success). • See table on p. 21 for a general history – also the prose on p. 23. • We will NOT cover web-based databases/services since we already have a course for that.

  20. Relational Model • Edgar Codd’s landmark paper in CACM (Communications of the Association of Computing Machinery) A relational Model of data for large shared databases in 1970. • Codd, a mathematician working for IBM in San Jose CA, applied concepts of relational algebra to the problem of a “stored data bank”. • Paved the way for the development of the relational database.

  21. Mapping objects to relational databases. • E.g. how does an object oriented program access non-object-oriented data in a relational database? • http://www.agiledata.org/essays/mappingObjects.html • We will see how this works when we discuss ADO.NET

  22. Data Structures for databases: Appendix D • We will not focus on complex data structures but there are a few things you must be aware of. • This appendix is online in a zipped file at http://wps.prenhall.com/bp_kroenke_database_11/127/32761/8386898.cw/index.html

  23. Disk • Contains concentric magnetic tracks on each surface. • Each track is divided into sectors.

  24. Disk head moves radially inward and outward while the disk rotates. • Disks are SLOW and a potential bottleneck • Need to minimize disk head movement (seek time) for optimal performance. • Rotational delays (time for sector to rotate past the head) also a factor

  25. File Organizations • Linked List: • Database records, disk sectors, or clusters of sectors are maintained in a linear linked list. • Simple but can be very slow, especially for finding a record based on a key or index value. • i.e. find an employee record given the employee’s ID. • See pages D-3 and D-4 of the appendix.

  26. Indexes: • list of field values that identify records along with the location of that record. • List can be linear or some other structure. • e.g. a textbook often has a linear index at the end • Searching the index is a lot faster than searching through all of the content. • However, for many millions of records a linear index can still be inefficient.

  27. B-tree • hierarchical arrangement of index values. • Provides quick access & order to data. • See pages D-5 and D-6 of the appendix. • Typically each level would correspond to a sector or cluster. • Might have millions or records accessible via only a few index layers. • [http://technet.microsoft.com/en-us/library/cc917672.aspx

  28. Hash function • Index value (sometimes called a key) fed into a hash function which specifies where to store the entire record. • To locate a record, given its key, apply the hash function to the key value and the location is calculated.

  29. R is a record R.k is the value of R’s key field H is a hash function that calculates a location Hash table (database records) H (R.k) R is stored here

  30. Time to find a record can be independent of the number of records. • Assumes a good hash function and sufficient space. • Each are nontrivial and the subject of a course in data structures or algorithms.

  31. Indexes • Dense • 1 entry for each record • Useful if records are stored in random order • Non-dense • 1 index for a group of records (say 1 on a page) • Useful if records are maintained in order

  32. 3 Levels (views) of a database. • Internal • physical storage • conceptual, sometimes the DBA view • a collection of Base Tables • Base table is a table with a direct underlying storage structure. • Created from the E-R diagram • described by a data dictionary or metadata.

  33. External, sometimes user view • Collection of tables defined for a particular user using SQL. • They are called logical tables, virtual tables, or derived tables, or just view • The data in a view is presented to the user as a single table though it is actually derived from one or more base tables specified in its definition. • These logical tables do not exist in the same sense as a base table – there is NO direct underlying storage structure.

  34. A view can simplify the user’s view of the database and provide security by hiding certain parts of a base table. • Examples: • Table consisting of student with a given major or GPA value) • the single table in Fig 1-20 (page 18) could be a view derived by joining the two other base tables in the same figure.

  35. user DBA user user

  36. DBA (Database Administrator) • defines conceptual schema, internal schema, user liaison, security and integrity, backup/recovery, performance

  37. Microsoft SQL Server 2008 • Accessing SQL Server: • StartAll Programs Microsoft SQL Server 2008 R2SQL Server Management Studio. • You may see a window indicating the MS SQL Server Management Studio is configuring for first time use. • Just wait and be patient.

  38. In the Connect to Server window, select • Database Engine for the Server Type • ICSD for the server name • Windows Authentication for Authentication (These should all be defaults). • Then press the connect button. • In the Object Explorer pane (left side of screen), expand the Databasesfolder. • If you don’t see an Object Explorer Pane, select it from the View menu.

  39. There are four databases that start with “CS451” you should have read-only access to each one. I will use these during the semester. • To see the tables in one of them, expand the database folder and the subsequent Tables folder that appear. Right click on one of the table names and select Select Top 1000 Rows. • Test this and let me know of any access problems.

More Related