1 / 0

DataBase Systems

DataBase Systems. Lecture 21 Dr. Adam P. Anthony. Overview. Database Fundamentals System components Schemas Database Models The Relational Model Survey of Database Products Data Mining Social Issues. What is a Database?. Non-Technical: A computer program that stores data

abedi
Download Presentation

DataBase Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. DataBase Systems

    Lecture 21 Dr. Adam P. Anthony
  2. Overview Database Fundamentals System components Schemas Database Models The Relational Model Survey of Database Products Data Mining Social Issues
  3. What is a Database? Non-Technical: A computer program that stores data Examples of Databases using this definition? Technical: A computer program that stores structured data in an organized fashion, and provides the ability to grant limited, or restricted access to different users based on their data requirements. Examples of Databases using this definition? Anything ruled out that we came up with above?
  4. Database Systems Initially: each department maintained their own “database” Sometimes just a text or Excel file Problems with duplication? Problems with sharing information? Now: One database holds everything, and each department is given access only to the data they need
  5. Schemas Schema  full description of all the database, including what information is provided and how it interacts Used to maintain and program the database Provides a nice overview of what data a business collects and (perhaps) how they use that data Subschema  Partial description of only a few relevant items Anything in a Subschema will also appear in the database’s Schema Based on providing only what the user needs to do their job Provides basic security: payroll record example
  6. The Database Management System Software applications can be enhanced by working with a database Especially on the web! (Google, Amazon, etc.) Database Management System (DBMS): has all the necessary code for organizing, manipulating data When application needs data, it just asks the DBMS to provide it
  7. More on Database Management Systems Database Management System (DBMS): A software layer that manipulates a database in response to requests from applications Schemas and subschemasare created, managed by the DBMS Can have a different subschema for each application that accesses the DBMS Distributed Database: A database stored on multiple machines Data is shared between each machine over a network DBMS will mask this organizational detail from its users Data independence: The ability to change the organization of a database without changing the application software that uses it
  8. Database Models Bits and bytes are hard to talk about! Want to reflect the organization of our data without thinking about the complex programming techniques needed to keep our data organized A Database Model provides a conceptual view of the database that gives us a ‘general idea’ of how the data is organized Flat File Model (not really a DB) Hierarchical Model Object-Oriented Model Relational Model
  9. The Relational Database Model Pretty Simple, on the surface A Relation is just a rectangular table of data An Attribute is any column in that table A Tuple is any row in the table There is complexity not in the table, but in how two different tables can interact If two relations have a common Attribute, then we can combine tuples from the two tables to get a more complex piece of data!
  10. A relation containing employee information
  11. Designing a Good Relational Database When we set up the DBMS, we could just put all our data in one relation, like a big Excel file! But that’s no fun! And, it can cause problems Redundancy: When the same data is repeated unnecessarily Deletion Anomalies: When deleting a tuple results in the loss of important information We can prevent some of these problems using a multi-relation design!
  12. A relation containing redundancy
  13. An employee database consisting of three relations
  14. The Employee Database Schema Employee Empl ID Name Address SSN Assignment Empl ID Job ID Start Date Term Date Job Job ID Job Title Skill Code Dept
  15. Combining Data From Multiple Tables In which departments has G. Jerry Smith worked?
  16. Designing a Relational Database Given a data problem: What are the entities/objects in this problem? How do those entities interact? Make one relational table for each entity Make one relational table for each interaction
  17. Identifying/Avoiding Redundancy Sometimes it’s obvious in the design Two columns with same/similar names Sometimes you need to focus on how the data will be used Common perpetrators: Addresses—more than one person can live at same address Organizations—more than one person belongs to a single organization Owners—a person/company can own more than one asset Creators—a creator may have many creations
  18. In-Class Example (from book) Chapter Review, question 18 (similar to assigned problem 20) Design a relational database containing information about music performers, their recordings, and the composers of the music they recorded (Avoid redundancies similar to those in Figure 9.4)
  19. SQL (Structured Query Language) Developed at IBM as a part of their Relational DBMS project Basic language humans can use to express, with precision, the data that they want from each (separate) relational table, and how it should be combined: select EmplId, Deptfrom ASSIGNMENT, JOBwhere ASSIGNMENT.JobId = JOB.JobId and ASSIGNMENT.TermData = “*”
  20. Survey of Database Products MySQL Microsoft Access Microsoft SQLServer Oracle DB
  21. MySQL Free, Open-Source database product Little support from creators, minimal software for managing the system Must know a great deal about database design fundamentals in order to use successfully on a large scale Can be done, but requires expert knowledge Good Choice for: temporary data solutions Zero-budget scenarios (but only if you have expert users to run it!) Hiring people to run the system costs money! Internet applications Facebook?? Poor Choice for: Non-Expert users MASSIVE data problems (but Facebook??)
  22. Microsoft Access Cheap, business-oriented database product Emphasis on user-friendliness, ease of development Packaged with ‘data templates’ that might automatically fit your data needs, no design required! very nice application-development features that are popular in the business world Limits on amount of data that can be stored in a single database Data may be as complex as on any other DBMS, just limited in size Good choice for: Small business settings Non-Expert Users Temporary data solutions Poor choice for: Large, enterprise-level data solutions Sharing data with a large number of people
  23. Microsoft SQLServer and Oracle DB Large, expensive enterprise-level DBMS’s Distributed DBMS support Provides both expert-level and non-technical tools for interaction with DBMS SQLServer more-so than Oracle Good For: Everything (if you can afford them!) Large jobs Bad for: Low-budget scenarios (except for SQLServer Express!)
  24. Data Mining Data Mining = Buzz word that covers any action that seeks to detect a previously unknown pattern from a large data set Encompasses many areas of Computer Science, Mathematics, Statistics Class Description: Divide data set into pre-determined groups (e.g., Mexican food lovers and Italian food lovers) Characterize each group by identifying common patterns “What type of person enjoys Mexican food?” Class Discrimination: Start with an initial data set, generate a class description for each group Given a new (incomplete) item, predict which group it will fall into: “Will this new customer prefer Mexican or Italian food?”
  25. Data Mining, Continued Cluster Analysis: Similar to Class Description but now the groups aren’t predetermined! Must find the most distinct groups AND describe them at the same time! Much more difficult Association Analysis Find patters between groups! Big in market research “Are Mexican food lovers also lovers of imported beers?”
  26. Social Impact of Databases Problems Massive amounts of personal data are being collected Often without knowledge or meaningful consent of affected people Data merging produces new, more invasive information Errors are widely disseminated and hard to correct Remedies Existing legal remedies often difficult to apply Negative publicity may be more effective
More Related