1 / 55

CSC443 Database Management

CSC443 Database Management. Course Introduction Professor Pepper adapted from presentations given by Professor Juliana Freire & Karl Aberer & Yan Chen & Silberschatz, Korth and Sudarshan. Today’s Goals. Course Overview Why study databases? Why use databases? Intro to Databases.

medina
Download Presentation

CSC443 Database Management

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CSC443 Database Management Course Introduction Professor Pepper adapted from presentations given by Professor Juliana Freire & Karl Aberer & Yan Chen & Silberschatz, Korth and Sudarshan

  2. Today’s Goals Course Overview • Why study databases? • Why use databases? • Intro to Databases

  3. Major Course Objectives • Design and diagram relational databases • Create Access and Oracle databases • Use SQL commands • Be able to design a good relational database • Know how to get information out of a database to answer any question

  4. Diagramming • Use Case • Class Diagram • Entity Relationship Diagram • Algebraic Relation Model

  5. Tools • Panther • Unix • Oracle 9.2.0.1.0 • FTP Explorer – register for trial • MS Access

  6. Books • Database System Concepts 5th Ed • Theory • Cross Reference for fourth ed • Oracle 9i Programming - A Primer • Practical examples • See course syllabus • Available in Library

  7. Learning Resources • Blackboard:my.adelphi.edu • Web site Database System Concepts: • www.db-book.com/ • My office hours: • Tuesday & Thursday 12:15-1:30; Wed 12-12:30 • Alumni 114 or Science Lab • My email: pepper@adelphi.edu • My phone: 516-747-2362 • My Web:www.adelphi.edu/~pepperk

  8. Adelphi Account Setup • Panther • Oracle • Blackboard • E-mail • Signin Sheet

  9. Projects / Grading • Projects: 40% • Access – 15 • Oracle - 25 • Homework assignments: 20% • Midterm: 20% • Final: 20%.

  10. Assignments • 2% dropped for anything 1 day late. • 10% dropped for anything 2 weeks late.

  11. Delivering assignments • Email • ftp • drop box • discussion board • mailbox in math department • E-mail me if making a change in delivery place. • forward your email from Adelphi

  12. What is a Database Management System? Database Management System = DBMS • A collection of files that store the data • A big program written by someone else that accesses and updates those files for you Relational DBMS = RDBMS • Data files are structured as relations (tables)

  13. Why Study Databases?

  14. What is behind this Web Site? • http://www.ticketmaster.com/ • Search on a large database • Specify search conditions • Many users • Updates • Access through a web interface Central to Modern Computer Science

  15. Database Systems: Then

  16. Database Systems: Today Field is developing quickly From Friendster.com on-line tour

  17. Other databases you may use Databases are EVERYWHERE

  18. Current Commercial Outlook • A major part of the software industry: • Oracle, IBM, Microsoft, Sybase • also Informix (now IBM), Teradata • smaller players: java-based dbms, devices, OO, … • Well-known benchmarks (esp. TPC) • Lots of related industries • data warehouse, document management, storage, backup, reporting, business intelligence, app integration • Relational products dominant and evolving • adapting for extensibility (user-defined types), adding native XML support. • Open Source coming on strong • MySQL, PostgreSQL, BerkeleyDB

  19. ? Why Study Databases?? • Need exploded • Corporate: retail swipe/clickstreams, “customer relationship mgmt”, “supply chain mgmt”, “data warehouses”, etc. • Scientific: digital libraries, Human Genome project, NASA Mission to Planet Earth, physical sensors, grid physics network

  20. Why study databases? • Data is valuable: • bank account records, tax records, student records… • Protect It! - no matter what • Hurricane • Flood • Human error

  21. Why study databases?Data often structured: • Example: Bank account records all follow the same structure • We can exploit this regular structure • To retrieve data in useful ways (that is, we can use a query language) • To store data efficiently

  22. Why Study Databases Summary • Central to modern computer science • Databases are everywhere • Commercially successful • Fast moving technology • Plethora of structured data that business and people need

  23. What is a database? • Whiteboard Exercise

  24. Database Definition • Database • – a very large, integrated collection of data. (the stuff) • Models a real-world enterprise • Entities (e.g., teams, games) • Relationships (e.g., The Forty-Niners are playing in The Superbowl) • Database Management System • – software that stores and manages databases (the tools)

  25. Database is better than simple file system because: • Data redundancy, inconsistency and isolation • Difficult to access • Integrity problems • Atomicity of updates (change one file and die before the other completes) • Multiple user issues

  26. So a Database Has: • representing information • data modeling • languages and systems for querying data • complex queries with real semantics* • over massive data sets • concurrency control for data manipulation • controlling concurrent access • ensuring transactional semantics • reliable data storage • maintain data semantics even if you pull the plug • * semantics: the meaning or relationship of meanings of a sign or set of signs

  27. Why Use a Database • Why use a database presentation

  28. What is in a database?

  29. Describing Data: Data Models • A data modelis a collection of concepts for describing data. • Aschemais a description of a particular collection of data, using a given data model. • A relation is the data stored in a certain schema • The relational model of datais the most widely used model today. • Entities and relations among them • Integrity constraints and business rules • Perspective dependent (warehouse & sales view item differently)

  30. Database Design The process of designing the general structure of the database: • Logical Design – Deciding on the database schema. • Business decision – What attributes • Computer Science decision – What relation schemas • Physical Design – Deciding on the physical layout of the database

  31. Data Models • A collection of tools for describing • Data • Data relationships • Data semantics • Data constraints • Relational model • Entity-Relationship data model (mainly for database design) • Object-based data models (Object-oriented and Object-relational) • Semistructured data model (XML) • Other older models: • Network model • Hierarchical model

  32. The Entity-Relationship Model • Models an enterprise as a collection of entities and relationships • Entity: a “thing” or “object” in the enterprise that is distinguishable from other objects • Described by a set of attributes • Relationship: an association among several entities • Represented diagrammatically by an entity-relationship diagram:

  33. Relational Model • ER for concept  map to Algebraic Relational Model • Relations (tables of possible data) • Instance (actual data at a given time) • Schema (description of those tables, their relations)

  34. Relational Model Terminology

  35. Relational Model Look • Notation: p(r) • p is called the selection predicate • Defined as:p(r) = {t | t  rand p(t)} Where p is a formula in propositional calculus consisting of termsconnected by :  (and),  (or),  (not)Each term is one of: <attribute> op <attribute> or <constant> where op is one of: =, , >, . <.  • Example of selection:branch_name=“Perryridge”(account)

  36. Object-Relational Data Models • Extend the relational data model by including object orientation and constructs to deal with added data types. • Allow attributes of tuples to have complex types, including non-atomic values such as nested relations. • Preserve relational foundations, in particular the declarative access to data, while extending modeling power. • Provide upward compatibility with existing relational languages.

  37. Design Goals • Design Goals: • Avoid redundant data • Ensure that relationships among attributes represented • Ensure constraints are properly modeled: updates • check for violation of database integrity constraints.

  38. Bad Design

  39. Queries • What the programmer sees

  40. Some Basic SQL Commands • Select – Get rows of data • * - everything • From – the name of the table (relation) will follow • Where – Only get the stuff that matches • Example: Select * from movies where theater = Loews • Exercise – • Write down the query to select all of your friends that live in NY State

  41. View 1 View 2 View 3 Conceptual Schema Physical Schema DB Example: University Database • Conceptual schema: • Students(sid: string, name: string, login: string, age: integer, gpa:real) • Courses(cid: string, cname:string, credits:integer) • Enrolled(sid:string, cid:string, grade:string) • External Schema (View): • Course_info(cid:string,enrollment: integer) • Physical schema: • Relations stored as unordered files. • Index on first column of Students. • Key to good performance

  42. View 1 View 2 View 3 Conceptual Schema Physical Schema DB Data Independence (levels of abstraction) • Applications insulated from how data is structured and stored. • Logical data independence: Protection from changes in logical structure of data – stablize views. • Physical data independence: Protection from changes in physical structure of data. • Q: Why are these particularly important for DBMS?

  43. Queries • Change and get data from a database • Run over data model • Easy & efficient • Not good for complex calculations • DML and DDL

  44. Data Manipulation Language (DML) • Language for accessing and manipulating the data organized by the appropriate data model • DML also known as query language • Two classes of languages • Procedural – user specifies what data is required and how to get those data • Declarative (nonprocedural) – user specifies what data is required without specifying how to get those data • SQL is the most widely used query language

  45. Data Definition Language (DDL) • Specification notation for defining the database schema Example: create tableaccount (account-numberchar(10),balanceinteger) • DDL compiler generates a set of tables stored in a data dictionary • Data dictionary contains metadata (i.e., data about data) • Database schema • Data storage and definition language • Specifies the storage structure and access methods used • Integrity constraints • Domain constraints • Referential integrity (references constraint in SQL) • Assertions • Authorization

  46. Count distinct  Join Proj Join Emp Asgn  Having Group(agg) Select  Emp Emp Queries - What does it look like? SELECT COUNT DISTINCT (E.eid) FROM Emp E, Proj P, Asgn A WHERE E.eid = A.eid AND P.pid = A.pid AND E.loc <> P.loc SELECT eid, ename, title FROM Emp E WHERE E.sal > $50K SELECT E.loc, AVG(E.sal) FROM Emp E GROUP BY E.loc HAVING Count(*) > 5 • System handles query plan generation & optimization; ensures correct execution. Employees Projects Assignments • Issues: view reconciliation, operator ordering, physical operator choice, memory management, access path (index) use, …

  47. SQL • SQL: widely used non-procedural language • Example: Find the name of the customer with customer-id 192-83-7465select customer.customer_namefrom customerwherecustomer.customer_id = ‘192-83-7465’ • Example: Find the balances of all accounts held by the customer with customer-id 192-83-7465selectaccount.balancefromdepositor, accountwheredepositor.customer_id = ‘192-83-7465’ anddepositor.account_number = account.account_number • Application programs generally access databases through one of • Language extensions to allow embedded SQL • Application program interface (e.g., ODBC/JDBC) which allow SQL queries to be sent to a database • For us: Oracle and Access SQL languages

  48. A Look underneath

  49. Concurrency Control • Concurrent execution of user programs: key to good DBMS performance. • Disk accesses frequent, pretty slow • Keep the CPU working on several programs concurrently. • Interleaving actions of different programs: trouble! • e.g., account-transfer & print statement at same time • DBMS ensures such problems don’t arise. • Users/programmers can pretend they are using a single-user system. (called “Isolation”) • Thank goodness! Don’t have to program “very, very carefully”.

More Related