1 / 46

CS 4 40 : Database Management Systems

CS 4 40 : Database Management Systems. 1: Introduction. Welcome to CS440 !. Arash Termehchy Assistant professor in the school of EECS Just moved here from Illinois Usable data exploration systems. Your turn: Name, field, DB background. Data management.

cianna
Download Presentation

CS 4 40 : Database Management Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS 440: Database Management Systems 1: Introduction

  2. Welcome to CS440! • Arash Termehchy • Assistant professor in the school of EECS • Just moved here from Illinois • Usable data exploration systems. • Your turn: • Name, field, DB background

  3. Data management • Modeling a large number of entities and relationships. • Called structured data • Formal (logical) model • Maintaining them on computational devices • Servers in the cloud, sensor networks, … • Keep them organizedaccording to model • Cope with failures • …

  4. Data management • Exploring entities and relationships efficiently, easily, and effectively • Where are the more affordable apartments in Portland? • Who is the most similar person to Alan? • How a virus will likely to spread in a population? • Make an informed and effective decision

  5. Why study data management? • Data is everywhere: • Business: financial analytics, … • Social: social network, data sharing, … • Personal: map apps, … • Science: spread of diseases, …

  6. Data management is valuable • According to McKinsey & Company’s: • $300 billion potential annual value to US health care • €250 billion potential annual value to Europe’s public sector • 60% potential increase in retailer’s operating margins • Data scienceis transforming the way we make decisions, make scientific discovery, … • Analyzing genetic data to find cures for diseases.

  7. Data management is challenging • According to McKinsey & Company’: • 30 billion data items shared on Facebook every month • 235 TB collected by the Library of Congress • 40% growth in the global data each year • 90% of world’s data was generated in the last two year! • Big data: huge, heterogeneous, evolving

  8. We study these challenges • How to get what we like from the data easily, effectively, and efficiently?

  9. Why should we learn these subjects? • Isn’t sufficient to know SQL? • Let companies that make database management systems to worry about these issues. • No! You will end up with: • A query that takes hundreds of hours to finish! • A database that contains negative salaries!

  10. Why should we learn these subjects? • Managing conventional data requires more: • Tuning databases, developing efficient data exploration programs, … • You may face unconventional data management scenarios • The data may be a big graph that is constantly evolving. • You may use data management ideas in your own work.

  11. Prerequisites • Good programming skills • CS 261 and CS 275 or equivalent • Contact instructor if you are not sure.

  12. Readings • Required: • Database Systems: The Complete Book, Hector Garcia Molina, Jeffry Ullman, and Jennifer Widom • Notes on the course website for subjects not covered by the textbook.

  13. Readings • Recommended: • Database Management Systems, Ragu Ramakrishnan and Johannes Gehrke • Foundations of Databases, Serge Abiteboul, Richard Hull, and Victor Vianu • Other useful readings on the course website.

  14. Grading Scheme • Assignments 40% • Project 60%

  15. Assignments • Written assignments • To understand the main concepts and methods. • Should be done individually. • Start soon!

  16. Project • A database centric application • Data Engineering effort. • Advanced feature • Different from CS 275 • Easier search (keyword search) • Nice visualization • …

  17. Project • Group 2 – 4 • Practice how to work in groups • Project definition is due in the third week of the class! • The data, application, and scope • 5% of total grade

  18. Basic Concepts • Database management system (DBMS): • A piece of software that simplifies and facilitates data management and exploration. • Database content • Data • Schema: information about data, meaning of the data Schema Data Salary: 10 Age: 10

  19. Physical Data Independence • Independence from physical details • File system, operating system, hardware, .. • Data models • The way that we see real-world data. • Relational data model: everything is a relation. • Declarative query language: SQL • Say what, not how

  20. Relational Database Management Physical Storage Conceptual Design Schema Relational Model Entity Relationship(ER) Model Files and Indexes

  21. Topics Physical Storage Conceptual Design Schema Relational Model Entity Relationship(ER) Model Files and Indexes

  22. Topics Physical Storage Conceptual Design Schema Relational Model Entity Relationship(ER) Model Files and Indexes

  23. Topics Modeling data and asking questions: Relational Model & Languages Physical Storage Conceptual Design Schema Relational Model Entity Relationship(ER) Model Files and Indexes

  24. Topics Organizing the data: Database design Physical Storage Conceptual Design Schema Relational Model Entity Relationship(ER) Model Files and Indexes

  25. Topics Keeping the data clean and meaningful: Integrity constraints Physical Storage Conceptual Design Schema Relational Model Entity Relationship(ER) Model Files and Indexes

  26. Topics Doing more than asking queries: Stored procedures, ORM Physical Storage Conceptual Design Schema Relational Model Entity Relationship(ER) Model Files and Indexes

  27. Topics Storing data in files: Storage Management Physical Storage Conceptual Design Schema Relational Model Entity Relationship(ER) Model Files and Indexes

  28. Topics Finding data in a big file really fast: Data access methods Physical Storage Conceptual Design Schema Relational Model Entity Relationship(ER) Model Files and Indexes

  29. Topics Translating complex queries to read & write: Query execution & optimization Physical Storage Conceptual Design Schema Relational Model Entity Relationship(ER) Model Files and Indexes

  30. Topics Coping with failure: Transaction Management Physical Storage Conceptual Design Schema Relational Model Entity Relationship(ER) Model Files and Indexes

  31. Topics Tuning Physical Storage Conceptual Design Schema Relational Model Entity Relationship(ER) Model Files and Indexes

  32. Relational Database Management Physical Storage Conceptual Design Schema Relational Model Entity Relationship(ER) Model Files and Indexes

  33. Conceptual Design • High level data model • Describe information in the database without worrying about implementation issues • ER model is the most popular tool for conceptual design • Invented by Peter Chen in 1976 • Provides an easy-to-use language: pictures • We review the basic stuff

  34. ER Model/ Diagram title category name price sells Publisher Book address buys employs Person name ssn address

  35. ER Model • Entity Set • An entity is distinctive real world object: cs540 textbook • An entity set is a collection of entities • Attribute • Belongs to an entity • Does not contain any other attribute: atomic • Atomic data types: string, integer, real, … Book Publisher title category price Book

  36. Relationship • Describe relationships between entity sets • Do not exists without entities • May have attributes employs Person Publisher startdate employs Person Publisher

  37. Relationship Multiplicity • One to one: • publisher - manager • Many to one • book – publisher • Many to many • publisher – person

  38. Book Purchase Store Person Multi-way Relationships • Relationships between more than two entity sets • Each entity set has a different role in the relationship seller buyer

  39. ER Model: Keys • Attribute(s) that uniquely identify entities • No standard way to annotate: usually underlined. • Each entity set must have a key • Why? • Relationships may also have keys Person name ssn address

  40. Topics Modeling data and asking questions: Relational Model & Languages Physical Storage Conceptual Design Schema Relational Model Entity Relationship(ER) Model Files and Indexes

  41. Relational Model • Relational model defines data organization and data retrieval/manipulation operations • It is easier to implement than ER model • It captures more details about the data

  42. An Example Relation name Attribute names Book: Title Price Category Year MySQL $102.1 computer 2001 Cell biology $201.69 biology 1954 French cinema $53.99 art 2002 NBA History $63.65 sport 2010 tuples

  43. Relational Model • Attributes • Atomic values • atomic types: string, integer, real, date, … • Each relation must have keys • Attributes without duplicate values • A relation does not contain duplicate tuples. • Reordering tuples does not change the relation. • Reordering attributes does not change the relation.

  44. Database Schema vs. Database Instance • Schema of a Relation • Names of the relation and their attributes. • E.g.: Person (Name, Address, SSN) • Types of the attributes • Constraints on the values of the attributes • Schema of the database • Set of relation schemata • E.g.: Person (Name, Address, SSN) Employment(Company, SSN)

  45. Database Schema vs. Database Instance • Schema: Book(Title, Price, Category, Year) • Instance: Title Price Category Year MySQL $102.1 computer 2001 Cell biology $201.69 biology 1954 French cinema $53.99 art 2002 NBA History $63.65 sport 2010

  46. Example Schema Beers(name, manf) Bars(name, addr, license) Drinkers( name,addr, phone) Likes(drinker, beer) Sells(bar, beer, price) Frequents(drinker, bar)

More Related