1 / 38

CS 5 40 : Database Management Systems

CS 5 40 : Database Management Systems. 1: Introduction. Welcome to CS540 !. Arash Termehchy Assistant professor in the school of EECS Just moved here from Illinois Usable data exploration systems . Your turn: Name, field, DB background. Data management.

saddam
Download Presentation

CS 5 40 : Database Management Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS 540: Database Management Systems 1: Introduction

  2. Welcome to CS540! • Arash Termehchy • Assistant professor in the school of EECS • Just moved here from Illinois • Usable data exploration systems. • Your turn: • Name, field, DB background

  3. Data management • Modeling a large number of entities and relationships. • Called structured data • Formal (logical) model • Maintaining them on computational devices • Servers in the cloud, sensor networks, … • Keep them organizedaccording to model • Cope with failures • …

  4. Data management • Exploring entities and relationships efficiently, easily, and effectively • Where are the more affordable apartments in Portland? • Who is the most similar person to Alan? • How a virus will likely to spread in a population? • Make an informed and effective decision

  5. Why study data management? • Data is everywhere: • Business: financial analytics, … • Social: social network, data sharing, … • Personal: map apps, … • Science: spread of diseases, …

  6. Data management is valuable • According to McKinsey & Company’s: • $300 billion potential annual value to US health care • €250 billion potential annual value to Europe’s public sector • 60% potential increase in retailer’s operating margins • Data scienceis transforming the way we make decisions, make scientific discovery, … • Analyzing genetic data to find cures for diseases.

  7. Data management is challenging • According to McKinsey & Company’: • 30 billion data items shared on Facebook every month • 235 TB collected by the Library of Congress • 40% growth in the global data each year • 90% of world’s data was generated in the last two year! • Big data: huge, heterogeneous, evolving

  8. We study these challenges • How to get what we like from the data easily, effectively, and efficiently?

  9. Why should we learn these subjects? • Isn’t sufficient to know SQL? • Let companies that make database management systems to worry about these issues. • No! You will end up with: • A query that takes hundreds of hours to finish! • A database that contains negative salaries!

  10. Why should we learn these subjects? • Managing conventional data requires more: • Tuning databases, developing efficient data exploration programs, … • You may face unconventional data management scenarios • The data may be a big graph that is constantly evolving. • You may use data management ideas in your own research.

  11. Prerequisites • Good programming skills • CS 261 and CS 275 or equivalent • Contact instructor if you are not sure.

  12. Readings • Required: • Database Systems: The Complete Book, Hector Garcia Molina, Jeffry Ullman, and Jennifer Widom • Foundations of Databases, Serge Abiteboul, Richard Hull, and Victor Vianu • Notes on the course website for subjects not covered by the textbook.

  13. Readings • Recommended: • Database Management Systems, Ragu Ramakrishnan and Johannes Gehrke • Other useful readings on the course website.

  14. Grading Scheme • Assignments 40% • Project 60%

  15. Assignments • Written assignments • To understand the main concepts and methods. • Should be done individually. • Start soon!

  16. Project • A data centric application • Some research elements or heavy engineering effort. • Different from CS 275 or CS 440 • Group 1 – 4 • A list of possible projects are in the web site. • Project definition is due in the third week of the class!

  17. Project • A list of possible projects are in the web site. • Project definition is due in the third week of the class! • 5% of total grade

  18. Basic Concepts • Database management system (DBMS): • A piece of software that simplifies and facilitates data management and exploration. • Database content • Data • Schema: information about data, meaning of the data Schema Data Salary: 10 Age: 10

  19. Physical Data Independence • Independence from physical details • File system, operating system, hardware, .. • Data models • The way that we see real-world data. • Relational data model: everything is a relation. • Declarative query language: SQL • Say what, not how

  20. Relational Database Management Physical Storage Conceptual Design Schema Relational Model Entity Relationship(ER) Model Files and Indexes

  21. Topics Physical Storage Conceptual Design Schema Relational Model Entity Relationship(ER) Model Files and Indexes

  22. Topics Physical Storage Conceptual Design Schema Relational Model Entity Relationship(ER) Model Files and Indexes

  23. Topics Modeling data and asking questions: Relational Model & Languages Physical Storage Conceptual Design Schema Relational Model Entity Relationship(ER) Model Files and Indexes

  24. Topics Exploring more than one database: Data Integration Physical Storage Conceptual Design Schema Relational Model Entity Relationship(ER) Model Files and Indexes

  25. Topics Asking vague queries: Approximate queries Physical Storage Conceptual Design Schema Relational Model Entity Relationship(ER) Model Files and Indexes

  26. Topics Storing data in files: Storage Management Physical Storage Conceptual Design Schema Relational Model Entity Relationship(ER) Model Files and Indexes

  27. Topics Finding data in a big file really fast: Data access methods Physical Storage Conceptual Design Schema Relational Model Entity Relationship(ER) Model Files and Indexes

  28. Topics Translating complex queries to read & write: Query execution & optimization Physical Storage Conceptual Design Schema Relational Model Entity Relationship(ER) Model Files and Indexes

  29. Topics Coping with failure: Transaction Management Physical Storage Conceptual Design Schema Relational Model Entity Relationship(ER) Model Files and Indexes

  30. Topics Tuning Physical Storage Conceptual Design Schema Relational Model Entity Relationship(ER) Model Files and Indexes

  31. Relational Database Management Physical Storage Conceptual Design Schema Relational Model Entity Relationship(ER) Model Files and Indexes

  32. Conceptual Design • High level data model • Describe information in the database without worrying about implementation issues • ER model is the most popular tool for conceptual design • Invented by Peter Chen in 1976 • Provides an easy-to-use language: pictures • We review the basic stuff

  33. ER Model/ Diagram title category name price sells Publisher Book address buys employs Person name ssn address

  34. ER Model • Entity Set • An entity is distinctive real world object: cs540 textbook • An entity set is a collection of entities • Attribute • Belongs to an entity • Does not contain any other attribute: atomic • Atomic data types: string, integer, real, … Book Publisher title category price Book

  35. Relationship • Describe relationships between entity sets • Do not exists without entities • May have attributes employs Person Publisher startdate employs Person Publisher

  36. Relationship Multiplicity • One to one: • publisher - manager • Many to one • book – publisher • Many to many • publisher – person

  37. Book Purchase Store Person Multi-way Relationships • Relationships between more than two entity sets • Each entity set has a different role in the relationship seller buyer

  38. ER Model: Keys • Attribute(s) that uniquely identify entities • No standard way to annotate: usually underlined. • Each entity set must have a key • Why? • Relationships may also have keys Person name ssn address

More Related