c20 0046 database management systems lecture 1 l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
C20.0046: Database Management Systems Lecture #1 PowerPoint Presentation
Download Presentation
C20.0046: Database Management Systems Lecture #1

Loading in 2 Seconds...

play fullscreen
1 / 40

C20.0046: Database Management Systems Lecture #1 - PowerPoint PPT Presentation


  • 152 Views
  • Uploaded on

C20.0046: Database Management Systems Lecture #1. Matthew P. Johnson Stern School of Business, NYU Spring, 2005. Personnel. Instructor: Matthew P. Johnson mjohnson@stern Office hours: tba, 8-171, KMC please visit! Tutor/TF/grader: tba….

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'C20.0046: Database Management Systems Lecture #1' - bradford


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
c20 0046 database management systems lecture 1

C20.0046: Database Management SystemsLecture #1

Matthew P. Johnson

Stern School of Business, NYU

Spring, 2005

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

personnel
Personnel
  • Instructor: Matthew P. Johnson
    • mjohnson@stern
    • Office hours: tba, 8-171, KMC
      • please visit!
  • Tutor/TF/grader: tba…

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

communications
Communications
  • Web page: http://pages.stern.nyu.edu/~mjohnson/dbms/
    • syllabus
    • course policies
    • may move in the future…
  • Blackboard web site
    • Some materials will be available here
    • Discussion board
      • send general-interest messages here to benefit all!
    • Go to http://sternclasses.nyu.edu
    • Click on C20.0046

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

acknowledgements
Acknowledgements
  • Thanks to Ramesh, Ullman, et al., Raghu and Johannes, Dan Suciu, Arthur Keller, David Kuijt for course materials
  • See classpage for other related, antecedent DBMS courses

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

what is a database
What Is a Database?
  • A very large, integrated collection of data.
  • Models real-world enterprise.
    • Entities
      • students, courses, instructors, TAs
    • Relationships
      • George is currently taking C20.0046
      • Dick is currently teaching C20.0046
      • Condi is currently TA-ing C20.0046 but took it last semester
  • A Database Management System (DBMS)is a software package designed to store and manage databases.

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

databases are everywhere ordering a pizza
Databases are everywhere: Ordering a pizza
  • Databases involved?
  • Pizza Hut’s DB
    • stores previous orders by customer
    • stores previous credit cards used
  • Credit card records
    • huge databases of (attempted) purchases
    • location, date, amount, parties
  • Got approved by credit-report companies
  • phone company’s records
    • Local Usage Details (“Pull his LUDs, Lenny.”)
  • Caller ID
    • ensures reported address matches destination

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

your wallet is full of db records
Your wallet is full of DB records
  • Driver’s license
  • Credit cards
  • NYUCard
  • Medical insurance card
  • Social security card
  • Gym membership
  • Money (serial numbers)
  • Maybe even photos (ids on back)

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

databases are everywhere
Databases are everywhere
  • Q: Websites backed by DBMSs?
    • retail: Amazon, etc.
      • data-mining: Page You Made
    • search engines: Google, etc.
    • directories: Internic, etc.
    • searchable DBs: IMDB, tvguide.com, etc.
  • Q: Non-web examples of DBMSs?
    • criminal/terrorist: TIA
    • airline bookings
    • NYPD’s CompStat
      • all serious crime stats by precinct
    • Retailers: Wal-Mart, etc.
      • when to re-order, purchase patterns, data-mining
    • Genomics!

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

example of a traditional db app
Example of a Traditional DB App

Suppose we are building a system

to store the information about:

  • checking accounts
  • savings accounts
  • account holders
  • state of each of each person’s accounts

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

can we do it without a dbms
Can we do it without a DBMS?

Sure we can! Start by storing the data in files:

checking.txt savings.txt customers.txt

Now write C or Java programs to implement specific tasks

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

doing it without a dbms
Doing it without a DBMS...
  • Transfer $100 from George’s savings to checking:

Write a C program to do the following:

  • Read savings.txt
  • Find&update the record “George”
  • balance -= 100
  • Write savings.txt
  • Read checking.txt
  • Find&update the record “George”
  • balance += 100
  • Write checking.txt

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

problems without an dbms
Problems without an DBMS...

1. System crashes:

  • Q: What is the problem ?
  • A: George lost his $100
  • Same problem even if reordered

2. Simultaneous access by many users

  • George and Dick visit ATMs at same
  • Lock checking.txt before each use–what is the problem?
  • Read savings.txt
  • Find&update the rec “George.”
  • Write savings.txt
  • Read checking.txt
  • Find&update the rec “George”
  • Write checking.txt

CRASH !

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

problems without an dbms13
Problems without an DBMS...

3. Large data sets (say 500GB)

    • Why is this a problem?
  • No indices
    • Finding “George” in huge flatfile is expensive
  • Modifications intractable without better data structures
    • “George”  “Georgie” is very expensive
    • Deletions are very expensive

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

problems without an dbms14
Problems without an DBMS...

5. Security?

  • File system may be insecure
  • File system security may be coarse

6. Application programming interface (API)?

  • suppose need other apps to access DB

7. How to interact with other DBMSs?

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

general problems to solve
General problems to solve
  • In building our own system, many Qs arise:
    • how do we store the data? (file organization, etc.)
    • how do we query the data? (write programs…)
    • make sure that updates don’t mess things up?
      • leave the DB “consistent”
    • provide different views on the data?
      • e.g., ATM user’s view v. bank teller’s view
    • how do we deal with crashes?
  • Too hard! Go buy a DBMS!
  • Q: How does a DBMS solve these problems?
  • A: See third part of course

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

big issue transaction processing
Big issue: Transaction processing
  • Grouping of several queries (or other database actions) into one transaction
  • ACID properties
    • Atomicity
      • all or nothing
    • Consistency
      • constraints on relationships
    • Isolation
      • concurrency control
      • Simulated solipsism
    • Durability
      • Crash recovery

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

atomicity durability
Atomicity & Durability
  • Saw how George lost $100 with makeshift software
  • A DBMS prevents this outcome
    • xacts are all or nothing
  • One idea: Keep a log (history) of all actions in set of xacts
    • Durability: Use log to redo or undo certain ops in crash recovery
    • Atomicity: don’t really commit changes until end
      • Then, all at once

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

isolation
Isolation
  • Concurrent execution is essential for performance.
    • Frequent, slow disk accesses
    •  don’t waste CPU – keep running
  • Interleaving actions of different user programs
  • can lead to inconsistency:
    • e.g., two programs simultaneously withdraw from the same account
  • DBMS ensures such problems don’t arise:
    • users can pretend they are using a single-user system.

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

isolation19
Isolation
  • Contrast with a file in two Notepads
    • Strategy: ignore multiple users
    • whichever saves last wins
    • first save is overwritten
  • Contrast with a file in two Words
    • Strategy: blunt isolation
    • One can edit
    • To the other it’s read-only

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

consistency
Consistency
  • Each xant (on a consistent DB) must leave it in a consistent state
    • can define integrity constraints
    • checks the defined claims about the data remain true

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

data models
Data Models
  • Any DBMS uses a data model: collection of concepts for describing data
  • Schema: description of partic set of data, using some data model
  • Relational data model: most widely used (by far) data model
    • Oracle, DB2, SQLServer, other SQL DBMSs
    • main concept: relation ~ table of rows & columns
    • a rel’s schema defines its fields

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

example university database
Example: university database
  • Conceptual schema:
    • Students(ssn: string, name: string, login: string, age: int, gpa: real)
    • Courses(cid: string, cname: string, credits: int)
    • Enrolled(sid:string, cid:string, grade: string)
  • Physical schema:
    • Relations stored as unordered text files.
    • Indices on first column of each rel
  • External Schema (View):
    • Course_info(ssn: string, name: string)
    • My_courses(cname: string, grade: string)

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

how the programmer sees the dbms
How the programmer sees the DBMS
  • Start with DDL to create tables:
  • Continue with DML to populate tables:

CREATE TABLE Students (

Name CHAR(30)

SSN CHAR(9) PRIMARY KEY NOT NULL,

Category CHAR(20)

) . . .

INSERT INTO Students

VALUES(‘Howard, ‘123456789’, ‘undergraduate’)

. . . .

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

how the programmer sees the dbms24
How the programmer sees the DBMS
  • Tables:
  • Still implemented as files, but behind the scenes can be quite complex

Takes:

Students:

Courses:

“data independence” = separate logical view from physical implementation

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

querying s tructured q uery l anguage
Querying: Structured Query Language
  • Find all the students who have taken C20.0046:
    • SELECT SSN

FROM Takes

WHERE CID=‘C20.0046’

  • Find all the students who C20.0046 last fall:
    • SELECT SSN

FROM Takes

WHERE CID=‘C20.0046’ AND Semester=‘Fall, 2003’

  • Find the students’ names:
    • SELECT Name

FROM Students, Takes

WHERE Students.SSN=Takes.SSN AND

CID=‘C20.0046’ AND Semester=‘Fall, 2003’

  • Query processor does this efficiently.

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

database industry
Database Industry
  • Relational databases are a great success of theoretical ideas.
    • based on most “theoretical” type of math there is: set theory
  • DBMS companies are among the largest software companies in the world.
  • Oracle, IBM (with DB2), Microsoft (SQL Server, Microsoft Access), etc.
  • Also opensource: MySQL, PostgreSQL, etc.
  • $20B+ industry.
  • XML (“semi-structured data”) also important
    • New lingua franca for exchanging data

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

the study of dbms
The Study of DBMS
  • Several aspects:
    • Modeling and design of databases
    • DBMS programming: querying and update
    • DBMS implementation
  • This course covers all three
    • though more time on first two
  • Also will look at some more advanced areas
    • XML, data-mining, LDAP?

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

databases are used by
Databases are used by
  • DB app programmers
    • desktop app programmers
    • web developers
  • Database administrators (DBAs)
    • design schemas
    • security/authorization
    • crash recovery
    • tuning
    • better paid than programmers!
  • Everyone else (perhaps indirectly)
  • “You may not be interested in databases, but databases are interested in you.” - Trotsky

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

course outline
Course outline
  • Database design:
    • Entity/Relationship models
    • Modeling constraints
  • The relational model:
    • Relational algebra
    • Transforming E/R models to relational schemas
  • SQL
    • Views and triggers

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

outline continued
Outline (Continued)
  • Connecting to a database from a programming language
  • Storage and indexing
  • Transactions
  • XML
  • Advanced topics
  • May change as course progresses
    • partly in response to audience

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

textbook
Textbook
  • Database Management Systems
  • by Raghu Ramakrishnan, Johannes Gehrke
    • 3 edition (August 14, 2002)

Available:

    • NYU bookstore
    • Amazon/BN (may be cheaper)
    • Amazon.co.uk (may be cheaper still)
    • Links on class page

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

sql readings
SQL Readings
  • Optional reference: SQL in a Nutshell
  • Online (free) SQL tutorials include:
    • A Gentle Introduction to SQL (http://sqlzoo.net/)
    • SQL for Web Nerds (http://philip.greenspun.com/sql/)

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

grading
Grading
  • Prerequisites: Programming experience
    • presumably C/C++/Java
  • Work & Grading:
    • Homework 30%: O(4)
    • Project: 30% - see below.
    • Midterm (closed book/notes): 15%
    • Final (closed book/notes): 20%
    • Class participation: 5%
    • Stern Curve
  • Class attendance is required
  • Absences will seriously affect your total grade

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

the project design end to end db app
The Project: design end-to-end DB app
  • data model
    • Identify entities (and fields), relationships
    • Identify resulting relations (tables)
  • creation of DB in Oracle/Mysql
    • Insertion of real(alistic) data
  • (web) app for accessing/modifying data
    • Identification of “interesting” questions to ask
    • Production of DBMS interface
  • Work in pairs/threes (start forming now)
  • Choose topic on your own: previous e.g.s online
  • Start forming your group today!

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

collaboration
Collaboration
  • Homework and exams done individually
  • Project done with your team members only
  • Non-cited use of others’ problem solutions, code, etc. = plagiarism
  • See Stern’s stern academic honesty policy
  • Contact me if you’re at all unclear before a particular case
  • Cite any materials used if you’re at all unclear after a particular case

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

on going feedback
On-going Feedback
  • Don’t wait until the end-of-semester course evals to complain or give feedback on how to improve course. (It’s too late for you then!)
  • Come see me early on during my office hours
    • or send me email with your concerns
  • “We’re in touch, so you be in touch.”

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

summary
Summary
  • DBMS used to maintain, query large datasets.
  • Benefits include recovery from system crashes, concurrent access, data integrity, security, and quick application development.
  • Database skills are critical in financial services, marketing and other business areas!

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

so what is this course about really
So what is this course about, really?

A bit of everything!

  • Languages: SQL, XPath, XQuery
  • Data modeling
  • Some theory!
    • Functional dependencies, normal forms
    • e.g., how to find most efficient schema for data
  • Algorithms and data structures (in the third part)
    • e.g., indices make data much faster to find – but how?
  • Lots of implementation and hacking for the project
  • Business DBMS examples/cases
  • Most importantly: how to meet real-world needs

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

for next time
For next time
  • Get the book
  • Read chapters 1 & 2

M.P. Johnson, DBMS, Stern/NYU, Spring 2005

for right now written survey
For right now: written survey
  • name
  • previous cs/is/math/logic courses
  • previous programming experience
  • career plans: programmer, DBA, MBA, etc.
  • why taking class

M.P. Johnson, DBMS, Stern/NYU, Spring 2005