CS411 Web site: http://www.cs.uiuc.edu/class/sp07/cs411/ Announcements Syllabus Policies Schedule Office hours
Teaching Staff Lawrence Angrave Tao Cheng Maryam Karimzadehgan Yoonkyong Lee Chengkai Li
My background Lawrence AngravePh.D. in Physics commercial– Enterprise applicationstwo software startupsdeveloped shrink-wrap and web applications application security– 'Hacking Databases' research– A.I. using Cognitive Science
Six degress of Kevin BaconDatabase ManagementSystems Where do we find DBMS?
CS411: Two perspectives on DBMS User perspective how to use a database system System perspective how to design and implement a database system
User Perspective conceptual data modeling, the relational and other data models, database schema design, relational algebra, and the SQL query language...
System perspective Data representation, indexing, query optimization and processing, transaction processing, concurrency control, and crash recovery ...
Prerequisite Must have data structure and algorithm background CS 225 or 400 equivalent Programming skills project will require lot of programming need PHP C++ or Java to do a good job at talking with databases you or your project group picks the language Other languages possible but your TA's / peers may not be able to help Knowing only C will require more work more difficult to talk in C to databases
Textbook Database Systems: The Complete Book, Hector Garcia-Molina, Jeffrey D. Ullman, & Jennifer D. Widom Good references Database Management Systems, Raghu Ramakrishnan an Johannes Gehrke, McGraw-Hill Database System Concepts, Abraham Silberschatz, Henry F. Korth, and S. Sudarshan Fundamentals of Database SystemsRamez Elmasri and Shamkant Navathe An Introduction to Database SystemsC. J. Date
Course Format For all students two 75-min lectures / week 4 homeworks project a midterm and a final exam 4credit option for graduate students Individual assignment
Lectures Lecture slides will be posted shortly after the lecture Lectures are important for guiding your reading of textbook (and will be covered in exams and homeworks) You can ask questions too :-)
Homework assignment Paper based, no programming Due by 11:30am of Thursday Feb 15th Hand-in: Trisha Benson's office 4322 SC Read relevant book sections (see CS411 website) Use newsgroup & TAs for help
Homework policy Late homework will not be accepted Solution will be posted soon after deadline Start it early :-) Ask questions/clarification before the due day.
Project Web-based DBMS application select an application that needs a database build a database application from start to finish bring the application to the web Significant amount of programming Multiple stages Submit some work at the end of each stage Will show a demo at semester end
Project Groups Groups of 3 or 4 students learn how to work in a group: valuable skills also use project group as study partners Form groups next week start by posting requests on the class newsgroup otherwise ... we will assign a group to you Grading all members receive same grade if someone drops out, the rest pick up the work
What does your neighbor know about programming and programming databases?
Midterm & final Midterm date after course drop date There's a review before each exam Check dates next week to make sure there's no conflict generally no makeup exams unless exceptional cases
Grading Breakdown (subject to minor changes) Quizes 5% Homework 20 % Project 25 % Final 30 % Midterm 20 % Re-grade: one week appeal window
Office Hours The best way to ask questions and clarifications if you're stuck Office hours every day See course website for schedule
Communications http://www.cs.uiuc.edu/class/sp07/cs411/ Newsgroup: class.cs411 (.i2cs) vitally important! make sure to check it daily for questions/clarifications announcements will appear here and the course web site If you have a question / problem talk to people in your group first post your question on newsgroup email TA411 @ cs.uiuc.edu go to office hours to talk to TA or instructor
Newsgroup: class.cs411 Designed for you and your peer to communicate and help one another please do not post solutions/admin-requests to the newsgroup TAs will monitor and try their best to help with your questions There can be many questions may not be able to answer all of them not good for more complex questions hence should come to office hours or email TA
class.cs411 Give and Take Pose questions & answer questions
Data Management Evolution Jim Gray: Evolution of Data Management. IEEE Computer 29(10): 38-46 (1996): Manual processing: -- 1900 Mechanical punched-cards: 1900-1955 Stored-program computer: sequential record processing: 1955-1970 Online navigational network DBs: 1965-1980 many applications still run today! Relational DB: 1980-1995 Post-relational and the Internet: 1995-
Database Management System (DBMS)? System for providing efficient, convenient, and safe multi-user storage of and access to massive amounts of persistent data
Example: Banking system Information on accounts, customers, balances, current interest rates, transaction histories... MASSIVE – many Tbs history of all transactions, check images... Far too big for memory PERSISTENT – data outlives programs that operate on it
MULTI-USER Access MULTI-USER: many people/programs accessing same database, or even same data, simultaneously - Need careful controls Alex @ ATM1: withdraw $100 from account #002 get balance from database if balance >= 100 then balance := balance - 100 dispense cash put new balance into database Bob @ ATM2: withdraw $50 from account #002 get balance from database if balance >= 50 then balance := balance - 50 dispense cash put new balance into database
MULTI-USER: What can go wrong? Initial balance = 200 Final balance = ??
Why direct implementation don’t work Storing data file system is limited size < 4GB (on 32 bit machines) when system crashes we may loose data password-based authorization insufficient Query & update need to write a new C++/Java program for every new query need to worry about performance (non-trivial)
Concurrency limited protection need to worry about interfering with other users need to offer different views to different users (e.g. registrar, students, professors) Schema change entails changing file formats need to rewrite virtually all applications Time for a Database
DBMS: More Requirements SAFE from system failures & malicious users CONVENIENT simple commands to - debit account, get balance, write statement, transfer funds, etc. also unpredicted queries should be easy EFFICIENT don't search all files in order to - get balance of one account, get all accounts with low balances, get large transactions, etc. massive data! DBMS's carefully tuned for performance
DBMS: A Software System Buy, install, set up for particular application Available for PC's, workstations, mainframes, supercomputers Major vendors: all are "relational" (or "object-relational") DBMS Oracle IBM (DB2) Microsoft (SQL Server, Access) Sybase
DBMS Architecture Query Parser Transaction Manager DDL Processor Query Rewriter Concurrency Control Logging & Recovery Query Optimizer Query Executor Records Indexes Lock Tables Buffer: data, indexes, log, etc Buffer Manager Storage Manager Storage User/Web Forms/Applications/DBA query transaction DDL commands Main Memory data, metadata, indexes, log, etc
Data Structuring:ModelSchemaData Model conceptual structuring of data stored in database eg. data is set of records, each with student-ID, name, address, courses, photo e.g. data is graph where nodes represent cities, edges represent airline routes
Data Structuring: Schema, Data Schema versus data schema: describes how data is to be structured defined at set-up time rarely changes also called "metadata" data is actual "instance" of database, changes rapidly vs. types and variables in programming languages
Data definition language (DDL) commands to set up and change schema CREATE ALTER DELETE Can also create triggers, assign rights, constraints...
Data Manipulation Language (DML) Commands to manipulate data in database: RETRIEVE, INSERT, DELETE, MODIFY Also called a "query language"
People DBMS user: queries/modifies data DBMS application designer set up schema, loads data, … DBMS administrator user management, performance tuning, … DBMS implementer: builds systems
Summary (and goals)
Getting the most out of 411? Read and think before class welcome to ask questions before class! Study and discuss with your peers discuss readings & assignments but write your own solution Use lectures to guide your study use it as a road-map for what’s important lectures do not cover everything
First ½ Topics: User Perspective Entity-Relationship Model Relational Model Relational Database Design Relational Algebra SQL and DBMS Functionalities: SQL Programming Queries and Updates Indexes and Views Constraints and Triggers
Second ½ Topics: System Perspective Storage and Representation Indexing Query Execution and Optimization Transaction Management
My goals for CS411 Good structure & at the right level for most students Theoretical + Practical Questions and roadblocks are dealt with swiftly You can it use as a launch-pad
Your outlook? 2007 "Useful for my next course" 2008 "My resume, employment, $$" 21st C = large amount of information processing "I have this idea for a startup ..."
Announcements Time to start forming project groups Next lecture: Thursday 2pm Read the book! Any questions / ideas? Please talk to us