Create Presentation
Download Presentation

Download Presentation

Temple University – CIS Dept. CIS616– Principles of Database Systems

Download Presentation
## Temple University – CIS Dept. CIS616– Principles of Database Systems

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**Temple University – CIS Dept.CIS616– Principles of**Database Systems V. Megalooikonomou Relational Model (based on notes by Silberchatz,Korth, and Sudarshan and notes by C. Faloutsos at CMU)**Overview**• history • concepts • Formal query languages • relational algebra • rel. tuple calculus • rel. domain calculus**History**• before: records, pointers, sets etc • introduced by E.F. Codd (1923-2003) in 1970 • revolutionary!!! • first systems: 1977-8 (System R; Ingres) • Turing award in 1981**Concepts**• Database: a set of relations (= tables) • rows: tuples • columns: attributes (or keys) • superkey, candidate key, primary key**Example**Database:**Example: cont’d**Database: k-th attribute (Dk domain) rel. schema (attr+domains) tuple**Example: cont’d**rel. schema (attr+domains) instance**Example: cont’d**• Di: the domain of the I-th attribute (eg., char(10) • Formally: an instance is a subset of (D1 x D2 x …x Dn) rel. schema (attr+domains) instance**Example: cont’d**• superkey (eg., ‘ssn , name’): determines record • cand. key (eg., ‘ssn’, or ‘st#’): minimal superkey (no subset of it is a superkey) • primary key: one of the cand. keys**Another example**• Example: if Customer-name = {Jones, Smith, Curry, Lindsay}Customer-street = {Main, North, Park}Customer-city = {Harrison, Rye, Pittsfield} Then r = { (Jones, Main, Harrison), (Smith, North, Rye), (Curry, North, Rye), (Lindsay, Park, Pittsfield)} is a relation over Customer-name x Customer-street x Customer-city**Relations, tuples**• Relation Schema: • A1, A2, …, Anare attributes • R = (A1, A2, …, An ) is a relation schema E.g. Customer-schema = (customer-name, customer-street, customer-city) • r(R) is a relation on the relation schema R E.g. customer (Customer-schema) • Relations are unordered • Order of tuples is irrelevant (tuples may be stored in an arbitrary order)**Database**• A database consists of multiple relations • Information about an enterprise is broken up into parts, with each relation storing one part of the informationE.g.: account : stores information about accountsdepositor : stores information about which customer owns which account customer : stores information about customers • Storing all information as a single relation such as bank(account-number, balance, customer-name, ..)results in • repetition of information (e.g. two customers own an account) • the need for null values (e.g. represent a customer without an account) • Normalization theory (discuss later) deals with how to design relational schemas**Overview**• history • concepts • Formal query languages • relational algebra • rel. tuple calculus • rel. domain calculus**Formal query languages**• How do we collect information? • Eg., find ssn’s of people in cis331 • (recall: everything is a set!) • One solution: Relational algebra, i.e., set operators (procedural language) • Q1: Which operators?? • Q2: What is a minimal set of operators?**Relational operators**• . • . • . • set union U • set difference ‘-’**Example:**• Q: find all students (part or full time) • A: PT-STUDENT union FT-STUDENT**Observations:**• two tables are ‘union compatible’ if they have the same attributes (i.e., same arity: number of attributes and same ‘domains’) • Q: how about intersection ? U**Observations:**• A: redundant: • STUDENT intersection STAFF = STUDENT - (STUDENT - STAFF) STAFF STUDENT**Relational operators**• . • . • . • set union • set difference ‘-’ U**Other operators?**• E.g., find all students on ‘Main street’ • A: ‘selection’**Other operators?**• Notice: selection (and rest of operators) expect tables, and produce tables • --> can be cascaded!! • For selection, in general:**Selection - examples**• Find all ‘Smiths’ on ‘Forbes Ave’ ‘condition’ can be any boolean combination of ‘=‘, ‘>’, ‘>=‘, ...**Relational operators**• selection • . • . • set union • set difference R - S R U S**Relational operators**• selection picks rows - how about columns? • A: ‘projection’ - eg.: finds all the ‘ssn’ - removing duplicates**Relational operators**Cascading: ‘find ssn of students on ‘forbes ave’**Relational operators**• selection • projection • . • set union • set difference R - S R U S**Relational operators**Are we done yet? Q: Give a query we can not answer yet!**TAKES**SSN c-id grade 123 cis331 A 234 cis331 B Relational operators A: any query across two or more tables, eg., ‘find names of students in cis351’ Q: what extra operator do we need?? A: surprisingly, the cartesian product is enough!**Cartesian product**• E.g., dog-breeding: MALE x FEMALE • gives all possible couples = x**TAKES**SSN c-id grade 123 cis331 A 234 cis331 B so what? • Eg., how do we find names of students taking cis351?**Ssn**Name Address ssn cid grade 123 smith main str 123 cis331 A 234 jones forbes ave 123 cis331 A 123 smith main str 234 cis331 B 234 jones forbes ave 234 cis331 B Cartesian product • A:**Cartesian product**Ssn Name Address ssn cid grade 123 smith main str 123 cis331 A 234 jones forbes ave 123 cis331 A 123 smith main str 234 cis331 B 234 jones forbes ave 234 cis331 B**Ssn**Name Address ssn cid grade 123 smith main str 123 cis331 A 234 jones forbes ave 123 cis331 A 123 smith main str 234 cis331 B 234 jones forbes ave 234 cis331 B**FUNDAMENTALRelational operators**• selection • projection • cartesian product MALE x FEMALE • set union • set difference R - S R U S**Relational ops**• Surprisingly, they are enough, to help us answer almost any query we want!! • derived operators, for convenience • set intersection • join (theta join, equi-join, natural join) • ‘rename’ operator • division**Joins**• Equijoin:**Cartesian product**• A: Ssn Name Address ssn cid grade 123 smith main str 123 cis331 A 234 jones forbes ave 123 cis331 A 123 smith main str 234 cis331 B 234 jones forbes ave 234 cis331 B**Joins**• Equijoin: • theta-joins: generalization of equi-join - any condition**Joins**• very popular: natural join: RS • like equi-join, but it drops duplicate columns: STUDENT(ssn, name, address) TAKES(ssn, cid, grade)**Joins**• nat. join has 5 attributes Ssn Name Address ssn cid grade 123 smith main str 123 cis331 A 234 jones forbes ave 123 cis331 A 123 smith main str 234 cis331 B 234 jones forbes ave 234 cis331 B equi-join: 6**Natural Joins - nit-picking**• if no attributes in common between R, S: • nat. join -> cartesian product:**Overview - rel. algebra**• fundamental operators • derived operators • joins etc • rename • division • examples**rename op.**• Q: why? • A: • Shorthand (BEFORE can be a relational algebra expression) • self-joins; … • for example, find the grand-parents of ‘Tom’, given PC(parent-id, child-id)**rename op.**• PC(parent-id, child-id)**rename op.**• first, WRONG attempt: • (why? how many columns?) • Second WRONG attempt:**rename op.**• we clearly need two different names for the same table - hence, the ‘rename’ op.**Overview - rel. algebra**• fundamental operators • derived operators • joins etc • rename • division • examples**Division**• Rarely used, but powerful. • Suited for queries that include the phrase “for all” • Example: find suspicious suppliers, i.e., suppliers that supplied all the parts in A_BOMB**Division**• Observations: ~reverse of cartesian product • It can be derived from the 5 fundamental operators (!!) • How?