cs263 lec 4 enhanced e r models l.
Download
Skip this Video
Download Presentation
CS263 Lec. 4: Enhanced E-R Models

Loading in 2 Seconds...

play fullscreen
1 / 46

CS263 Lec. 4: Enhanced E-R Models - PowerPoint PPT Presentation


  • 94 Views
  • Uploaded on

CS263 Lec. 4: Enhanced E-R Models. Basic ER model introduced in the mid 1970s Since then business relationships and business data have become more complex Enhanced Entity Relationship (EER) model refers to the extension of the original ER model

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'CS263 Lec. 4: Enhanced E-R Models' - buck


Download Now An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
cs263 lec 4 enhanced e r models
CS263 Lec. 4: Enhanced E-R Models

Basic ER model introduced in the mid 1970s

Since then business relationships and business data have become more complex

Enhanced Entity Relationship (EER) model refers to the extension of the original ER model

Supertype/Subtype Relationships - model a general entity type (supertype) then subdivide it into several specialised entity types (subtypes)

Each subtype inherits from its supertype and may have special attributes of its own

representing supertypes and subtypes
Representing supertypes and subtypes
  • Supertype connected with a line to a circle, in turn connected by a line to each subtype (see Fig.)
  • ‘U’ shaped symbol on each line connecting a subtype to the circle indicates subtype is a subset of supertype, the direction of the relationship
  • Attributes shared by all entities associated with the supertype, whilst attributes that are unique to a particular subtype are associated with that subtype
e g organisation has 3 basic types of employees
E.G. Organisation has 3 basic types of employees:

Hourly: Employee_Number, Employee_Name, Address, Date_Hired, Hourly_Rate

Salaried: Employee_Number, Employee_Name, Address, Date_Hired, Annual_Salary, Stock_Option

Contract consultants: Employee_Number, Employee_Name, Address, Date_Hired, Contract_Number, Billing_Rate

Many attributes same across 3 types - could define supertype called EMPLOYEE, subtypes for HOURLY_EMPLOYEE, SALARIED_EMPLOYEE and CONSULTANT (see Fig)

slide5

Employee supertype with three subtypes

All employee subtypes will have emp_no., name, address, and date-hired

Each employee subtype will also have its own attributes

when to use supertype subtype relations
When to use supertype/subtype relations

Should consider using subtypes when either (or both) of the following conditions are present:

1. There are attributes that apply to some (but not all) of the instances of an entity type

2. The instances of a subtype participate in a relationship unique to that subtype

Both are true in the following Fig., where PATIENT has two subtypes: OUTPATIENT and RESIDENT PATIENT (the primary key is PATIENT_ID)

All patients have an Admit_Date and a Patient_Name

patient example
Patient example

Every patient is cared for by a RESPONSIBLE_PHYSICIAN who develops a treatment plan for the patient

Each subtype also has unique attributes. Outpatients have a Checkback_Date, whilst residents have a Date_Discharged and a unique relationship that assigns each patient to a bed (this is a mandatory relationship, and each bed may or may not be assigned to a patient)

slide8

Supertype/subtype relationships of patients

Both outpatients and resident patients are cared for by a responsible physician

Only resident patients are assigned to a bed

slide9
Generalization = process of defining more general entity type from set of more specialized entity types (BOTTOM-UP)
  • Specialization = process of defining one or more subtypes of supertype, and forming supertype/subtype relationships (TOP-DOWN)
  • Figs. Show Generalisation where we have 3 different entity types: CAR, TRUCK and MOTORCYCLE
  • Second Fig. The more general entity type VEHICLE is shown
  • MOTORCYCLE not shown as does not satisfy conditions for a subtype – only attributes of MOTORCYCLE are those common to all vehicles - no attributes specific to motorcycles. Further, MOTORCYCLE does not have a relationship to another entity type
slide10

Example of generalization

Three entity types: CAR, TRUCK, and MOTORCYCLE

All these types of vehicles have common attributes

slide11

Generalization to VEHICLE supertype

So we put the shared attributes in a supertype

Note: no subtype for motorcycle, since it has no unique attributes

slide12

Specialisation: entity type PART has identifier Part_No and attributes Description, Unit_price, Location, Qty_On_Hand, Routing_Number and Supplier

Some parts internally Manufactured Parts, others externally Purchased Parts (some obtained from both)

Some attributes apply to all parts, others such as Routing_Number depend on the source as apply only to Manufactured Parts - suggests PART should be specialised into subtypes MANUFACTURED_PART and PURCHASED_PART

Relationship ‘Supplies’ created between PURCHASED_PART and SUPPLIER - allows users to easily associate purchased parts with their suppliers

Attribute Unit_Price associated with the relationship ‘Supplies’ so price may vary between suppliers

slide13

Example of specialization

Entity type PART

Applies only to purchased parts

Only applies to manufactured parts

slide14

Specialization to MANUFACTURED PART and PURCHASED PART

Note: multivalued attribute was replaced by a relationship to another entity

Created 2 subtypes

slide15
Completeness Constraints: Whether an instance of a supertype must also be a member of at least one subtype. Has two possible rules:
  • 1:Total Specialization Rule: Yes (double line) In following Fig. A PATIENT must either be an OUTPATIENT or a RESIDENT PATIENT. Total specialisation is indicated by the double line extending from the PATIENT identity type to the circle
  • 2: Partial Specialization Rule: No (single line) An entity instance of the supertype is allowed not to belong to any subtype.
    • In the following Fig. If a VEHICLE is a car it will appear as an instance of CAR, and if a truck as an instance of TRUCK.
    • However, if the vehicle is a motorcycle it cannot appear as an instance of any subtype. This example of partial specialisation is specified by the single line from the VEHICLE supertype to the circle
slide16

Examples of completeness constraints

A patient must be either an outpatient or a resident patient

Total specialization rule

slide17

Partial specialization rule

A vehicle could be a car, a truck, or neither

slide18
Disjointness Constraints: Whether supertype may simultaneously be member of two (or more) subtypes. 2 possibilities
  • Disjoint Rule: Supertype can be only ONE of subtypes. Fig. shows PATIENT must be either an outpatient or a resident patient but not be both – specified by the letter ‘d’
  • Subclass of patient may change over time, but at any given time patient is of only one type
  • Overlap Rule: Supertype can simultaneously be member of more than one subtype. Some PARTs both Manufactured and Purchased.
  • Considering Part Number 4000, at given time may be 250 of this part, of which 100 manufactured and 150 purchased. Overlap specified by placing an ‘o’ in the circle
  • Double line suggests any part must be either a purchased part or a manufactured part, or may simultaneously be both
slide19

Examples of disjointness constraints

A patient can either be outpatient or resident, but not both

Disjoint rule

slide21
Subtype Discriminators: Attribute supertype whose values determine the target subtype(s)
  • Disjoint subtypes: simple attribute with alternative values indicating possible subtypes. In the Fig. new attribute ‘Employee_Type’ added to supertype to serve as subtype discriminator. 3 values ‘H’ = Hourly, ‘S’ = Salaried, ‘C’ = Consultant - assigned correct value when new employee added.
  • Expression “Employee_Type = “ (the LHS of a condition statement) placed next to line leading from supertype to circle, with the value of the attribute that selects the appropriate subtype placed adjacent to line leading to that subtype
slide22

A simple attribute with different possible values indicating the subtype

Subtype discriminator (disjoint rule)

subtype discriminators
Subtype discriminators
  • Overlapping – composite attribute whose subparts pertain to different subtypes. Each subpart contains boolean value indicating whether or not instance belongs to associated subtype.
  • In the Fig. a new attribute Part_Type added to PART - composite attribute with components ‘Manufactured?’ and ‘Purchased?’ Each of attributes is a boolean variable, can be combined as: Manufactures only = YN, Purchased only = NY, Purchased and manufactured = YY
slide24

A composite attribute with sub-attributes indicating “yes” or “no” to determine whether it is of each subtype

Subtype discriminator (overlap rule)

slide25

Defining supertype/subtype hierarchies: possible for any of the subtypes to have other subtypes defined on it

In University most general entity type would be PERSON (root) - attributes are SSN (Social Security Number – Identifier), Name, Address, Gender and Date_Of_Birth

Next we define all major subtypes of the root, here there are 3, EMPLOYEE, STUDENT and ALUMNUS (already graduated)

Person may belong to more than one subtype (such as ALUMNUS and EMPLOYEE) so overlap rule is used. Overlap allows for any overlap (3-way) so if certain combinations are not allowed a more refined supertype/subtype hierarchy would have to be developed to eliminate these

slide26

The next step is to evaluate whether any of the subtypes already defined qualify for further specialisation. Here employee has two subtypes, FACULTY and STAFF

Because there may be types of employee other than Faculty and Staff, the partial specialisation rule is indicated. However, such an employee cannot be both Faculty and Staff at the same time, so the disjoint rule is indicated in the circle

Two subtypes are defined for student: GRADUATE_STUDENT and UNDERGRADUATE STUDENT

Notice that total specialisation and the disjoint rule are specified

slide28
Entity clustering: EER diagrams difficult to read when there are too many entities and relationships
  • Entity cluster: set of one or more entity types and relationships grouped into a single abstract entity type
  • Entity-clustering - hierarchical decomposition of a macro-level view of the data model into finer and finer views, eventually resulting in the full, detailed data model.
  • First of Figs. shows completed data model with dashed lines drawn around possible entity clusters, second shows
  • final result of transforming this into an EER diagram of only entity clusters and relationships. Formed:
  • 1. By abstracting a supertype and its subtypes (see CUSTOMER)
  • 2. By combining directly related entity types and their relationships (SELLING_UNIT, ITEM, MATERIAL, MANUFACTURING)
slide29

Possible entity clusters for Pine Valley Furniture (PVF)

Related groups of entities could become clusters

slide30

EER diagram of PVF entity clusters

More readable, isn’t it?

slide31
SQL: The SELECT Statement: Is used for queries on single or multiple tables. It has the following clauses:
  • SELECT - Lists the columns (and expressions involving columns) from base tables or views to be projected into the table that will be the result of the command
  • FROM - Identifies the table(s) or view(s) from which columns will be chosen to appear in the result table, and includes the tables or views needed to join tables to process the query
  • WHERE - Includes the conditions for row selection within a single table or view, and the conditions between tables or views for joining
  • The first two are required, and the third is necessary when only certain table rows are to be retrieved, or multiple tables are to be joined
  • Following Fig. Shows how we can display product name and quantity from the PRODUCT view for all products costing less than $275
select example
SELECT Example
  • Find products with standard price less than $275
  • SELECT PRODUCT_NAME, STANDARD_PRICE
  • FROM PRODUCT_V
  • WHERE STANDARD_PRICE < 275

Comparison Operators in SQL

distinct and
Distinct and *

If the user does not want to see duplicate rows in the result, SELECT DISTINCT can be used, so SELECT DISTINCT PRODUCT_NAME would display a results table without duplicate rows

SELECT * (where * is a wildcard to indicate all columns) displays all columns from all the tables or views in the FROM clause

using expressions
Using expressions

These are mathematical manipulations of the data in the table

Some are stored functions, such as SUM or AVG

e.g. what is the average standard price for each product in inventory (to be put in the result AVERAGE)?

SELECT AVG (STANDARD_PRICE, AS AVERAGE

FROM PRODUCT_V;

slide35
Functions such as COUNT, MIN, MAX, SUM and AVG of specified columns in the column list of a SELECT command may be used to specify that the resulting answer table is to contain aggregated data instead of row-level data.
  • Using any of these aggregate functions will give a one-row answer. e.g. using the COUNT aggregate function to find totals, asking how many different items were ordered on order number 1004
  • SELECT COUNT(*) FROM ORDER_LINE_V
  • WHERE ORDER_ID = 1004;

Note: with aggregate functions you can’t have single-valued columns included in the SELECT clause

using functions
Using functions

It is easy to confuse the functions COUNT(*) and COUNT

COUNT(*) counts all rows selected by a query, regardless of whether any of the rows contain null values

COUNT tallies only those rows that contain a value, it ignores null values

SUM and AVG can only be used with numeric columns

COUNT, COUNT(*), MIN and MAX can be used with any data type. e.g. using MIN on a text column will find the lowest value in the column - the one which is closest to the beginning of the alphabet

slide37

Wildcards can also be used in the WHERE clause if an exact match is not possible

Here the keyword LIKE is paired with wildcard characters and usually a string containing the characters that are known to be the desired matches

The wildcard character ‘%’ is used to represent any collection of characters

e.g. using LIKE ‘%DESK’ when searching PRODUCT_DESCRIPTION will find all the different kinds of desks

The underscore ,_, is used to represent exactly once character

So using LIKE ‘_-drawer’ when searching PRODUCT_NAME will find any products with specified drawers, such as ‘3-drawer’, ‘5-drawer’ etc.

using boolean operators
Using Boolean operators
  • AND, OR, and NOT Operators for customizing conditions in WHERE clause. NOT is evaluated first, then AND, then OR

The following query lists product name, finish and unit price for all desks and all tables that cost more than $300 in the PRODUCT view:

  • SELECT PRODUCT_DESCRIPTION, PRODUCT_FINISH, STANDARD_PRICE
  • FROM PRODUCT_V
  • WHERE (PRODUCT_DESCRIPTION LIKE ‘%Desk’
  • OR PRODUCT_DESCRIPTION LIKE ‘%Table’)
  • AND UNIT_PRICE > 300;
in and not in lists
In and not in lists

To match a list of values, use IN

e.g. list all customers who live in the warmer American states:

SELECT CUSTOMER_NAME, CITY, STATE

FROM CUSTOMER_V

WHERE STATE IN (‘FL’, ’TX’, ‘CA’, ‘HI’);

slide40
Sorting results: ORDER BY- sorts the final result rows in ascending or descending order
  • e.g. list customer, city and state for all customers in the CUSTOMER view whose address is in Florida, Texas, California or Hawaii. List the customers alphabetically by state, and alphabetically by customer within each state: i.e. sort the results first by STATE, and within a state by CUSTOMER_NAME:
  • SELECT CUSTOMER_NAME, CITY, STATE
  • FROM CUSTOMER_V
  • WHERE STATE IN (‘FL’, ‘TX’, ‘CA’, ‘HI’)
  • ORDERBY STATE, CUSTOMER_NAME;
slide41
Sorting Results: Group by: groups rows in intermediate results table, values in those rows same for one or more columns
  • Useful paired with aggregates such as SUM or COUNT
  • Divides table into subsets (by groups). Then an aggregate function can be used to provide summary information
  • Scalar aggregate: a single value returned from SQL query using an aggregate function
  • Vector aggregate: multiple values returned from SQL query with aggregate function (via GROUP BY)
  • e.g. count number of customers with addresses in each state

SELECT STATE, COUNT(STATE)

FROM CUSTOMER_V

GROUP BY STATE;

group by
GROUP BY

It is also possible to nest groups within groups, the same logic is used as when sorting multiple items

e.g. count the number of customers with addresses in each city to which we deliver. List the cities by state:

SELECT STATE, CITY, COUNT(CITY)

FROM CUSTOMER_V

GROUP BY STATE, CITY;

In general, each column referenced in the SELECT statement must be referenced in the GROUP BY clause, unless the column is an argument for an aggregate function included in the SELECT clause

having
HAVING
  • The HAVING clause acts like a WHERE clause, but it identifies groups that meet a criterion, rather than rows. Therefore we usually see a HAVING clause following a GROUP BY
  • e.g. find only states with more than one customer:

SELECT STATE, COUNT(STATE)

FROM CUSTOMER_V

GROUP BY STATE

HAVING COUNT(STATE) > 1;

Using WHERE here would not work, because WHERE does not allow aggregates - also WHERE qualifies rows, whereas HAVING qualifies groups

using all 6 clauses
Using all 6 clauses

e.g. list the product finish and average standard price for each finish for selected finishes where the average standard price is less than $750:

SELECT PRODUCT_FINISH, AVG(STANDARD_PRICE)

FROM PRODUCT_V WHERE PRODUCT_FINISH IN(‘Cherry’, ‘Natural Ash’, ‘Natural Maple’, ‘White Ash’)

GROUP BY PRODUCT_FINISH

HAVING AVG(STANDARD_PRICE) < 750

ORDER BY PRODUCT_FINISH;

using all 6 clauses45
Using all 6 clauses

The following Fig. Shows the order in which SQL processes the clauses of a statement

Arrows indicate the paths that may or may not be followed

Only the SELECT and FROM clauses are mandatory

The processing order is different from the syntax order

As each clause is processed an intermediate results table is produced that will be used for the next clause (users do not see the intermediate tables, only the final results)

ad