chapter 6 relational data modeling concepts principles approaches
Download
Skip this Video
Download Presentation
Chapter 6: Relational Data Modeling -- CONCEPTS, PRINCIPLES & APPROACHES--

Loading in 2 Seconds...

play fullscreen
1 / 146

Chapter 6: Relational Data Modeling -- CONCEPTS, PRINCIPLES & APPROACHES-- - PowerPoint PPT Presentation


  • 203 Views
  • Uploaded on

Chapter 6: Relational Data Modeling -- CONCEPTS, PRINCIPLES & APPROACHES--. Data Warehouse Fundamentals. Paul Chen. www.cs522.com (Please reference white papers on Data Modeling at Seattle U teaching materials website). Topics. Levels of Modeling

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Chapter 6: Relational Data Modeling -- CONCEPTS, PRINCIPLES & APPROACHES--' - julie


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
chapter 6 relational data modeling concepts principles approaches
Chapter 6: Relational Data Modeling-- CONCEPTS, PRINCIPLES & APPROACHES--

Data Warehouse Fundamentals

Paul Chen

www.cs522.com (Please reference white papers on Data Modeling

at Seattle U teaching materials website)

topics
Topics
  • Levels of Modeling
  • Relational Data Modeling—What is it? Types of Models and Pre-Modeling Activities
  • Understand Terms and Terminology –Tool Demonstration
  • Conceptual Data Modeling: What, Why, When, Who? Activity Description
  • Logical Data Modeling: What, Why, When, Who? Activity Description
  • Physical Data Modeling- An Overview

7 Prototypingand RAD

databases modeling
Databases & Modeling

Databases & Modeling

Type of

Database

New

Trend

Constructs

Characteristics

Relational

Database

ERD & EER

Row/

Column

Dimensional

Modeling

OLAP

DW

Multi-Dimensional

Database

Cube

Distributed

Component

Object Model

Distributed

Database

Client

Object

(DCOM)

XML

UML

Object-Oriented

Database

Object

Class

Diagram

Object = Data + Operations(Services);

Entity = Data only

topic 1 level of modeling
Topic 1: Level of Modeling

Descriptive: The dealer sold 200 cars last month.

Primarily Two Dimensional

Database System

Operational

(OLTP)

Explanatory: For every increase in 1 % in the interest,

auto sales decrease by 5 %.

Star Schema Cube

Traditional DW

(OLAP)

Predictive: Predictions about future buyer behavior.

Data Mining

Cube + sophisticated

analytical

tools

level of analytical processing
Level of Analytical Processing

Explanatory

“WHAT IF”

PROCESSING

ANALYZE WHAT

HAS PREVIOUSLY

OCCURRED TO

BRING ABOUT THE

CURRENT STATE

OF THE DATA

Predictive

Descriptive

SIMPLE QUERIES

& REPORTS

DETERMINE IF

ANY PATTERNS

EXIST BY REVIEWING

DATA RELATIONSHIPS

Statistical Analysis/Expert System/

Artificial Intelligence

Normalized

Tables

+

Dimensional

Tables

Classification & Value Prediction

Roll-up; Drill Down

Query

descriptive modeling
DESCRIPTIVE MODELING

Relational Data Modeling using ER Diagram

  • Conceptual Data Model (Analysis - Requirements Gathering; What’s it?)
  • Logical Data Model (Design-How is it?)
  • Physical Data Model (Implementation)
explanatory modeling
EXPLANATORYMODELING
  • Also calledDimensional Modelling (to be discussed in chapter 7)
  • Ways to derive the database component of a data warehouse
  • Every dimensional model (DM) is composed of one table with a composite primary key, called the fact table, and a set of smaller tables called dimension tables.
predictive modeling to be discussed in chapter 10
PREDICTIVE MODELING (to be discussed in chapter 10)
  • Similar to the human learning experience
    • Uses observations to form a model of the important characteristics of some phenomenon.
  • Uses generalizations of ‘real world’ and ability to fit new data into a general framework.
  • Can analyze a database to determine essential characteristics (model) about the data set.
topic 2 relational data modeling what s it
Topic 2: Relational Data Modeling-What’s it?

A data model is a collection of constructs, business

rules and sample data which together supports a

dynamic representation of real world objects and

events.

  • Constructs: entity-relationship diagrams (conceptual & logical or functional) and tables
  • Business rules: constraints such as referential integrity rules and operators (add, update, delete)
  • Sample Data for verification and prototyping: Verifying the accuracy of the model.
objectives
Objectives
  • Testing the real system before building it.
  • Assisting in understanding an organization’s data requirements.
  • Facilitating physical data base design.
types of models
Types of Models
  • Conceptual Data Model (Analysis - Requirements Gathering; What’s it?)
  • Logical Data Model (Design-How is it?)
  • Physical Data Model (Implementation)
pre modeling activities
Pre-modeling Activities
  • Data collection phase
  • Facilitation techniques (for ex. JAD session)
  • Roles and responsibilities
  • Tools and repository
  • Naming standards
  • Modeling convention (What methods to use)
  • Data protection/backup and recovery procedures
data collection phase
Data Collection Phase *
  • Sampling and Investigating Hard Data
  • The Needs for Sampling:

Containing costs; Speeding up the data gathering;

Improving effectiveness; Reducing bias

  • Sampling Design -Four steps:

1. Determine the data to be collected or described

2. Determine the population to be sampled

3. Choose the type of sample

4. Decide on the sample size

* Please review chapter 5

kinds of information sought in investigation
Kinds of Information Sought in Investigation

Types of hard Data (other than interviewing and

Observation)

  • Memos
  • Signs on bulletins boards or in work areas
  • Corporate Web sites
  • Manuals
  • Policy handbooks
  • Record layout
retina scan
Retina Scan

“That recent Tom Cruise movie, Minority Report, shows

advertising that targets each individual consumer as they pass by the signage. That’s the extreme, but I can

see it going that way,” said St. Denis.

five steps in interview preparation
Five Steps in Interview Preparation
  • Reading background material
  • Establishing interview objectives
  • Deciding when to interview
  • Preparing the interviewee
  • Deciding on question type and structure
two types of questions open end questions vs closed questions
Two Types of Questions – Open-End Questions vs. Closed Questions

Closed interview questions

Such as “ How many subordinates do you have?

Benefits:

  • Getting to relevant data
  • Keeping control over the interview

Drawbacks:

  • Failing to obtain rich detail
  • Intimidating the interviewee
three basic ways of structuring interviews
Three Basic ways of Structuring Interviews
  • Pyramid Structure: Starting from closed questions, then gradually expand into open territory.
  • Funnel Structure: The reverse of pyramid structure approach.
  • Diamond-Shaped: A combination of the two above structures.
jad joint application development
JAD (Joint Application Development)
  • JAD sessions (also called facilitated session) are used to gather information and feedback and confirm the results of requirements gathering.
  • JAD sessions replace the traditional way of conducting a series of interviews on a one-to-one basis with the users.

Advantages: Achieving consensus during the session when

multiple sources of information exist, raising and

addressing issues or assigning them for resolution, and

immediately confirming information.

topic 3 understand terms and terminology
Topic 3: Understand Terms and Terminology
  • Independent entity
  • Dependent entity
  • Associative Entity
  • Identifying relationship
  • Non-identifying relationship
understand terms and terminology
Understand Terms and Terminology
  • Identifier

An attribute distinctly identifies each occurrence of an entity.

For ex., bank account Id. , and student Id.

  • Association (relationships)

An association is a relationship between two or more entities.

Employee works for company

Part has item

understand terms and terminology1
Understand Terms and Terminology
  • Cardinality(the form of relationship)

Associations occur in there forms:

one-to-one; one-to-many; many-to-many

  • Tables

A table is a two-dimensional representation of data consisting of columns and rows.

  • Primary Key

Used to identify entities.

Unique identification for a row in a table.

Allow no nulls and no duplicates.

May be system assigned.

understand terms and terminology2
Understand Terms and Terminology
  • Foreign key

A foreign key is one or more data elements whose value is based on the primary identifier of another entity, thus allowing the system to ‘join’ and get related information from other entities. The ‘joining’ of different entities in this manner eliminates the need of data repetition and redundancy.

  • Normalization

A technique to make sure that the data in a logical model is defined once and only once. Normalization helps minimum data redundancy, and minimize update abnormalities.

topic 4 conceptual data modeling what
Topic 4: Conceptual Data Modeling: What?

What is it?

1.  It is a conceptual representation of data without concern for its logical (functional) or physical aspects.

2. It is a set of high-level business data models which provides a framework for the data modeling activities at the next level.

conceptual data modeling when
Conceptual Data Modeling: When?

When should it be done?

1. In support of the data requirements of a process model under development at the corresponding level.

or

2. Outside the system application lifecycle on a department, division, or company wide basis.

conceptual data modeling who
Conceptual Data Modeling: Who?

Who should do it?

1. The group responsible for assuring that data structure reflects business policies and rules.

2. It should be a joint effort between the owners and custodians of the data, the users of the data, and the analysts.

conceptual data modeling why
Conceptual Data Modeling: Why?
  • Documents the type of data (information) which must be represented in a system independent of specific application, organizations, or technology.
  • Maximizes data sharing; minimizing redundancy.
  • Provides foundations for physical database design.
  • Describes the unique business enterprise specifically.
conceptual data modeling why1
Conceptual Data Modeling: Why?
  • Outside of application life cycle on a company-wide basis.
  • Data modeling expresses inherent associations which are the most part, independent of anyone one application.
  • Data entities change very little even through the way they are used can change for each application.
  • A complete maintained conceptual data model should shorten the requirements definition phases of system development life cycle.
data modeling approach
Data Modeling: Approach
  • Data partitioning

Use a top-down approach to define the data requirements of a system. The purpose is to divide and conquer (from subject to entity), and to evolve from the conceptual level to logical level until physical database is derived.

  • Standard deliverables

For each of the levels, there is a set of standard deliverables that must be produced. The documentation items must be well defined so that the data at each level is well understood.

data partitioning via modeling
Data Partitioning Via Modeling

(How)

(What, Why, Who, Where)

Subjects

Conceptual Level

Technical considerations

Entities

Relationships

Logical Level

Data Elements

Frequencies

Physical

Level

Data Definition Language -DDL(create, Alter, drop tables)

Data Manipulation Language (select, insert, delete, update)

conceptual data modeling activity description
Conceptual Data Modeling --Activity Description

1. Define the system boundary – Data Context Diagram.

2. Partition a subject into entities.

3. Discover entities -super-type and subtype; part and

whole.

4. Define associations between entities.

5. Define major attributes.

6. Define unique identifier for the entity.

7. Assign cardinalities and set up relationships between

the entities.

8. Validate the model with the users.

1 define the system boundary by subject context diagram
1. Define the System Boundary by Subject Context Diagram
  • A subject (a group of entities with strong affinity) is a

class of data objects representing the mission and

resources of the organization.

  • “Subjects” provides mechanisms for controlling how

much of a model a reader (user, analyst, manager) is able

to consider and comprehend at a time. For a small system,

go directly to define entities.

For ex: Library (subject) decomposed into book, member,

and account. (entities).

atm erd subject context level
ATM ERD –Subject Context Level

Customer

ATM

uses

Has

Owns

Bank

Consortium

Account

Affiliated

Bank

Holds

Consists

of

data subject context diagram
Data (Subject) Context Diagram
  • A Data context diagram is a special case of ERD in which a single diagram represents the problem domain in terms of data requirements.

For example: The ATM diagram illustrates and highlights

Several important characteristics of the system:

  • The people and organization with which the

system communicates.

  • It documents the significant connections (relationships) between the data objects within as well as outside the problem domain.
2 partition a subject into entities
2. Partition a Subject into Entities

Criteria for partitioning a subject into entities

are:

1. The entities included in a subject all tend to describe the subject and have a strong affinity with the subject.

2. The entities of a subject should be of equal importance, as measured by the range, complexities or importance of their data.

3. Each entity should belong to only one subject.

3 discover entities
3. Discover Entities
  • Identifying data objects via business event analysis

An individual stimulus from one data object to another is an event. Events are discovered by investigating the external influences that act upon a system, and the data transformation that occurs within a system that converts inputs into outputs.Thus, source and target data objects that interact with an event can be identified. For example, an event “customer buys a product” as depicted below is initiated by the data object “customer” who requires a response from the data object “sale”, “product”, and “payment”.

slide37
Customer

Sale

Customer buys

A Product

Payment

Product

types of entities
Types of Entities

To find potential entities (entities are nouns), look for:

  • Objects can be generalized or specialized
  • The entities of a subject should be of equal importance, as measured by the range, complexities or importance of their data.
  • Each entity should belong to only one subject
subtype and super type
Subtype and Super-type

Faculty

Full Time

Faculty

Part Time

Faculty

Part (day)

Time

Faculty

Part (Night)Time

Faculty

generalization specialization
Generalization & Specialization

Aircraft

Specialization

Generalization

Commercial

Military

747

777

B52

B-1B

4 define associations between entities
4. Define AssociationsBetween Entities

Use a verb to describe associations between two or

more entities that the user wants to keep track of. Each

relationship must be specific as possible so that its

meaning is clear.

An individual owns a building.

Owns: possession, rental; or management?

5 define major attributes
5. Define Major Attributes

All the attributes of an entity must have meaning for each

and every one of the occurrence of the entity. Only

elementary data are included in the model. Attributes

resulting from process algorithms should not be included

in the model.

For ex. Entity ‘individual’ has attributes such as name,

address; sex (male or female); no. of dependents, etc. But

Derived attributes (such as percentage) are not

attributes.

6 define unique identifier for the entity
6. Define Unique Identifier for the Entity

This is an attribute that unambiguously identifies each

occurrence of each entity. Some entities do no have their

own identifier. These entities are qualified as dependent

or week entities. The identifier of the entity to which the

dependent entities are associated with must be used to

uniquely identify their occurrences.

For ex., a ‘child’ entity must have his or her parent

identifier to be uniquely identified.

7 assign cardinalities set up relationships between the entities
7. Assign Cardinalities & Set Up Relationships Between the Entities

Cardinality is the minimum and maximum number of

times an occurrence of an entity occurs in relationship

to another entity.

The minimum number: 0 or 1

The maximum number: 1 or M

There are three types of associations:

One-to-one; one-to-many; many to many.

types of relationships by degree of attributes the relation contains
Types of Relationships By Degree (# of Attributes the Relation Contains)

The relationships (representing business rules)

could be either as binary, recursive, or ternary.

Doctor

Patient

Binary

Part

Order

Ternary

Part/Order

Recursive

Organization

recursive association
Recursive Association

A recursive association is one in which there is a relationship between

An entity and itself.

Information Dev

Accounting

Engineering

Payable

Product

Receivable

Facility

Nuclear

Coal

8 validate the model with the users
8. Validate the Model with the Users
  • The approach requires that the users be involved at the outset of the data modeling activity until the end of its implementation. They work side-by-side with system developers, providing input and validating the accuracy of the data requirements. This will ensure that the delivered end-product meets the users’ need.
validate the model with the users
Validate the Model with the Users
  • To identify and document the integrity constraints given in the user’s view of the enterprise. This includes identifying:
    • Required data
    • Referential integrity
    • Attribute domain constraints
    • Enterprise constraints
    • Entity integrity
validate the model with these check points
Validate the Model With These Check Points
  • Attribute allocation
  • Rules followed
  • Cardinality necessity
  • Relationship necessity
  • Achievement of organization goals
  • Contributions to expected benefits
topic 5 logical data modeling what why when who and activity description
Topic 5: Logical Data Modeling: What, Why, When, Who? And Activity Description

What is it?

  • It is a representation of data required to support the complete business needs for a particular business area, system or project.
  • It is a set of data models that provides a framework for the physical database construction activities.
  • It is a graphical representation of data objects that shows the relationship between the tables, views and functional core services used by modules in the application system.
logical data modeling what why when who
Logical Data Modeling: What, Why, When, Who?

Why do it?

  • Document the type of data which must be represented in a system with regard to specific system applications, organizations, or technologies.
  • Assist in the orderly creation of a physical database design.
  • Specifically describe the unique business enterprise.
  • Accelerate and clarify communications between the functional analysis and DBA’s.
logical data modeling what why when who1
Logical Data Modeling: What, Why, When, Who?

When should it be done?

  • Part of the system application lifecycle.
  • In parallel with process modeling activities.
  • Upon the completion of the conceptual data model to produce a first-cut database design that includes definitions of tables, columns, and constraints.
logical data modeling what why when who2
Logical Data Modeling: What, Why, When, Who?

Who should do it?

  • The group responsible for ensuring that data structure reflects business data requirements.
  • It should be a joint effort between the functional analysts and data administrators.
logical data modeling what why when who3
Logical Data Modeling: What, Why, When, Who?

Benefits

  • Provide a definition of the data architecture of how the target system will be implemented.
  • Model parts of the database schema that show how data structures are related to the processes.
  • Provide program designers with the detail for the part of the database design that their modules use.
logical data modeling activity description
Logical Data Modeling: Activity Description

1. Define Data Architectural Standards

2. Position the conceptual data modeling and

revise the definitions of the entities.

3. Define integrity rules for entities and

relationships and Apply normalization rules to

each entity.

4. Complete and standardize the data elements.

5.Package the model for physical data modeling and system construction.

6. Evaluate quality of data for conversion.

7. Validate and Verify the Data Model

1 define data architectural standards
1. Define Data Architectural Standards
  • Data Access/retrieval guidelines

Naming convention; SQL coding standards;

  • Data integrity (package; trigger;commit; rollback)

Error handling; record locking rule; update collision.

  • Data Security

Data access rule; data separation rule;

Data recovery/backup

  • Data base refresh/performance tuning
2 position the conceptual data modeling and revise the definitions of the entities
2. Position the Conceptual Data Modeling and Revise the Definitions of the Entities
  • Position the conceptual data modeling and revise the definitions of the entities. By taking architectural standards into consideration, the complete CDM is reexamined. As a result, new entities and relationships my be discovered.
  • Partition data into entities at the table level.
continued
Continued
  • Map Local Conceptual Data Model to Local Logical Data Model
    • To refine the local conceptual data model to remove undesirable features and to map this model to a local logical data model. This involves:
      • (1) Remove M:N relationships.
      • (2) Remove complex relationships.
      • (3) Remove recursive relationships.
      • (4) Remove relationships with attributes.
      • (5) Remove multi-valued attributes.
      • (6) Re-examine 1:1 relationships.
      • (7) Remove redundant relationships.
3 define integrity rules and apply normalization rules
3. Define Integrity Rules and Apply Normalization Rules
  • Integrity rules for entities indicate the context in which an

entity occurrence may be created, modified, or deleted.

They also ensure that the entity is consistent with other

entities. This is accomplished by placing referential attributes in each appropriate entity on the model.

For example, a Client (entity) holds an Account (entity). A client cannot be deleted if at least one of his accounts has a balance greater than 0.

  • Apply normalization rules to each entity.
formalizing a one to one relationship with referential attribute
Formalizing a One-to-one Relationship with Referential Attribute

Husband

Wife

*Husband name

Other attributes

*Wife name

Other attributes

Husband name

Married

to

Referential Attribute

formalizing a one to many relationship with referential attribute
Formalizing a One-to-Many Relationship with Referential Attribute

Dog

Dog Owner

(1:M)

(1:1)

* Dog Id

Other attributes

Dog Owner Id

* Dog Owner

Dog Owner Id

Other attributes

Referential Attribute

formalizing a many to many relationship with referential attribute
Formalizing a Many-to-Many Relationship with Referential Attribute

Part

Order

*Part Id

Other attributes

*Order No

Other attributes

Order/Part

*Order No

*Part Id

Other attributes

An associative entity may

Participate in relationship

With other entity.

Referential Attributes

a many to many relationship
A Many-to-Many Relationship

A many-to-many relationships will result in the

creation of a new entity.

Order

Order #

Part

Part #

1:M

1:M

Part/Order

Part #/Order #

referential integrity
Referential Integrity

Three options:

  • Restrict: A primary key can not be deleted if there are any dependent foreign key rows.
  • Cascade: Deleting a primary key row causes the deletion of all dependent foreign key rows.
  • Set Null: Deleting a primary key row causes all dependent foreign keys values to be set null.
apply normalization rules
Apply Normalization Rules

A technique to make sure the data in a logical data

models is defined once and only once. Normalization

helps minimum data redundancy, and minimize

update abnormalities. Three forms:

  • First Normal Form
  • Second Normal Form
  • Third Normal Form
normalization
Normalization
  • First Normal Form: Relationships between primary key and each attribute must be one-to-one; ie., remove repeating group.
  • Second Normal Form: All non-key elements are dependent upon the entire primary key rather than any part thereof.
  • Third Normal Form: Elimination of the dependence of non-key field upon any other field excepts the primary keys.
pk primary key fk foreign key nn no null nd no duplicate
OrderNo (PK)

part-no

Qty

part-name

PK +

PK

NN

ND

FK

FK

1

1

123

Nut

3

5

123

Bolt

PK: Primary KeyFK: Foreign KeyNN: No NullND: No duplicate

Order

Part

Relationship

Order/Part

first normal form
First Normal Form

Item Table

Qty-Store-2

Qty-Store-3

Qty-Store-1

Item No

PK

3000

4000

5000

101

The above is an violation of first normal form

because there exists a repeated group.

rule number 1
Rule Number 1
  • For each occurrence of an entity, there is only one and only one value for each its attributes. Attributes with repeating values form at least one new entity.
  • N other words, relationship between primary key and each attribute must be one-to-one.
possible solution
Possible Solution

Store

Store/Item

Store

ID

Store

ID

Item-

No

Qty

Sold

+

PK

PK

FK

FK

S1

S1

3000

101

S2

S2

102

4000

second normal form
Second Normal Form

Student/Course

Course

Name

Course

No

Student

No

Teacher

code

Grade

PK

+

FK

FK

FK

3.0

Math

ST01

100

T2

Lee

ST02

4.0

T1

CS

Doe

200

Both course name and student name should be removed because

They are not related to the entire student/course primary key.

possible solution1
Possible Solution

Student No

Course

Name

Student

Name

Student

Course No

Student/Course

rule number 2
Rule Number 2
  • Each attribute must be related to the entire primary key.
second normal process
Second Normal Process

Order

Part

Part

Name

Order

No

Pt-price

PartNo

Order-Dt

PK

PK

1/2/01

Nut

1

1

1.5

1/3/01

5

Bolts

2.0

3

Order/Part

Order

No

Partno

QTY

How about

Putting PartName

In Order/part

Table?

PK

+

1

123

1

1

5

3

123

third normal form
Third Normal Form

COURSE

Course

Id

Teacher

Code

Course

Name

Dept

Name

Teacher

Name

Dept -Id

PK

T1

DOE

MH400

Math

Math

A1

CS

DB

CS401

T2

CS

Lee

The relationship between any two non-primary key components

must not be one-to-one. What’s wrong with the above?

rule number 3
Rule Number 3
  • The relationship between any two non-primary key components must not be one-t-one; ie., remove tables within tables.
the normal process
The Normal Process

Order

Customer

Cust-Name

Order

ID

Order

DT

Cust-Id

Cust-Id

PK

PK

FK

1

Lee

1

1/2/ 01

1

Sato

1/5/21

3

3

5

It would be a violation of third normal form to place cust-name in the order table.

slide84
Why

Reasons:

  • One-to-one relationship between two non-primary key columns (Cus-Id and Cust-name).
  • Redundancy
  • An update anomaly (when a customer name was changed)
  • Worse yet when a new name was added (the name could not be stored until the customer placed at least one order)
identify integrity constraints
Identify Integrity Constraints
  • To identify and document the integrity constraints given in the user’s view of the enterprise. This includes identifying:
    • Required data
    • Referential integrity
    • Attribute domain constraints
    • Enterprise constraints
    • Entity integrity
4 complete and standardize the data elements
4. Complete and Standardize the Data Elements

As a result of the modeling process via the preceding

procedure, an information model will emerge.

The information model can be used as input (for ex., via

Erwin Tool) to generate a data definition language (DDL)

which in turn is used as input for physical data model.

The information model (also called logical data model)

fulfills the data requirements of the system.

complete and standardize the data elements continued
Complete and Standardize the Data Elements (Continued)

For each entity, identify the associated list of attributes. For each element, specify the following:

  • Permitted value
  • Coding and editing rules
  • Dimensions
  • Length
  • Value
  • Frequency
5 package the model for physical data modeling and system construction
5. Package the model for physical data modeling and system construction.

To do this, one must have:

  • A normalized entity relationship diagram.
  • A description of table and column definitions.
  • A description of data architecture standards.
6 evaluate quality of data for conversion
6. Evaluate Quality of Data For Conversion

If the data modeling is part of re-engineering efforts,

we must also document:

  • Condition of the data of the existing system
  • Impacts on the new and enhanced system.
  • Conversion rules
7 validate and verify the data model
7. Validate and Verify the Data Model
  • Validation (dynamic):

Prototyping is used to validate and refine the

model.

  • Verification (static): Inspection or walk-through

Inspections for entity necessity, relationship necessity, and attribute allocation.

7 validate and verify the data model continued
7 Validate and Verify the Data Model (continued)
  • Validate Model against User Transactions

To ensure that the logical data model supports the transactions that are required by the user view. (Prototyping is a good tool)

  • Draw Entity-Relationship Diagram

To draw an Entity-Relationship (ER) diagram that is a logical representation of the data given in the user’s view of the enterprise.

validate model against user transaction
Validate Model Against User Transaction
  • Example transactions

(a) Insert details for new members of staff.

(b) Delete details of a member of staff, given the staff number.

topic 6 physical data modeling an overview
Topic 6: Physical Data Modeling-An Overview
  • Step 1

Translate global logical data model for target DBMS

  • Step 2

Design physical representation

  • Step 3

Design security mechanisms

step 1 translate global logical data model for target dbms
Step 1: Translate global logical data model for target DBMS
  • To produce a basic working relational database schema from the global logical data model
  • Design base relations for target DBMS
    • To decide how to represent the base relations we have identified in the global logical data model in the target DBMS.
  • Design enterprise constraints for target DBMS
    • To design the enterprise constraints for the target DBMS.
step 2 design physical representation
Step 2 : Design physical representation

To determine the file organizations and access methods that will be used to store the base relations; that is, the way in which relations and tuples will be held on secondary storage.

  • 2.1 Analyze transactions
  • 2.2 Choose file organizations
  • 2.3 Choose secondary indexes
  • 2.4 Consider the introduction of controlled redundancy
  • 2.4 Estimate disk space requirements
step 2 design physical representation continued
Step 2 : Design physical representation (Continued)
  • 2.1 Analyze transactions
    • To understand the functionality of the transactions that will run on the database and to analyze the important transactions.
  • 2.2 Choose file organizations
    • To determine an efficient file organization for each base relation.
analyze transactions
Analyze Transactions
  • For each Transaction associated with the components of the data model (usually predefined queries including view, trigger, procedure, function and package), it needs to be broken down into further smaller units of work:
transactions analysis continued
Transactions Analysis (continued)
  • A.    Transformation Rules: Describe the rules (R,U,I, D) or algorithms used to transform data received into data generated.
  • B.    Edit and Error Rules: Define the rules validating data received and the method of processing erroneous data.
  • C.    Sequence Analysis: Describe under what conditions this transaction is performed and what rules determine which transaction will be performed next.
transactions analysis continued1
Transactions Analysis (continued)
  • D.Audit Rules: Describe the rules required to audit the activity performed within this transaction.
  • E.Security Rules: Define the security required to invoke the transaction or various facets of the transaction.
transactions analysis continued2
Transactions Analysis (continued)
  • F.Frequency of execution: Define the number of times this transaction is performed in a fixed period of time.
  • G.Type of transaction mode: Describe whether the transaction is batch, on demand, or interactive.
example sample transactions
Example - Sample Transactions

(A) Insert details for a new member of staff, given the branch address.

(B) List rental properties handled by each staff member at a given branch address.

(C) Assign a rental property to a member of staff, checking that a staff member does not manage more than 10 properties already.

(D) List rental properties handled by each branch office.

step 2 design physical representation continued1
Step 2 Design Physical Representation (continued)
  • 2.3 Choose secondary indexes
    • To determine whether adding secondary indexes will improve the performance of the system.
  • 2.4 Consider the introduction of controlled redundancy
    • To determine whether introducing redundancy in a controlled manner by relaxing the normalization rules will improve the performance of the system.
step 2 3 choose secondary indexes
Step 2.3 Choose secondary indexes

Data File: The file contains the logical record.

Index File: The file contains the index file.

  • The values in the index file are ordered per the indexing field which is usually based on a single attribute.
indexes
Indexes
  • Primary index: The indexing field is guaranteed to have a unique value.
  • Secondary Index: An index that is defined on a non-ordering field of of the data.
  • Clustering index: If the index field is not a key field of the file, so that there can be more than one record corresponding to a value of the indexing field.
step 2 4 consider the introduction of controlled redundancy
Step 2.4 Consider the introduction of controlled redundancy
  • Simplified Relation with Derived Attribute
  • Duplicating Attribute
  • Setting up Lookup Table
  • Duplicating Foreign Key
step 2 design physical representation continued2
Step 2 Design Physical Representation (Continued)
  • 2.5 Estimate disk space requirements
    • To estimate the amount of disk space that will be required by the database.
step 3 design security mechanisms
Step 3 Design Security Mechanisms
  • 3.1 Design user views
    • To design the user views that were identified in Step 1 of the conceptual database design methodology.
  • 3.2 Design access rules
    • To design the access rules to the base relations and user views.
use hotel case for illustration
Use Hotel Case for illustration

Guest

Hotel_no

Guest_no

Date_from

Date_to

Room_no

Registration

Guest_no

Guest_name

Guest_address

Hotel

Room

Hotel_No

Hotel_name

City

Room_no

Hotel_no

Type

Price

1:1

1:M

Identifying Relationship

Dependent Entity (Attribute Entity)

data partitioning using hotel as a case study
Data Partitioning –using Hotel as a case study

(How)

(What, Why, Who, Where)

(Hotel)

Subjects

Conceptual Level

Technical considerations

Entities

Relationships

Logical Level

Data Elements

Frequencies

Data Definition Language -DDL(create, Alter, drop tables)

Data Manipulation Language (select, insert, delete, update)

conceptual data modeling breaking the subject hotel into several entities
Conceptual Data Modeling(Breaking the Subject Hotel into several entities.

Guest

Guest_no

1:M

Books

1:M

Hotel

Room

Hotel_No

Room_no

Has

1:1

1:M

Identifying Relationship

Dependent Entity (Attribute Entity)

logical data modeling
Logical Data Modeling

Booking

Guest

Hotel_no

Guest_no

Date_from

Date_to

Room_no

Guest_no

Guest_name

Guest_address

Hotel

Room

Hotel_No

Hotel_name

City

Room_no

Hotel_no

Type

Price

1:1

1:M

Identifying Relationship

Dependent Entity (Attribute Entity)

resolving referential attributes normalization
Resolving Referential Attributes & Normalization

(such as TV, Bed)

Item

Qty_ Hotel _no -2

Qty_ Hotel _no -3

Qty_Hotel

_no-1

Item No

PK

6

9

14

101

The above is an violation of first normal form because there

exists a repeated group.

Relationships between primary key and each attribute must be one-to-one.

possible solution2
Possible Solution

Hotel

Hotel/Item

Hotel

name

Hotel

ID

Hotel

ID

Item-

No

Qty

+

PK

PK

Min-nan

FK

FK

H1

H1

6

101(TV)

H2

Xiamen

H2

102

5

second normal form1
Second Normal Form

Room/Hotel

Hotel

No

Price

Room

No

Hotel name

Type

100

101

Xiamen

4

double

Hotel Name should be removed because

it is not related to the entire room/hotel primary key.

What happens if one of the hotel names is being changed?

third normal form1
Third Normal Form
  • The relationship between any two non-primary key components must not be one-t-one; ie., remove tables within tables.
third normal form2
Third Normal Form

City

Hotel/City

City

ID

City

name

Hotel

ID

Hotel -

name

City

Id

PK

PK

Las

Vegas

H0

C1

Circus

C1

H1

C1

Flemingo

C2

Seattle

H2

Holiday

C2

What happens if a new City name is added to the Hotel/City Table?

physical data modeling
Physical Data Modeling
  • Data Definition Language -DDL(create, Alter, drop tables)
  • Data Manipulation Language (select, insert, delete, update)
data manipulation language
Data Manipulation Language
  • SELECT DISTINCT COUNT(Guest_No)
  • FROM Booking
  • WHERE Date_From > 08/01/2000 AND < 08/31/2000);
data definition language ddl
Data Definition Language -DDL
  • CREATE TABLE hotel (
  • hotel_no char(18) NOT NULL,
  • hotel_name char(18) NULL
  • )

Note: Primary key is not null.

topic 7 prototyping
Topic 7: Prototyping

Purposes: To validate and refine the model of a system.

Characteristics: Prototyping is a discipline of

interactive experimentation to stimulate user

feedback.

Approach: A prototype is a materialization of

modeling process by building rapidly and simulating

the essential aspects of the system.

software modeling prototyping with built in testing validation project evaluation control
Software Modeling/Prototyping WithBuilt-in Testing Validation/Project Evaluation & Control

Target System

Existing System

Physical Model

Physical Model

Built-In Testing

Validation

Prototyping

Project Evaluation/Control

Logical Model

Logical Model

Approach: Structured or Spiral or Iterative Approach

Rapid Application Development

CASE tools; Deliverable Templates

types of prototyping
Types of Prototyping
  • Nonworking Scale Prototyping (Mock-up)
  • Straw man (exploratory) prototyping (scale model approach)
  • First Full-Scale Prototyping (Real world model)
types of prototyping1
Types of Prototyping
  • Evaluative (mock-up) prototyping (Blue print approach)

Use: Visualization; demonstration; inspection

Advantages. Reduce risk; low cost; resolve uncertainty

Early; fast; flexible

How: Focuses on a relatively small number of questions to avoid becoming too complex.

Generally thought of a throw-away.

types of prototyping2
Types of Prototyping
  • Straw man (exploratory) prototyping (scale model approach)

Use: Experiment; validating and refining the conceptual as well as logical data model.

Advantages. Used to improve design; discover things about a proposed system that would otherwise not be revealed.

How: Evaluate the impact on work flows; validate ease of use.

types of prototyping3
Types of Prototyping
  • Evolution prototyping (real world model)

Use: Production use.

Advantages. Part of production exercise.

How: Explore functionality of subsystem and system; probe technical feasibility the performance and the integrity of the system.

stages of prototyping
Stages of Prototyping
  • Planning
  • Initial analysis and design
  • Construction
  • Tryout
  • Evaluation
  • Disposal
dimensions of prototyping
Dimensions of Prototyping

Relationships of any prototyping to its eventual

system are characterized along four dimensions:

  • Focus – for example, a prototype may focus on the functionality, user interface, system integration, reliability, and performance.
  • Scope– Is a measure of how much of the eventual system the prototype represents.
dimensions of prototyping1
Dimensions of Prototyping
  • Depth – is a measure of how deeply it represents the behavior of the eventual system. For ex., a shallow prototype of a message system might display ‘canned’ messages, where a deeper prototype might actually perform communications to provide a more realistic surrogate for the eventual system.
  • Scale– Is a measure of how its size or performance compares with that of the eventual system.
use hotel case for illustration1
Use Hotel case for illustration

Guest

Hotel_no

Guest_no

Date_from

Date_to

Room_no

Guest_no

Guest_name

Guest_address

Hotel

Room

Hotel_No

Hotel_name

City

Room_no

Hotel_no

Type

Price

1:1

1:M

Identifying Relationship

Dependent Entity (Attribute Entity)

guest registration
Guest Registration

Guest Registration

prototyping for hotel case
Prototyping for Hotel Case
  • Straw man (exploratory) prototyping (scale model approach)-use Microsoft Access to build a form to simulate the registration of guests.
  • Straw man (exploratory) prototyping (scale model approach)- build a small set of tables to accept the data.
  • Evolution prototyping (real world model) –actually building a mini-system with real tables to simulate the system.
rapid application development rad
Rapid Application Development (RAD)
  • RAD, or rapid application development, is an object-oriented approach to systems development that includes a method of development as well as software tools
rad phases
RAD Phases
  • There are three broad phases to RAD:
    • Requirements planning
    • RAD design workshop
    • Implementation
requirements planning phase
Requirements Planning Phase
  • Users and analysts meet to identify objectives of the application or system
  • Oriented toward solving business problems
rad design workshop
RAD Design Workshop
  • Design and refine phase
  • Use group decision support systems to help users agree on designs
  • Programmers and analysts can build and show visual representations of the designs and workflow to users
  • Users respond to actual working prototypes
  • Analysts refine designed modules based on user responses
implementation phase
Implementation Phase
  • As the systems are built and refined, the new systems or partial systems are tested and introduced to the organization
  • When creating new systems, there is no need to run old systems in parallel
rad and the sdlc
RAD and the SDLC
  • RAD tools are used to generate screens and exhibit the overall flow of the application
  • Users approve the design and sign off on the visual model
  • Implementation is less stressful since users helped to design the business aspects of the system
when to use rad
When to Use RAD
  • RAD is used when
    • The team includes programmers and analysts who are experienced with it
    • There are pressing reasons for speeding up application development
    • The project involves a novel ecommerce application and needs quick results
    • Users are sophisticated and highly engaged with the goals of the company
using rad within the sdlc
Using RAD Within the SDLC
  • RAD is very powerful when used within the SDLC
  • It can be used as a tool to update, improve, or innovate selected portions of the system
disadvantages of rad
Disadvantages of RAD
  • May try and hurry the project too much
  • Loosely documented
  • May not address pressing business problems
  • Potentially steep learning curve for programmers inexperienced with RAD tools
final words
Final Words
  • Transform data into information by understanding the process
  • Transform information into decisions with knowledge
  • Transform decisions into results with actions
ad