1 / 95

Data Modeling and Database Design

This article explores the importance of data modeling in database design, highlighting the need for managing data as a corporate-wide resource. It discusses the history of data modeling and its relevance in enterprise-wide data management. The article also provides an overview of entity-relationship modeling and its role in data modeling, emphasizing its support in Information Engineering and its use in transaction processing and data warehouse systems.

diannaj
Download Presentation

Data Modeling and Database Design

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Modeling and Database Design

  2. Rationales for Data Modeling • Data is the foundation of modern information systems enabled by data base technologies. • Data in an organization exist and can be described independently of how these data are used. • Data should be managed as a corporate-wide resource. • The types of data used in an organization do not change very much. • Data have certain inherent properties which lead to correct structuring. • If we structure data according to their inherent properties, the structure (i.e., data models) will be stable.

  3. History of Data Modeling • Importance of Entity-Relationship Modeling Technique • Database • Data modeling and enterprise-wide data • Data quality • Data updating and accessing tools and procedure • Data sharing culture • ER modeling technique was first developed by Peter Chen in 1976 • A conceptual/logical data modeling tool • A user-oriented approach • A graphic-based method • ER modeling technique is the major data modeling method in Information Engineering and is widely supported by most of CASE tools. • Data modeling is the foundation of most database-centered transaction processing systems and data warehouse systems

  4. Multiple Perspectives DATA ACTIVITY We do these things We use this data ONE BUSINESS HIRE EMPLOYEE PAY EMPLOYEE ...... .... ...... .... EMPLOYEE FIRE EMPLOYEE PROMOTE EMPLOYEE

  5. Data Model (Entity Relationship Diagram) placed by; is enrolled under; Member Member Agreement places applies to Order established by; sells; generates; established is sold on generated by sponsors; is featured in; Promotion Club Product is sponsored by features

  6. Entity Types • Definition: • An entity is an object or event, real or abstract, about which we would like to store data. Entity is the abbreviation of entity type. It represent a set of entity instances which can be described by the same set of attribute types. The value of the same attribute for each entity instance may be different. • Identifying Entity Types • What information is required by the business? • Things that are of interest to the business that need to be remembered in order to manage and track them. • Things belong to the same entity type have common characteristics.

  7. Naming Entity Types • The name of each entity is in singular form • a noun • an adjective + a noun • a noun + a noun => (noun string) • an adjective + a noun + a noun • Examples • Customer, Customer Order, Product, Hourly Employee, Project, Department, Unfilled Customer Order • Be clear and concise • Avoid abbreviation • Be consist with user’s terminology • Identify synonyms • Customer Client • Product Merchandise • Supplier Vendor • Teacher Faculty • Use one name as the official name and document others as aliases

  8. Exercise: Entity Type Naming • Courses • Department • Customer Order • PO

  9. Properties of Entity Types • Name • Description • Identifier • Properties: Estimated number (Max., Min., Average) of entity instances • Expected growth rate of entity instances • Subject Area in which the Entity Type resides • Attributes that describe the Entity Types • Examples of entity type instances

  10. Definition of an Entity Type • A poor definition of Customer: Anyone that buys something from the company. • Can employees be a customer? • Can a leaser be a customer? • If the company sold a subsidiary to another company, does the new owner consider a customer? • Good definition should be: • Compatible • Precise • Concise • Clear • Complete

  11. Good Definition • Compatible • Customer: An ORGANIZATION that purchase PRODUCTs for personal use. • Distributor: An ORGANIZATION that purchase PRODUCTs for resale. • Precision: • With appropriate qualifiers • Example: An ORGANIZATION is considered to have purchase a PRODUCT when we receive a valid PURCHASE ORDER from it. • Complete • ORGANIZATION, PRODUCT, PURCHASE ORDER need to be defined. • Concise and Clear • Use modular definition

  12. Example of Entity Type Descriptions Entity Type Description Customer Information about all persons or organizations who purchases Product All goods manufactured and sold Raw-material Components used to manufacture Products. Supplier Vendors of Raw Materials. Buyer Company personnel responsible for purchasing Raw-Materials from Suppliers

  13. Entity Type and Entity Instance (Occurrence) Entity Types Entity Instance Vendor ABC Co. Employee John Smith Course Intro. to IM Department Marketing Department

  14. Exercise: Entity Types or Entity Instances? • Maryland • Organization Unit • Customer • President • Bill Clinton • Department of Commerce • Address

  15. Finding Entity Types • Interviews with users • JAD workshops • Business forms • Reports • Computer files using reverse engineering • Operation manuals

  16. Where to Look for an Entity Type? • Tangible or Intangible Things • The nouns that are used to describe the problem domain will often correspond to the major Entity Types of the system, at least at a high level. • Examples: Product, Sensor, and Employee, Department, and Sale Office. • Resources • Any resources that an organization needs to manage should be represented as an Entity Type. Information assists the efficient and effective use of other resources through improved decision. • Examples: Inventory, Machine, Bank Account, and Customer. • Roles Played • Roles can be played by persons or organizational units. • Examples: Customers, Managers, and Account representatives. • Events • Events are incidents that occur at points in time. An event often involved an interaction between two Entity Types or an action that changes the status of an Entity Type. • Examples: Sale, Delivery, and Registration of a motor vehicle.

  17. BIAIT:Business Information Analysis and Integration Technique • Analysis of Orders • Ordered entities can be a thing, a space, or a skill. • View the order from supplier side. • If an organization receives no orders, it has no reason for existing. • An organization unit can receive multiple types of orders. • 4 questions about the Supplier: • Billing (Cash)? • Deliver Late (Immediate)? • Profile customer? • Negotiate price (Fixed)? • 3 questions about the Ordered Entity: • Rented (Sold)? • Tracked? • Made to order (Stock)? Source: Carlson, W. M., "BIAIT: Business Information Analysis and Integration Technique - The New Horizon," Data Base, Vol. 10, No. 4, 1979, pp. 3-9.

  18. Criteria for Evaluating an Entity Type • Need to be remembered by the information system in order to be functional. • Can be operated on: CREATE, READ, UPDATE, DELETE. • Has a set of operations/services that always apply to change the status of each occurrence of an Entity Type. • Carry a set of attributes that always apply to describe each occurrence of an Entity Type. • Have at least one relationship with other entity type. • Exist more than one entity occurrence (instance) in an Entity Type. • Have at least a unique identifier. • Domain-based requirements: Something that the system must have in order to operate. These may be clearly specified in the problem description or known from subject matter experts.

  19. Entity Relationship Modeling and Diagramming • Relationships • Entity Relationship Diagramming Notation • Attributes • Identifiers • Partitioning and Entity Subtypes

  20. Relationship (Type) • Definition • A Relationship Type is an association among Entity Types. It indicates that there is a business relationship between these Entity Types. • Relationship Membership is the participation of an Entity Type in a Relationship. • In IE, a Relationship Type can involve only two Entity Types (binary relationship). Some other modeling techniques allow n-ary relationships. • Examples • CUSTOMER places ORDER • ORDER is placed by CUSTOMER • EMPLOYEE works on PROJECT • PROJECT has project member EMPLOYEE

  21. Paring (Relationship Instance) • Relationship paring is a pair of Entity Instances of two Entity Types associated by a Relationship Type between these two Entity Types. Entity Types Entity Instance Student Student#1 Student#2 Course Course#A Course#B Course#C Course#D Relationship Relationship Paring Student takes Course Student#1 takes Course#A Student#1 takes Course#B Student#1 takes Course#D Student#2 takes Course#A Student#2 takes Course#C Student#2 takes Course#D

  22. Relationship Instances Grouping • Definition: A collection of pairings of a Relationship Membership in which an Entity Instance is involved. • Examples: • Student#1 takes Course#A, #B, and #D • Student#2 takes Course#A, #C, and #D • Course#A is taken by Student#1 and Student#2

  23. Relationship Cardinality E2 E1 One-to-One 1:1 E2 One-to-Many E1 1:M Many-to-Many E2 E1 M:N

  24. Relationship Cardinality • The number of Entity Instances involved in the Relationship Instances Grouping in a Relationship Type. • Three Forms of Cardinality 1. One-to-one (1:1) DEPARTMENT has MANAGER Each DEPARTMENT has one and only one MANAGER Each MANAGER manages one and only one DEPARTMENT 2. One-to-many (1:m) CUSTOMER places ORDER Each CUSTOMER sometimes (95%) place one or more ORDERs Each ORDER always is placed by exactly one CUSTOMER 3. Many-to-many (m:n) INSTRUCTOR teaches COURSE Each INSTRUCTION teaches zero, one, or more COURSEs Each COURSE is taught by one or more INSTRUCTORs

  25. Entity Relationship Diagram (ERD): Notations Graphical Notations Cardinality indicator zero one many relationship-description Entity-Y Entity-X reversed-relation-description min max Translate into two structured statements Each Entity-X relationship-description cardinality-indicator (one-or-many) Entity-Y Each Entity-Y reversed-relationship-description (zero-or-one) Entity-Y Example is-managed-by Manager Department manages

  26. Optionality of Relationship Memberships • Whether all entity instances of both entity types need to participate in relationship pairing. • Optionality: • Mandatory • Optional • Example: • CUSTOMER membership is optional • ORDER membership is mandatory places ORDER CUSTOMER is placed by

  27. Relationship Statements one Cardinality indicator one or more Graphical Notations places ORDER CUSTOMER is placed by zero (sometimes) Optionality indicator one (always) Each Entity XoptionalityrelationshipcardinalityEntity Y Each CUSTOMER sometimes places one or more ORDER. Each ORDER always is placed by one CUSTOMER.

  28. Defining Relationships • Name • Description • Property • Cardinality volumes • Optionality percentage: % of Entity Type X's instances pairing with Entity Type's Y's instances • Transferability: A relationship is transferable if an entity instance can change its pairing within the same relationship. • TRANSFERABLE: An EMPLOYEE can change to a different DEPARTMENT. • NON-TRANSFERABLE: An ORDER cannot be transferred to another CUSTOMER.

  29. ERD: More Examples places (a) Customer Order belongs-to is-contained-in Product contains Parallel Relationship (b) manages Project Employee is-managed-by works-for has-project-members is-consists-of (c) Involuted or Looped Relationship Part contained-in

  30. Identifying Relationships • Association between entity types • Entity types that are used on the same forms or documents. • A description in a business document that has a verb that relates two entity types • has • consists of • uses

  31. Attributes • Definition • Characteristics that could be used to describe Entity Types and Relationship Types. However, in IE, relationship types are not allowed to have attributes. • Naming Conventions: • Names that have business meaning • Don't use abbreviation or possessive case, e.g., PN and Customer's name • Don't include entity type name because IEF will prefix the attribute name with entity type name automatically • Use standard format: Entity Type Name (Qualifiers) Domain Name Customer Name Employee Starting Date • Examples • Customer has customer name, address, and telephone number • Product has quantity-on-hand, weight, volume, color, and name. • Employee has SSN, salary, and birthday. • Employee-works-for-project has percentage-of-time, starting-date.

  32. Attributes: Notations Student ID Student Student Name Student studentID Birth date name phone Student ID Course no. enrollment Birth date Student(Student ID, Student Name, Birth Date) • Finding Attributes: • Attributes are identified progressively during BAA phase. • Data Analysis • Activity Analysis • Interaction Analysis • Current Systems Analysis

  33. Attribute Value • Definition • Attribute Values are instances of Attributes used to describe specific Entity Instances • Examples • Customer Number: 011334 • Customer Name: Minder Chen • State: VA • Order Total: $23,000 • Sale tax: $250 • An attribute of an entity type should have only one value at any given time. (No repeating group) • Avoid using complex coding scheme for an attribute. For example: PART Number: X-XXX-XXX Part Type Material Sequence Number

  34. Type & Instance OBJECT TYPEOCCURRENCE Entity Type Entity Instance Entity Entity Instance Entity Type Entity Relationship (Type) Pairing (Relationship Instance) Attribute (Type) (Attribute) Value

  35. Attribute Source Categories • Basic • Definition: An Attribute Value that cannot be deduced or calculated. • Examples: Student name and Birthday • Derived • Definition: The Attribute Value can be calculated or deduced from relationship Groupings or from the values of other Attributes. The value of a Derived Attribute changes constantly. • Examples: Student Age, Account Balance, Number of courses taken. • Designed • Definition: The Attribute is created to overcome the system constraints. The value of a Designed Attribute does not change. • Examples: Student ID, Course number.

  36. Properties of Attributes • Name • Description • Attribute Source Category: Basic, Derived, Designed • Domain or data type: Text, Number, Date, Time, Timestamp • Optionality: Mandatory or optional • Length and/or precision • Permitted Values (Legal Values) • Ranges • A set of values (Code Table) • Default value or algorithm Tools such as PowerBuilder has additional properties for table’s columns called extended attributes • Validation Rule • Editing Format • Reporting Format • Column Heading • Form Label • Code Table

  37. Data Modeling Case Study The following is description by a pharmacy owner: "Jack Smith catches a cold and what he suspects is a flu virus. He makes an appointment with his family doctor who confirm his diagnosis. The doctor prescribes an antibiotic and nasal decongestant tablets. Jack leaves the doctor's office and drives to his local drug store. The pharmacist packages the medication and types the labels for pill bottles. The label includes information about customer, the doctor who prescribe the drug, the drug (e.g., Penicillin), when to take it, and how often, the content of the pill (250 mg), the number of refills, expiration date, and the date of purchase." Please develop a data model for the entities and relationships within the context of pharmacy. Also develop a definition for "prescription". List all your underlying assumptions used in your data models.

  38. Data Modeling Process • List entity types • Create relationships • Pick a central entity type • Work around the neighborhood • Add entity types to the diagram • Build relationships among them • Determine cardinalities of relationships • Find/Create identifiers for each entity type • Add attributes to the entity type in the data model • Analyze and revise the data model

  39. Classifying Attribute and Partitioning • An Entity Subtype A collection of Entities of the same type to which a narrower definition and additional Attributes and Relationships apply. An Entity Subtype inherits (retains) all the Attributes and Relationships of its parent Entity Type. • Classifying Attribute: An attribute of the Base Entity Type whose values partition the Entity Instances into Subtypes. • Partitioning: A basis for subdividing one entity type into subtypes. The process of dividing an Entity Type into several Subtypes based on a Classifying Attribute is called Partitioning. • The Classifying Attribute is recorded as a property of the Partitioning and it appears on the diagram.

  40. Normalization • A data base is a model or an image of the reality. • Logical Data Base Design is a process of modeling and capturing the end-user views of an application domain and synthesis them into a data base structure. • Normalization is a logical data base design method. • The basis for normalization is the functional dependencies among attributes in a table.

  41. SQL Terminology Column Product Table p_no product_name quantity price 101 Color TV 24 500 201 B&W TV 10 250 202 PC 5 2000 Row Create a table in SQL CREATE TABLES (p_no CHAR(5) NOT NULL, product_name CHAR(20), quantity SMALLINT, price DECIMAL(10, 2));

  42. SQL Terminology Set Theory Relational DB File Example Relation Table File Product_table Attribute Column Data item Product_name Tuple Row Record Product_101's info. Domain Pool of legal values Data type DATE

  43. SQL Principles • The result of a SQL query is always a table (View or Dynamic Table) • Rows in a table are considered to be unordered • Dominate the markets since late 1980s • Can be used in interactive programming environments • Provide both data definition language (DDL) and data manipulation language (DML) • A non-procedural language • Can be embedded in 3GL: • Embedded SQL • Dynamic SQL

  44. SQL: Data Definition Language (DDL) TABLE VIEW INDEX DATABASE CREATE DROP ALTER TABLE

  45. SQL: Introduction • A relational data base is perceived by its users as a collection of tables • E. F. Codd 1969 • Dominate the markets since late 1980s • Strengths: • Simplicity • End-user orientation • Standardization • Value-based instead of pointer-based • Endorsed by major computer companies • Most CASE products support the development of relational data base centered applications

  46. SQL: Data Manipulation Language (DML) p_no product_name quantity price 101 Color TV 24 500 201 B&W TV 10 250 202 PC 5 2000 SELECT UPDATE INSERT DELETE The Generic Form of the SELECT Statement SELECT [DISTINCT] column(s) FROM table(s) [WHERE conditions] [GROUP BY column(s) [HAVING condition]] [ORDER BY column(s)]

  47. Database Table • The following code retrieves only the Last Name and the Employee ID where the Employee ID is greater than 5. The records are retrieved in descending order. SELECT LastName, EmployeeID FROM Employees WHERE EmployeeID > 5 ORDER BY EmployeeID DESC

  48. WHERE Clause • WHERE: Use the Where clause to limit the selection. The # symbol indicates literal date values. SELECT * FROM Employees WHERE LastName = "Smith" SELECT Employees.LastName FROM Employees WHERE Employees.State in ('NY','WA') SELECT OrderID FROM Orders WHERE OrderDate BETWEEN #01/01/93# AND #01/31/93#

  49. Keys • A key, also called identifier, is an Attribute or a Composite Attribute that can be used to uniquely identify an instance of an entity type. • Examples: Entity TypeKey Warehouse Warehouse Number Product Product Number Student Student ID or SSN Ship Name and Port of Registration Stock of Product Product Number and Warehouse No.

  50. Types of Key • Primary Key: A unique key is an attribute or a set of attributes that has been used by the DBMS as the identifier of a table. • Candidate (Alternative) Key: An attribute or a set of attributes that could have been used as the primary key of a table. • Secondary (Index) Key: An attribute or a set of attributes that has been used to construct the data retrieval index. • Concatenated (Combined or Composite) Key: A set of attributes that has been used as the key. • Foreign Key: An attribute or a set of attributes that is used as the primary key in another table.

More Related