1 / 62

Data Modelling Contentious Issues

7 Mar 01Jan 2005.

coen
Download Presentation

Data Modelling Contentious Issues

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


    1. Data Modelling Contentious Issues The Best Debates from the Data Modelling List January 2005 Karen Lopez InfoAdvisors.com

    2. 7 Mar 01Jan 2005 © 2001 InfoAdvisors 2 Karen López, I.S.P. Karen is the moderator of the Data Modelling List. She has 18 years of data modelling experience on large, multi-project programs She has a B.Sc. in Computer Technology / Information Systems from Purdue University She is a former President of the Information Resource Management Association of Canada (IRMAC)

    3. 7 Mar 01Jan 2005 © 2001 InfoAdvisors 3 Upcoming Speaking Engagements DAMA / Meta Data Symposium May 2005 Toronto IRMAC Spring 2005

    4. 7 Mar 01Jan 2005 © 2001 InfoAdvisors 4 About this Presentation We will be using an interactive format - you will be participating in informal polls about data modelling issues and best practices. This is not an introductory presentation - a good knowledge of data modelling issues will be assumed.

    5. 7 Mar 01Jan 2005 © 2001 InfoAdvisors 5 InfoAdvisors’ Discussion Groups E-mail, web, and newsgroup based discussion group Data Modeling, Frameworks, Tools Groups Over 8000 subscribers Moderated No Charge www.infoadvisors.com

    6. 7 Mar 01Jan 2005 © 2001 InfoAdvisors 6 E-mail

    7. 7 Mar 01Jan 2005 © 2001 InfoAdvisors 7 Web

    8. 7 Mar 01Jan 2005 © 2001 InfoAdvisors 8 Newsgroups

    9. 7 Mar 01Jan 2005 © 2001 InfoAdvisors 9 Agenda Contentious Issues Background Discussion Quotes Poll Results & Analysis Resources

    10. 7 Mar 01Jan 2005 © 2001 InfoAdvisors 10 Contentious Issues Near-religious discussions and debates People rarely change their minds based on this discussion The most successful discussions are ones where both sides learn something new about the other viewpoint.

    11. 7 Mar 01Jan 2005 © 2001 InfoAdvisors 11 Conceptual Data Models Do you do them? Are they used? How? Just what is one? How do they differ from other Data Models?

    12. 7 Mar 01Jan 2005 © 2001 InfoAdvisors 12 Conceptual Data Models What is a CDM? Entity only, no cardinality, fewer than 15 entities Entity only, cardinality, more than 15 entities Entity only, one Entity for every entity in the LDM Entity & Attributes, no physical, no surrogate keys Entity & Attributes, no DBMS-specific issues

    13. 7 Mar 01Jan 2005 © 2001 InfoAdvisors 13 Conceptual Data Models – View 1 The conceptual model is concerned with the real world view and understanding of data; the logical model is a generalized formal structure in the rules of information science; the physical model specifies how this will be executed in a particular DBMS instance. Duncan Dwelle, AIS Intl. http://www.aisintl.com/case/CDM-PDM.html

    14. 7 Mar 01Jan 2005 © 2001 InfoAdvisors 14 Conceptual Data Models – View 2 “A [CDM], typically called an Entity Relationship Diagram (ERD), contains business information and structure without regard to physical storage concerns. Business items become entities and you describe them with attributes. You formalize relationships…. Typically you have no index, foreign key or tablespace information. … A Logical Data Model (LDM) is sometimes skipped over. It is a hybrid between the CDM and PDM and contains some DBMS elements like denormalization and indexes, but isn't as detailed as the PDM.” - Michael N.

    15. 7 Mar 01Jan 2005 © 2001 InfoAdvisors 15 Conceptual Data Models – View 3 “[CDM is] a tricky phrase to define in that our industry uses this same phrase for two very different (in my opinion) ways. If you come from the Information Engineering background, a CDM is an entity and relationship-only method (no attributes and often no cardinalities to the relationships) that has a purpose of defining the scope of smaller, more detailed logical data modelling projects. Those logical data models are not constrained by organizational, platform, or other technical issues. If you come from an ORM background, CDM is much like Mike has described [previously]. As he mentions, in this case Logical Data Modelling is similar to a first cut physical data model. - Karen Lopez

    16. 7 Mar 01Jan 2005 © 2001 InfoAdvisors 16 Conceptual Data Models – Warm up Vote Do Conceptual Data Models have Attributes? Do they have cardinality? Are there fewer Entities in a CDM than in an LDM? If they have attributes, can they have surrogate keys?

    17. 7 Mar 01Jan 2005 © 2001 InfoAdvisors 17 Vote! Ok, Here’s How we vote I post a question The question has a Range. You put the sticky in the area for your vote (1-5), sometimes once for PDM and once for LDM. We debate the answer and the results

    18. 7 Mar 01Jan 2005 © 2001 InfoAdvisors 18 Conceptual Data Models - Vote Do you create Conceptual Data Models …that are actively used? Yes, it’s up-to-date, easily available, and used at least a couple of times a year What’s a Conceptual Data Model? We don’t need no stinkin’ CDMs!

    19. 7 Mar 01Jan 2005 © 2001 InfoAdvisors 19 Conceptual Data Models TRAP! You may find yourself debating a topic because there is no common definition of a model or object, not because there is a real disagreement.

    20. 7 Mar 01Jan 2005 © 2001 InfoAdvisors 20 Do we need Classwords? A traditional naming convention Usually means there’s a standard classword list Some tools can check for standards compliance Examples: Date, Amount, Count, Quantity

    21. 7 Mar 01Jan 2005 © 2001 InfoAdvisors 21 Attribute Names Customer First Name Customer Backordered Specialty Order Item ID Item ID Item

    22. 7 Mar 01Jan 2005 © 2001 InfoAdvisors 22 Classwords - Vote Does a good LDM / PDM data model use classwords? Always Classwords are so obsolete….

    23. 7 Mar 01Jan 2005 © 2001 InfoAdvisors 23 Keys - Natural or Surrogate Just what you call them may show your true colours.... Surrogate, Non-intelligent, Unnatural, Dataless or Meaningless Natural, Intelligent, Normal, Cluttered

    24. 7 Mar 01Jan 2005 © 2001 InfoAdvisors 24 Keys – Natural or Surrogate

    25. 7 Mar 01Jan 2005 © 2001 InfoAdvisors 25 Keys - Natural or Surrogate – View 1 I have inherited many [tables] that have used business elements for keys. The complexity of the keys for some tables is horrendous…Another thing I hate about this method is putting business information into the PK. This leads to the business changing either the contents of, or the definition of, a key column. …You should never change PK values…. With single column PK, you have single column FKs and easier maintenance of the system over time. You may have to perform more joins to satisfy complex queries, but the joins are simpler. … I have implemented surrogate keys as a "strongly recommended“ guideline for all future database development. - Michael N.

    26. 7 Mar 01Jan 2005 © 2001 InfoAdvisors 26 Keys - Natural or Surrogate – View 2 I agree wholeheartedly with Mike, but with one exception. I think reference tables would be better retaining the business elements in the primary key since they're usually standalone tables not used in joins. On a rather large application we set the ref tables up with sequence numbers as the primary key. Quickly some developers began hard coding them to refer back to the row. This caused us much difficulty - Thomas Z.

    27. 7 Mar 01Jan 2005 © 2001 InfoAdvisors 27 Keys - Natural or Surrogate - Vote What best describes your approach in an LDM / PDM? Surrogate Keys? We don’t need surrogate keys - we have natural identifiers. Every entity deserves its own surrogate key.

    28. 7 Mar 01Jan 2005 © 2001 InfoAdvisors 28 Derived and Redundant Data Dilemma Not all derived data is calculated Snapshot versus History History: Just maintain a foreign key (or relationship) back to the timestamped data Snapshot: Copy the data to another location to preserve the data at that point in time.

    29. 7 Mar 01Jan 2005 © 2001 InfoAdvisors 29 Historical (no derived data) Derived and Snapshot Derived Data Dilemma

    30. 7 Mar 01Jan 2005 © 2001 InfoAdvisors 30 Derived Data Dilemma - Vote Do Derived Attributes belong in an LDM/PDM? Never All the time - pure logical data modelling doesn’t work in the real world.

    31. 7 Mar 01Jan 2005 © 2001 InfoAdvisors 31 Derived Data Dilemma TIP! Every decision is ultimately made based on cost, benefit, and risk. Be prepared to analyze all three.

    32. 7 Mar 01Jan 2005 © 2001 InfoAdvisors 32 Abstraction / Generalization A modelling design decision Can be very flexible Can be very difficult to understand

    33. 7 Mar 01Jan 2005 © 2001 InfoAdvisors 33 Abstraction

    34. 7 Mar 01Jan 2005 © 2001 InfoAdvisors 34 Abstract or Specific - View “Although your model represents a higher level of abstraction than most people would create it does not violate the rules of normalization. So, it is not technically speaking incorrect. Now come the issues. What is the purpose of the model? Are you documenting the business requirements and rules? If so your approach does not provide as much information about the business as a less abstract approach would…If you are trying to model a flexible environment where the required data (represented by the repeating groups) changes frequently then your model would be more appropriate.” - Rick B.

    35. 7 Mar 01Jan 2005 © 2001 InfoAdvisors 35 Abstract or Specific Which style are you most likely to use in an LDM / PDM? The more flexible (Abstract) the better The more precise (Specific) the better

    36. 7 Mar 01Jan 2005 © 2001 InfoAdvisors 36 To Party or Not to Party Party, Party Role, Party Type, Party Category Migration of Party ID Use of Subtype “Owned Keys” Universality of Party

    37. 7 Mar 01Jan 2005 © 2001 InfoAdvisors 37 To Party or Not to Party Party Non-Party

    38. 7 Mar 01Jan 2005 © 2001 InfoAdvisors 38 To Party or Not to Party It is so unfortunate that there is still a mentality that we need to model the simple case only. I personally think that once you understand the party, party role model it is simpler. That simplicity reinforces itself as you are able to apply the same pattern to lots of different situations and environments. - George P.

    39. 7 Mar 01Jan 2005 © 2001 InfoAdvisors 39 To Party or Not to Party The problem I find with PARTY ROLE is that it is derivable information. This is purely a classification of an individual body based on other data that must be available. Otherwise if I identify you as my “customer” what business meaning does it have? There must be one or more “orders”, or “contracts”, or “correspondences”, or “contacts”, or whatever I deem necessary for it to be sensible for you to be known as my customer. It’s the ORDER that makes you my customer, not being my customer that makes me able to take your ORDER. - Mike V.

    40. 7 Mar 01Jan 2005 © 2001 InfoAdvisors 40 To Party or Not to Party - Vote Which is the best approach to PARTY in an LDM /PDM? It’s the only way to be fully logical It’s too academic, too theoretical and it just won’t work

    41. 7 Mar 01Jan 2005 © 2001 InfoAdvisors 41 DA – Lawyer or Architect? Several new legislative actions focus on data quality, integrity, reliability, availability… Sounds like good DA work, right?

    42. 7 Mar 01Jan 2005 © 2001 InfoAdvisors 42 Examples Sarbanes-Oxley (Sarbox): Financial Accountability & Reliability Anti-Spam Legislation: Data integrity, Reliability Privacy: Quality, Reliability, Accountability…

    43. 7 Mar 01Jan 2005 © 2001 InfoAdvisors 43 DA Role in Compliance Finally, a reason for the Execs to support us… Hey! I don’t even have time write good definitions…

    44. 7 Mar 01Jan 2005 © 2001 InfoAdvisors 44 Did UML kill the Data Modelling Star? Combines data and process into one diagramming technique Some tools support ERDs as well as UML models Many DA’s are being pressured to choose between UML and ERDs.

    45. 7 Mar 01Jan 2005 © 2001 InfoAdvisors 45 UML Class Diagram

    46. 7 Mar 01Jan 2005 © 2001 InfoAdvisors 46 UML - Vote? Is UML an acceptable replacement for LDM? Never – 2 different things UML is the future, we might as well switch now….

    47. 7 Mar 01Jan 2005 © 2001 InfoAdvisors 47 Generic Data Models Generic Data Models are models or subsets of models that an organization can purchase. They are sometimes prepared for a specific industry May be very generic, may be both.

    48. 7 Mar 01Jan 2005 © 2001 InfoAdvisors 48 Generic Data Models - Vote What do you think about Generic Data Models or Patterns? I love them and won’t ever start a project from a blank page again. They are so high level that I don’t see the value in purchasing them.

    49. 7 Mar 01Jan 2005 © 2001 InfoAdvisors 49 Who Gets to Update the Model? Many marketing pitches portray a team of developers, modelers, DBAs, etc., happily working on the model Is it “More hands make for less work” or “Too many cooks spoil the soup”?

    50. 7 Mar 01Jan 2005 © 2001 InfoAdvisors 50 Who Gets to Update the Model Only seasoned Data Modeling Professionals Everybody – we’re all professionals here

    51. 7 Mar 01Jan 2005 © 2001 InfoAdvisors 51 Who calls the shots in a Logical Data Model? Many stakeholders in a LDM Not everyone shares the same understanding about the purpose of the LDM DM tools can greatly influence the decision DA’s can report through a variety of departments

    52. 7 Mar 01Jan 2005 © 2001 InfoAdvisors 52 Who calls the shots in a Logical Data Model? The FINAL say on LDM / PDM decisions belongs to: The Data Architect/Modeller The Project Manager The User…Customer The DBA The CIO/CEO

    53. 7 Mar 01Jan 2005 © 2001 InfoAdvisors 53 XML – the next Data Specification XML is a type of markup language for data Similar to HTML, but carries meaning as well as format metadata Many who use EDI are looking toward XML Many vendors are jumping onto this ‘standard’ Still requires establishing meaning.

    54. 7 Mar 01Jan 2005 © 2001 InfoAdvisors 54 XML <?xml version="1.0" encoding="iso-8859-1" ?> - <clientapp> <global>  <string phone_number="*8369464" />   <string user id=“TestUser”/> <string dir_html="html" />   <string dir_language="english" />   <string filename_html_connected="connected.htm" />   <string filename_html_targetfile="target.htm" />   </global>-

    55. 7 Mar 01Jan 2005 © 2001 InfoAdvisors 55 XML – the next Data Specification I have heard from a wide variety of sources that XML means that data modelling is dead -- who needs a data model when you can create a DTD and call the piece of data anything you want? All you have to do is convince everyone else to call it the same thing... - Karen Lopez

    56. 7 Mar 01Jan 2005 © 2001 InfoAdvisors 56 XML – the next Data Specification XML is very simple. It is a facility for defining your own tags in transmitting documents. (With HTML you have to use theirs.) The implication of this is that the browser/person receiving the document has to know how to interpret your tags. This is the fun part. Some organizations have begun to define a set of tags to serve their purposes -- mathematicians, chemists, metadata gurus, etc. Alternatively, if you send the "Document Tag Definition (DTD)" with your data, it can be used to decode it, at least syntactically. - David H.

    57. 7 Mar 01Jan 2005 © 2001 InfoAdvisors 57 XML – the next Data Specification I see XML as a new medium for presenting data. To effectively present the data it needs to be understood and this is the role of the data model. Do we have the fundamental meaning of our data understood and represented in a quality data model? If yes then the transformation of the requirements into the presentation by XML will be much more effective. …. Enough said for now on how I feel that the data model and XML need to correlate data requirements. - Paul E.

    58. 7 Mar 01Jan 2005 © 2001 InfoAdvisors 58 XML - Vote? Will XML replace data models? Never – 2 different things XML is the future, we might as well switch now….

    59. 7 Mar 01Jan 2005 © 2001 InfoAdvisors 59 Is that with one “L” or Two? How is it spelled? Modeling/Modeler Modelling/Modeller It depends on my mood…

    60. 7 Mar 01Jan 2005 © 2001 InfoAdvisors 60 Online Data Modelling Resources InfoAdvisors (http:///www.infoadvisors.com) The Data Model List Product User Discussion Groups (ERwin, Advantage Repository, Visible Analyst/Advantage, ER/Studio, DBArtisan, CaseWise) The Data Administration Newsletter (http://www.tdan.com)

    61. 7 Mar 01Jan 2005 © 2001 InfoAdvisors 61 Recommended Books Data Modeling Essentials, Graeme Simsion, 2004 Building Quality Databases with IDEF1X, Thomas Bruce, 1992, Dorset House, ISBN 0-932633-18-8 The Data Modeling Handbook, Michael Reingruber & William Gregory, 1994, Wiley QED, ISBN0-471-05290-6 A Practical Guide to Logical Data Modeling, George Tillmann, 1993, McGraw-Hill, $45, ISBN 0-07-064615-5

    62. 7 Mar 01Jan 2005 © 2001 InfoAdvisors 62 Recommended Books Data Model Patterns: Conventions of Thought, David C. Hay, 1996, Dorset House, ISBN 0-9326333-29-3 The Data Model Resource Book, Silverston, Volumes 1 and 2

    63. Jan 2005Jan 2005 © 2005 InfoAdvisors 63 InfoAdvisors 11066 Sheppard Ave East Toronto, ON CANADA karenlopez@infoadvisors.com http://www.infoadvisors.com 8000+ IRM participants subscribed to several discussion groups

More Related