1 / 23

A Z Approach in Validating ORA-SS Data Models

A Z Approach in Validating ORA-SS Data Models. Scott Uk-Jin Lee Jing Sun Gillian Dobbie Yuan Fang Li. Introduction. Semistructured data Rapid growth in its usage Through World Wide Web, Web Services, other Web-based applications.

Download Presentation

A Z Approach in Validating ORA-SS Data Models

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Z Approach in Validating ORA-SS Data Models Scott Uk-Jin Lee Jing Sun Gillian Dobbie Yuan Fang Li

  2. Introduction • Semistructured data • Rapid growth in its usage • Through World Wide Web, Web Services, other Web-based applications. • Due to the introduction of XML and its related technologies. • Requires design of good semistructured data structure • Especially if the data is stored in a database. • Requires good schema definition • ORA-SS can be used. • ORA-SS • Provides schema definition for semistructured data. • Restricted to a diagrammatic notation and semantic written in English. • Requires formal mathematical semantics for wider utilization.

  3. Motivation • Benefits of having formal semantics for ORA-SS • Remove ambiguity that may arise from a diagrammatic representation. • EnabletheuseofORA-SSinotherapplicationsandtools. • Reveal inconsistencies in a design at the schema and instance levels. • Increase quality of the software system through semantics checking. • Improve quality of the software system by providing deep semantic checking for semistructured data used.

  4. ORA-SS Object • Object class • similar to an entity type in an ER diagram, a class in an object-oriented diagram, or an element in an XML document. • Relationship type • represents a nesting relationship among object classes. • is represented optionally with a labelled diamond and can be described by name, n, p and c. • name : name of relationship type • n : integer indicating the degree of relationship type • p : participation constraint of parent object class in relationship type • c : participation constraint of child object class in relationship type • Attribute • represents properties of an object or a property of a relationship. • can be a key attribute which has a unique value. • Reference • model recursive andsymmetric relationships. • reduce redundancy especially for many-to-many relationships. • represent disjunctionof objects and attributes. object name, n, p, c object Relationship name name name object object

  5. ORA-SS Example • The diagram presents an ORA-SS schema that represents the structure of a particular semistructured data. • The schema should consists of followings: • relationship between the ‘course’ object class and the ‘student’ object class with a single-valued attribute ‘mark’. • object class ‘course’ with an identifier ‘code’, a single-valued attribute ‘title’ and multi-valued attribute ‘ANY’. • object class ‘student’ with an identifier ‘ID number’ and single valued attributes ‘name’ and ‘email’. • The schema diagram is syntactically correct but there are three semantic errors • The degree of relationship ‘cs’ is 3, representing a ternary relationship where it actually is a binary relationship since object ‘course’ is not related to any other objects besides ‘students’. • Having two primary keys for the object class ‘student’. There are two attributes selected as primary key where there should only be one primary key for each object class. • the primary key ‘ID number’ is represented as an attribute of the relationship ‘cs’ where it really is an attribute of an object ‘student’. • A validation process is required to pick up this kind of errors in the design process

  6. Z & Z/EVES • Z • formal specification language • developed at the Programming Research Group at Oxford University. • based on set theory and first-order predicate logic. • declarative language with number of language constructs including given type, abbreviation type, axiomatic definition, state and operation schema definitions. • widely used forproviding formal semantics and verifications in various application domains. • Z/EVES • an interactive system for composing, checking, and analyzing Z specifications. • supports general theorem proving of Zspecifications.

  7. Formal Semantics of ORA-SS (Basic Type & Relationship Type) • Basic Types • Basic types used in the ORA-SS data modeling language has been identified and defined prior to constructing the formal representation. • The object types and attribute types defined above represent the set of objectclasses, object instances, attributes and attribute values respectively in theORA-SS language. • Relationship Type • Relationship Type in ORA-SS data modeling language has been defined as a function with a set of object classes as itsdomain and a sequence of set of object classes as its range. • The predicate ofthe function uses a recursive definition and describes that object classes canbe related to other object classes as well as to other relationships.

  8. Formal Semantics of ORA-SS (Relationship Type) • The definition includes two types of relationship in an ORA-SS schema diagram. • anormal relationship where the child participant is a single object class. • adisjunctive relationship where the child participant is a set of disjunctive objectclasses. • The first predicate of the function prevents cyclic definitions in the relationship structure. • The second predicate allows the represention of a binary relationship as well as a relationship of degree 3 ormore.

  9. Formal Semantics of ORA-SS (Degree of a Relationship Type) • Every relationship in an ORA-SS schema diagram has its associated degreerepresented as a natural number. • The above definition represents degree as a function where the firstargumentrepresents the relationship and the second argument represents the naturalnumber which refers to the value of the degree of the relationship. • The predicate of the function defines that the degree of any relationship is the number of object classes involved in the relationship.

  10. Formal Semantics of ORA-SS (Instances of Object Classes & Attribute) • In the ORA-SS data model, an object class has instances which are objects. • The above definition defines object classes having instances as a function where the first argument represents an object class and the secondargument represents a set of objects which refers to all the instances of theobject class. • The predicate of the function specifies that anobject cannot be an instance of multiple object classes. • The Instances of attribute has been defined similarly,as an object class hasinstances, attributes also have values.

  11. Formal Semantics of ORA-SS (Instances of a Relationship Type) • Relationship type also has its instances whichrepresents the participation instances from their corresponding object classesin the relationship.

  12. Formal Semantics of ORA-SS (Instances of a Relationship Type) • The relationship instance definitionis defined as a function wherethe first argumentrepresents a relationship and the second argument represents the instance of the relationship. • An instance of the relationship is represented as an object related to a sequence of objects that conforms to therelationship definition. • The first predicate of the function defines that the degree of a relationshipinstance should be the same as the degree of the relationship type. • Thesecond predicate defines that child object instance should be an instance of theassociated selected child object classes in the relationship. This predicate also defines that only the objects of a single object class is related to a parentobject or sub-relationship instance in the case of a disjunctive relationship. • The third predicate consists of two cases • If the degree of the relationship isbinary, the parent object instance should be an instance of the parent objectclass. • If the degree of the relationship is ternary or more, the second part ofthe predicate recursively defines that the sub-relationship instance sequenceis an instance of the sub-relationship type. • The last predicate definesthat any two relationship types should have their own disjoint set of instances. This specifies that a relationship instance cannot be an instance of multiplerelationship types.

  13. Formal Semantics of ORA-SS (Participation Constraints on Object in a Relationship Type) • Every relationship type in an ORA-SS schema diagram has its associatedconstraints on its participating objects which is represented by the ‘min:max’ notation. It constrains the number of child objects that a parent object canrelate to and vice versa.

  14. Formal Semantics of ORA-SS (Participation Constraints on Object in a Relationship Type) • The participation constraints on object in a relationship type is defined as a function where the first argument represents a relationship and the second argument represents a cartesian product of multiplicity which refers to a ‘min:max’ pair. • The predicateof the function defines that the number of relationship instances in whicheach object of the parent object class or each relationship instance of the sub-relationship type should be within the multiplicities defined in the relationship. • It specifies that the parent constraint sets the boundaries for the number ofchild objects that a single parent object or sub-relationship instance can have. • The child constraints of the relationship has been defined in a similar way.

  15. Formal Semantics of ORA-SS (Candidate Key of an Object Class) • An object can have an attribute or set of attributes that have a unique valuefor each instance of an object class called a candidate key. • a candidate key is a single attribute with unique value. • a composite candidate key is a set of attributes with a unique combined value.

  16. Formal Semantics of ORA-SS (Candidate Key and Primary Key of an Object Class) • The candidate key is defines as a relationship where objectclasses are related to the set of attributes which refer to all the candidatekeys that belong to the object. • The first predicate of thefunction defines that candidate keys belong to the set of attributes that theobject has. • The second predicate of the function defines two facts. • two objects are different when values of the candidate key for each objectare different. • two objects are the same when values of the candidate keyfor each object are the same. • The predicate also specifies thatthe value of candidate key for each object of an object class should uniquelyidentify an object instance. • Primary key has been defined as a total function with the same arguments as the candidate key definition and its predicate specifies that primary key is selected from a set of a candidate key.

  17. Formal Semantics of ORA-SS (Other definitions) • Object class, attribute pair and their instances • Definition of object class and attribute pair is defined as a simple total function similar to primary key definition but with no predicates. • Definition of instance of object class and attribute pair has been defined similar to the instance of a relationship. • Cardinality of attribute values associated with an object • Definition of cardinality for attribute values associated with an object is defined similar to the participation constraints in relationship type.

  18. Validation (Schema Diagram) • Guideline for validating an ORA-SS schema diagram • In a relationship type, the child object class must be either related to another parent object class to form a binary relationship or related to anothersub-relationship type to form a relationship type of degree 3 or more. • The degree of a binary relationship is 2, ternary is 3 and n-nary is n. • In a disjunctive relationship type, the child participants is a set of disjunctiveobject classes. • A composite attribute or disjunctive attribute has an attribute that is related to two or more sub-attributes. • A candidate key of an object class is selected from the set of attributes ofthe object class. • A composite key is selected from 2 or more attributes of an object class. • There can only be one primary key per object class and it can be either acandidate key or a composite candidate key. • Relationship attributes have to relate to an existing relationship. • An object class can reference one object class only, but an object class canbe referenced by multiple object classes.

  19. Validation (Schema Diagram) • Most of the guidelines have been encoded into Z semantics of ORA-SS • When a schema diagram is represented in Z, it can be validated of its correctness against the ORA-SS Z semantic • Previous ORA-SS schema diagram example represented in Z • Validating the degree of ‘cs’ relationship • The validation proves that the definition of degree cs = 3 is invalid.

  20. Validation (XML) • Guideline for validating an XML instance • Relationship instances must conform to the participation constraints. • In a disjunctive relationship, only one object class can be selected from thedisjunctive object class set and associated to a particular parent instance. • For a candidate key (single or composite), its value should uniquely identifythe object that this key attribute belongs to. • Each object can have one and only one primary key. • All attributes have their own cardinality and the number of attributes thatbelong to an object should be limited by the minimum and maximum cardinality values of the attribute. • For a set of disjunctive attributes, only one of the attribute choices can beselected and associated to an object instance. • Validation of a given XML instance following the guideline can be achieved by checking its consistency of the content in the document against its ORA-SS schema definitions.

  21. Validation (XML) • XML document conforming to the corrected ORA-SS schema example • Validating parent participation constraints of the relationship type cs • The validation shows that ‘Course8’ and ‘Course9’ does notsatisfy the parent participation constraint of a minimum of 4 students percourse. • Similarly, we can check the childparticipation constraint of therelationship `cs'.

  22. Conclusion • Contribution of this work • Definition of a formal mathematical semantics for the ORA-SS diagrammaticdata modeling notation. • It provides a rigorous formal foundation for the ORA-SS language. • It can be adopted by many semistructured data applicationswhich use the ORA-SS data model. • Definition of some guidelines for validatingthe ORA-SS data models at both the schema diagram level andthe XML instance level. • These can be used as a template for the applications that implement the validation algorithm of ORA-SSsemistructured data. • Demonstration of some reasoning steps using theZ ORA-SS semantics in validating customized ORA-SS schema diagrams andXML instances. • Proof steps are presented through a simple ORA-SS datamodel. • More complicated proofs can also be constructed for validating largesemistructured documents.

  23. Conclusion • Future Work • Extending and concentrating the work on the automaticvalidation of semistructured data in Z. • Developing a translationprogram that automatically transforms an XML instance into its corresponding Z ORA-SS instance representation for machine validation. • Extending the current Z semantics of the ORA-SS language to modelthe normalization problems in semistructured data design.

More Related