The enterprise data management council semantics repository case study
Download
1 / 48

23 June 2010 - PowerPoint PPT Presentation


  • 225 Views
  • Updated On :

The Enterprise Data Management Council Semantics Repository Case Study. Mike Bennett EDM Council Inc. Overview. EDM Council Case Study Review format requirements Ontology framework Metamodel Adaptations Extensions Relation to other standards Common terms and standards

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about '23 June 2010' - medwin


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
The enterprise data management council semantics repository case study l.jpg

The Enterprise Data Management Council Semantics Repository Case Study

Mike Bennett

EDM Council Inc.


Overview l.jpg
Overview

EDM Council Case Study

Review format requirements

Ontology framework

Metamodel

Adaptations

Extensions

Relation to other standards

Common terms and standards

Ontology modeling standards

Proof of Concept activities


Edm council requirements l.jpg
EDM Council Requirements

The EDM Council

“A non-profit trade association focussed on managing and leveraging enterprise data as a strategic asset to enable financial institutions to increase efficiency, minimize risk, and create competitive advantage”

Industry requirement

Consistent Terms, Definitions and Relationships

Growing realization that this needs semantic approach


History financial standards l.jpg
History: Financial Standards

MDDL: Market data XML

Technical (XML) message standard

Good design means weak semantics

Did try and do a semantic spreadsheet but technical folks did not use

ISO TC68/SC4/WG11 FIBIM

Logical data model (UML)

Intended to be semantic but using unextended UML

Not a widely recognized approach to semantics

Business experts were unable to comment on a “design” model they don’t understand

Industry Conclusion: “We need a semantics standard”


Semantic model requirements l.jpg
Semantic Model Requirements

Position as “Conceptual Model”

Same rules apply as for Requirements Specifications

Must be owned and validated by business

Manage the “Language interface” between tech and business subject matter experts

Everything should be in English

No techie terms and casing like “objectProperty”

Everything should be reviewable

Spreadsheets

dialect-free diagrams

Align with other financial industry standards

ISO 20022, EFAMA, FpML, XBRL, MISMO


Early experiment l.jpg
Early Experiment

Looked at what a truly “semantic” layer would be

Exercise for the ISO WG responsible for ISO 20022 FIBIM model

Example semantic model of industry terms

Used TopBraid Composer and Protégé

Did it meet “Usability” requirement?

Did the semantics stack up?



Example thing equity l.jpg
Example “Thing”: Equity

Real world definition of Equity:

"An equity is a financial instrument setting out a number of terms which define rights and benefits to the holder in relation to their holding a portion of the equity within the issuing company".


What is an equity l.jpg
What is an Equity?

Financial Instrument

Equity

Is a kind of

Equity security

In relation to

Instrument Terms

Has rights defined in

Or to put it another way…


What is an equity10 l.jpg
What is an Equity?

Using OWL to define the classes of real things in the world, and the facts about those things

Modeled in TopBraid Composer


Financial semantics in owl l.jpg
Financial Semantics in OWL

Pizza approach

“Everything is a Thing”

What about common terms?

accounting terms for equity, debt, cashflow

Places, time concepts

Legal terms (securities are contracts)

Better partitioning needed


Conclusions l.jpg
Conclusions

Does not provide views that business SMEs could validate

Requires them to interpret OWL terms and diagrams

They want spreadsheets and simple diagrams of “Things” and relations

Does not allow for common reusable terms outside of the financial services industry


Edm council members view l.jpg
EDM Council Members View

What can be seen and understood immediately?

Spreadsheets

Agreed on a spreadsheet format which could

Represent most OWL features

Simple as possible but no simpler

Simple block diagrams

Boxes and Lines like Visio

Process Flowcharts

Not required for static terms semantics but keep in mind when we need to model business processes


Semantics repository strategy l.jpg
Semantics Repository Strategy

OWL was the way to go for semantics models

May need to extend for things OWL does not do well

OWL tools were almost but not quite ready for business domain experts review

Any appearance of dialect or “techie” constructs would limit the review audience and therefore the quality of the model

Use a UML Modeling tool for flexibility

Generate spreadsheets from this

Create “UML-free” diagrams by turning off all UML features

Display the results on a dedicated web structure

Review this with industry SMEs.


Ontology definition metamodel l.jpg
Ontology Definition Metamodel

Metamodel and Profile for OWL in UML

Early draft available when we started this

OWL version 1

UML Tool

Enterprise Architect from Sparx Systems

Implemented metamodel of RDF/RDFS

Implemented metamodel of OWL

Used recommended stereotypes in Profile

Results in OWL and RDF/S toolbars in EA

Added stereotypes for non-stereotyped items in ODM so all on one toolbar e.g. Generalization, RDFS Sub Property, Unions.

Tweaks

Various tweaks to maintain user diagram commitments

Recast all terms in English to maintain user language commitments

Exposed predicate logic statements as separate “Logic” classes in XML Spy-like format


English versus owl concepts l.jpg
English versus OWL concepts

Each spreadsheet and diagram feature corresponds to some OWL concept:

Thing = OWL class

Simple fact = Datatype Property

Relationship fact = Object Property

with UML multiplicity, or

with predicate logic statement applied to the range

Special “Logic” classes to visually render logic combinations

Mutually Exclusive = Disjoint

Logical Union = OWL Union


Owl v uml in odm implementation l.jpg
OWL v UML in ODM Implementation

Class = class

Object Property = Association Class

Datatype Property = Attribute

Inverse of a Property = “inverse” Association (red)

Predicate Logic

Simple: Multiplicity

Complex: Exposed as “Logic” class icons

Disjoint With = “mutually exclusive” Association (red)

RDFS Datatype = datatype (same XML set)

Enumerated data range = enumeration

RDFS Sub-property of = “from” Generalization (green)

OWL Union Class: ODM recommends UML Covering Generalization Set with no stereotype. Stereotyped as “union” (purple)


Resulting model framework l.jpg
Resulting Model Framework

Modelling tool generates diagrams and spreadsheets content

Diagrams and models show:

Things

Facts about those Things

- Simple facts - names, dates etc

- Relationship Facts - relating one Thing to another

Framed within a technology neutral theory of meaning


Theory of meaning l.jpg
Theory of Meaning

Set theory constructs

Each Thing or class is a set

“Is A” relationship defines taxonomy

“What kind of thing is it?”

Facts about those things

Relationship Facts; Simple Facts

“What facts distinguish this thing from other things?“

Include necessary and contingent facts

Identify mutually exclusive sets

OWL1 does not support “Completely exhaustive” set of sub-classes

Additional written definitions against each term

Reviewed and agreed by business domain SMEs

NOTE: This is implemented as OWL Full

no limitations are imposed on how a modeler can use the framework to set down facts as they see them .



Diagrams l.jpg
Diagrams

Block diagrams are derived from the modelling tool but with all the UML features turned off

No + signs no < > brackets, no camelCase

Domain experts “know they don’t know” what those symbols would mean

The diagrams show:

The hierarchy of Things

Relationships between those Things

Simple facts about Things (optional)

Additional diagram types show relationships among relationships, for review by ontology experts and those business domain experts who are able to “Keep up” with what many see as the more philosophical aspects


Sample screenshot l.jpg
Sample Screenshot

Thing

“Is A” relations

Object Property

(Relationship Fact in English)


Sample screenshot 2 different types of thing l.jpg
Sample screenshot 2: Different types of Thing


Sample screenshot 3 simple facts l.jpg
Sample Screenshot 3Simple Facts



Comparison with data models l.jpg
Comparison with Data Models

Set theory classes not OO classes

Relationships are unidirectional

Pair of relationship + inverse = one OO relationship

Open World Assumption

“Absence of evidence is not evidence of absence”

Every fact which defines a thing is included even if data would never be available or is not needed

Taxonomy

Multiple inheritance

Supports real world multiple classifications

Data model enumerations

Mixed semantics in reference data models

Usually points to further semantic modelling requirement


Common concepts treatment l.jpg
Common Concepts Treatment

Goal is interoperability with standards terms

E.g. XBRL accounting concepts

Define what kind of “Thing” everything is

Securities are contracts

Tradable Securities v OTC Contracts

Need to define high level primitive concepts

And the necessary relationships among those concepts


Contexts and events l.jpg
Contexts and Events

Digital rights DOI standard

“Context” is something with Time and Place

English: Something with Time and Place is Event

Event with Actor is an Activity

Processes are made up of activities

Extended to Activity and Process Model concepts

Could we replicate Visio-style process flow models with semantics?



The grammar approach l.jpg
The “Grammar” approach

Extended the thinking with activity and process model, to all concepts

Legal top level model (contracts, terms, laws etc.)

Geopolitical concepts

Time, Information (identifiers etc,)

Defined every common concept we could think of

thing and relationship fact

Necessary relationships among these defines a “Grammar” which is both inherited and specialized

Unlike stereotypes, these are also part of the model content

Therefore we call them Archetypes

Implemented as UML Stereotypes in UML profiles

Importing these profiles results in editing toolbar for each set of common concepts

Now we can model everything using semantic toolbars




Top level taxonomy l.jpg
Top level taxonomy

Partitioned the top of the model into different classes of “Thing”

None of our archetypes is directly a “Thing”

Based on Knowledge representation (KR) Lattice

(John F Sowa, 2000)

independent, relative and mediating

physical and abstract

continuant and occurrent

Define all common terms in line with these 3 partitions

Added parts, time concepts etc.

Allows for cleaner management of data model terms

Relative concepts like Issuer v repetitive data structures

Introduces “Occurrent” partition in contrast to Continuant



Semantics repository content l.jpg
Semantics Repository Content

  • Top level: KR Lattice hierarchy

    • Independent v relative v mediating

    • continuant v occurrent

    • concrete v abstract

KR Lattice

  • Mid level: Global terms

    • Accounting

    • Legal

    • Math etc.

Global terms

Instruments

Dated Terms

Process

  • Financial Instruments Ontology

    • Instruments reference terms

    • Dated and Time-dependent terms

    • Processes

Common types and selection lists


Semantics repository content36 l.jpg
Semantics Repository Content

  • Top level: KR Lattice hierarchy

    • Independent v relative v mediating

    • continuant v occurrent

    • concrete v abstract

KR Lattice

  • Mid level: Global terms

    • Accounting

    • Legal

    • Math etc.

Global terms

These need to be aligned with the best of the rest

Instruments

Dated Terms

Process

  • Financial Instruments Ontology

    • Instruments reference terms

    • Dated and Time-dependent terms

    • Processes

Common types and selection lists


Summary l.jpg
Summary:

Presentation to Business SMEs

spreadsheets and tables

simple block diagrams

Ontology framework

Partitioning: Terms are descended from one term in each of the three partition layers

High level grammars define syntax of meaningful connectionsamong Archetypes

e.g. a Transaction always has certain Parties

Most of these common terms will be found in industry standards for the relevant industries.


Standards bodies l.jpg
Standards Bodies

Financial Securities Industry

MDDL: Market Data Definition Language (SIIA/FISD)

ISO 20022: Securities messaging (TC68)

Registration Authority = SWIFT

ISO 20022 FIBIM

WG11 draft from ISO TC68/SC4/WG11

FpML: Financial Products Markup Language (ISDA)

EFAMA Data dictionary (European Funds and Asset Management Association)

FIX: Financial Information eXchange (FPL)

Global Terms

XBRL: eXtendable Business Reporting Language (XBRL Inc.)

MISMO: Standard for loans etc.

REA (Resources, Events, Agents) Ontology (William E McCarthy, Michigan State University)

DOI Indecs (Digital rights standard)


Financial industry standards l.jpg
Financial Industry Standards

Reverse engineered these standards as initial repository content:

Reference Data Terms

ISO 20022 “FIBIM”

EFAMA (Funds)

Timed and Dated (Market Data) terms

MDDL

Over the Counter Derivatives:

FpML

Future / Proof of Concept

MISMO (Loans standard)

Terms currently imported from project participants, to be realigned with MISMO


Global concepts standards l.jpg
Global Concepts Standards

Financial Terms

XBRL: accounting standard reporting format

Used in creating the Financial (Accounting) high level model

Disregarded reporting-specific terms

Relationships as per basic accounting literature

XBRL terms have corresponding archetypes in the SR

Other terms to be aligned as material comes available

REA – partly incorporated

UN-FAO – partly incorporated


Securities trading terms l.jpg
Securities Trading Terms

FIX: Financial Information eXchange (FIX) format from FPL

Would cover pre-trade terms when we model securities trading lifecycle.

Data Model Working Group (DMWG)

FIX liaising with MDDL in DMWG initiative

EDM Council actively participating in this initiative

DMWG will align with EDM Council Semantics Repository


Ontology format extension l.jpg
Ontology Format Extension

Things that are not in scope of OWL itself

Synonym (not owl:sameAs)

Archetype

Classification facets

Provenance of meaning

to identify standards bodies / originators

Hope to collaborate on standardized use of OWL Annotation Properties

N-ary relationships would also be useful

Shown diagrammatically at present

SME view is that this is needed

Next iteration will include OWL2 relationship transitivity


Standards liaison strategy l.jpg
Standards Liaison Strategy

Meta-level terms

Standardize within Annotation Properties

Identify the ones of relevance to ontology practitioners

Common Upper Ontology Terms

Recognize provenance of industry standards bodies is more valuable than isolated ontologist assertions

Identify bodies who are developing semantic versions of well attested standard terms and business definitions

Ontolog Forum SIO Initiative – take this as canonical form of industry standards semantics recognition and provenance

Financial Services Industry

ISO 20022: Mapping SR to latest FIBIM

Liaise with ISO TC68 on next generation semantic layer for ISO 20022 v2

EFAMA Data Dictionary mapping

FpML review of current OTC Derivatives draft

XBRL

Identify a canonical XBRL “Taxonomy” (ontology) and align formally


History l.jpg
History

Initial design: May ’08 – Sept ’08

Format review panel – finalized formats of spreadsheets, diagrams and web layout

Roadshow presentations

Initial draft: Jan ‘09

Weekly SME Reviews to July ’09

Draft released July ’09

Weekly SME Reviews: Pricing, OTC

Proof of Concept and Validation

Baseline for changes Feb 2010

May 2010 Beta release


Content status l.jpg
Content status

Beta Status

Reference terms for tradable Securities

Draft

Pricing, Analytics etc. (market data)

OTC Derivatives

New in draft

Detailed loan and mortgage terms for PoC

To Do:

Corporate Events and Actions (CAE)

Securities Transactions Processing


Proof of concept project l.jpg
Proof of Concept Project

Securitization (MBS Issuance)

ECB, NY Fed, IBM Research, banks, agencies

Demonstrate ability to tag new instruments semantically at issue

Plans to make this mandatory

Basis for systemic risk regulation

Transformation to Semantic Data Model

New material: Loans and Mortgages model

Findings

The domain experts get it

Many terms not in ISO standards

Will feed these into ISO 20022

Refining these with domain experts

Complete view of poorly understood securities and missing data linkages (sub prime etc.)

It is realistic to tag securities terms at issue, when the semantics are still clearly defined within formal prospectus and other docs, as meanings are grounded legally.


The future l.jpg
The Future

Further work on OTC Derivatives

Corporate Events and Actions

Track semantics standards evolution (OWL, ODM)

Align upper ontology with semantic industry standards as these evolve

Align with Ontolog Forum “Sharing and Integration Ontologies (SIO) initiative

ISO Alignment

Alignment of content with ISO 20022 Logical Data Model

ISO 20022 version 2 semantics layer

work with TC68 on model standard

update the core modeling concepts in line with this

Objective: Move from a working prototype model framework to something more standard while contributing our model concepts to industry



ad