D couverte de mappings entre schemas les diff rentes approches schema matching different approaches l.jpg
This presentation is the property of its rightful owner.
Sponsored Links
1 / 18

Découverte de mappings entre schemas : les différentes approches Schema Matching : Different Approaches PowerPoint PPT Presentation


  • 113 Views
  • Uploaded on
  • Presentation posted in: General

Découverte de mappings entre schemas : les différentes approches Schema Matching : Different Approaches. Khalid Saleem LIRMM. RDF Schema. XML Schema. XML. RDF. OWL. Schema and Ontology. Schema represents Database Community

Download Presentation

Découverte de mappings entre schemas : les différentes approches Schema Matching : Different Approaches

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


D couverte de mappings entre schemas les diff rentes approches schema matching different approaches l.jpg

Découverte de mappingsentre schemas :les différentes approchesSchema Matching : Different Approaches

Khalid Saleem

LIRMM


Schema and ontology l.jpg

RDF Schema

XML Schema

XML

RDF

OWL

Schema and Ontology

  • Schema represents Database Community

    • Schemas often do not provide explicit semantics of their data (ER, XML document schema).

  • Ontology represents the AI Community

    • Ontologies are logical systems that themselves obey some formal semantics. Designed to be interpreted by computers for reasoning (OWL)

  • Schemas and Ontologies are similar in the sense that

    • Both provide a vocabulary of terms that describes a domain

    • Both constraint the meaning of terms used in vocabulary (Hierarchy/ relations)


Schema vs ontology examples l.jpg

<class-def>

<name>branch</name>

<slot-constraint>

<name>is-part-of</name>

<has-value>tree</has-value>

</slot-constraint>

</class-def>

XML

class-def animal

%plants are a class that is disjoint from animals

class-def plant subclass-ofNOT animal

%it isnecessary but not sufficientfor a tree to be a plant:

class-def tree subclass-of plant

%branches arePART OFtrees

class-def branch

slot-constraint is-part-of has-value tree

%it isnecessary and sufficientfor a carnivore to be an animal:

class-defdefined carnivore subclass-of animal

slot-constraints eats value-type animal

%herbivores eat only plantsORpart of plants

class-defdefined herbivore subclass-of animal

slot-constraint eats value-type plant OR

(slot-constraint is-part-of has-value plant)

DAML+OIL

Schema vs Ontology : examples


Match l.jpg

Books

Source A

Books

Source B

price book-title author-name

listed-price title a-fname a-lname

16,50 Nous Les Dieux Bernard Werber

24 Pompei Robert Harris

26,60 Harry Potter J. K. Rowling

11,50 Marie Des Intrigues Juliette Benzoni

Match

  • Takes two schemas/ontologies as input and produces a mapping between elements of the two schemas that correspond semantically to each other

complex match

1-1 match


Schema matching vs ontology matching l.jpg

Schema Matching vs Ontology Matching

  • Schema matching is usually performed with the help of techniques trying to guess the meaning encoded in the schemas

  • Ontology matching try to exploit knowledge explicitly encoded in the ontologies.`

In real world applications :

Solutions from both domains are mutually beneficial


Application domains l.jpg

Application Domains

  • Traditional (Static)

    • Schema Integration

    • Data warehousing

    • E-commerce

    • Catalogue Integration

  • New Frontiers (Dynamic)

    • Semantic Query Processing

    • Agent Communication

    • Web Services Integration

    • P2P Databases


Basic classification of matchers rb01 l.jpg

Basic Classification of Matchers [RB01]

  • Schema vs Data Instance

  • Element vs Structure

  • Language vs Constraint

    • String based : Prefix, Suffix e.g. auth: author

    • Tokenization, Lemmatization, Eliminition [GSY04]

      Tool_Kit :(Tool,Kit), Kits:Kit, IsRelatedTo : Related

    • Data Types, Value domain e.g. 1..12 : month

  • Match Cardinalities - 1:1, 1:n, n:m

    (Tel Res, Other) : (Tel Day, Evening, Night)

  • Auxiliary Information

    • Global Schema, Dictionaries, Thesauri, Previous Match Decisions, User Input


Basic classification of matchers se05 l.jpg

Basic Classification of Matchers [SE05]

  • Structure Level Techniques

    • Graph Matching

    • Children

    • Leaves

    • Relations

  • Taxonomy based Techniques

    e.g if super concept is same then sub concepts are same or vice versa

  • Model Based

    • ER, XML or XML schema, OWL, OO etc.

Combinational Matchers[RB01]

  • Hybrid Matcher

  • Multiple/Composite Matcher


Match dimensions se05 l.jpg

Match Dimensions [SE05]

For Match Algorithms designing

We need the knowledge for its utilization i.e. Dimensions

  • Input of the Algorithm

    • Data or Schema, Element level or Structure Level

  • Characteristics of the Matching Process

    • Require exact or approximate matching

    • Performance over quality

  • Output of the Algorithms

    • Output is a graded result, or part of a set of match algorithms which are combined together for a map result


Existing matching tools l.jpg

Existing Matching Tools

  • Cupid[MBR01]

  • COMA (COMA++)[ADMR05]

  • Similarity Flooding

  • SemInt

  • Artemis

  • DIKE

  • TransScm

  • AutoMed

  • Charlie[TBBT04]

Ontologies Specific

  • NOM/ QOM

  • OLA

  • Anchor-PROMPT

  • S-Match [GSY04]

  • HICAL

  • SKAT


Matching tools continued l.jpg

Matching Tools continued

Machine Learning

  • GLUE (LSD, CGLUE)[DMDH02]

  • Automatch

  • These tools do not completely fulfil the requirements for large scale schema matching because

    • Not fully automated

    • Emphasise less on search space optimisation


Our approach l.jpg

b

b

b

w

b

a

p

t

w

w

f

f

t

t

n

h

o

d

g

n

n

n

n

p

p

i

i

t

h

a

t

p

r

n

n

Our Approach

a: author

b: book

d: detail

f: information

g: general

h: birth

i: isbn

n: name

o: own-books

p: publisher

r: price

t: title

w: writer

  • Motivation :

    • Large Scale Scenario

      Peer-to-peer Information Systems over the XML Web

  • Our Schema Matching and Integration Approach

    • Tree Mining Techniques

      • Name Matcher

      • Element Level Matching

      • Structure Level Matching

a=w

b=o

f=d

Search sub-trees


Tree mining approach l.jpg

book

publisher

author

title

n

name

name

n2 [2,2]

b

n0 [0,5]

p

a

t

n5 [5,5]

n1 [1,2]

n3 [3,4]

n

n4 [4,4]

Tree Mining Approach

Inspired from the tree mining algorithms and data structures based on node scope values (calculated by depth first pre-order traversal) Top-down [Z02]

  • Our work extends these data structures for schema matching and integration process for handling large sets of XML schema trees.

  • Employs

    • Element level Name Matcher (same node label or synonym)

      • Cluster similar/synonym labels

    • Utilize the node scope values properties to extract semantics out of structure

      • E.g. node with label name n2[2,2] is a descendent of node with label author n1[1,2] and not of node with label publisher n3[3,4] verified using descendent test

Descendent Node Check :

Scope of Node x is [X,Y] and Scope of Descendent Node xd [Xd,Yd] then Xd>X and Yd<=Y


Tree mining approach continued l.jpg

Tree Mining Approach … continued

  • Data Structure used

    • Label List : Sorted list of all node labels in the forest of XML schema trees

    • xGrid : Matrix in which each row represent each participating XML tree and each column represents the corresponding node label. Each cell contains the scope values, parent node number and mapping information.

  • Output

    • Creation of a Mediated Schema Tree , from the given forest of participating XML schema trees.

    • Generation of Mapping Information between participating schema trees and the mediated schema tree


Tree mining approach continued15 l.jpg

Sm

S1

S2

S3

S4

Mapping Information is the column number of node

Tree Mining Approach … continued


Conclusion l.jpg

Conclusion

  • Element level Name and Linguistic Matching with the support of thesaurus is an integral part of every Match system.

  • With systems moving towards schema/ontology based manipulation, and lack of global schemas or previous matching results, Structure Level matching is equally important for making out the semantics.

  • Peer-to-peer environment requires new methods to be exploited for performance and quality mapping i.e. integration of Tree Mining techniques for matching purposes and search space optimisation.

  • Machine Learning algorithms can be beneficial in the P2P environment in later stages when training examples have been created from instance data, provided the target domain remains the same.


References l.jpg

References

  • [AH04] Antoniou G., Harmelen F. A Semantic Web Primer, The MIT Press, 2004

  • [ADMR05] Aumuller D., Do H. H. , Massmann S., and Rahm E. Schema and ontology matching with COMA++. In Proceedings of the International Conference on Management of Data (SIG-MOD), 2005

  • [BR04] Bellahsène Z. and Roantree M. (2004) Querying Distributed Data in a Super-peer based Architecture. DEXA 2004.

  • [BMP04] Bernstein PA., Melnik S., Petropoulos M. and Quix C. (2004) Industrial-Strength Schema Mapping. SIGMOD Record, Vol. 33, No. 4, December 2004

  • [DMDH02] Doan AH., Madhavan J., Domingos P. and Halvey A. (2002) Learning to Map Ontologies on the Semantic Web. WWW 2002

  • [MBR01] Madhavan J., Bernstein PA. and Rahm E. (2001) Generic Schema Matching with Cupid. VLDB 2001.

  • [RB01] Rahm E. and Bernstein PA (2001) A Survey of Approaches to Automatic Schema Matching. VLDB Journal 2001 : 10(4):334-3503

  • [SE05] Shvaiko P. and Euzenat J. (2005) A Survey of Schema-based Matching Approaches. Journal on Data Semantics, 2005.

  • [TBBT04] Tranier J., Baraer R., Bellahsene Z. and Teisseire M (2004) Where’s Charlie: Family Based Heuristics for Peer-to-Peer Schema Integration. IDEAS 2004, 227-235

  • [Z02] Zaki MJ (2002) Efficiently Mining Frequent Trees in a Forest. 8th ACM SIGKDD Int’l Conf. Knowledge Discovery and Data Mining. July 2002

  • http://www.w3.org/TR/daml+oil-reference

  • http://www.doc.ic.ac.uk/automed/


Thank you l.jpg

Thank you


  • Login