Chapter 12: Ontologies and Knowledge Representation. PRINCIPLES OF DATA INTEGRATION. ANHAI DOAN ALON HALEVY ZACHARY IVES. Outline. Introduction to Knowledge Representation and its relevance to data integration Description Logics: a family of KR languages
ANHAI DOAN ALON HALEVY ZACHARY IVES
Mediated schema: ontology with classes and relationships
Data sources: S1 has comedies and S2 documentaries
S3: movies with at least two awards
S4: comedies with at least one Oscar
S1 is relevant to Q1 because Comedy is a subclass of Movie (by subsumption)
S2 is irrelevant to Q2 because Comedy and Documentary are declared disjoint.
S3 is relevant to Q3 because movies with two awards will definitely satisfy the second subgoal.
S4 is relevant to Q3 because oscar is a sub-property of award.
(it should really be a square inclusion)
C,D are complex concepts. A is a base concept.
Many other constructors possible: union, existential quantification, equality on role paths,…
a1: Italians are people (really. Don’t laugh!)
a2: Comedies are movies
a3: Comedies are disjoint from documentaries
a4: Movies have at most one director
a5: Award movies are those that have at least one award
a6: Italian hits are award movies whose director is Italian
The extensions of concept and role descriptions are given by the following equations. (#S denotes the cardinality of the set S).
Assume an interpretation with the identity mapping on individuals in the knowledge base and a few extra elements (Director1, Award1, Actor2, …).
The following interpretation is a model:
is subsumed by
is subsumed by
Uniform Resource Identifiers: available beyond a single data set.
You can assign IDs to blank nodes, but they are internal to a document.