Building Local Consensus Ontologies Via Autonomous Merging Using a Lexical Database

Building Local Consensus Ontologies Via Autonomous Merging Using a Lexical Database SUNGSHIN WOMEN’S UNIVERSITY

Overview Of Ontology • Problem Definition • Identical view of the world • Merging ontologies is trivial in these cases. • Different views of the world • The ontology in this case is written in natural language. • Merging Ontologies is a very challenging problem • In this thesis • To allow all computer users to develop their own ontology and concepts and have a system that will develop a global ontology or a global point of view from that’s perspective. SUNGSHIN WOMEN’S UNIVERSITY, AI&M lab

Related Work • Two types of integration • Concept-level Integration • Requires inference about a domain ontology to make a decision about integration between a pair of classes. • Be known to require some sort of expert human intervention. • Syntactical-level integration • Defines rules in terms of class and attributes names. • Be usually conceptually blind and is comparatively easier to implement. SUNGSHIN WOMEN’S UNIVERSITY, AI&M lab

<Related Work>Ontology merging Tools • Chimaera tool[Fikes, et al, 1999] • Be based on the Ontolingua editor[Farquhar, et al, 1996] • Ontologies developed by different authors using different concepts. • Chimaera’s algorithm generates a list of possible suggestion based on the actions performed by the user. • Chimaera tool process • matching class names if the names do not match, then it looks for matches in the prefixes, suffixes and substring to find the merging point. • A user can select one of these merging points to perform the merging operation or proceed on his own way. SUNGSHIN WOMEN’S UNIVERSITY, AI&M lab

SMART system[Noy, et al, 1999] • deals with concepts similar to the Chimaera tool • but have suggests some improvements over it • SMART suggests are locating conflicts, suggesting actions a user should take to resolve conflict i.e. conflict resolution strategies. • PROMPT[Noy, et al, 2000] • Semi-automatic approach to ontology merging and alignment • The system runs into a conflict that it is unable to resolve(e.g. a naming conflict), interaction from the user is required in order to proceed. SUNGSHIN WOMEN’S UNIVERSITY, AI&M lab

Information Science Institute(ISI)[Chapulsk, et al, 1997] • attempted to build extremely large top level ontologies • The creation of the initial list relies more than just the class names. • They score the concepts whose names have long common sub-string, concepts whose documentation share many uncommon words and have a sufficiently high number of name similarities with nearby siblings, children and parents. • This mechanism doses help in removing uninteresting suggestions out of the list. SUNGSHIN WOMEN’S UNIVERSITY, AI&M lab

All the above systems assume • The user of the system is an ontology expert. • He wants to merge this ontology with the one developed by another expert. • In this thesis • Our goal is to allow all computer users to develop their own ontology and concepts and have a system that will develop a global ontology or a global point of view from that user’s perspective. • These users are no experts in the field and have no formal training in building ontologies or merging them SUNGSHIN WOMEN’S UNIVERSITY, AI&M lab

<related work>WordNet • WordNet is a software system that has been developed by Princeton University that aims to be a lexical database. • English nouns, verbs, adjectives and adverbs are organized into synonym sets, each representing one underlying lexical concept • Different relations link the synonym sets. • This is a lexical database that can be read by the computer that is efficient and would be useful to the user SUNGSHIN WOMEN’S UNIVERSITY, AI&M lab

APPROACH • Our approach involves the use of ontologies that are independently developed by different users. • The system environment • Java • XML • makes it system independent • WordNet • to do some semantic comparison SUNGSHIN WOMEN’S UNIVERSITY, AI&M lab

Ontology Creation and Learning • It allows to user to add new classes and relations to the existing ontology SUNGSHIN WOMEN’S UNIVERSITY, AI&M lab

The 5 types of relations between any two classes in the ontology • Is-Super-Class • Is-Sub-Class • Is-Part-of • Contains • Equivalent to • When the user saves an ontology, the system it as a XML file. SUNGSHIN WOMEN’S UNIVERSITY, AI&M lab

SUNGSHIN WOMEN’S UNIVERSITY, AI&M lab

Representing Ontology • be used XML format for representing the concepts and the relations between them. • The format basically consists of a list of nodes • Each node has a name, which represents the concept of that node • contains information about all the relations of the node with other nodes in the ontology • This structure can be conceptually visualized as a tree structure SUNGSHIN WOMEN’S UNIVERSITY, AI&M lab

Similarity Identification • Similarity Identification as the name suggests • Identifying classes with the same name • Identifying classes that are syntactically same(spelt closely) • Identifying classes that are semantically same(have same meaning) SUNGSHIN WOMEN’S UNIVERSITY, AI&M lab

<Similarity Identification>Syntactic similarity • be measured using edit distance formulated by Levenshtein[Levenshtein, 1966], which is a well-established method for weighting the difference between two strings) • measures the minimum number of token insertions, deletions, and substitutions required to transform one string into another using a dynamic programming algorithm • e.g. • ed(“The_Dog”, “TheDog”)=1, because one deletion operation changes the string “The_Dog” into “TheDog”. SUNGSHIN WOMEN’S UNIVERSITY, AI&M lab

<Similarity Identification>Syntactic similarity • Based on Levenshtein’s edit distance method we calculate the syntactic similarity measure[Maedche, et al, 2001] for two strings. • Let Si and Sj be the two strings (class/concept names) that are being compared for syntactic similarity • Syntactic similarity measure(SSM) • returns a degree of similarity between 0 and 1. SUNGSHIN WOMEN’S UNIVERSITY, AI&M lab

<Similarity Identification>Syntactic similarity • selecting an appropriate SSM • If people will in a practical world actually have this kind of different names for the same concept. • People often tend to commit spelling errors most of which are at an edit distance of one for the correct version. • In this thesis • We are using a SSM value of 0.75 • Increasing this value will make it more difficult to find a syntactic match while decreasing it increases the possibility of finding a match • Selecting a value of 1 will imply that both the string have to be exactly same for match to succeed while a value of 0 will imply any two strings are always equivalent. SUNGSHIN WOMEN’S UNIVERSITY, AI&M lab

<Similarity Identification>Semantic Equivalence • by the agents • WordNet dictionary • We use synsets provided by the dictionary to get the meaning of the world in various contexts. • A synset can be defined as a set of synonyms that is identified by WordNet. • When common users(non-experts) developed ontologies they tend to use different class names to express the same concept. • by using WordNet SUNGSHIN WOMEN’S UNIVERSITY, AI&M lab

<Similarity Identification>Semantic Equivalence SUNGSHIN WOMEN’S UNIVERSITY, AI&M lab

Flowchart of the process of Ontology Merging A SUNGSHIN WOMEN’S UNIVERSITY, AI&M lab

B A SUNGSHIN WOMEN’S UNIVERSITY, AI&M lab

B SUNGSHIN WOMEN’S UNIVERSITY, AI&M lab

Discovering New Concepts and Relationships • Based on the knowledge gained during the phase of identifying similar concepts, the agent learns these new concepts and adds it into its own ontology base. • It also looks for relationships that it can discover from the other ontology. Later it adds those relations to its own knowledge SUNGSHIN WOMEN’S UNIVERSITY, AI&M lab

Learning and Adding New Relations • WordNet provides us with relationships • Hypernym/Hyponym(상위어/하위어) = IS-A relationships • Holonym/Meronym(전체어/부분어) = HAS-A relationships • The depth to which we want to go in order to find relations can also be adjusted to get better results • In this thesis use a depth of level three • If we increase the depth we will find more relationship but we might lose on the quality of these relationships. SUNGSHIN WOMEN’S UNIVERSITY, AI&M lab

Architectural representation of our system SUNGSHIN WOMEN’S UNIVERSITY, AI&M lab

Class Diagram SUNGSHIN WOMEN’S UNIVERSITY, AI&M lab

Collaboration diagram of Ontology merging process SUNGSHIN WOMEN’S UNIVERSITY, AI&M lab

Evaluation • Plan to evaluate if indeed the algorithm approaches a consensus as we merge more ontologies • plan to evaluate the use of WordNet and demonstrate the improvement it provides in learning new concepts and relationships • further plan to conducts test to verify that the values being suggested for the depth of search using WordNet as well as the syntactic similarity measure(SSM) are indeed the most appropriate • have plans for evaluating the affect the order in which the merge takes place has on the final consensus ontology. SUNGSHIN WOMEN’S UNIVERSITY, AI&M lab

SUNGSHIN WOMEN’S UNIVERSITY, AI&M lab

Building Local Consensus Ontologies Via Autonomous Merging Using a Lexical Database

Building Local Consensus Ontologies Via Autonomous Merging Using a Lexical Database

Presentation Transcript

Building and Using Ontologies

Consensus Building

Building Consensus

Consensus building workshop

Consensus Building

Building Consensus

UNASSISTED CONSENSUS BUILDING

Building ontologies using Jenkins

Consensus Building

Building a Database

Building Consensus

Building a Lexical Database for an Interactive Joke-Generator

Building Ontologies

Global TRANSFORMATION ESTIMATION VIA LOCAL REGION CONSENSUS

Building Biomedical Ontologies

Consensus Building

Lexical Issues in Anatomical Ontologies

Building and Using Ontologies

ReGra’s Lexical Database

Building Bilingual Lexicons Using Lexical Translation Probabilities via Pivot Language

Building and Using Ontologies