1 / 94

L ogics for D ata and K nowledge R epresentation

L ogics for D ata and K nowledge R epresentation. Application of (Ground) ClassL. Outline. Ontologies Lightweight Ontologies Classifications Optimization of Classifications Document Classification in LOs Query-answering in LOs Semantic Matching. Ontology. Animal.

Download Presentation

L ogics for D ata and K nowledge R epresentation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Logics for Data and KnowledgeRepresentation Application of (Ground) ClassL

  2. Outline • Ontologies • Lightweight Ontologies • Classifications • Optimization of Classifications • Document Classification in LOs • Query-answering in LOs • Semantic Matching

  3. Ontology Animal • Ontologies are explicit specifications of conceptualizations. • They are often thought of as directed graphs whose nodes represent concepts and whose edges represent relations between concepts. Part-of Is-a Is-a Part-of Bird Mammal Head Body Is-a Is-a Is-a Chicken Predator Herbivore Is-a Is-a Eats Is-a Eats Eats Cat Tiger Goat

  4. Concept • The notion of concept is understood as defined in Knowledge Representation, i.e., as a set of objects or individuals. • This set is called the concept extension or the concept interpretation. • Concepts are often lexically defined, i.e. they have natural language names which are used to describe the concept extensions.

  5. Relation • The notion of relation is understood as a set of ordered pairs, with the two items of the pair from the source concept and the target concept respectively. • The backbone structure of the ontology graph is a taxonomy in which the relations are ‘is-a’, ‘part-of’ and ‘instance-of’ whereas the remaining structure of the graph supplies auxiliary information about the modeled domain and may include relations like ‘located-in’, ‘sibling-of’, ‘ant’, etc.

  6. Ontology as a graph • A mathematical definition comes from ‘graph’, an ontology is an ordered pair O=<V, E> in which V is the set of vertices describing the concepts and E is the set of edges describing relations.

  7. Tree-like Ontologies Animal • Take the ontology in previous slide, remove those auxiliary relations… • … we get a tree-like ontology consisting of the backbone structure with ‘is-a’, ‘part-of’ and even ‘instance-of’ relations. • They are informal Lightweight Ontologies. Part-of Is-a Is-a Part-of Bird Mammal Head Body Is-a Is-a Is-a Chicken Predator Herbivore Is-a Is-a Eats Is-a Eats Eats Cat Tiger Goat

  8. Descriptive VS. Classification Ontologies • Some ontologies are used to describe a piece of world, such as the Gene ontology, Industry ontology, etc. The purpose it to make a clear description of the world. This is usually the first idea to mind when people talk about ontologies. • Some other ontologies are used to classify things, such as books, documents, web pages, etc. The aim is to provide a domain specific category to organize individuals accordingly. Such ontologies usually take the form of classifications with or without explicit meaningful links. • We will see the difference further, in the transformation into formal Lightweight Ontologies.

  9. Why ‘Lightweight’ Ontologies? Two observations: • Majority of existing ontologies are ‘simple’ taxonomies or classifications, i.e., categories to classify resources. • Ontologies with arbitrary relations do exist, but no intuitively reasoning techniques support such ontologies in general. … so we need ‘lightweight’ ontologies.

  10. Outline • Ontologies • Lightweight Ontologies • Classifications • Optimization of Classifications • Document Classification in LOs • Query-answering in LOs • Semantic Matching

  11. Lightweight Ontologies • A (formal) lightweight ontology is a triple O = <N,E,C> • where • N is a finite set of nodes, • E is a set of edges on N, such that <N,E> is a rooted tree, • and C is a finite set of concepts expressed in a formal language F, such that for any node ni∈N, there is one and only one concept ci∈C, and, if ni is the parent node for nj ,then cj ⊑ ci.

  12. From Tree-like Ontologies to LOs Animal Animal Part-of Part-of Is-a Is-a ⊑ ⊑ Part-of Part-of Bird Mammal Head Body Bird Mammal Head Body Is-a Is-a Is-a ⊑ ⊑ ⊑ Chicken Predator Herbivore Chicken Predator Herbivore Is-a Is-a ⊑ ⊑ ⊑ Is-a Cat Tiger Goat Cat Tiger Goat

  13. In Classification Semantics… Animal Animal Part-of ⊑ Part-of Is-a Is-a ⊑ ⊑ ⊑ Part-of Part-of Bird Mammal Head Body Bird Mammal Head Body Is-a Is-a Is-a ⊑ ⊑ ⊑ Chicken Predator Herbivore Chicken Predator Herbivore Is-a Is-a ⊑ ⊑ ⊑ Is-a Cat Tiger Goat Cat Tiger Goat

  14. From Tree-like Ontologies to LOs cont. • For a descriptive tree-like ontology, the backbone taxonomy of ‘is-a’ intuitively coincident with ‘subsumption’ relation in LOs. But ‘part-of’ relations has to be modeled as a new kind of binary relation in order to preserve the semantics. • For a classification ontology, the semantics behind the labels of the nodes are the extension interpretation, i.e. the documents (books, websites, etc.) that should be classified under the nodes. Therefore, ‘part-of’ relation also follows the intuition of ‘subsumption’ and can be transformed directly into ‘⊑’ in the target LOs.

  15. Populated (Lightweight) Ontologies • In Information Retrieval, the term classification is seen as the process of arranging a set of objects (e.g., documents) into categories or classes. • A classification Ontology is said populated if a set of objects have been classified under ‘proper’ nodes. • Thus a populated (Lightweight) Ontology consists a new type of links: instance-of.

  16. Example of a Populated Ontology Animal ⊑ ⊑ ⊑ ⊑ Bird Mammal Head Body ⊑ ⊑ ⊑ Chicken Predator Herbivore Instance-of ⊑ ⊑ ⊑ ‘Chicken Soup’ Instance-of Cat Tiger Goat ‘How to Raise Chicken’ Instance-of Instance-of Instance-of ‘Tom and Jerry’ ‘www.protectTiger.org’ …

  17. Lightweight Ontologies in ClassL:TBox • Subsumption terminologies: ‘… C is a finite set of concepts expressed in a formal language F, such that for any node ni∈N, there is one and only one concept ci∈C, and, if ni is the parent node for nj ,then cj ⊑ ci.’ • Bird⊑ Animal • Mammal⊑ Animal • Chicken⊑ Bird • Cat⊑ Predator • … Observation: a tree-like ontology can be transformed into a lightweight ontology, but not vise versa.

  18. Populated LOs in ClassL: TBox+ABox • Subsumption terminologies: ‘… cj ⊑ ci.’ • ‘Instance of’ links: ‘concept assertion!’ • … • … • … • … • Chicken(ChickenSoup) • Cat(TomAndJerry) • …

  19. Outline • Ontologies • Lightweight Ontologies • Classifications • Optimization of Classifications • Document Classification in LOs • Query-answering in LOs • Semantic Matching

  20. Classifications… • Classifications hierarchies are easy to use... ... for humans. • Classifications hierarchies are pervasive (Google, Yahoo, Amazon, our PC directories, email folders, address book, etc.). • Classifications hierarchies are largely used in industry (Google, Yahoo, eBay, Amazon, BBC, CNN, libraries, etc.). • Classification hierarchies have been studied for very long (e.g., Dewey Decimal Classification system -- DCC, Library of Congress Classification system –LCC, etc.).

  21. Classification Example: Yahoo! Directory

  22. Classification Example: Email Folders

  23. Classification Example: E-Commerce Category

  24. Classifications .. more • Classifications hierarchies are lightweight (no roles, trees or simple DAGs, …). • Classification hierarchies are a kind of concept hierarchies. • Labels are natural language sentences; useful but hard to deal with in an automated way. • Links are of the kind “child-of” (e.g. “economy child-of Europe”), where in an ontology you would have, (instance-of}, or roles, or {is-a} links. • No clear semantics for both labels at nodes and links. How to use such informal information?

  25. Recall: Lightweight Ontologies • A (formal) lightweight ontology is a triple O = <N,E,C>, • where • N is a finite set of nodes, • E is a set of edges on N, such that <N,E> is a rooted tree, • and C is a finite set of concepts expressed in a formal language F, such that for any node ni∈N, there is one and only one concept ci∈C, and, if ni is the parent node for nj ,then cj ⊑ ci. A classification already has. To be fixed

  26. What do LOs Bring? • We know that a lightweight ontology is aformal conceptualization of a domain in terms of concepts and {is-a, instance-of}relationships. • Lightweight ontologies (LOs) add a formal semantics and {instance-of} relationships to classification hierarchies. • In short: LOs make classifications formal!

  27. LOs and Ground Class Logic • Ground ClassL provides a formal language (syntax + semantics) to model lightweight ontologies, where: • concepts are modeled by propositions and formulas; • ‘is-a’ relationship is modeled by subsumption (⊑) • and ‘is-instance-of’ relationship is modeled by individual assertion (i.e., wffs like P(a)).

  28. 0 Subjects (1) 1 Computers andInternet … (3) 2 … … (5) Programming 3 … … (7) Java Language … (8) Java Beans … Label Semantics Level • Natural language words are often ambiguous. • E.g. Java (an island, a beverage, an OO programming language) • When used with other words in a label, improper senses can be pruned. • E.g., “Java Language” – only the 3rd sense of Java is preserved. 4

  29. From NL Labels to Labels in Class Logic • Several approaches to rewrite a natural language label into a ClassL proposition. • Following (Giunchiglia et al., 2007), we may distinguish four steps: • Tokenization (get distinct words); Italian Pictures  ‘Italian’, ‘Pictures’ • Words stemming (get to a basic form); Pictures  picture • Rewrite each word into its proposition; picture picture-noun-1⊓picture-noun-2⊓…⊓picture-verb-2 • Prune inconsistent senses. picture-noun-1⊓picture-noun-2⊓…⊓picture-verb-2pictureN1

  30. Class Logic Label Eamples • E.g.1:“Java” becomes the proposition Java#1 ⊔ Java#2 ⊔ Java#3 where Java#i is a propositional variable representing the ith-sense of the word “Java” according to a dictionary (e.g., WordNet). • E.g.2: “JavaBeans” becomes: (Java#1 ⊔ Java#2 ⊔ Java#3)⊓(Bean#1 ⊔ Bean#2)

  31. Advantages of Propositions • NL labels are ambiguous, propositions are NOT! • Extensional semantics of propositions naturally maps nodes to real world objects. • Labels as propositions allow us to deal with the standard problems in classification (e.g., document classification, query-answering, and matching) by means of ClassL’s reasoning, mainly the SAT problem.

  32. Formalizing the Meaning of Links (1) • Child nodes in a classification are always considered in the context of their parent nodes. • Child nodes therefore specialize the meaning of the parent nodes. • Contextuality property of classifications.

  33. A 1 A A ? B C B B 2 (a) (b) Formalizing the Meaning of Links (2) • General intersection relationship(a): can be used to represent facets. The meaning of node 2 is C = A ⊓ B. • Subsumption relationship (b): child nodes are specific case of the parent nodes. The meaning of node 2 is B.

  34. l1 = “Subjects” l3= “Computers and Internet” l5= “Programming” computer programming scheduling, planning hardware software networking … General Intersection Example

  35. Concept at a Node • Parental contextuality is formalized in ClassL by the notion of “concept at a node.” • A concept Cr at the root node r is the class proposition (label) used to denote the node. • A concept Ci at a node ni is the conjunction of a proposition Pi (label of ni) and the concept Cj at node nj parent to ni (if it has any parents). In ClassL: Pi⊓ Cj.

  36. Concept at a Node • A concept at a node ni can be computed as the conjunction of all the labels from the root of the classification hierarchy to ni. • Concepts at nodes capture the classification semantics by using the meaning of labels (propositions defined by using WordNet and a linguistic analysis) and the nodes' position.

  37. Europe 1 2 3 Pictures Wine and Cheese 4 5 Italy Austria Concept at a Node: Example In ClassL: C4= Ceurope⊓ Cpictures⊓ Citaly

  38. What have we done? • Calculate the concepts and label and concept at nodes. • In which format? ClassL Java#1 ⊔ Java#2 ⊔ Java#3 Ceurope⊓ Cpictures⊓ Citaly … • We have built the ClassL formulas for each node!

  39. Distinctions Among Ontology, LO and CLS Tree-like Ontology Ontology A A A Is-a Is-a ⊑ Instance-of Instance-of ⊑ Backbone Taxonomy Likes A⊓B B B C A⊓C C Is-a Is-a ⊑ Part-of Part-of ⊑ Locate-in Descriptive Ontologies A⊓B⊓D D D E A⊓B⊓E E Classification Ontologies Most common format Classification Semantics A Classification Child-of Child-of Formal Lightweight Ontology B C Formalization Child-of Child-of D E

  40. Outline • Ontologies • Lightweight Ontologies • Classifications • Optimization of Classifications • Document Classification in LOs • Query-answering in LOs • Semantic Matching

  41. Rational LOs • LOs may be not perfect… • Reconstruct a LO based on the “most specific subsumer” relation. • Nodes get parents which most specifically describe them, still being more general. • The new structure is called, a Rational LO (RLO) • NOTE: classification semantics do not change. EU Italy Schengen States Germany France Pictures EU Schengen States Germany Italy France Pictures

  42. Optimization of Classifications • Problem: to find ‘the most specific subsumer’ of a given node. • Suppose we have, for all nodes in the LO, the concepts at label in ClassL, i.e. wff’s after NLP. • Then we can refer to the ‘subsumption’ reasoning service which finds the minimal with respect to the ordering ‘⊑’. • E.g.: Italy⊑EU, ShengenState⊑EU, Italy⊑ShengenState…

  43. Outline • Ontologies • Lightweight Ontologies • Classifications • Optimization of Classifications • Document Classification in LOs • Query-answering in LOs • Semantic Matching

  44. Document Classification • Each document d in a classification is assigned a proposition Cd in ClassL. • Cd is called document concept. • Cd is build from d in two steps: • keywords are retrieved from d by using standard text mining techniques. • keywords are converted into propositions by using methodology discussed above.

  45. “Get specific” Rule For any given document d and its concept Cd we classify d in each node ni such that: • ⊨Cd ⊑Ci(i.e. the concept at node ni is more general than Cd); • and there is no node nj (j ≠ i), whose concept at node Cj is more specific than Ci and more general than Cd: ⊨Cj ⊑ Ci and ⊨ Cd⊑ Cj. Subsumption reasoning Of ClassL

  46. Level 0 Subjects (1) 1 Business andInvesting Computers andInternet … (2) (3) 2 … … Small Business and Entrepreneurship (4) (5) Programming 3 … … New Business Enterprises (6) (7) Java Language 4 … (8) Java Beans … Example • Suppose we need to classify “Professional Java, JDK-5th Edition” by W. Clay Richardson et al. • The document concept of such document d is:Cd = Java#3⊓Programming#2. • The node 7 is the only node which conforms to the “get specific” rule.

  47. Level 0 Subjects (1) 1 Business andInvesting Computers andInternet … (2) (3) 2 … … Small Business and Entrepreneurship (4) (5) Programming 3 … … New Business Enterprises (6) (7) Java Language 4 … (8) Java Beans … Example (cont’) • Suppose we need to classify “Visual Basic.Net Programming for Business” by Philip A. Koneman. • The document concept of such document d is:Cd = VisualBasicNet#1⊓Programming#2⊓Business#1 • The nodes 2,5 conform to the “get specific” rule.

  48. What have we done by far? • Classify documents. • How? • Get specific algorithm! • But how to implement the algorithm? ClassL! We are reasoning with the ‘Concept Realization’ service of ClassL! (With an empty ABox.) ⊨Cj ⊑ Ciand ⊨ Cd⊑ Cj

  49. Outline • Ontologies • Lightweight Ontologies • Classifications • Optimization of Classifications • Document Classification in LOs • Query-answering in LOs • Semantic Matching

  50. Intuitive Query-answering • Query-answering on a hierarchy of documents based on a query q as a set of keywords is defined in two steps: • The ClassL proposition Cq is build from q by converting q’s keywords as said above. • The set of answers (retrieval set) to q is defined as a set of subsumption checking problems in Ground ClassL: Aq ={d∈ document | T⊨ Cd ⊑ Cq}.

More Related