1 / 22

Knowledge Representation and Semantic Capturing

Knowledge Representation and Semantic Capturing. Albena Strupchanska Linguistic Modelling Department, Institute for Parallel Processing, Bulgarian Academy of Sciences albena@lml.bas.bg. Few words about me. Programmer at LMD, 2001 -2003 Research Associate at LMD since 2003

beck-howe
Download Presentation

Knowledge Representation and Semantic Capturing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Knowledge Representation and Semantic Capturing Albena Strupchanska Linguistic Modelling Department, Institute for Parallel Processing, Bulgarian Academy of Sciences albena@lml.bas.bg

  2. Few words about me • Programmer at LMD, 2001 -2003 • Research Associate at LMD since 2003 • Research interests • knowledge representation: CGs, LFs in NLU; ontologies, semantic web • information extraction • e-learning • question-answering

  3. Knowledge Representation: Conceptual Graphs • Realization of CG operations (generalization, specialization, projection and join) • Integration of CG operations in CGWorld • Usage of those operation in several system prototypes (simple question-answering, eLearning)

  4. Knowledge Acquisition form Text General approach used in a few prototypes that process text in controlled English (restricted domains) • Lexical analysis, Named entities recognition and Part-of-speech tagger - GATE • Syntactic analysis - parser developed by Milena Yankova • Result: translation of text into Logical Forms (LFs) and other similar formalisms e.g. Conceptual Graphs

  5. Knowledge-based approaches • Resources used: • type hierarchy • domain knowledge • Attempts to • treat negation (prototype developed) • recognize scenarios (FRET system)

  6. “Naive” Negation Processing • Sentence/Query -> LF -> CG • The question: "Who does not buy bonds?“ will be translated to: ¬(all (X,bond(X)&buy(Y)&(Y,agnt,Univ)& (Y,obj,X))) • set the negation scope to the whole sentence

  7. “Naive” Negation Processing • construct all possible LFs with localization of the negated phrases • (2.1) exists(X,¬bond(X)&buy(Y)&(Y,agnt,Univ)& (Y,obj,X)) • (2.2) exists (X,bond(X)& ¬buy(Y)& (Y,agnt,Univ)& (Y,obj,X)) • (2.3) exists (X,¬bond(X)&¬buy(Y)& (Y,agnt,Univ)& (Y,obj,X)) • (2.1) Who does buy financial instruments different from bonds ? • (2.2) Who is doing other actions with bonds except buying them? • (2.3) Who is doing other actions except buying with something different from bonds

  8. “Naive” Negation Processing • Every negated concept is replaced by its hierarchical environment: • every concept corresponding to a verb is replaced by its "antonym or complementary events"; • every object is replaced by the so-called restricted universally quantified concepts. S(nc)=(Sib(nc)SonSib(nc)) \ Son(nc), where nc is the negated concept • Projection of the query to the KB of CGs => retrieval of answers

  9. FRET - Football Reports Extraction ofTemplates • Semantically driven approach for scenario recognition and templates filling • deep understanding only in “certain scenario-relevant points” by elaborating inference mechanisms • LF representation for effective inference • Text: football reports with specific paragraph structure (tickers for each minute)

  10. Text Resource Bank Text Preprocessor Templates Filler KB of filled template’s forms Logical Form Translator Direct Matching Filling Templates yes no Inference Matching yes no STOP FRET’s Architecture

  11. FRET - Resource Bank • Lexicon • Grammar rules • Rules for translation in logical form • Graphs of events • description of the domain events (nodes) and relations (arcs) between them • Templates description (uninstantiated LFs)

  12. FRET - Graph of Events • Three types of events (nodes in a directed graph): • Main event - LF description of obligatory and optional fields of the template and relations between them • Base events - LF of most important self-dependent events in the chosen domain • Sub-events - kinds of base events that are immediately connected to the main event(i.e. there exists an arc between the nodes of the main and the sub-events)

  13. FRET - Graph of Events • Four types of relations (an arc with associated weight in the graph): • Event E2 invalidates event E1, i.e. event E2 happens after E1 and annuls it • Event E1 entails event E2, i.e. when E1 happens E2 always happens at the same time. • Event E1 enables event E2, i.e. event E1 happens before the beginning of event E2 and event E1 is a precondition for E2 • Event E2 is a part of event E1.

  14. Main Event: Player scores. • LF Obligatory: time(Minute) & Score(A) & theta(A,agnt,Player) • Optional: Action1(C) & ball(D) & theta(C,agnt,Player) & theta(C,obj,D) & Location(E) & theta(C,Loc,E) & Action2(F) & theta(F,agnt,Assistant) & theta(F,obj,D)&theta(F,to,Player) entails Sub Event: Player’s shot hits the net. LF: time(Minute) & Action1(A1) & theta(A1,agnt,B) & shot(B) & theta(B,poss,Player) & theta(A1,obj,G) & Net(G) enables is a part of Base Event: Player shots the ball. Base Event: The ball is into the net. LF: time(Minute) & Action2(A2) & theta(A2,agnt,Player) & theta(A2,obj,D) & ball(D) LF: time(Minute) & ball(D) & theta(D,into,G) & Net(G) FRET - Graph of Events

  15. FRET - Identification of Negation • Explicit negation • Short sentences containing No • Complete sentence containing“Not/Non/No” Both cases: marker NEG attached to the LF of the previous sentence or succeeding part of the sentence • Implicit negation • Sentences with “but”, “however”, “although” Markers: BAHpos’ and ‘BAHneg’ • Markers are inserted during the parsing process

  16. FRET - Negation • Sentence: 79 mins: Henry fires at goal, but misses from a tight angle. • Logical forms: time(79) & fire(A) & (A,agnt,‘Henry’) & (A,at,B) & goal(B) & marker(‘BAHpos’,7). time(79) & miss(A) & (A,agnt,‘Henry’) & (A,form,B) & angle(B) & (B,char,C)& tight(C) & marker(‘BAHneg’,7).

  17. FRET -Treatment of Negation Interpretation of marked LFs • NEG • the matching result is ignored • BAHpos or BAHneg • there are two possible interpretations: • negation • conjunction of independent statements • the algorithm checkswhether the dual LFs marked with these markers can be matched to events connected with invalidate relation in the graph • if this succeeds, the previous matching is ignored.

  18. FRET - Templates Filling • The templates fillerperforms two main steps: • Matching LF • based on the modification of the unification algorithm • Filling templates • The templates filler processes those LF, which are produced from the so-called extended paragraph. Thus each paragraph is treated separately.

  19. FRET - Matching Algorithm • Direct matching • each LF from the extended paragraph to the main event • Inference Matching • use inference rules and the knowledge base • FRET inference-matching algorithm derives an inference from: base events LFs => sub-events LFs => main event LF If necessary information about some sub- event => consider type of relation between this sub-event and the main event => either recognize or not the main event

  20. Advantages and disadvantages • + Logical forms: convenient formalism for making inference • + Knowledge representation as graph of events • + Partial parsing (better to understand less than nothing) • - Creation of graph of events (nodes presented in LFs) and templates (presented in LFs) • - Narrow and restricted domains (not scaleable)

  21. Conclusion • Knowledge-based approaches are successful when they are applied to specific domains • Choice of domain representation formalism is crucial for semantic capturing • Domain modelling is difficult and time-consuming • Much efforts for semantic capturing of simple cases. Probably when these cases are the right ones the goal justifies the means

  22. Thank you!Any questions?

More Related