1 / 26

Michel Généreux - Austrian Research Institute for Artificial Intelligence

Un analyseur sémantique pour les langues naturelles basé sur des exemples (An Example-Based Semantic Parser for Natural Language). Michel Généreux - Austrian Research Institute for Artificial Intelligence. Introduction - Motivations.

Download Presentation

Michel Généreux - Austrian Research Institute for Artificial Intelligence

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Un analyseur sémantique pour les langues naturelles basé sur des exemples(An Example-Based Semantic Parser for Natural Language) Michel Généreux - Austrian Research Institute for Artificial Intelligence TALN 2002-Michel Généreux

  2. Introduction - Motivations Build a robust, portable (in practice, only a set of new annotated examples is needed), wide covering parser that deals well with real data (hesitations, recognition error, idiomatic expressions,...) In the current context of: Large quantity of Information accessible from the Internet Speech Recognition development  Natural Language Interfaces (NLI) have become very attractive to access easily information and/or speak to a computer (e.g. multimodality) Improve upon other corpus-based semantic parsers Provides an open and flexible model: the statistical model could be adapted for different types of training corpus Gives context a crucial role in the parsing process Abstracts over topics for changing domains Not rule-based (burden of creating hand-crafted rules, portability) TALN 2002-Michel Généreux

  3. Architecture du système TALN 2002-Michel Généreux

  4. Introduction - Caractéristiques du système • An empirical methods for semantic parsing of natural language: to learn to parse a new sentence by looking at previous examples • A shift-reduce type parsing paradigm where the operators are based on domain-specific semantic concepts, obtained from a lexicon. • A statistically trained model "specializes" the parser, by guiding the runtime beam-like search of possible parses. • Decisions are made on the basis of three criteria for the parsing stage: the similarities between the contexts in which the action took place, the similarities between the final meaning representation and finally, the sheer number of occurrences of those actions and final representations. • Finally, a module is provided so that training and parsing can also be done on a changing domain such as newspaper browsing. TALN 2002-Michel Généreux

  5. IntroductionWhat can a semantic parser learn from a training corpus of examples ? • Training([Maria,suchst,einen,großen,Hund],suchen(Maria,groß(Hund))). • It can learn that the predicate groß/1 combines best as the suchen/2 argument, and not the other way round. • Training([Einen,großen,Hund,suchst,Maria],suchen(Maria,groß(Hund))). • It can learn that word order may not really matter. • Training([Na,ja,einen,großen,Hund,um,suchst,Maria],suchen(Maria,groß(Hund))). • It can learn that some words or hesitations don’t matter. • Training([Paul,kicked,the,bucket],die(Paul)); • Training([Paul,kicked,the,ball],kick(Paul,ball)). • Using contextual information, it can learn how to disambiguate meanings. • Training([Big, deal, that, Maria, is, looking, for, a, big, dog], search(Maria, big(dog))). • It can learn that the same word may or may not participate to the meaning. • The corpus may give the parser direct examples on how to conduct its actions and contextual information found in those examples may give the parser additional clues to interpret word order and disambiguate meanings. TALN 2002-Michel Généreux

  6. IntroductionHow will the parser use that information to parse new sentences ? • When parsing a new sentence, each step (action) is compared (including context) to those generated at the training phase; • The highest the similarity and the frequency, the highest the action ranks; • A full parse is ranked according to the individual ranking of actions and the ranking of the final state; • A final choice is made among a set of full parses (limited by the searchbeam) TALN 2002-Michel Généreux

  7. Phase "Training"Données TALN 2002-Michel Généreux

  8. "Training" -Données • Origin of the train and test data: mainly from Wizard of Oz experiments, completed by some artificial examples, in the domain of newspaper searching and browsing • Example of annotation: • Command: • training([Zurück,bitte],zurück) • Search • training([Ich,suche,jetzt,etwas,über,topic(1)],suchen([topic(1)],zeitung(_),zeit(_)) • Command+search • training([Zurück,zu,den,Begriffen,topic(1),und,Politik],section_zurück(Politik,[topic(1)])). TALN 2002-Michel Généreux

  9. "Training" Exemples de Données Written Korpus • " Artikel über das Wiener Neujahrskonzert suche ich. " • " Artikel über das Neujahrskonzert als festen Bestandteil des kulturellen Lebens suche ich. " Spoken Korpus • "ich möchte jetzt eine neue Suche beginnen" • "die Kosovo-Krise un und Bill Clinton" • "Bill Clinton hält eine Ansprache vor der Öffentlichkeit" Artificial data • Bitte geben Sie mir einen Text zum Thema Kosovo-Krise. • Aber bitte nur in der Zeitung Salzburger Nachrichten von vor einer Woche. • Salzburger Nachrichten und Kosovo-Krise suche ich jetzt. TALN 2002-Michel Généreux

  10. "Training" The training phase uses an overly general parser, which produces all possible paths of actions (only limited by the training beam) to be taken in order to get from each training example utterance to its semantic representation. In the process, it records successful actions (called op), as well as the different final states. For each of them uniquely defined, it assigns a frequency* measure, defined as follows: Frequency**= Occurrence_of_an_action_in_a_specific_context / Total_number_of_occurrence_of_this_action * = the measure is similar for final states ** = a_particular_action is an action WITH arguments TALN 2002-Michel Généreux

  11. "Training" Le fichier de statistiques • The overlyGeneralParser parses the training file to generate the statistical file. Every step needed to go from the topicalized_phrase to the meaning is recorded, as well as final states themselves. Final states are simply the states of the parse stack themselves at the end of the parse. Each of them (actions and final states) are assigned a frequency measure as described previously. Each line has either one of the following format (recall that op is a container for any action): op(ACTION#PARSE_STACK#INPUT_STRING#FREQUENCY). final(FINAL_STATE#FREQUENCY). with • The Input string • [Ich,suche,einen,Artikel,über,Bush] • The Parse Stack • [concept1:[context1],concept2:[context2], ...] • [suchen([],zeitung(_),zeit(_)):[suche,einen,Artikel],start:[ich]] • Here are two examples: op(sHIFT(ich)#[start:[]]#[ich,suche,einen,Text,for,topic(1),topic(2),bitte,bearbeiten,Sie,meinen,Suchauftrag]#0.3333). final([bestätigen(neue_suche):[bearbeiten,Sie,meinen,Suchauftrag],start:[ich,möchte,jetzt,eine]] #0.2). • These lines are used by the specializedParser to compute the best parse. TALN 2002-Michel Généreux

  12. La phase d’analyse (parsing) TALN 2002-Michel Généreux

  13. AnalyseExtraction de thèmes TALN 2002-Michel Généreux

  14. Analyse sémantique statistiqueAnalyse syntaxique: PCFGP = maxtP(t,s|G) we try to find the parse P with the highest probability, given a grammar G, where t is a parse tree, s a sentence and where each grammar rule is assigned a probability according to its frequency in a corpus.P(S) = 0.6*0.3*0.4*0.7*0.2*0.3*0.2*0.1 (= 0.0006048) • Semantic parsing: • P = maxfP(f,s|L)we tryto find the parse P with the highest probability, given a first-order logical language L, where f is a formula, s a sentence and where each formula is assigned a probability according to its frequency in a corpus. • Probability(kick(Paul,ball)) = P(Introduce(Paul)) * P(Introduce(kick(_,_))) * P(DROP(Paul,kick(_,_))) * P(SHIFT(the)) * P(INTRODUCE(ball)) * P(DROP(ball,kick(Paul,_))) * P(kick(Paul,ball)) • The actual parsing of the input phrase is done by a specializedParser. It is specialized in the sense that it uses a statistical model to process all the information available from the training phase in order to get the best possible parse (the one with the highest probability). • The search space: the search beam parameter TALN 2002-Michel Généreux

  15. Analyse statistique: vue d’ensemble TALN 2002-Michel Généreux

  16. Analyse-Adapter le modèle pour différents types de corpus • Small training set • threshold  • training beam  • Small lexicon (narrow domain) • Pop  • Pfinal  TALN 2002-Michel Généreux

  17. Éléments de l’analyse • Elements of the parser: • Actions • sHIFT(word_to_be_shifted) puts the first word from the input string into the end of the context of the concept on the top of the parse stack • iNTRODUCE(concept_to_be_introduced) takes a concept from the semantic lexicon and puts it on the top of the parse stack • dROP(source_term, target_term)attempts to place a term from the parse stack as argument to another term of the parse stack • The semantic lexicon • lexicon(CONCEPT, [TRIGGERING_PHRASE]). • lexicon(topic(1),[topic(1)]). • lexicon(suchen([],zeitung(_),zeit(_)),[suche]). TALN 2002-Michel Généreux

  18. Analyse-Version du SR Analyseur: Shift-Introduce-Drop analyseur • We are now ready to present the variant of the shift-reduce parser we are using. The algorithm previously introduced must be modified as follows: 1. Try to introduce a new concept or shift a word. * 2. If possible, make one dROP action. * 3. If there are more words in the input string, go back to Step 1. Otherwise stop. • * The backtracking mechanism will ensure that ALL possible actions will be executed. TALN 2002-Michel Généreux

  19. Analyse - Un exemple • We now show a parse for Ich suche einen Artikel über Bush. We assume for simplicity that the parser always takes the best available action. The following trace presents the successive actions taken by the parser. The initial parse stack and input string states are: • [start:[]] and [ich,suche,einen,Artikel,topic(1)] Note: PP topicalized • Here is the complete parse, using the two previous lexical entries (lexicon(topic(1),[topic(1)]), lexicon(suchen([],zeitung(_),zeit(_)),[suche])). Each line represents a Parse State: • sHIFT(ich)#[start:[]]#[ich,suche,einen,Artikel,topic(1)] • The word ich is not in the semantic lexicon, so the only action possible is to shift it on the parse stack. • iNTRODUCE(suchen([],zeitung(_),zeit(_)))#[start:[ich]]#[suche,einen,Artikel,topic(1)] • The word suche is in the lexicon, so it can be introduced as a new predicate on the parse stack. Another possibility would be to shift it. • sHIFT(einen)#[suchen([],zeitung(_),zeit(_)):[suche],start:[ich]]#[einen,Artikel,topic(1)] • The word einen is not in the lexicon, it must be shifted. • sHIFT(Artikel)#[suchen([],zeitung(_),zeit(_)):[suche,einen],start:[ich]]#[Artikel,topic(1)] TALN 2002-Michel Généreux

  20. Analyse-Un exemple (suite) • The word Artikel is not in the lexicon (it is actually an relevant_word for the newspaper domain) and is therefore shifted. • iNTRODUCE(topic(1))#[suchen([],zeitung(_),zeit(_)):[suche,einen,Artikel],start:[ich]]#[topic(1)] • topic(1) is in the lexicon, so it can be introduced. • dROP(topic(1),suchen([],zeitung(_),zeit(_)))#[topic(1):[topic(1)],suchen([],zeitung(_),zeit(_)):[suche,einen,Artikel],start:[ich]]#[] • We can drop the predicate topic(1) into the first argument of the suchen predicate. The final parse stack or final state is: • [suchen([topic(1)],zeitung(_),zeit(_)):[suche,einen,Artikel],start:[ich]] • In the final stage, the parser simply puts back the meaning for topic(1) collected during the topic_extraction phase and the final parse is, without contextual information: • suchen([(Bush)],zeitung(_),zeit(_)) • which signals a search for the topic Bush with no specific newspaper or time frame. • Note: a parse WITH statistics is a parse in which each of the previous actions, plus a final state, is given a probability according to similarity and frequence. TALN 2002-Michel Généreux

  21. Résultats • Testing: from a pool of 90 examples, 9 different subsets of 10 test examples were used as test data (training beam of 1), the rest (80) being the training examples. The parser averages 62% correctness while parsing a new sentence. Although it is difficult to quantitatively compare the approach with others, accuracy is therefore slightly lower than the other approaches. I believe there are mainly three reasons to that: • The very low number (80) of training examples, compare to 560, 225 and 4000 sentences of other approaches. • The lack of extensive testing on what would be the best setting for default values of weighting parameters. Only a set of rather intuitive values were used. • The assimilation of natural language utterances to sets instead of lists. TALN 2002-Michel Généreux

  22. Résultat-Courbe d’apprentissage: "training" avec un "beam" de 1, analyse avec un "beam" de 10 TALN 2002-Michel Généreux

  23. Résultat-Démonstration d’analyse TALN 2002-Michel Généreux

  24. Recherche en cours • Testing with larger domains, bigger corpus to see if it scales up well • See if the model can be simplified (in terms of number of switches) • Establish a clear relation between the statistical parameters (switches) and the type of corpus in terms of efficiency and accuracy • Include a treatment of questions • Include an automated acquisition of a lexicon such as the one proposed in C. Thompson and R. Mooney. Semantic lexicon acquisition for learning natural language interfaces. In 6th Workshop on Very Large Corpora, August 1998. • Another useful tools to enlarge our semantic dictionary would be WordNet. TALN 2002-Michel Généreux

  25. Étape suivante: Discours • Extend the approach to Discourse using Discourse Representation Theory (DRT) • DRT structures can also be mapped to logical structures such as the ones used to represent isolate sentences • By mapping DiscourseDRT structures, the parser could train to resolve typical discourse problems such as pronoun resolution. TALN 2002-Michel Généreux

  26. Conclusion • Corpus-based methods offer a more effective way to deal with real data, and statistics offers an efficient and robust way to model and implement methods on a computer. There is no need to rely upon hand crafted rules; only a set of training examples is needed. • The parser learns efficient ways of parsing new sentences by collecting statistics on the context in which each parsing action takes place. Comparing to similar systems using some machine-learning technique, ours offers an approach in which linguistics, through context, can play a decisive role. • The parser configuration can be change in many ways, to fit different types of corpus or domains. • Some testing remains to be done to see how well the model scales up, but so far, promising results have paved the way for the use of the system to include even more sophisticated analysis, such as discourse information. TALN 2002-Michel Généreux

More Related