1 / 14

Learning Adjective-Noun Selectional Preference Using Probabilistic Graphical Model

Learning Adjective-Noun Selectional Preference Using Probabilistic Graphical Model Debaleena Chattopadhyay Mandeep Singh Grang. Outline. The Problem statement The Prior Collection The Max-Flow Model Results Conclusion. The Problem Statement.

briar
Download Presentation

Learning Adjective-Noun Selectional Preference Using Probabilistic Graphical Model

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Learning Adjective-Noun Selectional Preference Using Probabilistic Graphical Model Debaleena Chattopadhyay MandeepSingh Grang CSE 507, Spring 2011

  2. Outline • The Problem statement • The Prior Collection • The Max-Flow Model • Results • Conclusion CSE 507, Spring 2011

  3. The Problem Statement To learn Selectional Preference of adjectives and use the knowledge towards word sense disambiguation. • I want a red pen to write. • Stay away from the red hot burners. • He likes to eat red meat. • Jones is looking at the fat guy. • He has a fat salary. CSE 507, Spring 2011

  4. Related Work • Selectional Preference and Sense Disambiguation , Resnik (1997) • Word Sense Disambiguation of Adjectives. Using Probabilistic Networks. Chao et al. (2000) • Determinants of Adjective-Noun Plausibility, Lapata et al. (1999) • Evaluating and Combining Approaches to SelectionalPreference Acquisition, Brockmann et al. (2007) • Web-based WSD using Adjective-Noun pairs, Buscaldi, et al. • Explaining away ambiguity: learning verb selectional preference with Bayesian networks, Ciaramita et al. (2000) CSE 507, Spring 2011

  5. Dataset • Training Dataset: (For Prior Collection) • Adjective – Verb and Adjective-Verb -Object tuples • Hand Labeled Description from ImageClef ( http://www.imageclef.org/) • Project Gutenberg eBooks (http://www.gutenberg.org/wiki/Main_Page) • Wiki Text (Using web-crawler ) • Google N-gram counts • Test Dataset: (For testing Final Model) • Adjective – Verb and Adjective-Verb -Object tuples • SemCor 3.0 • SenseVal 3.0 CSE 507, Spring 2011

  6. The Prior Collection Training data: Sentences Parsed with a dependency Parser and stored as Adjective-Noun and Adjective-Noun -Verb tuples co-occurrence count. Learn selection preference using the data to obtain posterior marginal probabilities. Use WordNet Hierarchy to create a subnetwork for each adjective-object pair. CSE 507, Spring 2011

  7. The Prior Collection • For an adjective there are different nouns occurring in the training data. • Each noun in Wordnet belongs to certain classes – general and specific e.g.: pen might belong to classes ‘Object’, ‘Writing Instrument’, ‘Small Things’ • For each noun in the data, query Wordnet to get its K most general classes.(hypernyms) • For prior collection we do not consider the path of the hypernyms of the nouns. • Compute priors for adjective-noun tuples and adjective-class tuples from the training data. • Also compute priors for verb-adjective-noun tuples and verb-adjective-class tuples from the training data. • Use a probabilistic model to calculate these priors.

  8. The Prior Collection Naïve Bayes Model to get the P(noun|adj) • P(noun|adj) = P(adj|noun)P(noun)/ P(adj) • P(adj|noun) = P(adj|c1,c2,c3,..ck) = P(adj)P(c1|adj)P(c2|adj)P(c3|adj)…/ p(c1,c2,c3…) = P(adj)P(c1|adj)P(c2|adj)P(c3|adj)… Where the noun belongs to the set of classes C = {c1, c2,…cK} according to the Wordnethypernym structure. P(class|adj) = #(class,adj)/#(class) Similarly, Compute P(noun|adj,verb) = P(adj,verb|noun)P(noun)/P(adj,verb) = P(adj,verb|noun)/P(adj,verb) = P(adj|noun)P(verb|noun)/(P(adj)P(verb)) Assuming that adjective and verb are independent random variables = P(a|n)P(v|n) P(adj|noun) is calculated as above P(verb|noun) = P(verb|c1,c2,c3,…ck) = P(verb)P(c1|verb)P(c2|verb)P(c3|verb)

  9. A Max-Flow Problem • Model the selectional preference as problem of maximum flow in a graphical network. • The adjective synsetis the ‘source’ • The different noun synsetsare the ‘sinks’ • The hypernyms at different granularities are the vertices of the graph. • The edges between vertices are the hypernym path • The weight on each vertex is its capacity which is computed as P(class|adjective) • Find the path which maximizes the flow from source to sink i.e. from the adjective to a noun-Synset sink. CSE 507, Spring 2011

  10. An Example CSE 507, Spring 2011

  11. Results Frequency Counts Enter an adjective: red • jersey , bricks, flowers, car, tee Enter an adjective: sullen • reproach, looks, gloomy Enter an adjective: dark • blue, brown, night, grey, boy, girl, eyes, room Naïve Bayes Probabilities Enter an adjective: red • ink , ornaments Enter an adjective: sullen • girl, life, sea, tone Enter an adjective: dark • weeks, damage, crimes CSE 507, Spring 2011

  12. Results Naïve Bayes Probabilities with Enter an adjective: red Enter a verb: drank • Wine Enter an adjective: dark Enter a verb: see • time, book, path verb-adjective-noun tuples Enter an adjective: red Enter a verb: see • Lion, tongues Enter an adjective: dark Enter a verb: was • order, traditions, people CSE 507, Spring 2011

  13. Results • Disambiguating word sense: CSE 507, Spring 2011

  14. Evaluation • We extracted adjective-noun pairs and verb-adjective-noun pairs from test set. Created priors on them using Google N-gram dataset. • And built graphical model for adjectives to disambiguate the sense of a noun using the adjective’s Selectional preference. CSE 507, Spring 2011

More Related