1 / 52

Unsupervised Learning of Probabilistic Context-Free Grammar Using Iterative Biclustering

Unsupervised Learning of Probabilistic Context-Free Grammar Using Iterative Biclustering. Kewei Tu and Vasant Honavar Artificial Intelligence Research Laboratory Department of Computer Science Iowa State University www.cs.iastate.edu/~honavar/aigroup.html www.cild.iastate.edu.

deanna
Download Presentation

Unsupervised Learning of Probabilistic Context-Free Grammar Using Iterative Biclustering

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Unsupervised Learning of Probabilistic Context-Free Grammar Using Iterative Biclustering Kewei Tu and Vasant Honavar Artificial Intelligence Research Laboratory Department of Computer Science Iowa State University www.cs.iastate.edu/~honavar/aigroup.html www.cild.iastate.edu Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar

  2. Unsupervised Learning of Probabilistic Context-Free Grammar • Greedy search to maximize the posterior of the grammar given the corpus • Iterative (distributional) biclustering • Competitive experimental results Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar

  3. Outline Introduction Probabilistic Context Free Grammars (PCFG) The Algorithm based on Iterative Biclustering (PCFG-BCL) Experimental results Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar

  4. Motivation • Probabilistic Context-Free Grammar (PCFG) find applications in many areas including: • Natural Language Processing • Bioinformatics • Important to learn PCFG from data (training corpus) • Labeled corpus not always available • Hence the need for unsupervised learning Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar

  5. Task • Unsupervised learning of a PCFG from a positive corpus a square is above the triangle the square rolls a triangle rolls the square rolls a triangle is above the square a circle touches a square the triangle covers the circle …… S  NP VP NP  Det N VP  Vt NP (0.3) | Vi PP (0.2) | rolls (0.2) | bounces (0.1) …… Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar

  6. PCFG Context-free Grammar (CFG) G = (N, Σ, R, S) N: non-terminals Σ: terminals R: rules SN : the start symbol Probabilistic CFG Probabilities on grammar rules Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar

  7. P-CNF Probabilistic Chomsky normal form (P-CNF) Two types of rules: ABC Aa Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar

  8. The AND-OR form Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar • P-CNF in the AND-OR form • Two types of non-terminals: AND, OR • AND  OR1 OR2 • OR  A1 | A2 | a1 | a2 | …… • with probabilities

  9. The AND-OR form • P-CNF in the AND-OR form Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar

  10. The AND-OR form • P-CNF in the AND-OR form can be divided into two parts • Start rules • S… • A set of AND-OR groups • Each group: AND  OR1 OR2 • Bijection between ANDs and groups • An OR may appear in multiple groups Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar

  11. The AND-OR form • P-CNF in the AND-OR form can be divided into two parts Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar

  12. Outline Introduction Probabilistic Context Free Grammars (PCFG) The Algorithm based on Iterative Biclustering (PCFG-BCL) Experimental results Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar

  13. PCFG-BCL: Outline Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar • Start with only the terminals • Repeat the two steps • Learn a new AND-OR group by biclustering • Attach the new AND to existing ORs • Post-processing: add start rules • In principle, these steps are sufficient for learning any CNF grammar

  14. PCFG-BCL: Outline • Find new rules that yield the greatest increase in the posteriorof the grammar given the corpus • Local search, with the posterior as the objective function • Use a prior that favors simpler grammars to avoid overfitting Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar

  15. PCFG-BCL Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar • Repeat the two steps • Learn a new AND-OR group by biclustering • Attach the new AND to existing ORs • Post-processing: add start rules

  16. Intuition Construct a table T Index the rows and columns by symbols appearing in the corpus The cell at row x and column y records the number of times the pair xy appears in the corpus Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar

  17. An AND-OR group corresponds to a bicluster Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar

  18. The bicluster is multiplicatively coherent for any two rows i,j and two columns k,l Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar

  19. Expression-context matrix of a bicluster • Each row: a symbol pair contained in the bicluster • Each column: a context in which the symbol pairs appear in the corpus It’s also multiplicatively coherent. Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar

  20. Intuition If there’s a bicluster that is multiplicatively coherent and has a multiplicatively coherent expression-context matrix Then an AND-OR group can be learned from the bicluster Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar

  21. Bicluster multiplicative coherence Expression-context matrix multiplicative coherence Probabilistic Justification • Change in likelihood as a result of adding an AND-OR group to a PCFG Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar

  22. Prior • To prevent overfitting, use a prior that favors simpler grammars • P(G)  2DL(G) • DL(G) is the description length of the grammar Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar

  23. Learning a new AND-OR group by biclustering find in the table T a bicluster that leads to the maximal posterior gain create a new AND-OR group from the bicluster reduce the corpus using the new rules E.g., “the circle” is rewritten to the new AND symbol update T A new row and column are added for the new AND symbol Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar

  24. PCFG-BCL Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar • Repeat the two steps • Learn a new AND-OR group by biclustering • Attach the new AND to existing ORs • Post-processing: add start rules

  25. Attaching the new AND under existing ORs • For the new AND symbol N … • There may exist OR symbols in the learned grammar, s.t. ON is in the target grammar • Such rules can't be learned in the biclustering step • When learning O, N doesn’t exist • When learning N, only learn NAB • We need an additional step to find such rules • Recursion is learned in this step Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar

  26. Intuition • Adding rule ON = adding a new row/column to the bicluster • If ON is true, then • the expanded bicluster is multiplicatively coherent • the expanded expression-context matrix is multiplicatively coherent • If we find an OR symbol s.t. the expanded bicluster has this property • Then a new rule ON can be added to the grammar Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar

  27. Probabilistic Justification • Likelihood gain is an approximation of the expanded bicluster • To prevent overfitting, the prior is also considered Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar

  28. Attaching the new AND under existing ORs • Try to find OR symbols that lead to large posterior gain • When found • add the new rule ON to the grammar • do a maximal reduction of the corpus • update the table T Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar

  29. PCFG-BCL Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar • Repeat the two steps • Learn a new AND-OR group by biclustering • Attach the new AND to existing ORs • Post-processing: add start rules

  30. Postprocessing For each sentence in the corpus: If it’s fully reduced to a single symbol x, then add Sx If not, a few options… Return the grammar Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar

  31. Outline Introduction Probabilistic Context Free Grammars (PCFG) The Algorithm based on Iterative Biclustering (PCFG-BCL) Experimental results Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar

  32. Experiments Measurements weak generative capacity precision, recall, F-score Test data artificial, English-like CFGs Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar

  33. Experiment results [Adriaans, et al., 2000] [Solan, et al., 2005] P=Precision, R=Recall, F=F-score Number in the parentheses: standard deviation • PCFG-BCL outperforms EMILE and ADIOS • with lower standard deviations Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar

  34. Summary Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar • An unsupervised PCFG-learning algorithm • It acquires new grammar rules by iterative biclustering on a table of symbol pairs • In each step it tries to maximize the increase of the posterior of the grammar • Competitive experimental results

  35. Work in progress Alternative strategies for optimizing the objective function Evaluation on and adaptation to real world applications (e.g., natural language), wrt. both weak and strong generative capacity Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar

  36. Thank you~ Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar

  37. Backup… Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar

  38. Step 1 Posterior gain: Likelihood Gain Bicluster multiplicative coherence E-C matrix multiplicative coherence Prior gain (bias towards large BC) Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar

  39. Step 2 Intuition • Remember O is learned by extracting a bicluster • adding rule ON = adding a new row/column to the bicluster Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar

  40. Expanding the bicluster The expanded bicluster should still be multiplicatively coherent Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar

  41. Step 2 Intuition Expression-context matrix adding rule ON = adding a set of new rows to the E-C matrix Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar

  42. Expanding the expression-context matrix • The expanded expression-context matrix should still be multiplicatively coherent. Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar

  43. Step 2 • Likelihood gain: • : the expected numbers of appearance of the symbol pairs when applying the current grammar to expand the current partially reduced corpus. Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar

  44. Grammar selection/averaging Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar • Run the algorithm for multiple times to get multiple grammars • Use the posterior of the grammars to do model selection/averaging • Experimental results: • Improved the performance • Decreased the standard deviations

  45. Time Complexity • N: # of ANDs • k: average # of rules headed by an OR • c: average column# of Expr-Cont Matrix • h: average # of ORs that produce an AND or terminal • d: a recursion depth limit • ω: sentence# in the corpus • m: average sentence length Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar

  46. biclustering vs. distributional clustering Figure from [Adriaans, et al., 2000] V1  makes | likes V2  likes | is Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar

  47. biclustering vs. substitutability heuristic Figure from [Adriaans, et al., 2000] N1  tea | coffee N2  eating Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar

  48. Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar

  49. A set of multiplicatively coherent biclusters, which represent a set of AND-OR groups in the grammar. Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar

  50. Related work • Unsupervised CFG learning • EMILE [Adriaans et al., 2000] • ABL [Zaanen, 2000] • [Clark, 2001; 2007] • ADIOS [Solan et al., 2005] • Main difference • Distributional biclustering • A unified method for different types of rules Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar

More Related