Incrementally Learning Parameter of Stochastic CFG using Summary Stats

Written by:Brent Heeringa

Tim Oates

- To learn the syntax of utterances
Approach:

- SCFG (Stochastic Context Free Grammar)
M=<V,E,R,S>

V-finite set of non-terminal

E-finite set of terminals

R-finite set of rules, each r has p(r).

Sum of p(r) of the same left-hand side = 1

S-start symbol

1)Expensive storage: need to store a corpus of complete sentences

2)Time-consuming: algorithms needs to repeat passes throughout all data

- Inducing context-free structure from corpus(sentences)
- Learning – the production(rules) probabilities

General method: Inside/Outside algorithm

Expectation-Maximization (EM)

Find expectation of rules

Maximize the likelihood given both expectation & corpus

Disadvantage of Inside/Outside algo.

Entire sentence corpus must be stored using some representation(eg. chart parse)

Expensive storage (unrealistic for human agent!)

- Use Unique Normal Form (UNF)
- Replace all terminal A-z to 2 new rules
- A->D p[A->D]=p[A->z]
- D-> z p[D->z]=1

- No two productions have the same right hand side

- Replace all terminal A-z to 2 new rules

- Use Histogram
- Each rule has 2 histograms (Hor, HLr)

- Hor -contructed when parsing sentences in O
- HLr- -will continue to be updated throughout learning process

- Why?!
- Recently used rules has more impact on histogram

- Relative entropy
- T decrease- increase prob of rules used
- (if s large, increase prob of rules used when parsing last sentence )

- T increase- decrease prob of rules used
(eg pt+1(r)=0.01* p t+1(r)

Inside/Outside

O(n3)

Good

3-5 iterations

Bad

Need to store complete sentence corpus

Proposed Algo

O(n3)

Bad

500-1000 iterations

Good

Memory requirements is constant!