Unsupervised learning of Natural languages. Eitan Volsky Yasmine Meroz. Introduction. Grammar learning methods can be grouped into two kinds: supervised and unsupervised.
“David tea” is a “Context”
makes is an “Expression”
K(s) < log(|G|)
Kolmogorov complexity = descriptive length of s
Suppose that :
Total_support% = Context_support% = 75%
Expression_support% = 80%
All constituents of the same kind can be replaced by each other.
ABL uses a reversed version of this principle :
If parts of sentences can be substituted by each other, they are constituents of the same type.
also finds the longest common subsequence, and it also gives an estimation how far is the link between the two parts.
From (San Francisco to)1 Dallas ()2
From ()1 Dallas (to San Francisco)2
From (San Francisco)1 to (Dallas)2
From (Dallas)1 to (San Francisco)2