Syntactic category acquisition. Early words (Clark 2003). Early words (Clark 2003). people daddy, mommy, baby animals dog, kitty, bird, duck body parts eye, nose, ear food banana, juice, apple, cheese toys ball, balloon, book cloths shoe, sock, hat vehicles car, truck, boat
Syntactic categories are commonly defined in terms of their distribution; thus, it cannot be a surprise that distributional information is informative about syntactic category status. The argument is trivial or even circular.
The vast number of possible relationships that might be included in a distributional analysis is likely to overwhelm any distributional learning mechanism in a combinatorial explosion. (Pinker 1984)
The interesting properties of linguistic categories are abstract and such abstract properties cannot be detected in the input. (Pinker 1984)
Even if the child is able to determine certain correlations between distributional regularities and syntactic categories, this information is of little use because there are so many different cross-linguistic correlations that the child wouldn’t know which ones are relevant in his/her language.(Pinker 1984)
Spurious correlations will occur in the input that will be misguiding. For instance, if the child hears
John eats meat.
John eats slowly.
The meat is good.
He may erroneously infer The slowly is good is a possible English sentence.(Pinker 1984)
All adult speakers of the CHILDES database (2.5 million words).
Target words: 1000 most frequent words in the corpus
Context words: 150 most frequent words in the corpus
2 words preceding + 2 words following the target word:
x the __ of x
in the __ x x
will have __ the x
Target word 1 210-321-2-0
Target word 2 376-917-1-5
Target word 3 0-1-1078-1298
Target word 4 1-4-987-1398
Local contexts have the strongest effect, notably the word immediately preceding the target word is important.
"Learners might be innately biased towards considering only these local contexts, whether as a result of limited processing abilities (e.g. Elman 1993) or as a result of language specific representational bias." (Redington et al. 1998)
Level of accuracy
Number of target words
Distributional learning is most efficient for high frequency open class words.
nouns < verbs < function words
„Although content words are typically much less frequent, their context is relatively predictable … Because there are many more content words, the context of function words will be relatively amaophous." (Redington et al. 1998)
Level of accuracy
Number of words
Including information about utterance boundaries did not improve the level of accurarcy.
‘Frequency vectors’ were replaced by ‘occurrence vectors’:
Frequency vector Occurrence vector
The cluster analysis still revealed significant clusters, but performance was much better when frequency information was included.
Early child language includes very few function words. Thus, Redington et al. removed all function words from the context and repeated the cluster analysis without function words.
The results decreased but were still significant.
The cluster analyses were performed over the distribution of individual items. It is conceivable that the child recognizes at some point discrete syntactic categories (cf. semantic bootstrapping), which may facilitate the categorization task.
Representing particular word classes through discrete category labels (e.g. N), does not improve the categorization of other categories (e.g. V).
(1) The man [in the yellow car] …
(2) She [has not yet been] to NY.
(1) Nouns vs. verbs
(2) Open class vs. closed class.
1. Distributional information
2. Phonological information
Phonological features do not just reinforce distributional information, but seem to be especially powerful in domains in which distributional information is not so easily available.