1 / 49

Computational Cognitive Modelling

Computational Cognitive Modelling. COGS 511-Lecture 6 Computational Cognitive Modelling in Studying Inflectional Morphology. Related Readings. Readings: Nakisa et al. Single and Dual-Route Models of Inflectional Morphology;

elinor
Download Presentation

Computational Cognitive Modelling

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Computational Cognitive Modelling COGS 511-Lecture 6 Computational Cognitive Modelling in Studying Inflectional Morphology COGS 511

  2. Related Readings Readings: Nakisa et al. Single and Dual-Route Models of Inflectional Morphology; İn Broeder P. and J. Murre (2002). Models of Language Acquisition: Inductive and Deductive Approaches, OUP, 2002 Optional and Further Readings • Rumelhart and McClelland (1986) On Learning the Past Tenses of English Verbs. In McClelland et al. (eds) Parallel Distributed Processing, vol. 2, MIT Press. • The Past Tense Debate (articles and replies by Pinker and Ullman vs. McClelland and Patterson). Trends in Cognitive Sciences, 6(11), 2002. • Taatgen and Anderson (2002). Why do Children learn to say “Broke”? A model of learning the past tense without feedback. Cognition 86, pp. 123-155 • Almor, A. (2003). Past Tense Learning. In Arbib, M. (ed). Handbook of Brain Theory and Neural Networks, MIT Press. • Pinker. S (1999) Words and Rules: The Ingredients of Language. Phoenix • Marcus, G (2000) The Algebraic Mind: Integrating Connectionism and Cognitive Science, MIT Press. • Marcus. Children’s Overregularization and Cognition in Broeder and Murre (2002) All figures adopted are referenced in the notes parts of the relevant slide respectively. COGS 511

  3. Units of speech • Phones: unitary segments of the streams of speech • Are made up of phonological features acc. to places of articulation, voicing etc [+glottal], [+voiced], [-voiced], see also consonants and vowels • Phonemes: abstract units characterizing a phone and its allophones (variants of the same sound): [p] in spin and [ph] in pin are allophones of the same phoneme /p/ • Syllables: combinations of phones COGS 511

  4. Morphology • The study of word structure, of words and how they are formed. • Morphemes: the smallest meaningful linguistic unit. Morphemes may have more than one phonemic form, each of which is an allomorph of the morpheme- a meaningful form is a morph. • a/an in English; -ler/-lar in Turkish (-lAR); will/’ll (contractions) in English COGS 511

  5. Derivational vs Inflectional Affixes • Derivational: • function change (may change part of speech or derive new word – energy noun – energy + ize – energize verb but happy-unhappy, or pig-piglet • Inflectional • Bound forms of grammatical morphemes • No function, part-of-speech change, rather markings for tense, gender, case, number e.g. plural morphemes, past tense formation COGS 511

  6. Other terminology • Lexicon: our mental dictionary –avg adult knows 45,000 to 60,000 words • Root: A lexical morpheme which is the base to morhological processes • Stem – used either as a synonym to root or the base to inflectional morphology • Word class, category, part of speech: a linguistically relevant group that share particular linguistic properties: nouns, verbs, adjectives, adverbs, prepositions, pronouns, determiners etc • Suppletive forms: Irregular related forms, ex: be and were. • Partial suppletion: sub-regularity ex: sing-sang, ring-rang COGS 511

  7. Morphological Rules • Morphological rules express • When a morpheme has allomorphs, the choice among these, ex: kitaplar • Necessary and possible combinations and order of morphemes which make up words (morphotactics), ex: *kitabımlar • Morphosyntactic constraints e.g. Subject-verb agreement: I eat but she eats COGS 511

  8. Language Impairments • Aphasias (impairment in language and speech); developmental disorders (autism, William’s syndrome). • Broca’s aphasia (aka cortical motor aphasia) slow, halting, telegraphic speech. Finer distinctions in understanding language (basic word order vs movements) • Wernicke’s aphasia (aka cortical sensory aphasia) difficulties in understanding language; grammatical but meaningless utterances. • Common types of syndromes: paraphasias (production errors like chair for table; tame for lame); anomic (difficulties in finding the right word); echolalia (compulsive repetition). • Agrammatism: impairment of comprehension often associated with agrammatic production (absence of grammatical morphemes) in nonfluent aphasics COGS 511

  9. The Past Tense Debate and Inflectional Processes • Is regular inflection (e.g. English past tense suffix –ed) an implication for rules in mental computation? • Wug test (Berko, 1958): one wug, two ?  English speakers (age 3 upwards) apply the regular rule to new words they havent heard before • Overregularization errors: At around age 3, children who may have previously used irregular forms correctly suddenly start to inappropriately regularizing many irregular forms. Went  goed/wented. Plotting children’s performance against age is what is known as “U shaped learning curve”. • Acquisition of a rule? • A qualitative change in the learning mechanism? COGS 511

  10. Dual vs single route mechanisms • Dual Route (Pinker and others- Pinker’s version post-1999 aka Words and Rules theory) Proposal: Inflectional morphology in all human languages is computed by a dual route mechanism consisting of pattern associator type of memory module (for irregulars and frequently encountered, possibly irregular sounding regulars) and a rule (for defaults) which is unblocked only when the pattern associator fails. • Single route (McClelland and others) Proposal: Single mechanism for handling both regular and exceptional forms – mainly put forward by connectionist modelling. COGS 511

  11. Rumelhart and McClelland (1986) • Landmark connectionist model in past tense debate • Input: phonological representations of stem forms; output: phonological representations of past tense forms • Fixed encoding and decoding networks: word forms are represented by units designating each phoneme together with its predecessor and successor. Encoding will map these into so called “Wickelfeatures” that represent features (voiced, stop etc) of phonemes. • Learning by perceptron convergence (PDP version) and then backpropagation (Nature version) • Pattern associator with modifiable connections • No explicit rules but able to produce regular past tense forms for novel verbs and the U shaped learning curve characteristic of children in training. COGS 511

  12. COGS 511

  13. Criticisms • Divergence from human behaviour, e.g. model did not generalize well to novel forms that have an unusual sound (e.g. the model mapped the stem tour (not in the training set) to toureder). • U shaped learning occurs a result of implausible and carefully engineered training regime, e.g. a sudden jump in vocabulary from 10 to 420 verbs (Pinker and Prince, 1988) COGS 511

  14. Later Developments • Better connectionist models: MacWhinney and Leinbach (1991), Plunkett and Marchman (1991,1993, 1996)- obtaining the U shaped learning with gradual increase in vocabulary but performance in regular verbs also decreases with decrease in irregular verb performance – contradiction with Marcus’ data. • More criticisms of dual route theorists on specific assumptions of specific models • But very few computational comparable models of dual route theory, so is the theory underspecified? • Should simplifying assumptions of connectionist models be critical in the points they make? • And what about the assumptions that dual route theorists make? Ex: about the innate nature of blocking mechanism • Led to new empirical studies of frequency distribution of inputs and outputs in morphological acquisition as well as models for inflectional processes in other languages (German, Arabic, Hebrew), which have different morphological properties and frequencies than English. COGS 511

  15. A theoretical assessment of Words and Rules (Dual Route) theory Acc. To Pinker (2002) • Contrasts with generative phonology: Applying rules to irregular form by categorizing them into phonological patterns will lead to too many exceptions. • More similar to lexicalist theories (e.g. Jackendoff) that posit morphological phenomena are neither arbitrary lists nor fully productive phenomena. • It is not a connectionist system glued onto a rule system (cf. Nakisa et al.) as lexical entries have structured morphological, semantic etc. properties current connectionist models do not COGS 511

  16. COGS 511

  17. Dual Route theory does not say (Pinker, 2002) • Literally there is a rule “to form the past tense add –ed to the verb.” (Thus compatible with constraint or construction based theories of language) • It is not the case that regular forms are never stored, but just that they do not have to be. Such storage depends on word-, task- and speaker-specific factors. • Regular forms that constitute doublets with irregulars (dived/dove; dreamed/dreamt) must be stored to escape blocking by the irregular. • Regular forms that resemble irregulars (blinked, glided) must be stored to escape a partial blocking effect by similar irregulars. COGS 511

  18. Support for Dual Route Theory • Marcus et al collected past tense forms of English form CHILDES database from 83 children 1-6 years of age. Findings: Children overregularize rarely (4%). Concl: Errors stem from a performance error rather than qualitative grammatical reorganization. • Low frequency verbs tend to be overregularized more often than high frequency verbs. Concl: Overregularization is a result of memory failure • Verbs with greater number of similar sounding irregular numbers were less likely to overregularized. • Overregularization disappears gradually over time. • Onset of overregularization coincides with development of reliable regular past tense marking. • Presence of similar sounding regular verbs does not make overregularizations of irregulars more likely. • Cross linguistic study: On German plurals, both children and adults use –s for novel words that sound unusual and names Concl: Regular inflection can be generalized independently of frequency. COGS 511

  19. COGS 511

  20. COGS 511

  21. Empirical Evidence from Dual Route Theorist’s Point of View • Generalization to Unusual Novel Words: People tend to apply regular inflections to novel unusual words • Even connectionist models that can do so, either implement or presuppose a rule- e.g.not generating full form, but activating local output units for past tense inflection only; having extra mechanisms corresponding to an innate mechanism • Onset and rate of overregularization errors in children do not correlate with changes in the number and proportion of regular verbs used by parents. • Regular inflections may form a minority class but be generalized like English regulars in other languages. • Connectionist claims that distribution of regulars over phonological space is crucial (esp. Not in specific clusters) do not hold in languages like Hebrew where speakers apply them to unusual sounding and exocentric nouns. COGS 511

  22. Systematic Regularization • Some irregular forms can systematically be used in regular forms. • Words and Rules theory says this is because they lack a root in head position that can be marked for the inflectional feature (tense or number) and thus regular suffix applies since memory access is disabled. • Dinged, “I found three man’s on page 1”,a couple of wolfs (wolfing down the food) • If a irregular sounding word changes in meaning but retains a root in head position it stays irregular no matter how radical the change is: straw men, beewolves, superwomen etc. COGS 511

  23. Dual Route Reply to Single Route Key issue is not gradedness in behavioural data but whether human language mechanisms are combinatorial and sensitive to grammatical structure and categories. Rules can be acquired gradually and apply probabilistically, and thus can deal with gradedness. COGS 511

  24. Connectionist View of Two Approaches COGS 511

  25. Connectionist Reply to “Words or Rules” • Connectionist models exploit the quasi-regularity (the tendency for an exception to exhibit aspects of the regular pattern) as they are processed by the same mechanism and dual route theory does not. • Cut, hit etc past tense identical • Bleed, breed ; past tense bled, bred • 59% of 181 irregulars fall into one of the eight classes defined in McClelland and Patterson (2002). Rest also exhibit quasi-regularity except be and go. • Quasiregularity occurs in other domains such as spelling-sound mapping; derivational morphology. COGS 511

  26. Sudden Acquisition of Past Tense • Marcus’ (dual route) claim: First overregularization in each child’s corpus indicates a moment of acquisition of the past tense rule, and this is followed by rapid increases in inflecting regulars to high levels shortly. • Connectionists’ reply: Hoeffner’s reevaluation of the same data gives a more gradual and graded picture. COGS 511

  27. Uniformity with respect to Phonology: • Dual route theorists’ claim: Rules apply on categorical conditions • Connectionists’ reply: Prasada and Pinker’s conclusion that there was no effect of similarity of novel words to known regulars was ill-founded as their stems were not of high phonological acceptibility. Regular past tense is sensitive to phonological attributes of the stem. COGS 511

  28. Uniformity with respect to Semantics • Dual theorists claim: word meaning does not affect tendencies for novel (aka nonce) words. Connectionists’ claim: It does, Ramscar’s placement of novel words like frink into semantic contexts that primed words alternatively like drink or blink, elicited different past tense formations, namely frank or frinked. COGS 511

  29. Frequency Effects • The use of irregularly inflected forms is strongly affected by their frequency; and to the extent that regularly inflected forms show frequency effects, these effects are quite small. • Both dual and single route theories can explain this. • Distinguishing type frequency and token frequency: irregular verbs are few in type but common as tokens. • Irregularization errors (incorrectly producing an irregular form for the regular form) are more likely for low frequency regular verbs than for high frequency regular verbs; also latency of correct responses for low frequency regulars is more if there is interference by similar sounding irregulars. • Almor’s claim: this is not compatible with dual route theory as the theory predicts only regulars stored in the memory system should be high frequency regulars. COGS 511

  30. The Case for Minority Defaults • Regular past tense in English applies 86% of 1000 most common words. • Regular German past participle +t, the Arabic broken plural, and the German –s plural have been claimed by dual theorists as being minority defaults thus strenghtening the case for dual mechanism. • Connectionist claim: Empirical data show otherwise for all three cases. For +s plural, although it is minority, it does not apply uniformly across contexts, hence it is not the default. COGS 511

  31. Neurological Impairments and Imaging • Double dissociations between having trouble with regulars vs irregulars • Temporal and functional differences between processing of regulars and irregulars • Dual interpretation: Grammar areas handle regular processing; lexical semantics areas handle irregular processing (agrammatism vs anomia) • Alternative dual interpretation (Ullman, 2001) Regular processing on procedural memory, irregular on declarative memory • Connectionist models can also show selective impairments to regulars and irregulars • Irregulars depend more on semantics than phonology, where as regulars depend more on phonology; so more damage to phonological representation will cause affect regulars more. (Joanisse and Seidenberg, 1999). Pinker claims the representation in semantics is effectively a lexicon, with one unit dedicated to each word. More evidence against connectionist modelling: Anomic patients with no difficulty in accessing word meanings still have difficulty with irregulars; the prediction that patient groups should have parallel tendencies to generalize regular and irregular inflection to novel words but there is dissociation. COGS 511

  32. COGS 511

  33. Double Dissociations • Connectionist Claim: Data reported by dual theorists on selective impairment is either misinterpreted or experimentally biased, eg Ullmans study had word final consonants twice longer in regulars than in exceptions. This increases phonological complexity; thus impairment to phonological representation will entail impairment to regular inflection (similar prediction for developmental language disorders). When phonological complexity is matched, an advantage for irregulars no longer remains (Bird et al.) COGS 511

  34. Against the Predictions of Connectionist Models • It is not necessary or empirically correct to assume overregularization is triggered by a sudden increase in regular forms in the input. • No polysemous irregular roots tie regular forms to specific meanings e.g. *throwed up. Ramscar’s experiment is ill-founded. • Experimental evidence about –t participles and –s plurals in German: e.g. controversies on counting for determining majority • Currently SLI (specific Language Impairment) patients show no difference in impairment for regulars vs irregulars. Language impaired people are impaired with rules (hence unable to inflect nonsense words) but can memorize common regular forms (lack of deficit compared w. irregulars). SLI is found to have no relation w. Auditory perception. • Replication of aphasia studies showing non-fluent aphasics have more trouble with regular than irregular forms gave mixed results; neither did it show that it is a side effect of phonological complexity. COGS 511

  35. Comparative evaluation of Dual and Single Route Strategies (Nakisa et al.) • For three different paradigms • German plurals • Arabic plurals • English past tense • Three different pattern associators • A nearest neighbour classifier: for a novel word, find the most similar neighbour and adopt its inflection type. • Simplified Nosofky Generalized Context Model: Based on probabilistic reasoning on classification. • Three layer feedforward network with backpropagation; outputs corresponding to local units for different inflections. • Dual route models are implemented with definition of “memory failure” in each model and an additional rule mechanism: e.g. memory fails if the greatest output unit activity is less than a threshold value in the neural network. • A phonology based representation was used in all simulations COGS 511

  36. COGS 511

  37. Some Constraints • Associative memory of the dual route classifier is trained with only irregular forms. • Nearest neighbour algorithms can not deal with token frequencies so it is not accounted for in any of the pattern associators. COGS 511

  38. COGS 511

  39. COGS 511

  40. COGS 511

  41. Major findings • Nearly in all simulations single route classifiers generalized better more accurately than dual route classifiers. • Sound of a word stem is a good predictor of the inflection type the stem undergoes. • The failure to deal with Arabic dependent on the distribution of irregulars with respect to regulars. Broken plurals (73% of type frequencies of the data) were distant to other irregulars, thus were mistakenly regularized by the dual route system. COGS 511

  42. An ACT-R model of Past Tense Learning • (Taatgen and Anderson, 2002) Showing U-shaped learning without direct feedback (internal feedback is provided by execution times of different strategies), with realistic training regime, i.e. gradual changes in vocabulary, and unrealistically high rates of regular verbs; and can deal with minority default rules. Uses rules both for regular and irregular cases. • Interpreted as characterizing an underlying connectionist system at a higher level of analysis; with rules providing descriptive summaries of the regularities captured in the network’s connections. COGS 511

  43. Various Strategies Used • Retrieval Strategy: Produce a past tense by recalling an example of inflecting the word from memory • Analogy: Recall an arbitrary example of past tense from memory, and use it as a basis for analogy. Leads to learning regular rule (takes some time to learn, and overregularization occurs whenever retrieval fails in low frequency verbs) • Zero strategy: Do no inflection at all. • The strategy with highest expected utility is applied with highest probability. • Perception and generation alter over the period of simulation; 478 words based on Marcus (1992) study. COGS 511

  44. Comparison w. Dual Route Account • It is not the case that cognitive system discovers that the regular rule is an overgeneralization but just that it has not properly memorized the exceptions yet. Dominance of the irregular is a result of its greater efficency not because of the assumption of blocking system being the dominant strategy. COGS 511

  45. COGS 511

  46. COGS 511

  47. COGS 511

  48. Conclusion • Hot debate, with major implications for cognitive architecture • Close scrunity to methodologies and interpretations of both experiments and corpus based studies. • Computational vs noncomputational models are hard to compare. Dual route theorist have a nonfair advantage there. • Which level of description one is offering? • Generally a good example of what computational cognitive models can lead to. COGS 511

  49. Lecture 7 • Next Week: Sample Models in Cognitive Neuropsychology • Readings: Cohen and Servan-Schreiber,Context, Cortex and Dopamine; Farah, Locality COGS 511

More Related