Introduction to Language Acquisition Theory

Introduction to Language Acquisition Theory Janet Dean Fodor St. Petersburg July 2013 Class 3. Poverty of the stimulus: No negative evidence

The logical problem of language acquisition • Learnability theory (for language) is the theoretical wing of language acquisition research. • Its focus has been called the logical problem of language acquisition = how in principle it is possible for a natural language to be learned, given the resources (input, memory, etc.) available to some learning device. • Why a problem? Because, according to Chomsky and many (psycho)(computational) linguists, a child learner's resources seem to be not adequate to the task. (a)children have too little processing capacity to compose and test a mental grammar for the target language. (b) too little information about the language is supplied in the sample of sentences a child hears.

Why a logical problem? (Surely, empirical) • The logical problem is to find any model of learning that works. What could possibly do the job? Under what circumstances of input, computational resources, etc. • What's empirical is establishing the bounds on a solution for human acquisition of natural languages. What resources are available to a typical child learner? E.g., Possibly: Adult-like speech perception and parsing mechanisms are in place, though need tuning up. Surely not: Child checks compatibility of every possible grammar with each input sentence. • Think of a child as a mechanism for learning language. A language acquisition device (a LAD). What is this mechanism? The closer our models of it fit with linguistic and psychological reality, the better.

The logical problem can be trivially solved! • Ignoring realistic resource limitations, there are trivial solutions to the logical problem, in a linguistic framework that posits only a finite number of possible natural languages, such as Chomsky’s Principles & Parameters theory (LGB, 1981). • Just keep guessing different grammars, obeying the Subset Principle. Keep a grammar if it works. Mentally list those that fail and don’t revisit them.Then the average number of guesses before success < the number of possible grammars. • Not a useful solution! Why not?40 binary parameters  approx a trillion grammars. • Our 21st century aim is not just learnability but feasible learning.Children don't make a trillion random guesses!

6 conditions on a plausible model (Pinker 1979) • Learnability Condition: “account for the fact that languages can be learned” • Equipotentiality Condition: “not account for the child’s success by positing mechanisms narrowly adapted to the acquisition of a particular language” • Time Condition: “allow the child to learn his language within the time span normally taken by children, which is in the order of three years for the basic components of language skill” • Input Condition: “the mechanisms must not require as input types of information or amounts of information that are unavailable to the child” • Developmental Condition: “make predictions about the intermediate stages of acquisition that agree with empirical findings in the study of child language” • Cognitive Condition: “the mechanisms described by the theory should not be wildly inconsistent with what is known about the cognitive faculties of the child, such as the perceptual discriminations he can make, his conceptual abilities, his memory, attention, and so forth”

Information unavailable to the child = the poverty of the stimulus (POS) • POS is the main link between learnability & linguistics. • Chomsky’s claim (from 1980 and since): There is less information in a learner’s language input than there is in the grammar s/he acquires. • More precise/relevant: There is less information in a learner’s language intake (the information the learner can extract from his/her language input) than there is in the grammar acquired. • Chomsky’s solution: Whatever language facts can't be extracted from the language environment are innate. • This is the argument for innate linguistic knowledge, from the poverty of the stimulus (APS).

Poverty of the stimulus: estimate UG by subtraction • Given: • It follows that: • This is the sense in which learnability research is another way of doing linguistics. A tool for mapping out the contents of UG. • The more impoverished the input, the better for us! We learn more about what is in UG. [ input + innate linguistic knowledge ]  grammar innate linguistic knowledge = [ grammar – input ]

Footnotes to the subtraction formula • This subtraction equation may underestimate UG, since some info could be both in the input and innate. • What is innate may be a mixture of UG (knowledge) and the learning mechanism. Not easy to distinguish these. • Innate knowledge may not all be available to the learner at once. Some may become accessible only with maturation (Wexler 1990). • We expect (hope!) that the estimate of UG based on stimulus poverty (once we can factor out learning mechanism strategies) concurs with that from linguistic methodology. • The working assumption of much current linguistics: UG entails every aspect of human language that is universaland that has no historical or functional (practical) or general cognitive explanation. UG must be the source of all otherwise-unexplained universals. • But also! the Minimalist Program (Chomsky 1995) aims for the leanest possible characterization of UG. Strong but simple theory.

Many kinds of stimulus poverty

POS under recent attack • Until a few years ago, the poverty of the stimulus was the central pillar of most models of language acquisition. • But recent corpus searches have cast some doubt on the poverty claim. (Chomsky’s evidence was very informal!) • Strong critiques by Pullum & Scholz (2002) & others. • Also, newer powerful computational systems raise the possibility of extracting information from the input that was previously overlooked, or was infeasible to extract. • So now: POS is not a central pillar, but a central issue. • At CUNY, we have engaged with this debate:Fodor & Crowther (2002) response toPullum & Scholz.Kam et al. (2008) response on statistical info extraction.

Today: Missing negative information (1) What a big smile you have! What big teeth you have! How big a smile did she have? * How big teeth did she have? (2) I let go of the rope. (3) Out popped the cuckoo.* I let fall of the rope. *Out popped it. * I made go of the bug. (4) Don’t you eat my cake! * Please do you eat my cake. Please do eat my cake. (5) Tim’s taller than Jim is. *Tim’s taller than Jim’s. • Could UG ensure that learners don’t overgeneralize these patterns? How? Do they follow from universal principles???

Famous examples, with UG explanations (6) Isi the boy [ who is running ] ti happy? *Isi the boy [ who running ti ] is happy? • Chomsky’s explanation: UG entails that all transformational operations are structure-dependent. Front highest Aux in the tree, not first Aux in the string. (7) This is the boy who Mary kissed. This is the boy Mary kissed. This is the boy who kissed Mary. *This is the boy kissed Mary. • Den Dikken (2007) proposes: UG entails that a null operator (null relative pronoun) may not move vacuously (here, from subject position to Spec,CP).

But is this needed negative info really absent? • These examples would be an argument for UG only if children do not in fact receive negative information. • Despite some dissenters, little doubt that children aren’t reliably and systematically informed about what is not grammatical in the target language. (See below.) • That’s no problem if learners never generalize beyond their input. But they do. Since they don’t overgeneralize, something holds them back, in just the right ways. • Familiar argument: If it’s not negative evidence, it must be something innate: UG. (The subtraction formula.) • Every sentence you know to be ungrammatical is so specified by UG! • BUT: There may be indirectforms of negative data, such as failed expectations or statistical probabilities. (Slide 19ff.)

Empirical evidence for ‘PONS’ • Adults don’t generally produce ungrammatical sentences & then label them as ungrammatical. (Linguists aren’t normal!) • So negative evidence would more likely consist of feedback on the child’s own utterances. • Classic study: Brown & Hanlon (1970). Parents of three children showed no greater approval, and no greater comprehension, for child's grammatical utterances than for ungrammatical. • "the bases for approval and disapproval...are almost always semantic or phonological“. Did the child say something true? • Eve: “Mama isn't boy, he a girl.” Mother: “That's right.” Eve: “There's the animal farmhouse.” Mother: “No, that's a lighthouse.”

The key questions (Grimshaw & Pinker 1989) • Do language learners receive negative evidence? In a child-usable form? • Do learners need it? Do they use it when it is available? • Important points to bear in mind: Unless all children who acquire language receive negative feedback, it cannot play an essential role. Most ungrammatical sentences in a language are not uttered by children, therefore no negative feedback could be given. A learner has no way to distinguish a correction from any other (approximate) repetition (examples below). Negative info other than feedback on child's own utterances seems to be rare before school age, in all (?) cultures.

Answers that many linguists accept • Negative evidence is not provided for every construction, nor for every child. • Even when it does occur, it’s not very child-usable – too undiscriminating. (See examples, next slides.) • Anecdotes suggest learners can’t always benefit from explicit correction. E.g., McNeill 1966: Nobody don’t like me.  Nobody don't likes me. • If indeed it's mostly not available, we conclude that acquiring the target grammar does not require it. • But still: there are periodic revivals of enthusiasm for negative evidence in acquisition. So we should give serious consideration to the contrary data that is adduced.

Empirical data seemingly againstPONS • Hirsh-Pasek, Treiman & Schneiderman (1984): Confirmed Brown & Hanlon re parental approval, but more adult repetition of ungrammatical child utterances than grammatical. For ten 2-year olds only (not 3,4,5 yrs). • Moreover: "Virtually all repetitions of the ill-formed sentences included a correction of the child's error.” • 20.8% of ill-formed sentences were repeated. 12% of well-formed sentences were repeated. • People lives in Florida.People live in Florida.I in school. You're in school. (JDF: Note change of pronoun!) • Another problem: Some parental repetitions of child’s goodgrammatical utterances also included changes.What do you do with a wooden block?What do you do with a little wooden block?

Inefficacy of ‘noisy’ feedback(Marcus 1993) • The only type of negative feedback documented is ‘noisy’. It is a negative response that follows more ungrammatical than grammatical sentences. (It’s not if-and-only-if *). • Empirical studies (as of 1993) indicate that only 2-yr-olds receive a significant amount. • A child utterance would have to be repeated 85 (or 446) times to decide reliably whether the parental response marks it as ungrammatical. But (aside from formulas and routines) children rarely repeat a sentence even 10 times. • Marcus’ conclusion: "Children who changed their grammarsevery time the parent said something different would radically damage their languages.“ • This more or less ended the debate on parental feedback.

Possible alternatives to adult feedback • How else might learnability theory avoid invoking UG to tell children just how far to generalize? • Indirect negative evidence – 1: Child keeps track of what is absentfrom the input sample. • Implausible! Highly labor-intensive. Would have to track occurrence frequency of all potential sentences! • Perhaps instead track every possible rule that is never needed for parsing input sentences; discard those rules. • But that’s not sufficiently discriminating if rules can be general. (And surely they must be.) • E .g., Wh-movement from adjunct clauses occurs rarely or never. But the learner can’t discard it if it’s an instance of the general rule of WH-movement (or ofMove Alpha).

Indirect negative data - 2 • Child compares adult’s utterance with how she herself would have said it, and assumes the adult’s way is better. Why didn’t mommy say it my way? I guess my way is wrong! • Unlike adult feedback, this does not require the child to utter every sentence that needs correction.☺ • Unlike noticing absence of a form, this does not require the child to monitor everything that adults don’t say.☺ • Not implausible, psycholinguistically. Worth trying to develop this idea. Or related: Child is seeking best way to express a particular thought. • Note: It requires knowledge of which sentences express thesame or different messages. E.g., ‘the little wooden block’ isn’t more grammatical than ‘the wooden block’. The child shouldn’t deduce that ‘little’ is obligatory before ‘wooden’.

The uniqueness principle (UP) • Indirect negative evidence-2 (comparison with adult utterances) presupposes the Uniqueness Principle: There is only one right way to express any proposition. • It creates negative evidence out of positive examples. ☺ • It can cure superset errors. ☺ • In morphology: feet in the input drives out *foots. • UP is not absolute; only a default. Doublets do occur (e.g., cannot/ can’t), so may have to reinstate later.

UP could be very useful for lexical learning • Lexical generalizations are typically only partial. (Exceptions may be arbitrary, or perhaps fall under some other partial generalization. Many ex’s in Pinker,1989.) • Therefore, tempting overgeneralizations need to be held back. UP can help.I gave $100 to Jim. I gave Jim $100. I donated $100 to the library. I donated the library $100.Let the rope go / fall. Let go / fall of the rope. • And even for lexical constraints on general transformations:  It’s nice / It seems that he’s here. Extraposition. That he’s here is nice / seems.

Is UP at work in ‘pure’ syntax acquisition? • Unclear how helpful UP can be for ‘pure’ syntax. • Positive evidence would drive out competitors:  Do eat the cake!  *Do you eat the cake!  Out it popped.  *Out popped it. • Sometimes that’s good:  Did Sue see a yak?  Saw Sue a yak? • Sometimes it’s too severe: Did Sue see a yak?  * Sue saw a yak? [pitch rise] • But there’s positive evidence for Sue saw a yak?so it would be reinstated. Ok.

But too much re-learning needed? • In ‘pure’ syntax, there is more generality; and there are more doublets. E.g.,  I saw no-one. I didn’t see anyone. • This means very many exceptions to preemption by UP  much work for learners, to reinstate acceptable forms. • E.g., do actives preempt passives? (Or vice versa?) • Also, as noted earlier, a learner would need innate knowledge of what counts as the "same" proposition. E.g. do these compete? Is there a subtle meaning difference? ,I will if I can.  If I can, I will. • Importantly, UP cannot eliminate ungrammatical forms if a sentence form has no well-formed competitor: Subjacency violation: *How did he wonder whether to sing the song?Vacuous WH-movement inside DP: *How big teeth does she have?

Indirect negative evidence: Summary • Of all potential sources of negative evidence, the most likely is UP, in the form of child comparing his/her expected sentence form with what an adult actually said. • Not well-established, but may be worth exploring further. Some concerns need to be addressed in future research: • Is workload feasible? Child must anticipate the content of adult’s remark, and form own version. Or reflect on it afterwards (while keeping up with next sentence & next!). • Is the work wasteful? Many sentence forms are eliminated by UP preemption and must later be reinstated. (To & fro.) • Learners may still need UG, to provide guidance on what is a different syntactic form of the same proposition, versus a different remark!

Please write 2 or 3 pages on the relation between stimulus poverty and UG. • How strong is the current empirical evidence in favor of poverty of the negative stimulus (PONS?) • What is the conceptual (logical/theoretical) relation between scarcity of negative data for learners, and the hypothesis that humans have innate pre-knowledge of the properties of natural languages? • What might future research contribute that could help to settle these issues? • Hand in on Monday in class. • Note: You may rely on class slides and discussion; no need to read further literature. • Don’t expect to cover every aspect of the topic in 2 - 3 pages! Just set out major points.

Introduction to Language Acquisition Theory

Introduction to Language Acquisition Theory

Presentation Transcript

child language acquisition introduction

LANGUAGE ORIGINS SOCIETY ST PETERSBURG 12-18 JULY 19933

Janet Wagner Dean

Introduction to Second Language Acquisition

St. Petersburg

Introduction to Second Language Acquisition

Language Acquisition Department 2013

Janet Wagner Dean

Introduction to Second Language Acquisition

Language Acquisition Theory

Janet Wagner Dean

INTRODUCTION TO SECOND LANGUAGE ACQUISITION

Introduction to Language Acquisition Theory Janet Dean Fodor St. Petersburg July 2013

Introduction to Language Acquisition Theory Janet Dean Fodor St. Petersburg July 2013

Introduction to Language Acquisition Theory Janet Dean Fodor St. Petersburg July 2013

Introduction to Language Acquisition Theory Janet Dean Fodor St. Petersburg July 2013

Introduction to Language Acquisition Theory Janet Dean Fodor St. Petersburg July 2013

Introduction to Language Acquisition Theory Janet Dean Fodor St. Petersburg July 2013

St. Petersburg

Savvichev Alexander St. Petersburg - 2013

Introduction to Language Acquisition Theory Janet Dean Fodor St. Petersburg July 2013