1 / 56

Corpus Studies of Constituent Ordering

Corpus Studies of Constituent Ordering. Tom Wasow. An example, from Steven Pinker’s The Language Instinct , p. 131:. In my laboratory we use it as an easily studied instance of mental grammar, allowing us to document in great detail the psychology of linguistic rules

lamya
Download Presentation

Corpus Studies of Constituent Ordering

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Corpus Studies of Constituent Ordering Tom Wasow

  2. An example, from Steven Pinker’s The Language Instinct, p. 131: In my laboratory we use it as an easily studied instance of mental grammar, allowing us to document • in great detail • the psychology of linguistic rules • from infancy to old age • in both normal and neurologically impaired people, • in much the same way that biologists focus on the fruit fly Drosophila to study the machinery of the genes.

  3. One of the other 119 possible orders: In my laboratory we use it as an easily studied instance of mental grammar, allowing us to document • the psychology of linguistic rules • in great detail • in both normal and neurologically impaired people, • from infancy to old age • in much the same way that biologists focus on the fruit fly Drosophila to study the machinery of the genes.

  4. And another order ?? In my laboratory we use it as an easily studied instance of mental grammar, allowing us to document • in much the same way that biologists focus on the fruit fly Drosophila to study the machinery of the genes • in both normal and neurologically impaired people, • in great detail • the psychology of linguistic rules • from infancy to old age

  5. What makes some orders sound more natural than others? • The answer might shed light on the psychological processes underlying language use. • It might also have practical applications: • for on-line style checkers • for machine translation • for other applications requiring robust generation

  6. The Alternations I Studied • Heavy Noun Phrase Shift: • We take too many dubious idealizations for granted. • We take for granted too many dubious idealizations. • The Verb-Particle Construction: • We figured out the problem. • We figured the problem out. • Dative Alternation: • Kim handed a toy to the baby. • Kim handed the baby a toy.

  7. Factors I Looked At • Structural complexity (or “weight”) • Discourse status (or “newness”) • Semantic connectedness of verb and following constituents • Lexical biases of verbs • Ambiguity avoidance

  8. Grammatical Weight • Behaghel’s “Gesetz der Wachsenden Glieder”: “Von zwei Gliedern von verschiedenem Umfang steht das umfangreichere nach.” Translation Law of Growing Constituents: Of two constituents of different size, the larger one follows the smaller one • In other words: Simple phrases precede complex ones.

  9. Many Proposals to Make Behaghel’s Generalization Precise • Some absolute, others relative • Some categorical, others graded • Corpus data support relative, graded definition • Various proposed measures are so highly correlated that they can’t be distinguished

  10. Categorial Weight Definitions • An NP is heavy if it "dominates S” [Ross (1967, rule 3.26)] • "the condition on complex NP shift is that the NP dominate an S or a PP" [Emonds (1976; 112)] • "Counting a nominal group as heavy means either that two or more nominal groups...are coordinated...., or that the head noun of a nominal group is postmodified by a phrase or clause" [Erdmann (1988; 328), emphasis in original] • "the dislocated NP [in HNPS] is licensed when it contains at least two phonological phrases" [Zec and Inkelas (1990; 377)] • "it is possible to formalize the intuition of 'heaviness' in terms of an aspect of the meaning of the constituents involved, namely their givenness in the discourse" [Niv (1992; 3)]

  11. Graded Weight Definitions • Number of words dominated [Hawkins (1990)] • Number of nodes dominated [Hawkins (1994)] • Number of phrasal nodes (i.e. maximal projections) dominated [Rickford, et al (1995; 111)]

  12. Numbers of Examples HNPS DA V-Prt V DO X 10,592 426 496 V X DO 694 615 1,205 TOTAL 11,286 1,041 1,701

  13. Testing Adequacy of Categorial Definitions using HNPS

  14. Testing Categorical Definitions as Relative Criteria using HNPS

  15. Weight Effects Increase Smoothly

  16. Weights of Both Constituents Matter in HNPS

  17. Weights of Both Constituents Matter in DA

  18. The Overlap of the Weight Measures

  19. More on Overlap of Weight Measures

  20. More on Overlap of Weight Measures

  21. Still More on Overlap of Weight Measures

  22. Correlation Coefficients for 3 Weight Measures HNPSDAV-Prt Words & Nodes .94 .96 .99 Words & Phrasal Nodes .96 .97 .95 Nodes & Phrasal Nodes .94 .96 .98

  23. HNPS and Collocations

  24. Two Verb Classes and HNPS • Vt (for "transitive verbs") require NP objects in all their subcategorizations: bring, carry, make, place, put, set, take. • Vp (for "prepositional verbs") can occur with NP objects but also have uses with an immediately following PP and no NP object: add, build, call, draw, give, hold, leave, see, show, write.

  25. Predictions SPEAKER'S PERSPECTIVELISTENER'S PERSPECTIVE Vt HNPS rare HNPS relatively common Vp HNPS relatively common HNPS very rare

  26. Results from Brown Corpus

  27. Results from Switchboard Corpus

  28. Two Verb Classes and DA • Vs(for ”sentential verbs") may be followed by an NP and that-clause or infinitval VP: offer, show, teach, tell, write • Vn (for ”non-sentential verbs") may not be followed by an NP and that-clause or infinitval VP : assign, bring, give, hand, pay, send, take

  29. Predictions SPEAKER'S PERSPECTIVELISTENER'S PERSPECTIVE Vs double object relatively common double object relatively rare Vn double object relatively rare double object relatively common

  30. Corpus Results for DA Verb Classes

  31. End of material on weight

  32. Newness • The “Given-Before-New Principle”, as formulated by Clark & Clark: “Given information should appear before new information.” • Many variants in the literature.

  33. Are weight and newness distinct effects? • New information requires more more words to convey than old information (e.g., descriptions vs. pronouns) • Is one of these factors just a side-effect of the other? • Surprisingly, nobody asked this question until a few years ago.

  34. Weight and newness are distinct. • With my students, I conducted corpus analyses and a production experiment to tease weight and newness apart. • Both methods showed the two factors were not reducible to one.

  35. Weight vs. Newness in Heavy NP Shift Corpus Study

  36. Weight & Newness Aren’t the Whole Story “On this side of the Atlantic, the Lancaster-Oslo/Bergen corpus was designed to replicate as closely as possible the Brown corpus, the only difference being that this corpus contains British rather than American English texts.” Judith Klavans, “Computational Linguistics,” in W. O’Grady, M. Dobrovolsky, & F. Katamba, Contemporary Linguistics: An Introduction

  37. Another Factor: Semantic Connectedness Behaghel again: “das geistig eng Zusammengehörige auch eng zusammengestellt wird” Translation What belongs together mentally is also placed close together

  38. Collocations and Idioms • Idioms (semantically opaque collocations): • …bring pressure to bear • Semantically transparent collocations: • …bring the meeting to an end • Non-collocations: • ...bring a pencil to the meeting

  39. Heavy NP Shift and Semantic Connectedness

  40. Dependent vs. Independent Particles • Dependent: They ate the cookies up. • The meaning of “up” is dependent on the meaning of “ate”, since the cookies don’t go up. • Independent: They picked the cookies up. • The meaning of “up” is independent of the meaning of “ate”, since the cookies go up.

  41. Particle Position and Semantic Connectedness

  42. Another Factor: Verb Bias

  43. Possible Explanations for Factors Influencing Order Variation • Short before long is easier to process, because hard tasks are postponed. • Given before new facilitates efficient communication by establishing common ground.

  44. Possible Explanations for Factors Influencing Order Variation (continued) • Long phrases and new information are hard to produce and thus get postponed. • Choices in word order allows speakers flexibility in production. • Our memory for words includes information about what constructions they occur in and how frequently.

  45. Another Possible Factor: Ambiguity Avoidance • Global ambiguity: • I saw a man wearing an odd hat with a telescope. • I saw with a telescope a man wearing an odd hat. • Local ambiguity: • They gave Grant’s letters to Lincoln to a museum. • They gave a museum Grant’s letters to Lincoln.

  46. Corpus Study of Global Ambiguity and Heavy NP Shifting

  47. Corpus Search for Local Ambiguity • Few ambiguities of the relevant form (3) • The company gave the U.S. rights to the drug to the Population Council… • More unambiguous word orders (56) • Giuliani gave the commissioner the ceremonial key to the city… • But all unambiguous cases are also cases of short-before-long.

  48. SPEAKER LISTENER Experimental Method 1. Speaker silently reads a sentence: A museum received Grant's letters to Lincoln from the foundation.

  49. What did the foundation do? SPEAKER LISTENER Experimental Method 2. Sentence disappears from screen. Listener reads question from list.

  50. Experimental Method 3. Speaker answers the listener’s question. Listener chooses the correct response on list (from two choices). The foundation gave .... the museum, um, Grant's letter's to Lincoln. SPEAKER LISTENER

More Related