1 / 22

Corpus Pattern Analysis: new light on words and meanings

Patrick Hanks Institute of Formal and Applied Linguistics, Charles University in Prague. Corpus Pattern Analysis: new light on words and meanings. Talk Outline. Collocations: the work of J. M. Sinclair Phraseology and terminology in dictionaries:

caseymorris
Download Presentation

Corpus Pattern Analysis: new light on words and meanings

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Patrick Hanks Institute of Formal and Applied Linguistics, Charles University in Prague Corpus Pattern Analysis: new light on words and meanings

  2. Talk Outline • Collocations: the work of J. M. Sinclair • Phraseology and terminology in dictionaries: • Terminological words need to be defined in relation to the world and other related terms (and etymology) • Everyday words need to be explained in relation to the different patterns of use in which they are found • Creativity in language: exploiting norms • exploitations are different in kind from normal uses • Collocations and phraseology: neglected in dictionaries • A Pattern Dictionary

  3. Collocations: “Many, if not most meanings, require the presence of more than one word for their normal realization. ... “Patterns of co-selection among words, which are much stronger than any description has yet allowed for, have a direct connection with meaning.” —J. M. Sinclair 1998, ‘The Lexical Item’ in E. Weigand (ed.) Contrastive Lexical Semantics. Benjamins. John Sinclair (1933-2007)

  4. “The principle of idiom is that a language user has available to him or her a large number of semi-preconstructed phrases that constitute single choices, even though they might appear to be analysable into segments.” —Sinclair 1991. Corpus, Concordance, Collocation, p. 110 “Tending towards open choice is what we can dub the terminological tendency, which is the tendency for a word to have a fixed meaning in reference to the world. ... tending towards idiomaticity is the phraseological tendency, where words tend to go together and make meanings by their combinations.” —Sinclair 2004. Trust the Text, p. 29 Idiomaticity vs. Open Choice

  5. The terminological extreme: strobilation “Jellies live their lives in two body forms or stages. There is a polyp or attached stage which resembles tiny sea anemones ... and a medusa or free-swimming stage. The trick to getting jellies to go through their life cycle (both stages) is to get the attached polyps to begin to strobilate, or bud off new juvenile jellies. ... “Head Curator Lisa Scott (class of 2007) designed and built a Moon JellyStrobilation display for her Senior Project. She was able to successfully strobilate jellies which will ensure that we have a steady supply of adult jellies for our popular kreisel, or jelly display. So next time you stop by the Aquarium make sure to check out Lisa Scott’s juvenile jellies in the new Moon Jelly Strobilation Display.” —from the Cabrillo High School Aquarium project, California, http://www.cabrilloaquarium.org/aqurium-exhibits/moon-jelly-strobilation.html

  6. How is strobilation treated in English dictionaries? (1) • (N)ODE (1998): It’s not in.

  7. How is strobilation treated in English dictionaries? (2) • Merriam Webster's Online Dictionary: Main Entry: stro·bi·la·tion ... Etymology: New Latin strobila :asexual reproduction (as in various coelenterates and tapeworms) by transverse division of the body into segments which develop into separate individuals, zooids, or proglottids. Does the etymology tell the reader anything?

  8. How is strobilation treated in English dictionaries? (3) Collins English Dictionary (1979): • strobilation... asexual reproduction by division into segments, as in tapeworms and jellyfishes. • strobilaceousBotany. relating to a cone or cones. • strobila ... the body of a tapeworm, consisting of a string of similar segments (proglottides) ... [C19: from Greek strobilē plug of lint twisted into a cone shape, from strobilos a fir cone] • strobilus ... Botany. the technical name for a cone (sense 3). [C18: via Late Latin from Greek strobilos a fir cone] Collins English Dictionary On Line offersno etymologies, but at strobilation adds “see also strobile, strobilus, stroboscope”. What is the reader to make of this?

  9. strobilation in Wikipedia “Strobilation or transverse fission is a form of asexual reproduction consisting of the spontaneous transverse segmentation of the body. It is observed in certain cnidarians and helminths. This mode of reproduction is characterized by high offspring output, which, in the case of the parasitic tapeworms, is of great significance.”

  10. Why etymology matters to terminology (N)ODE does not have an entry at all. Merriam Webster mentions NL strobila without gloss or explanation Collins (printed book) has a good cluster of related strob- entries, with informative etymologies. None of the sources cited mention that the New Latin term strobila is based on Greek strobilē ‘an act or state of twisting or whirling’ [not just a piece of twisted lint]. This is related to strobilos ‘pine cone’ (ultimately a derivative of strephein ‘to twist or whirl’). This information is essential to understand the connections – mentioned by Collins On Line – to strobilus (a pine cone) and stroboscope (which gave us strobe lighting).

  11. The phraseological extreme: blow • collocational preferences: wind, gale, sand, snow, ship; fire, nose, bubble; building, house, window, fuse. • 3 core meanings: • what the wind does • what a human does when exhaling (e.g. blow a whistle, blow smoke over other people, blow bubbles, blow one’s nose) • what an explosion does (e.g. blow a hole in a wall). • Alternations: active/passive, causative/inchoative, conative, resultative • 6 phrasal verbs (blow apart, blow away, blow down, blow off, blow over, blow up) with more or less independent meanings, some with more than one pattern (e.g. blow up a balloon vs. blow up a building).

  12. The phraseology of blow (ctd.) • 17 or more idiomatic or figurative expressions, including: • blow [a project]off course • blow the cobwebs away [= introduce fresh thinking] • blow one's own trumpet [= boast] • blow the whistle on someone or something [= expose wrongdoing] • blow one's brains out [= kill] • blow hot and cold [= vacillate] • blow a fuse [= lose one's temper] • blow the gaff [= expose a secret] • blow a raspberry [= make a rude, derisive noise]

  13. Exploiting phraseological norms • Blow up a balloon is a norm for the phrasal verb blow up. • ‘balloon’ is not a necessary condition of this norm. • Norms don’t have necessary conditions! • You can exploit a norm with an anomalous argument, e.g.: • “When a visiting paramedic distributed free condoms, the children blew them up and played with them like balloons.” —from a text on birth control education in Nepal.

  14. Terminological meaning and phraseological meaning: discord second, terminological definition stipulated as an SI unit (in the Système international d'unités) by a committee of scientists: • “the duration of 9,192,631,770 periods of the radiation corres- ponding to the transition between the two hyperfine levels of the ground state of the caesium 133 atom.” Some ordinary phraseology: • After pausing for a second he must have relented. • My darling John, our thoughts are with you every second. • A bird the size of a sparrow beats its wings 14 times a second.

  15. Terminological and phraseological meaning: polyphony [[Human]] organize [[Eventuality]] • [[He] had organized a mass rally in the city’s Cathedral Square. • Attempts by the United States to organize a Middle East peace conference remained deadlocked. • Notation used in organizing books on shelves or files in a filing cabinet • All the evidence has led us to organize this book with reference to different kinds of places. [[Entity]] {be organized} [Adv[Manner]] • The brain is organized with a considerable amount of parallel wiring. [[Human]] organize [[Human Group]] • In about 1813-14 he organized in Bermondsey a society of workmen. • Ticha ... helped organize women coffee harvesters. • Inchoative alternation: Agricultural workers were still denied the right to organize.

  16. Terminological meaning and phraseological meaning: harmony admit: terminological sense A) ‘say reluctantly’ • Richmond admitted driving a motor vehicle with excess alcohol in his blood. • Mary Bryce admits a certain sympathy for Norman Lamont. admit: terminological sense B) ‘allow to enter’: • A sluice on the lock is opened to admit water. • The skylights in the galleries admit light through angled screens. • Each old person admitted to residential care should sign a contract relating to the rights of residents. • They [the Baltic nations] were admitted into the United Nations and other international organizations. • Joanna had dislocated her hip and was admitted to hospital.

  17. The lamentable condition of European lexicography • Great historical dictionaries (OED, Grimm) • On 18th-century reductionist principles (Leibniz, Johnson) • Useful practical handbooks (ODE/COD, Duden, Larousse) • supplying users with “hints and associations” (Bolinger) • Pedagogical dictionaries (OALD, LDOCE, Macmillan) • On 18th-century principles with a bit of added phraseology • No English dictionary offers a serious account of language use based on the idiom principle • Cobuild was at best a first attempt in this direction • Some European dictionaries (e.g. WDG) attempted to represent phraseology • Without corpus evidence they were merely guessing.

  18. The even more lamentable condition of American lexicography • America’s favourite dictionary (Merriam Webster Collegiate) • theoretical foundations?? .... historical principles, Latin grammar • takes no account of corpus evidence, constructions, phraseology, ... • MW Learners’ Dictionary – a conservative copy of British models • American Heritage Dictionary • places the modern meaning first • clear definitions • takes no account of corpus evidence, constructions, phraseology, ...

  19. What about Chinese lexicography? • Yihua Zhang and Hexian Xue (Historical Lexicography Conference, Oxford 2010) report: “Modern dictionaries for native Chinese speakers do not present sufficient information about [idiomatic] expressions and their cultural information. ... Without those pieces of information, foreign learners of Chinese encounter great obstacles to idiomaticity and fluency.” • Zhang and Xue give examples of idiomatic uses of hóng ‘red’ implying 1) beauty, 2) vitality and success, 3) revolution and power, ... etc.

  20. Why has phraseology been neglected by dictionaries? • Phraseology is hard to capture accurately • Each element in a phrase offers many possible variations • Meaning was assumed to be a property of words, not phrases • Meaning in text was assumed to be built up compositionally • The ‘Lego set’ theory of language • Lack of a) evidence and b) analytic techniques • Construction grammarians and systemic linguists agree meaning is a property of phrases (or “constructions”) as well as words • But without corpus evidence, systematic analysis was not possible • Grammarians could give a few (more or less bizarre) examples • Using evidence of introspection, the number of phraseological variables associated with each word seemed to be vast and unmanageable

  21. So what’s changed? • In a word (or rather, in a multi-word expression): Corpus Evidence • Corpus evidence shows that systematic analysis of normal phraseology, though difficult, is possible • The variables can be tamed • Normal phraseology can be distinguished from abnormal variations • Linguistic behaviour is highly patterned • The patterns of lingusitic behaviour associated with each word – and their meanings – can be captured, guided by prototype theory and statistical analysis

  22. Conclusion • Dictionaries of the future will devote serious attention to analysis of words in constructions and collocations. • For this, it will be necessary to distinguish patterns of normal usage from creative exploitations of such patterns. • Fillmore proposes a Constructicon in parallel to the Lexicon • http://framenet.icsi.berkeley.edu/ • Hanks and Pustejovsky (2005) propose a Pattern Dictionary • http://nlp.fi.muni.cz/projects/cpa/ • Are these proposals compatible?

More Related