corpora What they are How they are used How our Student Grammar uses them Examples of lexical concordancing
A corpus is • “a large, systematic collection of texts stored on computer” (SGSWE, 3) People creating dictionaries, grammars, and other text materials use them for • Real, authentic examples of usage in different contexts • Variation, preferences, and frequency of use in different genres • Coverage of usage across registers and dialects, across learners and languages: SGSWE uses conversation, fiction, news and academic registers
Registers Our grammar text, SGSWE, is keyed to registers, which are situation-cued varieties of language. A register is characterized by Mode - whether the language is spoken or written Interactiveness, produced in real time – real or simulated Shared situation – co-produced Main purpose - inform, explain, evaluate, personal communication, argument, pleasure (reading) Audience – individual, group, specialist, wide-public
Our grammar text is corpus-based It is based on usage, derived from a very large collection of words (currently a little over 40 million), occurring in specific registers This particular grammar focuses on words, phrases and clauses. It is descriptive rather than prescriptive or proscriptive. That is, the grammar does not present norms of social or political correctness. It does, however, present reasons for certain choices and it identifies issues of disputed usage, which is pedagogically usefrul
Website with examples of corpora http://bowland-files.lancs.ac.uk/monkey/ihe/linguistics/corpus4/4fra1.htm
Teachers can use corpora to collect and access real data • Stvan 2005 suggests that language teachers • create small-scale corpora • sample relevant specialized texts • “craft hands-on inferential vocabulary tasks” • to increase interest and retain word meanings
Examples of small corpora might include Menus and Cookbooks individual Novels Speeches Appliance manuals Business letters Conferences Each of these could become a small corpus introducing specialized vocabulary. • New learners ‘retained vocabulary knowledge at higher rates’ if they looked at meanings across multiple contexts – using concordancing (Cobb 2005). • 80% of words in most English texts belong to the 2000 most frequent word families, according to Paul Nation. Teachers can see how the text uses these words first and then use lexical concordance tools to identify harder ones – like LexTutor (we’ll work with these)
Concordance? two examples word (or phrase) in context, arranged in appropriate order
From the BNC: ‘ferocious’ • Back in Chania, the Calypso bar is filling with elderly Cretan men who click a lot: worry-beads clatter through their fiddling fingers, and they smack down chips on gameboards with ferocious violence. • The Repton boy's horror at Churchill's anti- olshevist campaign had turned into a more adult and less ferocious criticism. • Consequently he was in a ferocious temper when he got into his room and saw a mound of mail on his desk, half of which was addressed to Steinmark. • A couple got off the train, were met by the police and put up a ferocious but unsuccessful fight: suspicionof a bank robbery the previous day, in Thunder Bay. • first impressions, keeping ferocious dogs is not exactly novel in South Africa. • Similarities are clear: the ferocious teeth of a predatory dinosaur are a sure indication of hunting habits, with hardly a glance at the fangs of living mammalian carnivores. • Although we may first think of ferocious as applying only to animals (and the acts committed by their teeth), we see an extension of that meaning to the teeth, to criticism, to temper, and to small violent actions in a game.
Example: business letters Nelson (2006) examined a corpus of business letters to identify patterns of collocates (words that co-locate) He shows how positive and negative impressions are created and how thematic images and metaphors are invoked, by using collocates in a particular genre. Word choice is patterned: student writers need to investigate patterns of words as well as the words themselves. keyword collocates with cause: accident, cancer, commotion, crisis, delay provide: care, food, jobs, relief, support employee benefits, restaurants, holiday, bonus Categories:
Checking the BNC for ‘provide’ Notice that in a large corpus incorporating a number of genres, not just business letters,providecollocates with services and information for salience (the degree to which the words are mutual in their ‘call for each other.’ Think of fashion: yellow striped tie is salient with blue shirt – person with blue shirt more likely to pick yellow tie, person with yellow tie more likely to pick blue shirt) In terms of frequency,provide collocates with a (which is typically followed by a noun or nominal phrase) to (like TELL, provide can be ditransitive, which means a person provides something to somebody. Check concordance to see what happens if inanimate or organization provides
The irony of ditransitivity in Frost’s “Provide, provide” • The witch that came (the withered hag)To wash the steps with pail and rag,Was once the beauty Abishag, • The picture pride of Hollywood.Too many fall from great and goodFor you to doubt the likelihood. • Die early and avoid the fate.Or if predestined to die late,Make up your mind to die in state. • Make the whole stock exchange your own!If need be occupy a throne,Where nobody can call you crone. • Some have relied on what they knew,Others on being simply true.What worked for them might work for you. • No memory of having starredAtones for later disregardOr keeps the end from being hard. • Better to go down dignifiedWith boughten friendship at your sideThan none at all. Provide, provide!