using english language corpora in the esl classroom n.
Skip this Video
Loading SlideShow in 5 Seconds..
Using English Language Corpora in the ESL Classroom PowerPoint Presentation
Download Presentation
Using English Language Corpora in the ESL Classroom

Loading in 2 Seconds...

play fullscreen
1 / 40

Using English Language Corpora in the ESL Classroom - PowerPoint PPT Presentation

  • Uploaded on

Using English Language Corpora in the ESL Classroom. I-TESOL Conference October 12 th , 2012 Brent A. Green Salt Lake Community College. Introduction. Personal and Professional interests in using corpora in language teaching

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

Using English Language Corpora in the ESL Classroom

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
using english language corpora in the esl classroom

Using English Language Corpora in the ESL Classroom

I-TESOL Conference

October 12th, 2012

Brent A. Green

Salt Lake Community College

  • Personal and Professional interests in using corpora in language teaching
  • Goals of Workshop: Participants will learned how to access and use on-line written and spoken English corpora to help them prepare course materials and assessments, increase understanding of English language structures, and engage students in data-driven learning tasks.
  • What is a corpus?
    • A large database of language
  • What is a concordancer?
    • A software program that allows you to search the database for particular words or phrases
  • What is classroom concordancing?
    • A teaching approach in which concordance data are used in the language classroom to help learners notice and practice language patterns and use. This teaching approach is sometimes referred to as Data-driven Learning (DDL). Learners are driven by authentic language data, presented in the form of concordance lines, to act as a “linguistic detective’ to find answers to their linguistic queries (Johns 1988; 1991 a, b)
  • What are concordance lines?
    • Examples of words or phrases uniquely presented in a way that the words or phrases under investigation are aligned in the middle of the page with their left and right contexts (often referred to as KWIC format).
  • Example of KWIC from the Corpus of Contemporary American English (COCA)
three dimensional framework
Three-Dimensional Framework







Larsen-Freeman 1991

what do we look for
What do we look for?
  • Lexicography
    • What are the meanings associated with a particular word?
    • What is the frequency of a word relative to other related words?
    • What non-linguistic association patterns does a particular word have (e. g. to registers, historical periods, dialects)
    • What words commonly co-occur with a particular word, and what is the distribution of these “collocational” sequences across registers?
    • How are the senses and uses of a word distributed
    • How are seemingly synonymous words used and distributed in different ways?

(Biber et al, 1988)

what do we look for1
What do we look for?
  • Grammatical structures (if or that clauses, causatives, etc.)
  • Discourse functions (making suggestions, introducing a speaker, etc.)
how does one begin examining corpus data
How does one begin examining corpus data?
  • You need the following
    • a language related question which arises out of the text, your own observations or curiosity, or the observations and curiosity of your students.
    • A corpus of language that contains contexts which are similar to your learners’ target language learning domains.
    • Pedagogically sound principles in accessing and applying corpus data.
corpus based research
Corpus-based Research

1. Research question

2. Extensive review of the literature

3. Summary of experts across form, meaning, and use categories

4. Comparison of experts against spoken and

written corpora

5. Reformulation and expansion of existing frameworks

corpus based teaching
Corpus-based Teaching
  • Syllabus design and evaluation
    • Student-based corpora
    • Student texts
  • Material preparation
  • Teacher-student collaboration
  • Student research
  • Assessments
the corpus of contemporary american english coca
  • What is it?
    • The Corpus of Contemporary American English (COCA) is the largest freely-available on-line corpus of English
  • Who created it?
    • It was created by Mark Davies of Brigham Young University in 2008
  • How many words does it contain?
    • The corpus contains more than 450 million words of text and is equally divided among spoken, fiction, popular magazines, newspapers, and academic texts.
          • Information adapted from
the corpus of contemporary american english coca1
  • What type of searches can I do with COCA?
    • The interface allows you to search for exact words or phrases, wildcards, lemmas, part of speech, or any combinations of these.  You can search for surrounding words (collocates) within a ten-word window. 
    • The corpus also allows you to easily limit searches by frequency and compare the frequency of words, phrases, and grammatical constructions
    • Information adapted from
the corpus of contemporary american english coca2
  • What else can you do?
    • You can also easily carry out semantically-based queries of the corpus. For example, you can contrast and compare the collocates of two related words to determine the difference in meaning or use between these words. 
    • You can find the frequency and distribution of synonyms for nearly 60,000 words and also compare their  frequency in different genres, and also use these word lists as part of other queries.
    • Finally, you can easily create your own lists of semantically-related words, and then use them directly as part of the query.
      • Information adapted from
corpus based practice
Corpus-based Practice

Before you look for the collocates of each of the words deep, run, smile, and fairly -- what would you guess are the best collocates -- in other words, surrounding words that really help to "define" these words?

Are there any that are surprises in what you see in the corpus?

corpus based practice1
Corpus-based Practice

Compare the collocates of the two words democrats and republicans. According to these texts (from newspapers, magazines, TV talk shows, etc),

Any possible media bias here?

corpus based practice2
Corpus-based Practice

Compare the frequency of second vs secondly in academic texts. Which one would you guess is more frequent?

What issues do we have when we make this comparison?

corpus based practice3
Corpus-based Practice

Compare the adjectives used to describe women and men.

Does this reflect biases in contemporary American culture?

corpus based practice4
Corpus-based Practice
  • Using the web interface, you can search by
    • Words—malignant
    • Phrases—nooks and crannies or faint + noun (faint [n*])
    • lemmas (all forms of words, like sing ([sing])or tall ([tall])
    • wildcards (un*ly or r?n*)
    • more complex searches (un-X-ed adjectives (un*ed.[j*] )or verb + any word + a form of ground ([vv*] * [ground]).
types of concordance based tasks
Types of Concordance-based Tasks

Adapted from Sripicharn 2003

teacher centered tasks
Teacher-Centered Tasks
  • Example #1
    • Used to and would in the habitual past
  • On Your Own
    • Hedges (kind of, sort of, like)
    • Say, talk, tell
erades 1943
Erades (1943)
  • It may be safely said that in language a difference of form always corresponds to a difference in meaning and whenever more than one construction is—theoretically—possible, they never wholly and under all circumstances denote the same thing. The first axiom of all valid linguistic thinking is that in language nothing can serve as a substitute for something else.
would vs used to example
Would vs. Used to Example
  • Briefly discuss the differences between the two sentences with a partner

(a)My father used to exercise every morning

(b) My father would exercise every morning

  • One difference is that (a) can signal only habitual past action whereas (b) can also be conditional given appropriate context (i.e. “If he had time”).
would vs used to example1
Would vs. Used to Example
  • Steps
    • think about the context when the structure occurs
      • personal narrative
    • find corpus data that matches that context
      • American Dreams (Studs Terkel)
      • Switchboard
    • search for target structures using a concordancing program
      • Monoconc
would vs used to example2
Would vs. Used to Example
  • Steps cont.
    • Look for patterns in form, meaning, and use
      • In what ways, if any, are the forms the same or different?
      • In what ways, if any, are the meanings different or similar? (look carefully at surrounding context)
      • In what ways, if any, are the structures used differently? (look carefully at surrounding context)
    • Create sample worksheets or tests for students
micase corpus
  • How many words?
    • approximately 1.8 million words (190 hours)
  • What is the focus?
    • Contemporary university speech within the University of Michigan, in Ann Arbor, Michigan.
  • Who are the speakers?
    • Speakers represented in the corpus include faculty, staff, and all levels of students, and both native and non-native speakers.
micase corpus1
  • What are the speech events?
    • The speech events included in the corpus include: small and large lectures (62), public interdisciplinary or departmental colloquia (13), discussion sections (9), student presentations (11), seminars (8), undergraduate lab sessions (8), lab group and other meetings (6), one-on-one tutorials (3), office hours (8), advising consultations (5), dissertation defenses (4), study groups (8), interviews (3), campus/museum tours (2), and service encounters (2).
on your own teacher centered task
On Your Own: Teacher-centered Task
  • Say, Talk, or Tell
    • Characteristics
      • Transitive vs. intransitive vs. ditransitive
      • Used in spoken language
      • Idiomatic expressions
    • Tasks
      • Search MICASE for tokens of these forms
        • Cut and past example sentences from MICASE into MS Word.
          • Ask learners to examine the forms
          • Assess learners ability to get the forms correct
      • Search for idiomatic expressions
        • Cut and paste examples of idiomatic forms
        • Ask learners key questions about the examples
example of teacher centered tasks
Example of Teacher-centered Tasks

Sample sentences and Idioms

micase search
  • Click on the link below to begin your search


  • Using the form, meaning, and use handout—take notes on our discussion with softening phrases such as I think, In my opinion, It seems to me, others?
learner centered
  • The learners form their own questions
  • The learners browse the corpus independently there is no structure or controlled task
  • There is very little interference from the teacher in the generalization process
  • Now it is your turn to answer those structure related questions that have been bothering you for years!
  • Corpus of Contemporary American English (COCA)
other tasks
Other tasks
  • Utilizing the audio features
  • Browsing the corpus to find specific speech events
  • Micase activities for learners
two examples of student with ta during office hours
Two examples of student with TA during office hours

S1: okay

S2: you feel th- as though you're in a lab or, [S1: yeah almost ] <LAUGH> it's a little a little bit a little bit odd. okay. uh, the reason i asked you to come in is that, i- i'm looking at the grades and i'm looking at at this paper and, you're at the point where i don't want you to, fall off the edge. uh and and get a grade that's not gonna be, supportive. it seems to me that you know that you've been in touch with things in the class and that i, i liked what you did with your poem to change it which wasn't_ which must have involved a fair amount of work. [S1: (i don't know) ] to, you know to get that in a different order and to get the system ba- was it a lot of work?

S1: mm, it wasn't too much it didn't take me too long to just, use the same word i just, i'd say the hardest part yeah was changing the sentences. trying to make 'em all fit again. [S2: okay ] but it wasn't too bad.

S2: okay. but the rhythm seemed to work right and, [S1: mhm ] it it really did, come out to be a sus- sestina and one of the effects of the sestina is that, since you're using those words over and over again they they tend to acquire different meanings they tend to to just, they sound different in different combinations [S1: mhm ] and and they mean something. but let's look at this [S1: kay ] um, because i think that that part of what's happening here, is that is that you're using a lot of words where few words would work. where you don't really need that that many words to say what you want to to say. and there are some cases where you're where you're looking, or where you seem to be saying something um, and i think i know what i know what you want to say, but because you've sort of, you've given me more than than i need you're really disguising the meaning [S1: mkay ] rather than bringing the meaning out. so that, if y- if you look at this sentence and if you just r- read that sentence aloud.

Victor: Do you have a few minutes?

Pam: Sure, I’m Pam.

Victor: I’m Victor

Pam: Hi Victor. Have a seat. How can I help you?

Victor: Well I’m in Dr. Sears’ American Lit class…and I’m having a lotta trouble with that poetry unit. I’m thinking of dropping the class

Pam: Oh. I hate to tell you, but Friday was the last day to drop.

Victor: Oh no. I knew I should have dropped last week.

Pam: Well, it’s all right. Let’s see what we can do to get you through the class. Guess literature isn’t your thing, huh?

Victor: It’s just this unit on poetry. I did okay with short stories.

Pam: What’s giving you problems?

Victor: I just don’t get a lot of this modern stuff. It just doesn’t seem like poetry to me.

Pam: What exactly bothers you?

Victor: I understood the poems by Robert Frost and Maya Angelou, But the poems in last night’s homework don’t rhyme or have rhythm or anything.

(Hartmann, P. & Blass, 2000)

(R. C. Simpson, S. L. Briggs, J. Ovens, and J. M. Swales, 2002)

favorite corpus web site
Favorite Corpus Web Site
  • Michael Barlow’s Corpus Linguistics Site

other links
Other Links
  • Spoken Corpora
    • MICASE: R. C. Simpson, S. L. Briggs, J. Ovens, and J. M. Swales. (2002) The Michigan Corpus of Academic Spoken English. Ann Arbor, MI: The Regents of the University of Michigan.
    • Linguistic Data Consortium University of Pennsylvania
    • The Corpus of Contemporary American English Mark Davies, Brigham Young University
    • American National Corpus
    • British National Corpus also available through Mark Davies Corpus website
  • Spoken Language Resources
    • Bygate, M. (1998) Theoretical perspectives on speaking. Annual Review of Applied Linguistics 18, p. 20-42
    • Burns, A. (1998) Teaching speaking. Annual Review of Applied Linguistics 18, p. 102-123
    • Burns, A. & Joyce, H. (2002). Focus on speaking. Sydney: National Center for English Language Teaching and Research.
    • McCarthy, M. (1998). Spoken language & applied linguistics. Cambridge: Cambridge University Press.
    • Celce-Murcia, M., & Larsen-Freeman, D. (1999). The grammar book: An ESL/EFL teacher's course (2nd ed.). Boston, MA: Heinle & Heinle.
  • Corpus Linguistics Texts
    • Biber, D., Conrad, S., & Reppen, R. (1998). Corpus linguistics: Investigating language structure and use. Cambridge: Cambridge University Press.
    • Partington, A. (1998) Patterns and Meanings: Using corpora for English language research and teaching. John Benjamins.
    • Tribble, C. & Jones, G. (1997). Concordances in the classroom: using corpora. A resource guide for teachers [new edition]. Houston, TX: Athelstan
  • MICASE Tips and Tutorials
  • Other References
    • Erades, P. A. (1943). The case against provisional It. English Studies, 25, 169-176
    • Hartmann, P. & Blass, L. (2000). Quest: Listening and speaking in the academic word Book 3. New York: McGraw Hill.
    • Johns, T. F. (1988) Whence and whither classroom concordancing? In T. Bongaerts, P de Hann, S. Lobbe, & H. Wekker (eds.) Computer applications in language learning, p. 9-27. USA: Forbis Publications
    • Johns, T. F. (1991) Should you be persuaded: Two examples of Data-driven learning. In T.F. Johns & P. King (eds.) ELR Journal Vol. 4 Classroom concordancing (p. 27-46). Birmingham CESL: The University of Birmingham Press
    • Johns, T. F. (1997). Contexts: The background, development, and trailing of a concordance-based CALL program. In A. Wichmann, S. Fligelstone, T. McEnery, & G. Knowles (eds.) Teaching and language corpora. London: Longman.
    • Riggenbach, H. (1999). Discourse analysis in the language classroom: Vol. 1. The spoken language. Ann Arbor, MI: University of Michigan Press.
    • Sripicharn, P. (2003). Implementing collaborative concordancing between teacher and learners in the writing class. Paper presented at the 5th CULI International Conference, Bangkok, Thailand.