Chinese learner corpora and second language research - PowerPoint PPT Presentation

chinese learner corpora and second language research n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Chinese learner corpora and second language research PowerPoint Presentation
Download Presentation
Chinese learner corpora and second language research

play fullscreen
1 / 66
Chinese learner corpora and second language research
430 Views
Download Presentation
milos
Download Presentation

Chinese learner corpora and second language research

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. The 2006 International Symposium of Computer-Assisted Language Learning June 2-4, 2006, Beijing Chinese learner corpora and second language research Qiufang Wen The national research center for foreign language education, BFSU

  2. Topics to be addressed • English corpora of Chinese learners • Corpus-based studies on English learners in mainland China • Several corpus-based studies on English learners’ interlanguage by myself or together with my colleauges • Advantages and disadvantages of corpus-based studies on the interlanguage

  3. Topic One English corpora of Chinese learners

  4. Chinese learner English Corpus (CLEC) • College Learners’ Spoken English Corpus (COLSEC) • Spoken and Written Corpus of Chinese Learners (SWECCL) • Version 1 • Version 2 (under construction) • Bilingual Corpus of Chinese English Learners (BICCEL): under construction

  5. 1. Chinese learner English Corpus (CLEC) by Gui & Yang in 2003 • Written corpus: 1 million • Timed and untimed compositions • Levels of proficiency • Middle school students • Non-English major (Band 4) • Non-English major (Band 6) • English majors (Band 4 ) • English majors (Band 8) • Error-tagged

  6. Two Types of English Learners in University English Majors Non-English majors Year 4 Year 3 Year 2 Year 1 Year 4 Year 3 Year 2 Year 1 Band 8 Band 6 Band 4 Band 4 Band 2

  7. 2. College Learners’ Spoken English Corpus (COLSEC) by Yang & Wei in 2005 • Tokens: 0.7million • Source: National spoken English test for non-English majors • Test items • Teacher-student conversation • Student-student discussion • teacher-student discussion • Data format: written transcripts

  8. 3. Spoken and Written Corpus of Chinese Learners (SWECCL) by Wen, Wang & Liang in 2005 (Version 1) SWECCL WECCL SECCL 1.46 million 1.18 million

  9. Spoken (SECCL) • Source of data • National spoken English test: 1996-2002 • Second-year English majors • Data format • Digital sounds as well as transcripts of the speeches

  10. National spoken English test for English majors — Band 4 • Test format • Test in a lab • The number of testees annually • 2006: more than 16,000 • Expect to have 50,000 in the future • Scoring procedures • A random sample (30-35 tapes) • Two raters scoring one tape independently

  11. Number of subjects • 6 groups from each year (1996-2002) • 42 groups (30/35) = about 1400 students • About 230 hours’s speech • Testing items

  12. Testing items

  13. The structure of SECCL Tagged Article Past Tense Text Special Whole Task A SECCL Raw Task Task B Task C Year Sound files (1996-2002)

  14. The written component Written Year 1 Year 2 Year 3 Year 4

  15. The written component • Source of data • Timed compositions in class (40 minutes, no less than 300 words) • Take-home compositions (no word limit) • Types of compositions • Argumentative (a list of topics provided) • Narrative

  16. SWECCL in 2007 (Version 2) SWECCL WECCL SECCL Two million Two million

  17. SECCL(Version 2) • 2003-2006 National Spoken English Test for second-year English majors (band 4) • 2000-2006 National Spoken English Test for 4th-year English majors-Band 8 (Task 3) • A longitudinal data (2001-2004)

  18. Spoken (Band 8) • Testing item (Task C) • Make a comment on a given topic • Data format • Digital sounds as well as transcripts of the speeches

  19. Spoken (Longitudinal) • 72 students 56 students • 40 hours’ speech

  20. Tasks • Reading aloud • Retelling a story • Talking on a given topic (Narrative) • Talking on a given topic (argumentative) • Conversation (Role play) • Discussion on a given topic

  21. 4. Bilingual Corpus of Chinese English Learners (BICCEL) BICCEL Spoken Written E-C C-E E-C C-E 0.5 million 0.5 million 0.5 million 0.5 million

  22. Spoken component of BICCEL • National Oral English test — Band 8 • The 4th year English majors • Interpreting from English to Chinese (Task A) • Interpreting from Chinese to English (Task B) • 2001-2005: 1100 testees

  23. Written component of BICCEL • Source of data: in-class assignment • E-C and C-E translation • Across the 3rd and 4th years • 30 universities across the country

  24. Topic Two A brief review of corpus-based studies on Chinese learner English

  25. Sources • China National Knowledge Infrastructure (CNKI)(On-line journals) • Digital dissertation database

  26. Corpus-based studies in mainland China

  27. Research areas

  28. Conferences & workshop • The International conference on “Corpus Linguistics” 25-27 October, 2003 • The First National Symposium on corpus linguistics and ELT Education 11-13 October, 2004 • Workshop on the use of corpus in teaching and research 17-19 March, 2006

  29. Topic Three Several corpus-based studies on English learners’ interlanguage by myself or together with my colleagues

  30. Study One Features of oral style in English compositions of advanced Chinese EFL learners (Wen, Q.F. Ding, Y.R. & Wang, W.Y. 2003, Foreign Language Teaching & Research (4):268-274.

  31. Study Two A Study on Frequency Adverbs Used by Advance English Learners in China Wen, Q. F. & Ding, Y. R. 2004. Modern foreign languages(2): 141-147.

  32. Study Three An analysis of English Majors’ Abstracting abilities through their English compositions Wen, Q.F. & Liu, R.Q. 2006. Foreign Languages (2)

  33. Study Four • A longitudinal study on the developmental features of speaking vocabulary by English majors in mainland China Wen, Q. F. 2006. Foreign Language Teaching and Research (3).

  34. Study Five • A comparison of developmental features of Speaking and Writing vocabulary by English majors • Wen, Q. F. 2006. Foreign languages and Foreign Language Teaching (4)

  35. Study Six Patterns of change in speaking vocabulary development by English majors

  36. Study Two A Study on Frequency Adverbs Used by Advance English Learners in China Wen, Q. F. & Ding, Y. R. 2004. Modern foreign languages(2): 141-147.

  37. Frequency Adverbs • Adverbs used for describing “how often” something happens • never, sometimes, usually, always

  38. Top Twenty Frequency Adverbs • Most frequently used by native speakers according to the analyses of the British National Corpus (BNC) byLeech, Rayson and Wilson (2001)

  39. Top Twenty Frequency Adverbs (TTFAs)

  40. Common features • All high-frequency words • Different frequencies in speech and writing except sometimes and twice (Leech et al. 2001)

  41. A comparison of TTFAs in speech and writing • The overall difference • TTFAs more likely occur in writing than in speech. • The specific differences • Speech: never, always, ever, normally • Neutral: sometimes, twice • Writing: 14 words

  42. Previous corpus-based studies • e.g. Altenberg & Granger, 2001; Cobb, 2002; Ringbom, 1998; Wen, Ting, & Wang,2003 • Conflicting finding one: overuse vs. underuse

  43. Examples • Overuse high-frequency words in writing (Cobb, 2001) • Overuse modal verbs (Aijmer, 2002) • Underuse adverbial connectors (Altenberg & Tapper, 1998) • No study on frequency adverbs

  44. Conflicting finding two • Tend to use written style features in their speech • Tend to use a mixed register in either speech or in writing • Tend to use oral style features in their writing • Did not compare the use of high-frequency words in speech with writing

  45. General purposes of this study • Whether Chinese EFL learners simply overuse the TTFAs or they overuse some while underusing others • whether they use the TTFAs similarly or differently when compared their speech with writing

  46. Research questions • Do they overuse or underuse the TTFAs differently between speech and writing? • Do they differ more from native speakers in writing or in speaking with regard to the use of the TTFAs? • Do they demonstrate a similar pattern of writing-speaking difference as native speakers in the use of the TTFAs?

  47. Data for analysis

  48. Data analysis Four comparisons • Learners’ speech and native speakers’ speech SECCL vs. BNCS • Learner’s writing and native speakers’ writing CLEC vs. BNCW • Dif. in learners’ speech & native speakers’ and Dif. In learners’ writing & native speakers’ SECCL vs. BNCS and CLEC vs. BNCW • Dif. In learners’ speech & writing and dif. in native speakers’ speech & writing SECCL vs. CLEC and BNCS vs. BNCW

  49. Results(1) TTFA use in learners’ spoken corpus (SECCL)

  50. Results(2) TTFAs use in learners’ written corpus(CLEC)