chinese learner corpora and second language research n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Chinese learner corpora and second language research PowerPoint Presentation
Download Presentation
Chinese learner corpora and second language research

Loading in 2 Seconds...

play fullscreen
1 / 66

Chinese learner corpora and second language research - PowerPoint PPT Presentation


  • 398 Views
  • Uploaded on

The 2006 International Symposium of Computer-Assisted Language Learning June 2-4, 2006, Beijing. Chinese learner corpora and second language research. Qiufang Wen The national research center for foreign language education, BFSU. Topics to be addressed. English corpora of Chinese learners

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Chinese learner corpora and second language research' - milos


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
chinese learner corpora and second language research

The 2006 International Symposium of Computer-Assisted Language Learning

June 2-4, 2006, Beijing

Chinese learner corpora and second language research

Qiufang Wen

The national research center for foreign language education, BFSU

topics to be addressed
Topics to be addressed
  • English corpora of Chinese learners
  • Corpus-based studies on English learners in mainland China
  • Several corpus-based studies on English learners’ interlanguage by myself or together with my colleauges
  • Advantages and disadvantages of corpus-based studies on the interlanguage
topic one
Topic One

English corpora of Chinese learners

slide4
Chinese learner English Corpus (CLEC)
  • College Learners’ Spoken English Corpus (COLSEC)
  • Spoken and Written Corpus of Chinese Learners (SWECCL)
    • Version 1
    • Version 2 (under construction)
  • Bilingual Corpus of Chinese English Learners (BICCEL): under construction
1 chinese learner english corpus clec by gui yang in 2003
1. Chinese learner English Corpus (CLEC) by Gui & Yang in 2003
  • Written corpus: 1 million
  • Timed and untimed compositions
  • Levels of proficiency
    • Middle school students
    • Non-English major (Band 4)
    • Non-English major (Band 6)
    • English majors (Band 4 )
    • English majors (Band 8)
  • Error-tagged
two types of english learners in university
Two Types of English Learners in University

English Majors Non-English majors

Year 4

Year 3

Year 2

Year 1

Year 4

Year 3

Year 2

Year 1

Band 8

Band 6

Band 4

Band 4

Band 2

2 college learners spoken english corpus colsec by yang wei in 2005
2. College Learners’ Spoken English Corpus (COLSEC) by Yang & Wei in 2005
  • Tokens: 0.7million
  • Source: National spoken English test for non-English majors
  • Test items
    • Teacher-student conversation
    • Student-student discussion
    • teacher-student discussion
  • Data format: written transcripts
3 spoken and written corpus of chinese learners sweccl by wen wang liang in 2005 version 1
3. Spoken and Written Corpus of Chinese Learners (SWECCL) by Wen, Wang & Liang in 2005 (Version 1)

SWECCL

WECCL

SECCL

1.46 million

1.18 million

spoken seccl
Spoken (SECCL)
  • Source of data
    • National spoken English test: 1996-2002
    • Second-year English majors
  • Data format
    • Digital sounds as well as transcripts of the speeches
national spoken english test for english majors band 4
National spoken English test for English majors — Band 4
  • Test format
    • Test in a lab
  • The number of testees annually
    • 2006: more than 16,000
    • Expect to have 50,000 in the future
  • Scoring procedures
    • A random sample (30-35 tapes)
    • Two raters scoring one tape independently
slide11
Number of subjects
    • 6 groups from each year (1996-2002)
    • 42 groups (30/35) = about 1400 students
    • About 230 hours’s speech
  • Testing items
the structure of seccl
The structure of SECCL

Tagged

Article

Past Tense

Text

Special

Whole

Task A

SECCL

Raw

Task

Task B

Task C

Year

Sound files

(1996-2002)

the written component
The written component

Written

Year 1

Year 2

Year 3

Year 4

the written component1
The written component
  • Source of data
    • Timed compositions in class (40 minutes, no less than 300 words)
    • Take-home compositions (no word limit)
  • Types of compositions
    • Argumentative (a list of topics provided)
    • Narrative
sweccl in 2007 version 2
SWECCL in 2007 (Version 2)

SWECCL

WECCL

SECCL

Two million

Two million

seccl version 2
SECCL(Version 2)
  • 2003-2006 National Spoken English Test for second-year English majors (band 4)
  • 2000-2006 National Spoken English Test for 4th-year English majors-Band 8 (Task 3)
  • A longitudinal data (2001-2004)
spoken band 8
Spoken (Band 8)
  • Testing item (Task C)
    • Make a comment on a given topic
  • Data format
    • Digital sounds as well as transcripts of the speeches
spoken longitudinal
Spoken (Longitudinal)
  • 72 students 56 students
  • 40 hours’ speech
tasks
Tasks
  • Reading aloud
  • Retelling a story
  • Talking on a given topic (Narrative)
  • Talking on a given topic (argumentative)
  • Conversation (Role play)
  • Discussion on a given topic
4 bilingual corpus of chinese english learners biccel
4. Bilingual Corpus of Chinese English Learners (BICCEL)

BICCEL

Spoken

Written

E-C

C-E

E-C

C-E

0.5 million

0.5 million

0.5 million

0.5 million

spoken component of biccel
Spoken component of BICCEL
  • National Oral English test — Band 8
    • The 4th year English majors
    • Interpreting from English to Chinese (Task A)
    • Interpreting from Chinese to English (Task B)
    • 2001-2005: 1100 testees
written component of biccel
Written component of BICCEL
  • Source of data: in-class assignment
    • E-C and C-E translation
    • Across the 3rd and 4th years
    • 30 universities across the country
topic two
Topic Two

A brief review of corpus-based studies on Chinese learner English

sources
Sources
  • China National Knowledge Infrastructure (CNKI)(On-line journals)
  • Digital dissertation database
conferences workshop
Conferences & workshop
  • The International conference on “Corpus Linguistics” 25-27 October, 2003
  • The First National Symposium on corpus linguistics and ELT Education

11-13 October, 2004

  • Workshop on the use of corpus in teaching and research 17-19 March, 2006
topic three
Topic Three

Several corpus-based studies on English learners’ interlanguage by myself or together with my colleagues

study one
Study One

Features of oral style in English compositions of advanced Chinese EFL learners

(Wen, Q.F. Ding, Y.R. & Wang, W.Y. 2003, Foreign Language Teaching & Research (4):268-274.

study two
Study Two

A Study on Frequency Adverbs Used by Advance English Learners in China

Wen, Q. F. & Ding, Y. R. 2004. Modern foreign languages(2): 141-147.

study three
Study Three

An analysis of English Majors’ Abstracting abilities through their English compositions

Wen, Q.F. & Liu, R.Q. 2006. Foreign Languages (2)

study four
Study Four
  • A longitudinal study on the developmental features of speaking vocabulary by English majors in mainland China

Wen, Q. F. 2006. Foreign Language Teaching and Research (3).

study five
Study Five
  • A comparison of developmental features of Speaking and Writing vocabulary by English majors
  • Wen, Q. F. 2006. Foreign languages and Foreign Language Teaching (4)
study six
Study Six

Patterns of change in speaking vocabulary development by English majors

study two1
Study Two

A Study on Frequency Adverbs Used by Advance English Learners in China

Wen, Q. F. & Ding, Y. R. 2004. Modern foreign languages(2): 141-147.

frequency adverbs
Frequency Adverbs
  • Adverbs used for describing “how often” something happens
  • never, sometimes, usually, always
top twenty frequency adverbs
Top Twenty Frequency Adverbs
  • Most frequently used by native

speakers according to the analyses of the British National Corpus (BNC) byLeech, Rayson and Wilson (2001)

common features
Common features
  • All high-frequency words
  • Different frequencies in speech and writing except sometimes and twice (Leech et al. 2001)
a comparison of ttfas in speech and writing
A comparison of TTFAs in speech and writing
  • The overall difference
    • TTFAs more likely occur in writing than in speech.
  • The specific differences
    • Speech: never, always, ever, normally
    • Neutral: sometimes, twice
    • Writing: 14 words
p revious corpus based studies
Previous corpus-based studies
  • e.g. Altenberg & Granger, 2001; Cobb, 2002; Ringbom, 1998; Wen, Ting, & Wang,2003
  • Conflicting finding one: overuse vs. underuse
examples
Examples
  • Overuse high-frequency words in writing (Cobb, 2001)
  • Overuse modal verbs (Aijmer, 2002)
  • Underuse adverbial connectors (Altenberg & Tapper, 1998)
  • No study on frequency adverbs
conflicting finding two
Conflicting finding two
  • Tend to use written style features in their speech
  • Tend to use a mixed register in either speech or in writing
  • Tend to use oral style features in their writing
  • Did not compare the use of high-frequency words in speech with writing
general purposes of this study
General purposes of this study
  • Whether Chinese EFL learners simply overuse the TTFAs or they overuse some while underusing others
  • whether they use the TTFAs similarly or differently when compared their speech with writing
research questions
Research questions
  • Do they overuse or underuse the TTFAs differently between speech and writing?
  • Do they differ more from native speakers in writing or in speaking with regard to the use of the TTFAs?
  • Do they demonstrate a similar pattern of writing-speaking difference as native speakers in the use of the TTFAs?
data analysis
Data analysis

Four comparisons

  • Learners’ speech and native speakers’ speech

SECCL vs. BNCS

  • Learner’s writing and native speakers’ writing CLEC vs. BNCW
  • Dif. in learners’ speech & native speakers’ and Dif. In learners’ writing & native speakers’

SECCL vs. BNCS and CLEC vs. BNCW

  • Dif. In learners’ speech & writing and dif. in native speakers’ speech & writing

SECCL vs. CLEC and BNCS vs. BNCW

results 1
Results(1)

TTFA use in learners’ spoken corpus (SECCL)

results 2
Results(2)

TTFAs use in learners’ written corpus(CLEC)

results 3
Results(3)

Comparison of learners’ speech with their writing in TTFA use (Overuse)

results 31
Results(3)

Comparison (Underuse)

results 32
Results(3)

Comparison (identical or similar)

results 4
Results(4)

Speaking-writing differences in TTFA use in the CEMIC and the BNC

results 41
Results(4)

Speaking-writing differences in TTFA use in the CEMIC and the BNC

summary 1
Summary (1)
  • English majors in China tend to overuse and underuse certain TTFAs in their speech and writing. The overuse tendency is stronger than the underuse tendency in both speech and writing.
summary 2
Summary (2)
  • The overuse tendency is more marked in their speech than in their writing while the underuse tendency is also slightly stronger in speech than in writing. Some of the overused or underused TTFAs in speech are the same as those in writing but others are different.
summary 3
Summary (3)
  • Chinese English majors demonstrate a pattern of speaking-writing difference that is opposite to that shown in the native speakers’ corpus: they tend to use more TTFAs in their speech than in their writing while native speakers tend to use more TTFAs in their writing than in their speech. This shows that Chinese EFL learners use TTFAs without awareness of their register differences.
possible reasons
Possible reasons
  • Limited vocabulary (Table 1b)
  • Use them as “time buyers”
  • Without equivalents readily available in Chinese
topic four
Topic Four

Advantages and disadvantages of corpus-based studies on SLA

advantage one
Advantage One
  • A large sample stored electronically and open to the public
    • Validity and reliability (replicable)
    • Possible for a diachronic study
advantage two
Advantage Two
  • Using a computer software such as WordSmith
    • Effectiveness and efficiency
advantage three
Advantage Three
  • Understand the learner language from a different perspective
    • Correct vs. incorrect
    • More acceptable vs. less acceptable
    • Frequency
      • Overuse
      • Underuse
      • unuse
closing remark
Closing Remark
  • The number of researchers increasing
  • Constructing different types of corpora
  • Carrying corpus-based studies
  • Findings useful for textbook writers as well as for practitioners