Download presentation
Presentation is loading. Please wait.
Published byMagnus Walker Modified over 9 years ago
1
Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International Symposium of Computer-Assisted Language Learning June 2-4, 2006, Beijing
2
Topics to be addressed English corpora of Chinese learners English corpora of Chinese learners Corpus-based studies on English learners in mainland China Corpus-based studies on English learners in mainland China Several corpus-based studies on English learners’ interlanguage by myself or together with my colleauges Several corpus-based studies on English learners’ interlanguage by myself or together with my colleauges Advantages and disadvantages of corpus- based studies on the interlanguage Advantages and disadvantages of corpus- based studies on the interlanguage
3
Topic One English corpora of Chinese learners
4
Chinese learner English Corpus (CLEC) College Learners’ Spoken English Corpus (COLSEC) Spoken and Written Corpus of Chinese Learners (SWECCL) – Version 1 – Version 2 (under construction) Bilingual Corpus of Chinese English Learners (BICCEL): under construction
5
1. Chinese learner English Corpus (CLEC) by Gui & Yang in 2003 Written corpus: 1 million Timed and untimed compositions Levels of proficiency – Middle school students – Non-English major (Band 4) – Non-English major (Band 6) – English majors (Band 4 ) – English majors (Band 8) Error-tagged
6
Two Types of English Learners in University English Majors Non-English majors Year 4 Year 3 Year 2 Year 1 Band 8 Band 4 Year 4 Year 3 Year 2 Year 1 Band 6 Band 4 Band 2
7
2. College Learners’ Spoken English Corpus (COLSEC) by Yang & Wei in 2005 Tokens: 0.7million Source: National spoken English test for non-English majors Test items – Teacher-student conversation – Student-student discussion – teacher-student discussion Data format: written transcripts
8
3. Spoken and Written Corpus of Chinese Learners (SWECCL) by Wen, Wang & Liang in 2005 3. Spoken and Written Corpus of Chinese Learners (SWECCL) by Wen, Wang & Liang in 2005 (Version 1) SWECCL WECCL SECCL 1.18 million1.46 million
9
Spoken (SECCL) Source of data – National spoken English test: 1996-2002 – Second-year English majors Data format – Digital sounds as well as transcripts of the speeches
10
National spoken English test for English majors — Band 4 Test format – Test in a lab The number of testees annually – 2006: more than 16,000 – Expect to have 50,000 in the future Scoring procedures – A random sample (30-35 tapes) – Two raters scoring one tape independently
11
Number of subjects – 6 groups from each year (1996-2002) – 42 groups (30/35) = about 1400 students – About 230 hours’s speech Testing items
12
TaskContentPreparationtime RetellingA storyListen twice but no preparation 3 min. MonologuePersonal experience 3 min. Role playAbout an issue in daily life 3 min.4 min.
13
The structure of SECCL SECCL Text Tagged Raw Special Article Past Tense Whole Task Year Task A Task B Task C Sound files (1996-2002)
14
The written component Written Year 1Year 2Year 3 Year 4
15
The written component Source of data – Timed compositions in class (40 minutes, no less than 300 words) – Take-home compositions (no word limit) Types of compositions – Argumentative (a list of topics provided) – Narrative
16
SWECCL in 2007 SWECCL in 2007 (Version 2) SWECCL WECCL SECCL Two million
17
SECCL(Version 2) 2003-2006 National Spoken English Test for second-year English majors (band 4) 2000-2006 National Spoken English Test for 4 th -year English majors-Band 8 (Task 3) A longitudinal data (2001-2004)
18
Spoken (Band 8) Testing item (Task C) – Make a comment on a given topic Data format – Digital sounds as well as transcripts of the speeches
19
Spoken (Longitudinal) 72 students 56 students 40 hours’ speech Year 1Year 2Year 3Year 4 Data collection time 2001200220032004
20
Tasks Reading aloud Retelling a story Talking on a given topic (Narrative) Talking on a given topic (argumentative) Conversation (Role play) Discussion on a given topic
21
4. Bilingual Corpus of Chinese English Learners (BICCEL) BICCEL Spoken Written E-CC-EE-CC-E 0.5 million
22
Spoken component of BICCEL National Oral English test — Band 8 – The 4 th year English majors – Interpreting from English to Chinese (Task A) – Interpreting from Chinese to English (Task B) – 2001-2005: 1100 testees
23
Written component of BICCEL Source of data: in-class assignment – E-C and C-E translation – Across the 3rd and 4th years – 30 universities across the country
24
Topic Two A brief review of corpus- based studies on Chinese learner English
25
Sources China National Knowledge Infrastructure (CNKI)(On-line journals) Digital dissertation database
26
Corpus-based studies in mainland China Studies Year Articlesdissertations 200697 20054028 20042917 200385 200265 200161 200010 Total9963
27
Research areas ArticlesDissertationsTotal Phonological516 Lexical434891 Grammatical27835 Discourse8210 Others16420 Total9963162
28
Conferences & workshop The International conference on “Corpus Linguistics” 25-27 October, 2003 The First National Symposium on corpus linguistics and ELT Education 11-13 October, 2004 Workshop on the use of corpus in teaching and research 17-19 March, 2006
29
Topic Three Several corpus-based studies on English learners’ interlanguage by myself or together with my colleagues
30
Study One Features of oral style in English compositions of advanced Chinese EFL learners (Wen, Q.F. Ding, Y.R. & Wang, W.Y. 2003, Foreign Language Teaching & Research (4):268- 274.
31
Study Two A Study on Frequency Adverbs Used by Advance English Learners in China Wen, Q. F. & Ding, Y. R. 2004. Modern foreign languages(2): 141- 147.
32
Study Three An analysis of English Majors’ Abstracting abilities through their English compositions Wen, Q.F. & Liu, R.Q. 2006. Foreign Languages (2)
33
Study Four A longitudinal study on the developmental features of speaking vocabulary by English majors in mainland China Wen, Q. F. 2006. Foreign Language Teaching and Research (3).
34
Study Five A comparison of developmental features of Speaking and Writing vocabulary by English majors Wen, Q. F. 2006. Foreign languages and Foreign Language Teaching (4)
35
Study Six Patterns of change in speaking vocabulary development by English majors
36
Study Two A Study on Frequency Adverbs Used by Advance English Learners in China Wen, Q. F. & Ding, Y. R. 2004. Modern foreign languages(2): 141- 147.
37
Frequency Adverbs Adverbs used for describing “how often” something happens never, sometimes, usually, always
38
Top Twenty Frequency Adverbs Most frequently used by native speakers according to the analyses of the British National Corpus (BNC) by Leech, Rayson and Wilson (2001)
39
Top Twenty Frequency Adverbs (TTFAs) Level of vocabulary Frequency adverbsNo. 1000- word level never, always, often, ever, *sometimes, usually, once, generally, hardly, no longer, increasingly, *twice, in general, occasionally, mostly 15 2000-word level frequently, rarely, regularly3 Academic word list normally, constantly2
40
Common features All high-frequency words (Leech et al. 2001) Different frequencies in speech and writing except sometimes and twice (Leech et al. 2001)
41
A comparison of TTFAs in speech and writing The overall difference TTFAs more likely occur in writing than in speech. The specific differences Speech: never, always, ever, normally Neutral: sometimes, twice Writing: 14 words
42
Previous corpus-based studies e.g. Altenberg & Granger, 2001; Cobb, 2002; Ringbom, 1998; Wen, Ting, & Wang , 2003 Conflicting finding one: overuse vs. underuse
43
Examples Overuse high-frequency words in writing (Cobb, 2001) Overuse modal verbs (Aijmer, 2002) Underuse adverbial connectors (Altenberg & Tapper, 1998) No study on frequency adverbs
44
Conflicting finding two Tend to use written style features in their speech Tend to use a mixed register in either speech or in writing Tend to use oral style features in their writing Did not compare the use of high- frequency words in speech with writing
45
General purposes of this study Whether Chinese EFL learners simply overuse the TTFAs or they overuse some while underusing others whether they use the TTFAs similarly or differently when compared their speech with writing
46
Research questions Do they overuse or underuse the TTFAs differently between speech and writing? Do they differ more from native speakers in writing or in speaking with regard to the use of the TTFAs? Do they demonstrate a similar pattern of writing-speaking difference as native speakers in the use of the TTFAs?
47
Data for analysis The learner corpus: The corpus of English majors in China Spoken (SECCL) 473,408 words 955,043 words Written (CLEC) 481,635 words The native- speaker corpus: The British National Corpus (BNC) Spoken (BNCS) 10 million words 100 million words Written (BNCW) 90 million words 955,043 words
48
Data analysis Four comparisons Learners’ speech and native speakers’ speech SECCL vs. BNCS Learner’s writing and native speakers’ writing CLEC vs. BNCW Dif. in learners’ speech & native speakers’ and Dif. In learners’ writing & native speakers’ SECCL vs. BNCS and CLEC vs. BNCW Dif. In learners’ speech & writing and dif. in native speakers’ speech & writing SECCL vs. CLEC and BNCS vs. BNCW
49
Results(1) TTFA use in learners’ spoken corpus (SECCL) Tendency Words OveruseAlways, once, often, sometimes, usually, hardly (6 words/407 Occurrences) UnderuseNormally, never, ever, twice, generally, in general, occasionally, no longer, constantly, increasingly (10 words/48 occurrences)
50
Results(2) TTFAs use in learners’ written corpus(CLEC) Tendency Words Overuse Always, sometimes, usually, no longer, never, once, often, generally, mostly (9 words/125 occurrences) Underuse Constantly, occasionally, ever, regularly, rarely, frequently, twice, increasingly, normally, (9 words/37 occurrences)
51
Results(3) Comparison of learners’ speech with their writing in TTFA use (Overuse) TendencyWordsFrequency difference SECCL BNCS (Spoken) (6) always, once, often, sometimes, usually, hardly 407 CLEC BNCW (Written) (9) always, sometimes, usually, no longer, never, once, often, generally, mostly 125
52
Results(3) Comparison (Underuse) TendencyWordsFrequency difference SECCL BNCS (Spoken) (10) normally, never, ever, twice, generally, in general, occasionally, no longer, constantly, increasingly - 48 CLEC BNCW (Written) (9) normally, increasingly, twice, frequently, rarely, regularly, ever, occasionally, constantly - 37
53
Results(3) Comparison (identical or similar) TendencyWordsFrequency difference SECCL BNCS (Spoken) (4) frequently, regularly, rarely, mostly - 4 CLEC BNCW (Written) (2) in general, hardly 3
54
Results(4) Speaking-writing differences in TTFA use in the CEMIC and the BNC Register-neutralSpoken-register sensitive BNC Twice Sometimes (2) Never, always, normally, ever (4) CEMIC Constantly, never, regularly, rarely, increasingly, normally (6) Always, once, often, sometimes, hardly (5)
55
Results(4) Speaking-writing differences in TTFA use in the CEMIC and the BNC Written-register sensitive BNC Often, once, no longer, generally, increasingly, usually, frequently, hardly, rarely, regularly, constantly, in general, occasionally, mostly (14) CEMIC No longer, generally, usually, in general, ever, mostly, occasionally, frequently, twice (9)
56
English majors in China tend to overuse and underuse certain TTFAs in their speech and writing. The overuse tendency is stronger than the underuse tendency in both speech and writing. Summary (1)
57
Summary (2) The overuse tendency is more marked in their speech than in their writing while the underuse tendency is also slightly stronger in speech than in writing. Some of the overused or underused TTFAs in speech are the same as those in writing but others are different.
58
Summary (3) Chinese English majors demonstrate a pattern of speaking-writing difference that is opposite to that shown in the native speakers’ corpus: they tend to use more TTFAs in their speech than in their writing while native speakers tend to use more TTFAs in their writing than in their speech. This shows that Chinese EFL learners use TTFAs without awareness of their register differences.
59
Possible reasons Limited vocabulary (Table 1b) Use them as “time buyers” Without equivalents readily available in Chinese
60
Topic Four Advantages and disadvantages of corpus- based studies on SLA
61
Advantage One A large sample stored electronically and open to the public – Validity and reliability (replicable) – Possible for a diachronic study
62
Advantage Two Using a computer software such as WordSmith – Effectiveness and efficiency
63
Advantage Three Understand the learner language from a different perspective –Correct vs. incorrect –More acceptable vs. less acceptable –Frequency Overuse Underuse unuse
64
CanCannot ProductProcess ProductiveReceptive Group patternsIndividual differences Language useLanguage knowledge Disadvantages
65
Closing Remark The number of researchers increasing Constructing different types of corpora Carrying corpus-based studies Findings useful for textbook writers as well as for practitioners
66
Thank you!!!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.