Presentation is loading. Please wait.

Presentation is loading. Please wait.

Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International.

Similar presentations


Presentation on theme: "Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International."— Presentation transcript:

1 Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International Symposium of Computer-Assisted Language Learning June 2-4, 2006, Beijing

2 Topics to be addressed English corpora of Chinese learners English corpora of Chinese learners Corpus-based studies on English learners in mainland China Corpus-based studies on English learners in mainland China Several corpus-based studies on English learners’ interlanguage by myself or together with my colleauges Several corpus-based studies on English learners’ interlanguage by myself or together with my colleauges Advantages and disadvantages of corpus- based studies on the interlanguage Advantages and disadvantages of corpus- based studies on the interlanguage

3 Topic One English corpora of Chinese learners

4 Chinese learner English Corpus (CLEC) College Learners’ Spoken English Corpus (COLSEC) Spoken and Written Corpus of Chinese Learners (SWECCL) – Version 1 – Version 2 (under construction) Bilingual Corpus of Chinese English Learners (BICCEL): under construction

5 1. Chinese learner English Corpus (CLEC) by Gui & Yang in 2003 Written corpus: 1 million Timed and untimed compositions Levels of proficiency – Middle school students – Non-English major (Band 4) – Non-English major (Band 6) – English majors (Band 4 ) – English majors (Band 8) Error-tagged

6 Two Types of English Learners in University English Majors Non-English majors Year 4 Year 3 Year 2 Year 1 Band 8 Band 4 Year 4 Year 3 Year 2 Year 1 Band 6 Band 4 Band 2

7 2. College Learners’ Spoken English Corpus (COLSEC) by Yang & Wei in 2005 Tokens: 0.7million Source: National spoken English test for non-English majors Test items – Teacher-student conversation – Student-student discussion – teacher-student discussion Data format: written transcripts

8 3. Spoken and Written Corpus of Chinese Learners (SWECCL) by Wen, Wang & Liang in 2005 3. Spoken and Written Corpus of Chinese Learners (SWECCL) by Wen, Wang & Liang in 2005 (Version 1) SWECCL WECCL SECCL 1.18 million1.46 million

9 Spoken (SECCL) Source of data – National spoken English test: 1996-2002 – Second-year English majors Data format – Digital sounds as well as transcripts of the speeches

10 National spoken English test for English majors — Band 4 Test format – Test in a lab The number of testees annually – 2006: more than 16,000 – Expect to have 50,000 in the future Scoring procedures – A random sample (30-35 tapes) – Two raters scoring one tape independently

11 Number of subjects – 6 groups from each year (1996-2002) – 42 groups (30/35) = about 1400 students – About 230 hours’s speech Testing items

12 TaskContentPreparationtime RetellingA storyListen twice but no preparation 3 min. MonologuePersonal experience 3 min. Role playAbout an issue in daily life 3 min.4 min.

13 The structure of SECCL SECCL Text Tagged Raw Special Article Past Tense Whole Task Year Task A Task B Task C Sound files (1996-2002)

14 The written component Written Year 1Year 2Year 3 Year 4

15 The written component Source of data – Timed compositions in class (40 minutes, no less than 300 words) – Take-home compositions (no word limit) Types of compositions – Argumentative (a list of topics provided) – Narrative

16 SWECCL in 2007 SWECCL in 2007 (Version 2) SWECCL WECCL SECCL Two million

17 SECCL(Version 2) 2003-2006 National Spoken English Test for second-year English majors (band 4) 2000-2006 National Spoken English Test for 4 th -year English majors-Band 8 (Task 3) A longitudinal data (2001-2004)

18 Spoken (Band 8) Testing item (Task C) – Make a comment on a given topic Data format – Digital sounds as well as transcripts of the speeches

19 Spoken (Longitudinal) 72 students 56 students 40 hours’ speech Year 1Year 2Year 3Year 4 Data collection time 2001200220032004

20 Tasks Reading aloud Retelling a story Talking on a given topic (Narrative) Talking on a given topic (argumentative) Conversation (Role play) Discussion on a given topic

21 4. Bilingual Corpus of Chinese English Learners (BICCEL) BICCEL Spoken Written E-CC-EE-CC-E 0.5 million

22 Spoken component of BICCEL National Oral English test — Band 8 – The 4 th year English majors – Interpreting from English to Chinese (Task A) – Interpreting from Chinese to English (Task B) – 2001-2005: 1100 testees

23 Written component of BICCEL Source of data: in-class assignment – E-C and C-E translation – Across the 3rd and 4th years – 30 universities across the country

24 Topic Two A brief review of corpus- based studies on Chinese learner English

25 Sources China National Knowledge Infrastructure (CNKI)(On-line journals) Digital dissertation database

26 Corpus-based studies in mainland China Studies Year Articlesdissertations 200697 20054028 20042917 200385 200265 200161 200010 Total9963

27 Research areas ArticlesDissertationsTotal Phonological516 Lexical434891 Grammatical27835 Discourse8210 Others16420 Total9963162

28 Conferences & workshop The International conference on “Corpus Linguistics” 25-27 October, 2003 The First National Symposium on corpus linguistics and ELT Education 11-13 October, 2004 Workshop on the use of corpus in teaching and research 17-19 March, 2006

29 Topic Three Several corpus-based studies on English learners’ interlanguage by myself or together with my colleagues

30 Study One Features of oral style in English compositions of advanced Chinese EFL learners (Wen, Q.F. Ding, Y.R. & Wang, W.Y. 2003, Foreign Language Teaching & Research (4):268- 274.

31 Study Two A Study on Frequency Adverbs Used by Advance English Learners in China Wen, Q. F. & Ding, Y. R. 2004. Modern foreign languages(2): 141- 147.

32 Study Three An analysis of English Majors’ Abstracting abilities through their English compositions Wen, Q.F. & Liu, R.Q. 2006. Foreign Languages (2)

33 Study Four A longitudinal study on the developmental features of speaking vocabulary by English majors in mainland China Wen, Q. F. 2006. Foreign Language Teaching and Research (3).

34 Study Five A comparison of developmental features of Speaking and Writing vocabulary by English majors Wen, Q. F. 2006. Foreign languages and Foreign Language Teaching (4)

35 Study Six Patterns of change in speaking vocabulary development by English majors

36 Study Two A Study on Frequency Adverbs Used by Advance English Learners in China Wen, Q. F. & Ding, Y. R. 2004. Modern foreign languages(2): 141- 147.

37 Frequency Adverbs Adverbs used for describing “how often” something happens never, sometimes, usually, always

38 Top Twenty Frequency Adverbs Most frequently used by native speakers according to the analyses of the British National Corpus (BNC) by Leech, Rayson and Wilson (2001)

39 Top Twenty Frequency Adverbs (TTFAs) Level of vocabulary Frequency adverbsNo. 1000- word level never, always, often, ever, *sometimes, usually, once, generally, hardly, no longer, increasingly, *twice, in general, occasionally, mostly 15 2000-word level frequently, rarely, regularly3 Academic word list normally, constantly2

40 Common features All high-frequency words (Leech et al. 2001) Different frequencies in speech and writing except sometimes and twice (Leech et al. 2001)

41 A comparison of TTFAs in speech and writing The overall difference  TTFAs more likely occur in writing than in speech. The specific differences  Speech: never, always, ever, normally  Neutral: sometimes, twice  Writing: 14 words

42 Previous corpus-based studies e.g. Altenberg & Granger, 2001; Cobb, 2002; Ringbom, 1998; Wen, Ting, & Wang , 2003 Conflicting finding one: overuse vs. underuse

43 Examples Overuse high-frequency words in writing (Cobb, 2001) Overuse modal verbs (Aijmer, 2002) Underuse adverbial connectors (Altenberg & Tapper, 1998) No study on frequency adverbs

44 Conflicting finding two Tend to use written style features in their speech Tend to use a mixed register in either speech or in writing Tend to use oral style features in their writing Did not compare the use of high- frequency words in speech with writing

45 General purposes of this study  Whether Chinese EFL learners simply overuse the TTFAs or they overuse some while underusing others  whether they use the TTFAs similarly or differently when compared their speech with writing

46 Research questions Do they overuse or underuse the TTFAs differently between speech and writing? Do they differ more from native speakers in writing or in speaking with regard to the use of the TTFAs? Do they demonstrate a similar pattern of writing-speaking difference as native speakers in the use of the TTFAs?

47 Data for analysis The learner corpus: The corpus of English majors in China Spoken (SECCL) 473,408 words 955,043 words Written (CLEC) 481,635 words The native- speaker corpus: The British National Corpus (BNC) Spoken (BNCS) 10 million words 100 million words Written (BNCW) 90 million words 955,043 words

48 Data analysis Four comparisons Learners’ speech and native speakers’ speech SECCL vs. BNCS Learner’s writing and native speakers’ writing CLEC vs. BNCW Dif. in learners’ speech & native speakers’ and Dif. In learners’ writing & native speakers’ SECCL vs. BNCS and CLEC vs. BNCW Dif. In learners’ speech & writing and dif. in native speakers’ speech & writing SECCL vs. CLEC and BNCS vs. BNCW

49 Results(1) TTFA use in learners’ spoken corpus (SECCL) Tendency Words OveruseAlways, once, often, sometimes, usually, hardly (6 words/407 Occurrences) UnderuseNormally, never, ever, twice, generally, in general, occasionally, no longer, constantly, increasingly (10 words/48 occurrences)

50 Results(2) TTFAs use in learners’ written corpus(CLEC) Tendency Words Overuse Always, sometimes, usually, no longer, never, once, often, generally, mostly (9 words/125 occurrences) Underuse Constantly, occasionally, ever, regularly, rarely, frequently, twice, increasingly, normally, (9 words/37 occurrences)

51 Results(3) Comparison of learners’ speech with their writing in TTFA use (Overuse) TendencyWordsFrequency difference SECCL BNCS (Spoken) (6) always, once, often, sometimes, usually, hardly 407 CLEC BNCW (Written) (9) always, sometimes, usually, no longer, never, once, often, generally, mostly 125

52 Results(3) Comparison (Underuse) TendencyWordsFrequency difference SECCL BNCS (Spoken) (10) normally, never, ever, twice, generally, in general, occasionally, no longer, constantly, increasingly - 48 CLEC BNCW (Written) (9) normally, increasingly, twice, frequently, rarely, regularly, ever, occasionally, constantly - 37

53 Results(3) Comparison (identical or similar) TendencyWordsFrequency difference SECCL BNCS (Spoken) (4) frequently, regularly, rarely, mostly - 4 CLEC BNCW (Written) (2) in general, hardly 3

54 Results(4) Speaking-writing differences in TTFA use in the CEMIC and the BNC Register-neutralSpoken-register sensitive BNC Twice Sometimes (2) Never, always, normally, ever (4) CEMIC Constantly, never, regularly, rarely, increasingly, normally (6) Always, once, often, sometimes, hardly (5)

55 Results(4) Speaking-writing differences in TTFA use in the CEMIC and the BNC Written-register sensitive BNC Often, once, no longer, generally, increasingly, usually, frequently, hardly, rarely, regularly, constantly, in general, occasionally, mostly (14) CEMIC No longer, generally, usually, in general, ever, mostly, occasionally, frequently, twice (9)

56 English majors in China tend to overuse and underuse certain TTFAs in their speech and writing. The overuse tendency is stronger than the underuse tendency in both speech and writing. Summary (1)

57 Summary (2) The overuse tendency is more marked in their speech than in their writing while the underuse tendency is also slightly stronger in speech than in writing. Some of the overused or underused TTFAs in speech are the same as those in writing but others are different.

58 Summary (3) Chinese English majors demonstrate a pattern of speaking-writing difference that is opposite to that shown in the native speakers’ corpus: they tend to use more TTFAs in their speech than in their writing while native speakers tend to use more TTFAs in their writing than in their speech. This shows that Chinese EFL learners use TTFAs without awareness of their register differences.

59 Possible reasons Limited vocabulary (Table 1b) Use them as “time buyers” Without equivalents readily available in Chinese

60 Topic Four Advantages and disadvantages of corpus- based studies on SLA

61 Advantage One A large sample stored electronically and open to the public – Validity and reliability (replicable) – Possible for a diachronic study

62 Advantage Two Using a computer software such as WordSmith – Effectiveness and efficiency

63 Advantage Three Understand the learner language from a different perspective –Correct vs. incorrect –More acceptable vs. less acceptable –Frequency Overuse Underuse unuse

64 CanCannot ProductProcess ProductiveReceptive Group patternsIndividual differences Language useLanguage knowledge Disadvantages

65 Closing Remark The number of researchers increasing Constructing different types of corpora Carrying corpus-based studies Findings useful for textbook writers as well as for practitioners

66 Thank you!!!


Download ppt "Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International."

Similar presentations


Ads by Google