Presentation is loading. Please wait.

Presentation is loading. Please wait.

What kind of vocabulary is in course books and graded readers? Rob Waring Notre Dame Seishin University JALT Vocab SiG Symposium June 29, 2013.

Similar presentations

Presentation on theme: "What kind of vocabulary is in course books and graded readers? Rob Waring Notre Dame Seishin University JALT Vocab SiG Symposium June 29, 2013."— Presentation transcript:

1 What kind of vocabulary is in course books and graded readers? Rob Waring Notre Dame Seishin University JALT Vocab SiG Symposium June 29, 2013

2 Steps Decide a scale to use (ERF Scale) Make a base wordlist based on the scale Scan in the texts and remove proper nouns Run the analysis in AntWord Count the running words in each text at each of the wordlist levels Identify a typical average frequency profile (by baseword level) at each reading level for the GRs and course books Decide the number of average texts to be ‘read’ (30) Decide how many times a word has to be met before it’s learnt (20)


4 #Titles Total running words Average length Wordlist Level ERF1 (50) 64 4,138 65 Wordlist Level ERF2 (100) 39 9,343 240 Wordlist Level ERF3 (200) 68 42,846 630 Wordlist Level ERF4 (300) 57 82,962 1,455 Wordlist Level ERF5 (400) 59 189,697 3,215 Wordlist Level ERF6 (600) 106 223,740 2,111 Wordlist Level ERF7 (800) 121 403,641 3,336 Wordlist Level ERF8 (1000) 61 514,319 8,431 Wordlist Level ERF9 (1250) 51 645,368 12,654 Wordlist Level ERF10 (1500) 54 544,035 10,075 Wordlist Level ERF11 (1800) 65 1,113,465 17,130 Wordlist Level ERF12 (2100) 14 90,870 6,491 Wordlist Level ERF13 (2400) 33 596,979 18,090 Wordlist Level ERF14 (3000) 34 445,038 13,089 Wordlist Level ERF15 (3600) - Wordlist Level ERF16 (4500) 2 78,455 39,228 828 4,984,896 Graded Reader Corpus

5 Percentage of words at each ERF Reading level by Wordlist level ERF1ERF2ERF3ERF4ERF5ERF6ERF7ERF8ERF9ERF10ERF11ERF12ERF13ERF14ERF15ERF16 ERF1 (50)34.4%37.6%33.8%30.8%30.0%27.6% 27.4%27.7%27.3%27.8%26.7%26.4%26.3%27.6% ERF2 (100)8.2%13.3%12.1%11.6%11.2%10.0%9.3%9.5%9.8%9.4%9.2%8.9%9.0%8.6%8.9% ERF3 (200)17.6%13.1%18.9%20.3%22.0%21.8%21.4%21.2% 20.9%19.6%19.5%19.4%18.8%17.4% ERF4 (300)5.3%2.4%4.7%6.4%6.9%7.9%7.8%8.5%8.2%8.7%8.3%7.9%8.2%7.8%6.8% ERF5 (400)1.4%1.7%1.8% 2.3%2.5%2.9%3.2%3.1% 3.0%3.2%3.1%2.7% ERF6 (600)1.9%1.1%2.1%2.2% 2.7%3.0%3.5%3.4%4.0%3.8%3.9%4.0% 3.4% ERF7 (800)1.5%0.7%0.8%0.7% 1.1%1.4%1.6%1.5%1.9%2.1% 2.4%2.7%2.2% ERF8 (1000)1.3%0.5%1.0%0.9%1.0%1.5%1.3%1.6% 1.9%2.0%2.5% 2.9%2.4% ERF9 (1250)1.6%0.5%0.6%0.4%0.6%0.7%0.8%0.7% 0.8%1.0%1.1%1.3%1.4%1.3% ERF10 (1500)0.8%0.5% 0.6%0.7%0.8% 1.0%1.3%1.2%1.5%1.7% ERF11 (1800)0.9%0.3% 0.6%0.5%0.7%0.8%0.7%0.6%0.7%0.9%0.8%1.0%1.1%1.3% ERF12 (2100)1.4%0.2%0.5% 0.4%0.7% 0.5% 0.6%0.8%0.9%1.0%1.5% ERF13 (2400)0.8%0.6%0.5%0.2%0.4% 0.3% 0.4%0.2%0.5%0.6%0.7% ERF14 (3000)0.6%0.1%0.8%0.7%0.5%0.8%0.6%0.5% 0.4%0.5%0.3%0.4%0.6%0.7% ERF15 (3600)1.3%0.2%1.1%0.9%0.7%1.1%1.0%0.9%0.8% 0.9% 1.3%1.4% ERF16 (4500)0.8%0.5%0.3%0.2%0.1%0.3%0.2% 0.3% 0.7% ERF17 (6000)0.5%0.4%0.3%0.2%0.3%0.4% 0.3% 0.4% 0.3%0.4% 1.0% ERF18 (8000)0.9%0.2%0.4%0.2%0.6%0.9%0.7%0.3%0.4%0.3%0.4%0.3% 0.4%1.1% ERF19 (12000)0.7%0.6%0.8%0.3% 0.4%0.5%0.3%0.2%0.3% 0.2% ERF20 (18000)2.2%1.1%1.5%1.4%1.7%1.5%1.6%1.5%1.6% 1.7%1.5%1.8%1.9%1.7% Out of level0.2%2.2%0.2%2.5%0.9%1.8%1.9%2.0%1.0%0.7%0.5%3.7%0.8%0.9%0.2% Proper nouns15.7%22.3%16.9%16.5%16.0%14.7% 14.3%15.4%14.7%15.0%14.0%14.7%13.6%14.5% Not in lists0.1%0.0%0.1% 0.2% 0.3%0.2% 0.4%0.5% Wordlist levelERF Reading level

6 % of families at each level which occur more than 20 times (minus proper nouns) ERF1ERF2ERF3ERF4ERF5ERF6ERF7ERF8ERF9ERF10ERF11ERF12ERF13ERF14ERF15ERF16 ERF1 (50) 41%48%41%37%36%32% 33%32%33%31% 30% 32% ERF2 (100) 10%17%15%14%13%12%11% 12%11% 10%11%10% ERF3 (200) 21%17%23%24%26% 25% 24%23% 22%20% ERF4 (300) 6%3%6%8% 9% 10% 9%10%9%8% ERF5 (400) 2% 3% 4% 3%4% 3% ERF6 (600) 2%1%3% 4% 5% 4%5% 4% ERF7 (800) 2%1% 2% 3% ERF8 (1000) 2%1% 2% 3% ERF9 (1250) 2%1% 2% ERF10 (1500) 1% 2% ERF11 (1800) 1%0% 1% 2% ERF12 (2100) 2%0%1% 2% ERF13 (2400) 1% 0% 1% ERF14 (3000) 1%0%1% 0%1% ERF15 (3600) 2%0%1% 2% ERF16 (4500) 1% 0% 1% ERF17 (6000) 1%0% 1% 0% 1% ERF18 (8000) 1%0% 1% 0% 1% ERF19 (12000) 1% 0% 1%0% ERF20 (18000) 3%1%2% Out of level 0%3%0%3%1%2% 1% 4%1% 0% Proper nouns 19%29%20% 19%17% 18%17%18%16%17%16%17% Wordlist levelERF Reading level

7 Average book length without proper nouns ERF1ERF2ERF3ERF4ERF5ERF6ERF7ERF8ERF9ERF10ERF11ERF12ERF13ERF14ERF15ERF16 40.8%65%78%83%86%85%86%89%91%92%93%91%94%93%94% % of each book in level ERF1ERF2ERF3ERF4ERF5ERF6ERF7ERF8ERF9ERF10ERF11ERF12ERF13ERF14ERF15ERF16 541865241,2152,7001,8012,846 7,22 510,6998,59314,5565,58315,42811,30633,531

8 How many words do you meet if you read 30 books at each level? ERF1 only ERF 1-2 ERF 1-3 ERF 1-4 ERF 1-5 ERF 1-6 ERF 1-7 ERF 1-8 ERF 1-9 ERF 1-10 ERF 1-11 ERF 1-12 ERF 1-13 ERF 1-14 ERF 1-15 ERF 1-16 Number of books read / level30 Accumulated books read6090120150180210240270300330360390420450 Running words met /level 65 240 630 1,455 3,215 2,111 3,336 8,431 12,654 10,075 17,130 6,491 18,090 13,089 39,228 Running words for 30 books / level 1,940 7,187 18,903 43,664 96,456 63,323 100,07 6 252,944 379,628 302,242 513,907 194,721 542,708 392,681 - 1,176,825 Accumulated having read all 450 books 1,940 9,127 28,029 71,693 168,150 231,472 331,54 8 584,492 964,120 1,266,362 1,780,269 1,974,991 2,517,699 2,910,379 4,087,204

9 Accumulated coverage for 30 books per level to 95% coverage of the families at 20 meetings for each type at each level ERF1 onlyERF1-2ERF1-3ERF1-4ERF1-5ERF1-6ERF1-7ERF1-8ERF1-9ERF1-10ERF1-11ERF1-12ERF1-13ERF1-14ERF1-15ERF1-16 # books read 306090120150180210240270300330360420450480 ERF1 (50) 16.7%28.2%35.5%53.2%71.5%70.3%70.1%73.0%73.3%73.6%77.8%76.1%78.2%78.1%76.7% ERF2 (100) 1.0%14.3%41.9%60.0%77.1%81.9%82.9%87.6%95.2%96.2%98.1%99.0% ERF3 (200) 0.8%4.6%23.2%52.9%76.8%79.8%83.7%86.3%89.0%90.5%92.8%93.2%95.1%95.8%96.2% ERF4 (300) 0.8%2.3%9.8%34.6%67.7%77.4%86.5%89.5%91.0%91.7%94.0% 94.7%95.5%96.2% ERF5 (400) 0.0%1.9%3.7%15.7%49.1%64.8%74.1%88.9%91.7% 92.6% 95.4% 96.3% ERF6 (600) 0.0%0.6% 9.4%28.1%38.1%58.1%79.4%88.1%91.3%93.8%94.4%95.6%96.3%96.9% ERF7 (800) 0.0%0.7%2.6% 7.9%15.8%26.3%48.7%59.2%68.4%82.2%87.5%94.7%96.1%97.4% ERF8 (1000) 0.0% 0.9%3.7%11.6%15.3%25.5%39.4%50.9%61.1%78.7%86.6%91.7%94.9% 97.2% ERF9 (1250) 0.0% 0.5% 6.3%8.2%14.4%26.0%39.4%52.4%63.0%72.1%80.3%86.5%89.9% ERF10 (1500) 0.0%0.4%0.8%1.2%5.8%9.1%12.8%23.5%36.2%43.2%58.0%65.0%73.3%83.1%89.7% ERF11 (1800) 0.0% 1.5%3.3%4.4%9.5%18.2%32.1%39.8%54.4%58.4%70.4%79.9%88.3% ERF12 (2100) 0.0% 0.3%1.2%3.4%4.3%7.4%13.3%20.4%27.6%40.6%43.0%53.6%63.8%76.8% ERF13 (2400) 0.0% 0.4% 2.7%4.2%6.2%9.7%16.2%22.0%30.1%32.8%43.2%51.4%68.7% ERF14 (3000) 0.0% 0.3%0.9%2.3%4.1%5.2%10.5%15.2%19.0%25.9%27.1%32.4%39.4%54.2% ERF15 (3600) 0.0% ERF16 (4500) 0.0%0.1%0.5%0.9%1.6%2.7%3.2%6.5%9.4%12.4%18.5%19.4%25.4%30.6%45.4% Accumulated reading amount

10 How many words are you likely to ‘know’ (20 meetings) after reading all that? ERF1 only ERF 1-2 ERF 1-3 ERF 1-4 ERF 1-5 ERF 1-6 ERF 1-7 ERF 1-8 ERF 1-9 ERF 1-10 ERF 1-11 ERF 1-12 ERF 1-13 ERF 1-14 ERF 1-15 ERF 1-16 ERF1 (50) 81418273635 3637 393839 038 ERF2 (100) 11442607782838895969899 0 ERF3 (200) 15235377808486899093 95960 ERF4 (300) 12103568778689919294 95 096 ERF5 (400) 024164965748992 93 95 096 ERF6 (600) 0111956761161591761831881891911930194 ERF7 (800) 0155163253971181371641751891920195 ERF8 (1000) 0027233151791021221571731831900194 ERF9 (1250) 001116203665991311571802012160225 ERF10 (1500) 012314233259911081451631832080224 ERF11 (1800) 000410132855961191631752112400265 ERF12 (2100) 00141013224061831221291611910230 ERF13 (2400) 00118131929496690981301540206 ERF14 (3000) 002514243163911141561631942360325 ERF15 (3600) 0000000000000000 ERF16 (4500) 015815242958851111671742292760409 1142117248488608780109213711580192620362296252002894

11 Summary 450 books = 2894 ‘known’ words (20 meetings) Many words at each level won’t be met enough times to ‘learn’ them even after having read 30 titles at each level

12 Course books 6 Japanese Junior High texts 21 Japanese High school texts 18 Korean Middle School texts 15 Korean High School texts 5 Mexican Middle and Senior High texts

13 How many words will a learner meet on average in these texts in a middle or high school? Middle SchoolHigh SchoolTotal Mexico (Sequences) 126,043 106,493 232,536 Korea (averaged) 23,483 37,950 61,433 Japan (averaged) 14,066 20,977 35,043

14 How many words are in each book by ERF level? Japanese Korea Mexico JH SH Middle HS Middle HS ERF1 (50) 2,906 4,278 4,770 6,756 41,038 30,059 ERF2 (100) 1,012 1,638 1,757 2,369 10,755 8,911 ERF3 (200) 1,906 3,450 3,479 5,358 20,963 18,664 ERF4 (300) 616 1,339 1,233 2,056 7,550 7,472 ERF5 (400) 379 827 753 1,107 3,949 3,113 ERF6 (600) 407 932 878 1,312 4,059 3,787 ERF7 (800) 304 551 621 1,059 2,457 2,941 ERF8 (1000) 274 650 635 1,353 2,686 3,238 ERF9 (1250) 251 419 535 996 1,653 1,815 ERF10 (1500) 171 385 455 1,002 1,091 1,868 ERF11 (1800) 196 218 458 885 1,569 1,320 ERF12 (2100) 222 368 407 993 878 1,413 ERF13 (2400) 120 227 264 520 390 532 ERF14 (3000) 114 212 326 726 871 928 ERF15 (3600) 154 334 388 981 1,949 2,045 ERF16 (4500) 105 119 127 447 247 299 ERF17 (6000) 102 181 194 686 433 614 ERF18 (8000) 223 259 182 672 380 417 ERF19 (12000) 125 139 161 374 298 394 ERF20 (18000) 286 479 453 995 2,657 2,040 Out of level 247 206 266 404 686 830 Proper nouns 2,637 2,920 3,812 4,474 16,992 11,964 Not in lists 1,310 851 1,329 2,426 2,492 1,829 Total 14,066 20,977 23,483 37,950 126,043 106,493

15 words met vs number of words probably learnt (>20 meetings) in various course books JapaneseKoreanMexico # meetingsJHSHBothJHSHBothJHSHBoth 50+2439683764124310271492 31-49203972395593127131191 20-302357755371172138139199 10-1981159261207282536279348394 5-9182266380422686685291393409 1-47927137928021225996497625567 112212731648156023832606164219072252 50+2.1%3.1%4.1%2.4%2.7%4.8%18.9%14.2%21.8% 31-491.8%3.1%4.4%2.5%2.3%3.6%7.7%6.9%8.5% 20-302.0%4.5%4.6%3.4%3.0%6.6%8.4%7.3%8.8% 10-197.2%12.5%15.8%13.3%11.8%20.6%17.0%18.2%17.5% 5-916.2%20.9%23.1%27.1%28.8%26.3%17.7%20.6%18.2% 1-470.6%56.0%48.1%51.4% 38.2%30.3%32.8%25.2% 100% 100.0%

16 Course book plus a book a week = ? JapanKoreaMexico # meetings JH course book Plus ERF1-3 (90 Books) JH & SH course books Plus ERF 1-6 (180 books) Middle course books Plus ERF1-3 (90 Books) Middle & SH course books plus ERF 1-6 (180 books) Middle course books plus ERF1-3 (90 Books) Middle and SH course books plus ERF 1-6 (180 books) 50+101523121568354780 31-496318290229165258 20-3077182103204149228 10-19162300288467291411 5-9267372368593323404 1-4564771767824562605 123423301737288518442686 50+8.2%22.4%7.0%19.7%19.2%29.0% 31-495.1%7.8%5.2%7.9%8.9%9.6% 20-306.2%7.8%5.9%7.1%8.1%8.5% 10-1913.1%12.9%16.6%16.2%15.8%15.3% 5-921.6%16.0%21.2%20.6%17.5%15.0% 1-445.7%33.1%44.2%28.6%30.5%22.5% 100.0%

17 Number of words met JapanKoreaMexico Course books only JH14,06623,483126,043 JH & SH35,04361,433232,536 Course books plus reading JH35,98945,405147,966 JH & SH219,242245,632416,735

18 Likely uptake (words met more than 20 times from reading 30 texts at each level) JapanKoreaMexico Course books only JH147184854 JH & SH4769251,276 Course books plus reading JH403602959 JH & SH1,1871,4681,677

19 Summary Course books only leads to low gains most words forgotten Course books plus reading doubles vocabulary BUT these data underestimate learning because the data do not include partially known words (probably double that), collocations, colligations, multi-word phrases etc. are unfair to the Mexico group who were restricted to low level reading (so we could compare)

20 It’s a work in progress …. Some levels in my wordlist need redoing level 3 has lots of past forms and irregular verbs -> bump in data level 6, 8, 15 & 16 are short of families Some levels short of texts level 12 and level 15 Next I’ll … add higher level texts 17-20 when they become available replicate Paul’s study on how many words you need to meet to learn X,000 words with this corpus of SL texts analyze which GR series best represents their stated levels find out how many texts are needed before learners have covered say 05% of the words at a set level re-do the stats for 12, 30 meetings

21 Phew! Yes Paul, I’ll publish it!

Download ppt "What kind of vocabulary is in course books and graded readers? Rob Waring Notre Dame Seishin University JALT Vocab SiG Symposium June 29, 2013."

Similar presentations

Ads by Google