Download presentation
Presentation is loading. Please wait.
Published byRegina Jennings Modified over 9 years ago
1
GSL & NGSL
2
Comparison: GSL 1953 (Michael West) 1995 ( John Bauman & Brent Culligan) Today’s version 2284 Word families (famous early 20th century researchers; several vocabulary conferences) Based on 2.5 million word corpus NGSL 2013 Charles Browne, Brent Culligan and Joseph Phillips 2818 ; 2801( updated in 2014) Headwords & Lemmas (Selected word subsection from CEC) Based on 273-million word subsection of the 2 billion word CEC
3
Comparison: The improvement of coverage that the NGSL 1.01 version The combines NGSL/NAWL gives about 5% more text coverage than the combined GSL/AWL
4
Vocabulary List GSLNGSL Corpus Size 2.5 million273 million Number of “Word Families” 1964 2368 Number of “Lemmas” 36232801 Coverage in CEC Corpus 84.24%90.34% Classic Literature 86.1785.35 Scientific American 65.8771.34 The Economist 76.5581.75 Comparison :(References from 碩專一 AWL+GSL v.s. NAWL + NGSL)
5
Comparison: Coverage figures of GSL are actually higher as the NGSL has purposely excluded things like days of the week, months of the year and certain contractions that were not grouped together in the CEC. ( proper nouns, abbreviations, slang) GSL offers slightly better coverage for texts of classic literature. (about 0.8% better than the NGSL) NGSL offers 56% more coverage than either list for more modern corpora such as Scientific American or the Economist. (References from 碩專一 AWL+GSL v.s. NAWL + NGSL)
6
Reasons for GSL been criticized :
7
Browne, C (2014). The New General Service List Version 1.01: Getting Better All the Time. Korea TESOL journal 11(1). Been used more than 60 years, corpus is dated (most words published in 1800s –1930s ) Small by modern standards (original analysis done with a corpus only 2,5 million words ) In need of a clearer definition of what constitutes a “word” within the list
8
NGSL meet the following goals:
9
1. Update and expand the size of corpus used ( 273 million words), increasing the validity and ability to generalize the list 2. Creating an NGSL of the most important high- frequency words useful for L2 learners 3. Making an NGSL that is based on a clearer definition of what constitutes a word 4. Be a starting point for scholars and teachers with the goal of updating and revising the list
10
CEC corpora for NGSL N: had the problem of showing a marked bias towards financial terms A: was a specific genre not directly related to general English Result: 1,282,909,322-748,391,436-260,904,352= 273,613,534
11
Why did not use Word Families like the original GSL ?
12
Update ( Dec 2, 2013) : not only providing frequency and coverage figures for 273 million word corpus of general English ( which is comprised of 90% written and 10% spoken data) but also for the 27 million spoken word subsection which consisted 3 main parts –spoken conversational English, TV and radio.
13
Update ( Feb 17, 2014) : Published NAWL which based on a 288 million word academic corpus Update ( April 4, 2014) : Decrease the number of NGSL headwords by 17 from 2818 to 2801 Two Words Added TOURNAMENT, YEAH (YES) Nineteen Words Deleted ZERO, BILLION, FIFTEEN, FIFTY HER was listed under SHE. HIM and HIS were listed under HE. ITS was listed under IT. ME and MY were listed under I. OUR and US were listed under WE. THEIR and THEM were listed under THEY. THESE was listed under THIS. THOSE was listed under THAT. WHOM and WHOSE were listed under WHO. YOUR was listed under YOU.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.