Download presentation
1
CORPUS APPROACHES TO LANGUAGE STUDIES FL, AWL
ENG 626 CORPUS APPROACHES TO LANGUAGE STUDIES FL, AWL Bambang Kaswanti Purwo
2
a 10-year-old native speaker of English has a vocabulary of
Word Levels a 10-year-old native speaker of English has a vocabulary of around 10,000 word families A word family describes the base form of a word plus its closely related inflected and derived forms. For example, here's the word family for absent: absent absented absenting absents absentee absentees absenteeism absently
3
the vocabulary size of native speakers
rough estimate: the vocabulary size of native speakers by adding 1,000 word families for each year of their life up to the age of about 20 a native speaker of English (a university graduate) probably knows at least 20,000 words goals for a learner of English as a second language [20,000 words – very ambitious] split up the vocabulary they need to learn into four levels: high-frequency, low-frequency, academic, and technical
4
frequency of a word: how often it occurs in a text
Word Frequency frequency of a word: how often it occurs in a text word most frequently used in written English: the a frequency of around seven in every 100 words of text = the occurs in almost every line of a written text [when Paul Nation started studying vocabulary teaching] to see how often each word occurred counted a 1,000-word text word-by-word • manually: a whole weekend • now with computers: less than a second
5
[original text: 1,906 words long, 532 different word types]
[original text: 1,906 words long, 532 different word types] Word Frequency the 100 wide 1 of 74 will to 58 without and 56 work words 46 working a 41 write in 39 yet vocabulary 38 you is 30 young are 25 yourself
6
A small number of words cover a lot of the text.
A small number of words cover a lot of the text. • “running words” or “tokens”: all the words in a text, including repeated words • 11 running words, a and of occur twice High- and Low-Frequency Words • a relatively small group of words (around 2,000) much more frequently used than other words in the lang • the 2,000 high-frequency words include the function words and content words. Function words: articles (a, the), conjunctions (because, but, although, and), prepositions (in, below, above), determiners (each, every, this, those), numbers Content words: nouns, verbs, adjectives, and adverbs
7
General Service List of English Words (GSL) by Michael West
General Service List of English Words (GSL) by Michael West • 2,000 word families • lots of useful information about frequency and meanings • it's been proven to work in graded readers. graded readers: books specially written in a limited vocabulary easy to read for learners of English (e.g. some books may have 300 words or less) • the rest of vocabulary is made up of low-frequency words • most conservative estimate: 120,000 low-frequency English words (not including proper names) • low-frequency words always a problem for lang. Ls n Ts (unpredictable when they'll occur in a text) • Ts need to deal with low- n high-frequency words differently
8
Academic and Technical Words
Academic and Technical Words • academic vocabulary: additional high-frequency word list known as the Academic Word List (AWL) • to be learned after students acquire the 2,000 high-frequency words • AWL (developed by Averil Coxhead): 570 word families (not in the most frequent 2,000 words); for anyone doing academic study in almost any subject area • technical vocabulary of particular subject areas e.g., in computing: mouse, pixel, rom, and retrieve
9
Vocabulary Level Number of Words Text Coverage
Vocabulary Level Number of Words Text Coverage high-frequency 2,000 70% academic 570 5% technical 1,000 20% low-frequency 6,000
10
Academic Word List (AWL) – Averil Coxhead (1998) An Academic
Word List. English Language Institute Occasional Publication No. 18. • developed at the School of Linguistics and Applied Language Studies at Victoria University of Wellington, NZ • a list of 570 words, excluding words in the most frequent 2000 words of English • to be used for Ls at tertiary level study • the headwords = the stem form of the words • the headwords of the AWL are listed on pp. 7–11 • the word families of the AWL are listed in sublists 1–10 • the word family analyse, for example, include the regular inflections of the verb: analysed, analysing, analyses the derivations of the word: analysis, analyst, analysts, analytical, analytically, etc. the American spelling: analyze, analyzed, analyzes, analyzing • the most frequently used member of the family is in italics e.g. analysis the most common form the word family analyse
11
• the word families of the AWL selected from the words
in the Academic Corpus (AC), approx. 3,500,000 words • the AC is a written corpus of academic English: journal articles, book chapters, course workbooks, laboratory manuals, and course notes • four faculty sections: ▪ Arts ▪ Commerce ▪ Law ▪ Science • each faculty section approx. 875,000 running words • each faculty section divided into seven subject areas, approx. 125,000 running words
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.