Homework review
Corpus Linguistics What is corpus? What is corpora? A corpus is a large and principled collection of texts stored in electronic format.
How can we define corpus linguistics? The area of study known as ‘corpus linguistics’ has enjoyed much greater popularity, both as a means to explore actual patterns of language use and as a tool for developing materials for classroom language instruction. One of the major contributions of corpus linguistics is in the area of exploring patterns of language use. can provide tremendous insights as to how language use varies in different situations, such as spoken versus written, or formal interactions versus casual conversation.
As a result of these advances, there are typically four features that are seen as characteristic of corpus-based analyses of language: It is empirical, analysing the actual patterns of use in natural texts. It utilizes a large and principled collection of natural texts, known as a ‘corpus’, as the basis for analysis. It makes extensive use of computers for analysis, using both automatic and interactive techniques. It depends on both quantitative and qualitative analytical techniques. (From Biber, Conrad and Reppen, 1998: 4.)
Corpora Existing corpora (British National Corpus (BNC), the Corpus of Contemporary American English (COCA), the Brown Corpus, the Lancaster/Oslo–Bergen (LOB) Corpus and the Helsinki Corpus of English Texts.) However, researchers interested in exploring aspects of language use that are not represented by readily available corpora (for example, research issues relating to a particular register or time period) will need to compile a new corpus.
Corpus Desing and Compilation need to be principled to ensure representativness and balance; Choose a reliable tool for this purpose (MonoConc, WordSmith, Antconc…) e.g. business letters example; collect representative sample; combination of quantative and qualitative techniques; What is the size of a representative corpus? Early-standard size was one million words, but there is no minimum size for a text collection to be considered a corpus.
What can corpus tell us? 1. WORD LIST where se can see the frequency of occurence information (alphabetic or frequency order);
2. CONCORDANCING PACKAGES can provide additional information on about lexical co-occurance patterns;
Concordance packages use: a target word or a phrase needs to be selected; the list of each occurrence of the target word in the context is provided; the displays (on the previous slide) shows the key word in the context (KWIC); the size of the windows and the amount of context can be adjusted; a concordance program can also provide information about the words that tend to occur together in the corpus; Words that commonly occur with or in the vicinity of a target word (that is, with greater probability than random chance) are called ‘collocates’, and the resulting sequences or sets of words are called ‘collocations’.
Corpus analysis and language teaching benefits Through the use of corpus analyses we can discover patterns of use that previously were unnoticed. (e.g. start/begin); We can discover lexical phrases or lexical bundles that occur with a greater than random frequency (e.g. Longman Grammar of Spoken and Written English lists collocations and freuqent three-, four- and five-word lexical bundle patterns by register.) We can use these information in teaching and in material creation; Task: Where do you see this kind of knowledge valuable in ELT practice?
Homework Enter the corpus and type a word in the search box and analyze it in the context. Choose the option KWIC before pressing the search button.
Homework the kind of texts you chose for the analysis (e.g. spoken/written, magazines/interview…; the number hits; grammar words preceding and following; the most frequent context; anything else that you have noticed;
Corpus linguistics – practice p General Service List Academic Word List New Academic Word List
TERMIN VEŽBI PONEDELJKOM Needs Analysis (Savremeni koncept analize potreba) p