Use of Concordancers A corpus (plural corpora) – a large collection of texts, written or spoken, stored on a computer. A concordancer – a computer programme.

Slides:



Advertisements
Similar presentations
The English Vocabulary Profile
Advertisements

Concordancing at Upper-Intermediate Levels What it is not What you will get from this talk.
Integrating corpus-based vocabulary activities into an academic writing course TESOL 2005, San Antonio, Texas March 30, 2005 John Bunting Georgia State.
Uses of a Corpus “[E]xplore actual patterns of language use”
Using Corpus Tools in Discourse Analysis Discourse and Pragmatics Week 12.
1 Corpora for all Adam Kilgarriff Lexical Computing Ltd Lexicography MasterClass Ltd Universities of Leeds and Sussex.
Recent Developments in Technological Tools for the Purpose of Facilitating SLA.
1/26 Corpus Linguistics. 2/26 Varieties of English Relevance of corpus linguistics to this course –Previously studies of stylistics were largely informal.
Data-Driven South Asian Language Learning SALRC Pedagogy Workshop June 8, 2005 J. Scott Payne Penn State University
Resources Primary resources – Lexicons, structured vocabularies – Grammars (in widest sense) – Corpora – Treebanks Secondary resources – Designed for a.
Word Usage and Vocabulary in context Lecture 8
The application of corpus analysis and concordance feedback to collegiate EFL writing Presenter: Wen-Shuenn Wu (Michael Wu) Chung Hua University, Hsinchu,
Resources for Using Corpus Linguistics in ELT Kenji Kitao Doshisha University Kyoto, Japan S. Kathleen Kitao Doshisha Women ’ s College Kyoto, Japan.
Presented by Jennifer Robison TexTESOL II March 12, 2010 San Antonio, TX.
1 Vocab Assessment & Corpora and Concordancing Major vocabulary assessment tools Major corpora and concordancers.
Corpus Linguistics What can a corpus tell us ? Levels of information range from simple word lists to catalogues of complex grammatical structures and.
Research methods in corpus linguistics Xiaofei Lu.
Memory Strategy – Using Mental Images
Getting to know each other. MAAL6018 Vocabulary Teaching And Learning Course Outline Session 1Building blocks and dimensions of vocabulary knowledge.
Masaryk University, Brno Friday 13 th September Katie Mansfield
McEnery, T., Xiao, R. and Y.Tono Corpus-based language studies. Routledge. Unit A 2. Representativeness, balance and sampling (pp13-21)
Online Corpora in L2 Writing Class Zawan Al Bulushi Indiana University Bloomington November 15,
Linguistics, Pragmatics & Natural Grammar
BTANT 129 w5 Introduction to corpus linguistics. BTANT 129 w5 Corpus The old school concept – A collection of texts especially if complete and self-contained:
Today Vocabulary teaching Vocabulary assessment
Researching language with computers Paul Thompson.
1 Taxonomy of VLS by Schmitt (1997) Taxonomy of Language Learning Strategies (LLS) by O’Malley & Chamot, 1990: cognitive, metacognitive, socio-affective.
Why We Need Corpora and the Sketch Engine Adam Kilgarriff Lexical Computing Ltd, UK Universities of Leeds and Sussex.
Tracking Language Development with Learner Corpora Xiaofei Lu CALPER 2010 Summer Workshop July 12, 2010.
Corpora and Concordancers in ESL/EFL Class: Truly Authentic Language for Language Learning. and opening.
Seminar in Applied Corpus Linguistics: Introduction APLNG 597A Xiaofei Lu August 26, 2009.
Working with learners who are reading at starting points a focus on vocabulary Sue Dymock & Sue Douglas.
How Can Corpora Help Me To Be Successful in CO150?
Corpus approaches to discourse
Five Energizing Activities to Boost Vocabulary Production and Retention.
Colorado State University
Corpus search What are the most common words in English
Using Corpora to Teach Vocabulary Helping Students Help Themselves 1.
Corpus Linguistics MOHAMMAD ALIPOUR ISLAMIC AZAD UNIVERSITY, AHVAZ BRANCH.
Using Corpora in TEFL By Terri Yueh. WhyWhy Work With Corpora? Why  From Vocabulary to Corpus  Choosing a Corpus Choosing a Corpus  Examples of Word.
What is a Corpus? What is not a corpus?  the Web  collection of citations  a text Definition of a corpus “A corpus is a collection of pieces of language.
CORPUS LINGUISTICS 1) A revision of corpus linguistics 2) Language corpora in the ESL/EFL classroom.
MAAL6018 Vocabulary Teaching And Learning Course Outline Session 1Building blocks and dimensions of vocabulary knowledge - What is a word? What is meant.
Making trouble-free corpus tasks in 10 minutes Jennie Wright.
PRIMENJENA LINGVISTIKA I NASTAVA JEZIKA II 3 rd class.
Using language corpora in developing Arabic lessons & syllabuses
Corpora: a key part of a materials writer’s toolkit
Writing Inspirations, 2017 Aalto University
Vocabulary Module 2 Activity 5.
ALE161 國際行銷英文簡報技巧 International Marketing Presentation Techniques
CORPUS LINGUISTICS Corpus linguistics is the study of language as expressed in samples (corpora) or "real world" text. An approach to derive at a set of.
Vocabulary acquisition in language classrooms

Computational and Statistical Methods for Corpus Analysis: Overview
ALE161 國際行銷英文簡報技巧 International Marketing Presentation Techniques
Exploring the BNC Corpus
Corpus Linguistics I ENG 617
Introduction to Corpus Linguistics: Exploring Collocation
Introduction to Corpus Linguistics: Applications Lexicography
Corpus Linguistics I ENG 617
Writing Inspirations, Spring 2016 Aalto University
Corpora and Concordancers in ESL/EFL Class:
Corpus-Based ELT CEL Symposium Creating Learning Designers
Development of an Online Adaptive Vocabulary Test System
Developing a prototype of Online Adaptive Vocabulary Test:
VOCABULARY ASSESSMENT
(word formation: follow up)
Using GOLD to Tracking L2 Development
A Corpus-Based Approach to Adapting Authentic Military Material
Applied Linguistics Chapter Four: Corpus Linguistics
Presentation transcript:

Use of Concordancers A corpus (plural corpora) – a large collection of texts, written or spoken, stored on a computer. A concordancer – a computer programme used to search this database

2 Considerations General English / Academic English / Specialised English (e.g. medical, law, 1K and 2K graded, UWW corpora on Compleat Lexical Tutor Written / Spoken? Size? Currency? Free of charge?

Corpus Size “I don’t think there can be any corpora, however large, that contain information about all of the areas of English….that I want to explore [but] every corpus that I’ve had a chance to examine, however small, has taught me facts that I couldn’t imagine finding out about in any other way.” (Fillmore, 1992, p. 35)

Use of Corpora Word lists and dictionary entries (different senses of a word / typical examples of usage / frequency information) are compiled by computational linguists using a corpus of the language. E.g. the COBUILD project was the first project using a computerised corpus for dictionary making in the 1980s, Collins started to use a computerised corpus (then called the COBUILD corpus) with John Sinclair of University of Birmingham; now the Collins Cobuild Corpus has 2.5 billion words (part of which is the Bank of English Corpus) 4

Major Corpora Matching exercise Matching exercise 5

6 Major corpus: BNC 100 million words Written (90%) and spoken (10%) samples British English from the 1980’s to 1993 General English

7 Major corpus: Brown corpus 1 million words American English One of the earliest corpora / compiled in 1960s 500 text samples from 15 text categories Searchable through Compleat Lexical Tutor at d_e.html d_e.html

8 Major corpus: Bank of English Part of Collins Cobuild Corpus 450 million words as of 2005 (650 million words as of 2012) 75% written and 25% spoken 70% British, 20% American and 10% others Contemporary English html html

Major Corpus: The Corpus of Contemporary American English (COCA) Contemporary American English containing about 450 million words (from 1990 to 2011) five genres: spoken, fiction, popular magazines, newspapers, and academic journals five genres: spoken, fiction, popular magazines, newspapers, and academic journals 9

10 Major corpus: MICASE Michigan Corpus of Academic Spoken English Michigan Corpus of Academic Spoken English started in 1997 started in 1997 contains transcripts and audio files of academic speech contains transcripts and audio files of academic speech

Some user-friendly concordancers Word Neighbors (developed by University of Science and Technology) COCA (needs registration) Create your own concordance using tools provided by CAES, HKU: 11

The use of chemicals in food has started concern in the public …..Do we say “start concern”? The use of chemicals in food has started concern in the public …..Do we say “start concern”?

The use of chemicals in food has caused great concern among the public. The public have expressed deep concern about the use of chemicals in food. The use of chemicals in food has started concern in the public …..Do we say “start concern”?

Tasks - answers The public have expressed concern about … / … are of great concern to the public Improve / increase / promote efficency Substitute for

COCA Corpus What are the differences between “ardent” and “fervent”? Can they be used interchangeably? What are the differences between “sheer” and “complete”? Can they be used interchangeably? 19

Create your own concordance using tools provided by CAES, HKU htm htm 20

How can corpora be used in the classroom? Student A – part 2 of the talk Student B – part 3 of the talk

Getting students to use a corpus in the classroom Which 3 nouns come most frequently after “underlying”? Then, compare your results with examples from a dictionary. How to use the phrase “not only … but (also) …” 22

Answers Word Neighbours: Underlying cause/s Underlying assumptions Underlying principle Cambridge Dictionary Online: Underlying significance Word Neighbours: Not only (verb) but also (verb) Not only (noun) but also (noun) Not only (adjective) but also (adjective) Not only (prep + noun) but also (prep + noun) 23

How can concordancers be used to facilitate vocabulary learning/teaching? See which words are low-frequency words (off-list words using Vocab Profiler) to see which words are likely to cause difficulty (can pre-teach these words), and see whether a text is likely to cause difficulty to students Study words in context and increase depth of processing Check grammatical behaviour of words e.g. what prepositions to use after a verb Check collocations and lexical patterns Find out about the frequencies of words / word combinations Find out about usage of a word in different text types (e.g. fiction vs academic / spoken vs written), e.g. by using “Range” on Compleat Lexical Tutor

VOCABULARY ASSESSMENT 25

Vocabulary Assessment Tools What kind of vocabulary knowledge is being tested in each of the tests? Do you see any problems with some of the tests? 26

27 Various vocabulary assessment tools (available at Vocabulary Levels Tests (VLTs) To check vocabulary size at different word frequency levels – both receptive and productive 2000, 3000, 5000, word levels; AWL Aim at score of at least 80% Word Association Test Meaning (different senses of a word), collocations Vocabulary Knowledge Scale (VKS) To check “quality” or “depth” of vocab knowledge Vocab Profiler Lexical richness (type/token ratio) – more different words More frequent words or more low-frequency words being used

Vocabulary Knowledge Scale (VKS) “retire” iii. I have seen this word before and I think it means “stop working because of old age” (3 pts) iv. I know this word. It means “stop working because of old age” (3 pts) v. I can use this word in a sentence: He spent more time with his family after retire. (4 pts) He spent more time with his family after he retired. (5 pts) He wants to retire. (? pts) 28

29 VKS Problems: Self-reported in nature Level V: ability to produce sentence with target vocab = ability to use the word appropriately?

Preparation for next class Make a plan for your assignment For discussion next week 30