Daniel Nkemleke, Humboldt Kolleg Kamerun, 30/07/2008 Corpus Linguistics and Language Education: Development and Utility of the Corpus of Cameroon English.

Slides:



Advertisements
Similar presentations
Diachronic study and language change Corpus Linguistics Richard Xiao
Advertisements

School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING A comparative study of the tagging of adverbs in modern English corpora.
Presenters: Dr. Ann Snow & Dr. Anne Katz December 7, 2010
Diachronic study and language change Corpus Linguistics Richard Xiao
Uses of a Corpus “[E]xplore actual patterns of language use”
Corpus design See G Kennedy, Introduction to Corpus Linguistics, Ch.2
Introduction: A discourse perspective on grammar
Dr. Daniel A. Nkemleke Department of English Ecole Normale Supérieure
Using an Enhanced MDA Model in study of World Englishes Richard Xiao
1 Analysing and teaching meaning (3) Analysing and teaching meaning (3) SSIS Lazio - Lesson 3 prof. Hugo Bowles January 2007.
A corpus-based study of lexical bundles in students‘ dissertations in Cameroon Prof Daniel A. Nkemleke Department of English Ecole Normale Supérieure University.
Recent Developments in Technological Tools for the Purpose of Facilitating SLA.
What is a corpus?* A corpus is defined in terms of  form  purpose The word corpus is used to describe a collection of examples of language collected.
Talking about your homework News story? –What made you choose…? One of your words? –What made you choose…? (Give your vocabulary books to another student.
1/26 Corpus Linguistics. 2/26 Varieties of English Relevance of corpus linguistics to this course –Previously studies of stylistics were largely informal.
Data-Driven South Asian Language Learning SALRC Pedagogy Workshop June 8, 2005 J. Scott Payne Penn State University
Corpus 01 Introduction Historical Review. Corpus Linguistics Linguists need evidence for theories. Evidences can be from intuition or introspection, experimentation.
LELA English Corpus Linguistics
Presented by Jennifer Robison TexTESOL II March 12, 2010 San Antonio, TX.
Corpus Linguistic Development with reference to Cameroon Prof. Dr Daniel A. Nkemleke Department of English Ecole Normale Supérieure University of Yaounde.
Corpus Linguistics What can a corpus tell us ? Levels of information range from simple word lists to catalogues of complex grammatical structures and.
A Brief Introduction to Stylistics By:Dr.K.T.KHADER
Research methods in corpus linguistics Xiaofei Lu.
Chapter 3: An Introduction to Corpus Linguistics Compiled by: Sajjad Ghadamyari Farhad Ghiasvand Presentation Date: Dec. 8, Monday.
14: THE TEACHING OF GRAMMAR  Should grammar be taught?  When? How? Why?  Grammar teaching: Any strategies conducted in order to help learners understand,
English Corpora and Language Learning Tamás Váradi
Memory Strategy – Using Mental Images
CORPUS LINGUISTICS: AN INTRODUCTION Susi Yuliawati, M.Hum. Universitas Padjadjaran
The ‘London Corpora’ projects - the benefits of hindsight - some lessons for diachronic corpus design Sean Wallis Survey of English Usage University College.
McEnery, T., Xiao, R. and Y.Tono Corpus-based language studies. Routledge. Unit A 2. Representativeness, balance and sampling (pp13-21)
Online Corpora in L2 Writing Class Zawan Al Bulushi Indiana University Bloomington November 15,
Prof. Karīna Aijmere ( Karin Aijmer ) Gēteborgas Universitāte, Zviedrija „Valodas apguvēju korpuss – tā veidošana un izmantošana valodu apguvē, mācību.
BTANT 129 w5 Introduction to corpus linguistics. BTANT 129 w5 Corpus The old school concept – A collection of texts especially if complete and self-contained:
Representatıvness, balance and samplıng ın a corpus Lınguistıcs.
1 Corpora: Annotating and Searching LING 5200 Computational Corpus Linguistics Martha Palmer.
Researching language with computers Paul Thompson.
Chapter 10 Language and Computer English Linguistics: An Introduction.
UCREL: from LOB to REVERE Paul Rayson. November 1999CSEG awayday Paul Rayson2 A brief history of UCREL In ten minutes, I will present a brief history.
Tracking Language Development with Learner Corpora Xiaofei Lu CALPER 2010 Summer Workshop July 12, 2010.
Methodology Lecture # 21. Review of the last lecture 1.Authentic language in real context: sports columns from a recent newspaper 2: Ability to figure.
Using a Story-Based Approach to Teach Grammar
Instructor: Chelsea Jones Teaching English in English (TEE) January 2012 Adapted from: Dr. Scott Phillabaum’s PPT Presentation on Pragmatics.
How Can Corpora Help Me To Be Successful in CO150?
Mabel Ortiz N.. Discourse analysis 1. What is discourse? It is written or spoken _______. A. Words B. Sentences C. Paragraphs D. Communication What is.
Begin $100 $200 $300 $400 $500 CategorytwoCategorythreeCategoryfourCategoryfiveCategorysixCategoryone.
1 English In A Changing World Introduction. 2 3 Text And New Words: Advice  Record New Unfamiliar Words  Organize In Textbook Units or by Topics 
Corpus approaches to discourse
Enda F. Scott 2001 Good morning An introduction to modern dictionary making.
Building and analysing your own corpus 1. Building a corpus.
New Englishes. Global English  ‘[…] the English language ceased to be the sole possession of the English some time ago’ (Rushdie, 1991)  Loss of ownership.
Colorado State University
Corpus Linguistics MOHAMMAD ALIPOUR ISLAMIC AZAD UNIVERSITY, AHVAZ BRANCH.
LECTURE 3 1 APPROACHES TO THE STUDY OF LANGUAGE IN SOCIETY.
Discourse Analysis Before giving a simplistic definition of "discourse analysis ", it is expedient to look at some definitions and quotations from well.
Using Corpora in TEFL By Terri Yueh. WhyWhy Work With Corpora? Why  From Vocabulary to Corpus  Choosing a Corpus Choosing a Corpus  Examples of Word.
INTRODUCTION TO APPLIED LINGUISTICS
CORPUS LINGUISTICS 1) A revision of corpus linguistics 2) Language corpora in the ESL/EFL classroom.
To my presentation about:  IELTS, meaning and it’s band scores.  The tests of the IELTS  Listening test.  Listening common challenges.  Reading.
Use of Concordancers A corpus (plural corpora) – a large collection of texts, written or spoken, stored on a computer. A concordancer – a computer programme.
PRIMENJENA LINGVISTIKA I NASTAVA JEZIKA II 3 rd class.
Use of Literature in Language Teaching
CORPUS LINGUISTICS Corpus linguistics is the study of language as expressed in samples (corpora) or "real world" text. An approach to derive at a set of.

Using Corpora in Linguistics
Exploring the BNC Corpus
عمادة التعلم الإلكتروني والتعليم عن بعد
Corpus-Based ELT CEL Symposium Creating Learning Designers
APPROACHES TO THE STUDY OF LANGUAGE IN SOCIETY
Applied Linguistics Chapter Four: Corpus Linguistics
The Roles of Teachers and Learners in the Classroom
Presentation transcript:

Daniel Nkemleke, Humboldt Kolleg Kamerun, 30/07/2008 Corpus Linguistics and Language Education: Development and Utility of the Corpus of Cameroon English Daniel A. Nkemleke Department of English Ecole Normale Supérieure University of YaoundeI Outline  Introduction: Corpus Linguistics, history  Some (main) existing corpora  Development of the Corpus of Cameroon English (CCE)  Corpus utility with reference to the CCE  Prospect

Daniel Nkemleke, Humboldt Kolleg Kamerun, 30/07/2008 Corpus Linguistics and Language Education: Development and Utility of the Corpus of Cameroon English Daniel A. Nkemleke Department of English Ecole Normale Supérieure University of YaoundeI Plan  Introduction: Corpus Linguistics, history  Some (main) existing corpora  Development of the Corpus of Cameroon English (CCE)  Corpus utility with reference to the CCE  Prospect

Daniel Nkemleke, Humboldt Kolleg Kamerun, 30/07/2008 Introduction: what is Corpus Linguistics?  The study of language based on examples of “real life“ language use, collected, stored and processed via computer  Facilitated by the advent of computer technology (1960s)  Latin: corpus (body): body of text  any collection of more than one text, written or spoken

Daniel Nkemleke, Humboldt Kolleg Kamerun, 30/07/2008 Introduction (con’t): brief history  Before 1940s/1950s: “early corpus linguistics“  corpus-based methodology (“Primitive corpora?“)  Between 1960s and 1980s: minority of linguists continued working on corpus-based work (Quirk: SEU, Francis & Kucera: Brown corpus, Svartik: London-Lund corpus)  Computer technology: major support for CL  First African Corpus: 1989 (ICE-East Africa) (Schmied 1989)  Second African Corpus: 1992 CCE (Tiamajou 1993)/ Nigeria??

Daniel Nkemleke, Humboldt Kolleg Kamerun, 30/07/2008 Introduction (con’t): brief history “Thirty years ago when this research started it was considered impossible to process texts of several million words in length. Twenty years ago it was considered marginally possible but lunatic. Ten years ago it was considered quite possible but still lunatic. Today it is very popular“ (Thomas/Short 1996: 4)

Daniel Nkemleke, Humboldt Kolleg Kamerun, 30/07/2008 Some (main) existing corpora L1 Corpora  Brown Corpus of American English  Lancaster-Oslo/Bergen Corpus (LOB)  London-Lund Corpus  British National Corpus (BNC)  Birmingham Corpus of British English L2 Corpora  ICE-East Africa (Kenya & Tanzania)  Corpus of Cameroon English  Corpus of Nigerian English ??  Kolhapur Corpus of Indian English Multinational Corpus Project  International Corpus of English (ICE)

Daniel Nkemleke, Humboldt Kolleg Kamerun, 30/07/ main characteristics of a corpus 1. Sampling & representativeness  Interest in whole variety of English  Attempts to construct a “representative” sample corpus  Which maximally represents variety  Aim: picture as accurate and reasonable as possible of a language population

Daniel Nkemleke, Humboldt Kolleg Kamerun, 30/07/2008 Four main characteristic of a corpus (Con‘t) 2. Finite size  Body of finite amount of words, e.g. 1,000,000  Figure determined at beginning of project  monitor corpus: constant addition of texts

Daniel Nkemleke, Humboldt Kolleg Kamerun, 30/07/2008 Four main characteristics of a corpus (con‘t) 3. Machine-readable form  Past: reference to printed text  Nowadays: implication, machine-redable  Few in book form (e.g. original London-Lund)  Occasionally other forms of media (microfiche, recordings)

Daniel Nkemleke, Humboldt Kolleg Kamerun, 30/07/2008 Four main characteristics of a corpus (con‘t) 4. Standard reference  Tacitly a corpus constitutes a standard reference  Presupposition: wide availability to other researchers  Direct comparison of results with other varieties

Daniel Nkemleke, Humboldt Kolleg Kamerun, 30/07/2008 Development of the Corpus of Cameroon English (CCE)  Began in 1992 with the collaboration of two British universities (Birmingham/Liverpool)  Assistance of the British council in Yaoundé  Target of a million words reached in 1994  Data use for classroom activities/research since then  2005: project benefited from a grant of the AvH → Goal: Further development (tagging) of the database (TU-Chemnitz)

Daniel Nkemleke, Humboldt Kolleg Kamerun, 30/07/2008 Objective  Provide authentic data for the description of the main features and problems inherent in the variety of English which is written in Cameroon  Provide a source of authentic material for English language teaching/learning in Cameroon  Serve as a database for comparative studies on CamE in relation to other varieties of English

Daniel Nkemleke, Humboldt Kolleg Kamerun, 30/07/2008 Text categories: written component Text categoriesNo. of textsNo. of words A: Official Press257126,539 B: Private Press4249,098 C: Novels & Short Stories2177,096 D: Religion1996,380 E: Tourism526,881 F: Official letters7712,285 G: Private letters25079,386 H: Students’ Essays83137,399 I: Government Memos1671,368 J: Advertisement104,875 K: Miscellaneous22139,247 TOTAL802820,554

Daniel Nkemleke, Humboldt Kolleg Kamerun, 30/07/2008 Text categories: spoken component  Dialogues 1. Conversations 2. Phone calls 3. Broadcast discussions 4. Classroom lessons 5. Interviews 6. Parliamentary debates 7. Legal cross- examination 8. Business transactions  Monologues 1. Commentaries 2. Demonstrations 3. Legal Presentations 4. Broadcast News 5. Broadcast Talks 6. Non-broadcast Talks

Daniel Nkemleke, Humboldt Kolleg Kamerun, 30/07/2008 Corpus utility with reference to CCE  13 possible ways in which a corpus may be useful 1. Corpora as a source of empirical data 2. Corpora in language teaching and learning 3. Corpora in Lexical studies 4. Corpora in grammar studies 5. Corpora in speech research 6. Corpora and semantic studies 7. Corpora in pragmatic and discourse studies 8. Corpora in sociolinguistic studies 9. Corpora and stylistic studies 10. Corpora in historical linguistics 11. Corpora in dialectology and variational studies 12. Corpora in Psycholinguistics 13. Corpora in cultural studies

Daniel Nkemleke, Humboldt Kolleg Kamerun, 30/07/ Corpus as a source of empirical data  Linguists can make more objective statements on language use in the variety, comparing other varieties Nkemleke /Mbangwana (2001) Nkemleke (2003) Nkemleke (2004a, 2004b) Nkemleke (2005) Nkemleke(2006) Nkemleke (2007a, 2007b) Nkemleke(fc: 2008a, 2008b, 2008c) Schmied/Nkemleke (fc:2008a, 2008b) A number of post-graduate projects in ENS/Faculty

Daniel Nkemleke, Humboldt Kolleg Kamerun, 30/07/ Corpora in language teaching/learning  CCE data used for classroom activities over the years

Daniel Nkemleke, Humboldt Kolleg Kamerun, 30/07/2008 Concordances : arrive _ NP (Simplification)

Daniel Nkemleke, Humboldt Kolleg Kamerun, 30/07/2008 Value of concordances  Support teachers’ classroom explanation  Learner’s as researchers  Data-driven learning  Critical look at existing language teaching material

Daniel Nkemleke, Humboldt Kolleg Kamerun, 30/07/2008 Natural data for textbook  CCE data used for studies on aspects of Cameroon English usage, E.g. Hans-Georg Wolf used data from the corpus in his book English in Cameroon, published in 2001 by Mouton de Grouter (Berlin/New York).

Daniel Nkemleke, Humboldt Kolleg Kamerun, 30/07/ Corpora in Lexical Studies  Keep informed about new words, changing meanings  Call up word combinations, co-occurring words

Daniel Nkemleke, Humboldt Kolleg Kamerun, 30/07/2008 Prospect  ICE-Cameroon is on-going  Future possibility of more specialized corpora E.g. Academic texts, Fiction

Daniel Nkemleke, Humboldt Kolleg Kamerun, 30/07/2008 END Thank You!