Corpus Linguistics in Research Doctorate in Education University of Warwick 6th November 2008.

Slides:



Advertisements
Similar presentations
Part Two: Using Xaira to explore corpora Richard Xiao
Advertisements

Corpus Linguistics Richard Xiao
Concordancing at Upper-Intermediate Levels What it is not What you will get from this talk.
Databases. What is a database? It is a collection of information, which can be searched and sorted. It can be information about anything. Toys, pupils,
Uses of a Corpus “[E]xplore actual patterns of language use”
Using Corpus Tools in Discourse Analysis Discourse and Pragmatics Week 12.
Høgskolen i Oslo Using Self-Compiled, Discipline- Specific Corpora as a Practical Learning-Research Tool for Developing Written Language Skills in English.
Methods in Computational Linguistics II Queens College Lecture 1: Introduction.
What is a national corpus. Primary objective of a national corpus is to provide linguists with a tool to investigate a language in the diversity of types.
The Sketch Engine -What is The Sketch Engine? -What is a corpus? -Looking at the BASE and the BAWE corpora. -How can this help.
What is a corpus?* A corpus is defined in terms of  form  purpose The word corpus is used to describe a collection of examples of language collected.
Constructing and Evaluating Web Corpora: ukWaC Adriano Ferraresi University of Bologna Aston University Postgraduate Conference.
Corpus Linguistics. What is corpus linguistics? Method / Theory in Linguistics Analysis of collections of texts (corpora) Verifying/ Strengthening or.
Talking about your homework News story? –What made you choose…? One of your words? –What made you choose…? (Give your vocabulary books to another student.
Corpus Linguistics and Second Language Acquisition – The use of ACORN in the teaching of Spanish Grammar Guadalupe Ruiz Yepes.
Using Corpora in Linguistics Introduction to WordSmith Tools for Beginners Íde O’Sullivan Regional Writing Centre
Pedagogic uses of a corpus of student writing and their implications for sampling and annotation Alois Heuboeck University of Reading, UK.
Data-Driven South Asian Language Learning SALRC Pedagogy Workshop June 8, 2005 J. Scott Payne Penn State University
Using Corpora in Linguistics
Introducing Corpus Linguistics: AntConc and Project Gutenberg. Dr Glenn Hadikin.
Corpus Linguistics: session 2 Corpus Linguistics (2): The Tools of the Trade 669o4zt
Corpus Linguistics What can a corpus tell us ? Levels of information range from simple word lists to catalogues of complex grammatical structures and.
Research methods in corpus linguistics Xiaofei Lu.
FATMA ISMED K1.09 CALL. Advantages of s s are easy to use. You can organize your daily correspondence, send and receive electronic messages.
Memory Strategy – Using Mental Images
CORPUS LINGUISTICS: AN INTRODUCTION Susi Yuliawati, M.Hum. Universitas Padjadjaran
Masaryk University, Brno Friday 13 th September Katie Mansfield
McEnery, T., Xiao, R. and Y.Tono Corpus-based language studies. Routledge. Unit A 2. Representativeness, balance and sampling (pp13-21)
The main advantage of concordancing tools is that they allow translators to see terms in a variety of contexts simultaneously to detect various kinds of.
Online Corpora in L2 Writing Class Zawan Al Bulushi Indiana University Bloomington November 15,
Introduction to Florian Jaeger, For the Methods class, December 3 rd, 2003.
Researching language with computers Paul Thompson.
©2006 Barry Natusch Tools for Language Researchers Barry Natusch “ Man is a tool-using animal. Without tools he is nothing, with tools he is all. ” - Thomas.
Learner corpus analysis and error annotation Xiaofei Lu CALPER 2010 Summer Workshop July 13, 2010.
10 practical uses of a million-word corpus in ELT ( All easy to find and use on – just add imagination) March 30, Fri, 10-11h30.
ENG 626 CORPUS APPROACHES TO LANGUAGE STUDIES exploring frequencies in texts Bambang Kaswanti Purwo
practical aspects1 Translation Tools Translation Memory Systems Text Concordance Tools Useful Websites.
Tracking Language Development with Learner Corpora Xiaofei Lu CALPER 2010 Summer Workshop July 12, 2010.
Seminar in Applied Corpus Linguistics: Introduction APLNG 597A Xiaofei Lu August 26, 2009.
Title: Corpus-Based Analysis of the Representation of Thai Government in Bangkok Post and the Nation: Critical Discourse Analysis Mr.Warawit Natephra.
英 3B 戴偲婷. WConcord is a fast and easy to use concordancer for unlimited amounts of text. It allows the user to load multiple plain text files (.txt)
Elena Tarasheva, PhD New Bulgarian University. Conclusions at last year’s BETA conference.
Corpus approaches to discourse
Engaging with data Choices and decisions. Seeing or looking at? The advance of corpus linguistics has certainly changed the way that we can look at our.
Copy all files on CD to D drive D:\workshop. Corpus: An Internet Metaphor  Web pages + search engine  Texts + Tools.
Corpus Linguistics MOHAMMAD ALIPOUR ISLAMIC AZAD UNIVERSITY, AHVAZ BRANCH.
What’s in a Wordle? Vocabulary Learning Made Fun Tilly Harrison University of Warwick.
What is a Corpus? What is not a corpus?  the Web  collection of citations  a text Definition of a corpus “A corpus is a collection of pieces of language.
中国学习者英语笔语中的 词块能力研究 许家金 中国外语教育研究中心 北京外国语大学. Lexical Chunks in Chinese Learners ’ Writing (WECCL) Xu Jiajin Beijing Foreign Studies University.
Use of Concordancers A corpus (plural corpora) – a large collection of texts, written or spoken, stored on a computer. A concordancer – a computer programme.
PRIMENJENA LINGVISTIKA I NASTAVA JEZIKA II 3 rd class.
Using Parallel Corpora for Contrastive Studies Michael Barlow.
Searching corpora.
AntConc is a freeware, multiplatform of application suitable for all types of users
8. Translation resources
Using Corpora in Linguistics
Computational and Statistical Methods for Corpus Analysis: Overview
عمادة التعلم الإلكتروني والتعليم عن بعد
Introduction to Corpus Linguistics: Exploring Collocation
Topics in Linguistics ENG 331
Introduction to Corpus Linguistics: Dispersion/concordance plots
Introduction to Corpus Linguistics: Key Word Analysis
WE ARE STARTING OFF WITH COMPONENT 2: SECTION B
A Brief Intro to Corpus Techniques in ELT Research
Download Documents from HFM
A Search for Discipline-Specific Vocabulary
ICEweb 2 a new way of compiling high-quality web-based components for ICE corpora Martin Weisser Center for Linguistics & Applied Linguistics, Guangdong.
Using GOLD to Tracking L2 Development
Applied Linguistics Chapter Four: Corpus Linguistics
Presentation transcript:

Corpus Linguistics in Research Doctorate in Education University of Warwick 6th November 2008

Aims Understanding key terms Understanding relation to research paradigms Consideration of usefulness to own context Practice in hands-on use of a corpus tool

Key Terms Corpus Linguistics Corpus and corpora Lemma Concordance

Discussion 1. What do you know about corpus linguistics or corpus approaches? 2. Have you seen online concordancers? 3. Have you ever used a corpus tool (such as Wordsmith Tools)? What did you do with it? 4. Have you considered using a corpus approach as part of your project? Why/ why not?

Research Issues Competence or Performance? The influence of Chomsky on corpus research Quantitative or Qualitative? Perceived as quantitative and so harder to fit it to new approaches Positivist or Social Constructionist? - ‘Number crunching’ and statistics suggest it is a ‘scientific’ approach - less popular Technology-driven? - is research restricted to what is possible on a computer?

Advantages Can mitigate inevitable bias of Discourse Analysis Uncovering semantic prosody - the implicit negative or positive ‘flavour’ of a collocation Looking at language change over time Useful for triangulation of data ‘confirming suspicions’

Types of Corpus Specialised Corpora - usually small and tightly focused Sampling / Representative Corpora - early approach taking equal amounts of a variety of texts Diachronic Corpora - to track changes over time Reference Corpus - can be huge but no corpus can represent ‘the language’ (eg BNC) Monitor Corpus - kept up to date (eg BoE)

How to Manipulate a Corpus? BNC and BoE provide own search engines - not easily customised Stand alone corpora can use stand alone concordancers, eg WordSmith Tools, AntConc, MicroConcord Possible actions - wordlist, concordance lines for a word, collocates, clusters, key words (of a smaller corpus compared to a larger one)

Hands-on Open a browser and put AntConc into Google Go to Lawrence Anthony’s website and download AntConc 3.1 Click on ‘Open Files’ and choose a ‘corpus’ Make a Wordlist.

Wordlist Discussion Look at the Wordlist Which are the most frequent words and why? What are the top 5 ‘lexical’ words in your list? Do the most frequent words help to reveal any special features of the corpus?

Concordance Discussion Scroll down the word list and choose a word that interests you. You will see a concordance is automatically generated. What does this reveal about the word? (position in the sentence, punctuation, immediate collocates on the left and right) Try sorting the words on the left or right.

Cluster Discussion Click on clusters - this reveals the ‘lexical chunks’ of which this word is a part Are there any interesting patterns there? How useful is it to research lexical chunks?

Caveat “Acknowledging what a corpus-based approach can do and what it cannot do is necessary, but should not mean that we discard the methodology altogether - we should just be more clear about when it is appropriate to use it or employ some other method.” (Baker, 2006:7)