Corpus Linguistics: session 2 Corpus Linguistics (2): The Tools of the Trade 669o4zt

Slides:



Advertisements
Similar presentations
Part Two: Using Xaira to explore corpora Richard Xiao
Advertisements

IAC (ACCESS INTERFACE CORPUS) DEVELOPED BY BARCELONA MEDIA & UNIVERSITAT POMPEU FABRA TONI BADIA (BARCELONA MEDIA - UNIVERSITAT POMPEU FABRA) JUDITH DOMINGO.
Corpus Linguistics Richard Xiao
Corpus Linguistics: Counting words, texts or features Mike Scott, University of Liverpool Corpus Linguistics Summer Institute June-July 2008.
Concordancing at Upper-Intermediate Levels What it is not What you will get from this talk.
Uses of a Corpus “[E]xplore actual patterns of language use”
Using Corpus Tools in Discourse Analysis Discourse and Pragmatics Week 12.
ICAME in CLARIN - a software demo of Corpuscle Knut Hofland Uni Research Computing Bergen, Norway ICAME 35, Nottingham.
Recent Developments in Technological Tools for the Purpose of Facilitating SLA.
What is a corpus?* A corpus is defined in terms of  form  purpose The word corpus is used to describe a collection of examples of language collected.
Compiling a corpus II. Corpus A finite size, non random collection of naturally occurring language, in a computer readable form. Non-random = representative.
Today Listening test Corpus linguistics talk, Part 3 News task NEOs Life on Mars.
Corpus Linguistics. What is corpus linguistics? Method / Theory in Linguistics Analysis of collections of texts (corpora) Verifying/ Strengthening or.
Talking about your homework News story? –What made you choose…? One of your words? –What made you choose…? (Give your vocabulary books to another student.
Corpus Linguistics and Second Language Acquisition – The use of ACORN in the teaching of Spanish Grammar Guadalupe Ruiz Yepes.
Using Corpora in Linguistics Introduction to WordSmith Tools for Beginners Íde O’Sullivan Regional Writing Centre
Using Corpora in Linguistics
Resources for Using Corpus Linguistics in ELT Kenji Kitao Doshisha University Kyoto, Japan S. Kathleen Kitao Doshisha Women ’ s College Kyoto, Japan.
Corpus Linguistics What can a corpus tell us ? Levels of information range from simple word lists to catalogues of complex grammatical structures and.
Research methods in corpus linguistics Xiaofei Lu.
Knowledge Science & Engineering Institute, Beijing Normal University, Analyzing Transcripts of Online Asynchronous.
Memory Strategy – Using Mental Images
McEnery, T., Xiao, R. and Y.Tono Corpus-based language studies. Routledge. Unit A 2. Representativeness, balance and sampling (pp13-21)
Online Corpora in L2 Writing Class Zawan Al Bulushi Indiana University Bloomington November 15,
Using corpora for bespoke language teaching
ENG 626 CORPUS APPROACHES TO LANGUAGE STUDIES lexico-grammatical profiles Bambang Kaswanti Purwo
U SING C ORPUS - BASED R ESEARCH FOR L ANGUAGE T EACHING AND L EARNING ENGLISH 510 Hee Sung (Grace) Jun & Kimberly LeVelle.
Translation Studies 8. Research methods in Translation Studies Krisztina Károly, Spring, 2006 Sources: Károly, 2002; Klaudy, 2003.
Introduction to Florian Jaeger, For the Methods class, December 3 rd, 2003.
 What is the BNC?  What is Xaira?  How to use the BNC for: › Language teaching and learning › Research.
Researching language with computers Paul Thompson.
Learner corpus analysis and error annotation Xiaofei Lu CALPER 2010 Summer Workshop July 13, 2010.
ENG 626 CORPUS APPROACHES TO LANGUAGE STUDIES exploring frequencies in texts Bambang Kaswanti Purwo
Tracking Language Development with Learner Corpora Xiaofei Lu CALPER 2010 Summer Workshop July 12, 2010.
Corpora and Concordancers in ESL/EFL Class: Truly Authentic Language for Language Learning. and opening.
Creating Authentic EFL Materials Using English Corpora: Some Benefits of Corpus for the Layman Tyler Barrett Kure City ALT
Food and Agriculture Organization of the UN Library and Documentation Systems Division July 2005 Ontologies creation, extraction and maintenance 6 th AOS.
Seminar in Applied Corpus Linguistics: Introduction APLNG 597A Xiaofei Lu August 26, 2009.
Natural Language Processing Spring 2007 V. “Juggy” Jagannathan.
Electronic Texts and Their Study Geoffrey M. Rockwell x TSH 312 courses/textanalysis/
英 3B 戴偲婷. WConcord is a fast and easy to use concordancer for unlimited amounts of text. It allows the user to load multiple plain text files (.txt)
Corpus approaches to discourse
Corpus Linguistics in Research Doctorate in Education University of Warwick 6th November 2008.
Communicative and Academic English for the EFL Professional.
Corpus Linguistics MOHAMMAD ALIPOUR ISLAMIC AZAD UNIVERSITY, AHVAZ BRANCH.
What is a Corpus? What is not a corpus?  the Web  collection of citations  a text Definition of a corpus “A corpus is a collection of pieces of language.
CORPUS LINGUISTICS 1) A revision of corpus linguistics 2) Language corpora in the ESL/EFL classroom.
Text Mining for Music Research: Using word frequency to analyze content Janelle Varin The New School Music Library Association Conference Cincinnati, OH.
Making trouble-free corpus tasks in 10 minutes Jennie Wright.
Use of Concordancers A corpus (plural corpora) – a large collection of texts, written or spoken, stored on a computer. A concordancer – a computer programme.
PRIMENJENA LINGVISTIKA I NASTAVA JEZIKA II 3 rd class.
XAIRA is an XML Aware Indexing and Retrieval Architecture ● Developed from the British National Corpus Sara program, it provides: – platform-independent.
Corpus Linguistics Anca Dinu February, 2017.
Searching corpora.
AntConc is a freeware, multiplatform of application suitable for all types of users
Using Corpora in Linguistics
Computational and Statistical Methods for Corpus Analysis: Overview
Exploring the BNC Corpus
عمادة التعلم الإلكتروني والتعليم عن بعد
Session 3: Collocation 1.
Introduction to Corpus Linguistics: Exploring Collocation
Topics in Linguistics ENG 331
Introduction to Corpus Linguistics: Dispersion/concordance plots
Corpus Linguistics I ENG 617
Introduction to Corpus Linguistics: Key Word Analysis
Corpus-Based ELT CEL Symposium Creating Learning Designers
Introduction to Corpus Linguistics ENG 331
Using GOLD to Tracking L2 Development
Corpus processing tools
A new web-based corpus management and analysis platform
Presentation transcript:

Corpus Linguistics: session 2 Corpus Linguistics (2): The Tools of the Trade 669o4zt

Today’s session An introduction to some features of tools Demo of different (kinds of) tools Hands-on practice with one tool AIM: Help you know what to look for in a tool for your work (and what options there are)

TYPES OF TOOLS There are different

Different kinds of tools Online / offline For one particular corpus / for any corpus or text Use straight away / need to prepare corpus 'Free' / licence conditions and costs

Different kinds of tools Online / offline For one particular corpus / for any corpus or text Use straight away / need to prepare corpus 'Free' / licence conditions and costs

Tools may have different functions: concordance, wordlist, statistics, collocation, keywords… handle annotation: interpret tags, ignore tags, treat tags as text take different text formats:.txt,.xml,.html

TYPICAL FUNCTIONS Different tools have different functions.

Concordance Search word + context Can be displayed as KWIC Can usually be sorted Used to see patterns of use

KWIC Concordance

Wordlist List all words in the corpus alphabetically by frequency Used as starting point for further functions keywords lexical density/readability calculations

Sampler AntConc wordlist

Collocations Co-occurrence patterns borrow money borrow books borrow a car May I borrow (more in Session 3)

Collocates: adjectives immediately preceding BUSINESS Corpus of Contemporary American English /

Visualization Graphs Word clouds Distribution displays Etc.

Example: BNCweb

borrow

Example: Voyant Tools

‘borrow’ Compare your intuition to what you find in the corpus What is borrowed and by whom? What words do you expect to find together with borrow? Can these words be grouped in some way, for example based on their word class, function, or meaning? Where would you expect these words (e.g. before or after borrow? Immediately adjacent or not?) Who do you think uses the work borrow? In what context or type of language would you find borrow? Are there any words that are NOT used with borrow?

AntConc Download AntConc for free from: (or just search for Antconc) Use your own texts and corpora. Find some examples at:

Tip of the week Register to use the BYU corpora for free.

Next week (Session 3) Collocation Corpus linguists claim to have identified an important principle is responsible for the creation of much of the meaning of texts – collocation (co-occurrences). What is it, and are the claims true? Optional reading: * Xiao, Richard, and Tony McEnery (2006). "Collocation, Semantic Prosody, and near Synonymy: A Cross- Linguistic Perspective " Applied Linguistics 27(1):

Corpus Linguistics: session 2 Corpus Linguistics (2): The Tools of the Trade 669o4zt