BYU COCA: CORPUS OF CONTEMPORARY AMERICAN ENGLISH

Slides:



Advertisements
Similar presentations
Grammar Unit Prepositions.
Advertisements

Grammar Spinner Touch any part of the screen to begin. (Or click your mouse) Touch the screen again each time you want to spin.
The Convergence of Law and Technology The following slides will provide a brief overview of the Potomac Publishing online service. After viewing this presentation,
HOW TO USE A FRENCH DICTIONARY
Corpus Creation for Lexicography Adam Kilgarriff, Michael Rundell Lexicography MasterClass, UK Elaine Ui Dhonnchadha ITE (Linguistics Institute of Ireland)
Future challenges of Corpus Linguistics Voltaire comment from earlier: we see things from our own perspective How to “harness the power” of text archives,
Here is a list of citations the database retrieved for us. To find out more about an article, click on the “complete reference” link.
Using ProQuest Databases Jackson Community College Atkinson Library.
Introduction Ebsco Host. Public Libraries Have many databases you can search to find journal, magazine, and newspaper articles. Of these, Ebsco is one.
Resources for Using Corpus Linguistics in ELT Kenji Kitao Doshisha University Kyoto, Japan S. Kathleen Kitao Doshisha Women ’ s College Kyoto, Japan.
Habeas Corpus in Your Classroom An InterACTIVE Workshop Dr. Rob TroyerIALLT Western Oregon UniversityJune 11, 2013.
EBSCO for All An introduction to the wonderful world of EBSCO.
Memory Strategy – Using Mental Images
Online Corpora in L2 Writing Class Zawan Al Bulushi Indiana University Bloomington November 15,
What is a Sentence? By Jaye Lynn Trapp.
Why We Need Corpora and the Sketch Engine Adam Kilgarriff Lexical Computing Ltd, UK Universities of Leeds and Sussex.
Corpora and Concordancers in ESL/EFL Class: Truly Authentic Language for Language Learning. and opening.
Support.ebsco.com Introduction to EBSCOhost Tutorial.
HOMOPHONES English Education grades 7-12 Michelle Goble Next 
Research for English 201 Ielleen Miller, Reference/Instruction Librarian Website:
How Can Corpora Help Me To Be Successful in CO150?
Building and analysing your own corpus 1. Building a corpus.
Power Searching 501 (?): a crash course The stuff you need to know about searching, but may have forgotten along the way! (And, the stuff I want you to.
What’s Next? Why are we here and what are we going to do? Why are we here and what are we going to do?
Corpus search What are the most common words in English
Overview of Corpus Linguistics
Using Corpora to Teach Vocabulary Helping Students Help Themselves 1.
Welcome to Stanah School
Making trouble-free corpus tasks in 10 minutes Jennie Wright.
How to use Drupal Awdhesh Kumar (Team Leader) Presentation Topic.
Use of Concordancers A corpus (plural corpora) – a large collection of texts, written or spoken, stored on a computer. A concordancer – a computer programme.
PRIMENJENA LINGVISTIKA I NASTAVA JEZIKA II 3 rd class.
Introduction to EBSCOhost
History Reference Center
History Reference Center
Using language corpora in developing Arabic lessons & syllabuses
How Many Words Does It Take to Listen and Read in English?
Computer Corpora and What They Can Tell Us about How People Use Language 情報科学入門 26 July 2012.
Copyediting and Corpus Linguistics
Computational and Statistical Methods for Corpus Analysis: Overview
Exploring the BNC Corpus
Corpus Linguistics I ENG 617
Introduction to Corpus Linguistics: Exploring Collocation
NLP Assignments for Undergraduates (1)
Compleat lexical tutor
Corpus Linguistics I ENG 617
The Open-Source FLAX Language System
Nouns Nouns not noun noun noun not not
Corpora and Concordancers in ESL/EFL Class:
Corpus-Based ELT CEL Symposium Creating Learning Designers
Sentences What are they?.
History Reference Center
Introduction to Corpus Linguistics ENG 331
Teacher Research EDG 506 Using the Database Thesaurus 12/5/2018.
HTML Links.
SEARCHING ERIC THESAURUS & LIMIT RESULTS
Statistical n-gram David ling.
For the week of November
Search in Token-annotated Corpora Search in Treebanks
Using web corpora for language queries
Using GOLD to Tracking L2 Development
Introduction to EBSCOhost
Corpus processing tools
YOUR text YOUR text YOUR text YOUR text
Great Resource of Newspapers and Magazines
Parts of Speech II.
Welcome to Hornsey Library
The BAWE Quicklinks project
DESCRIÇÃO E ANÁLISE MORFOSSINTÁTICA DO INGLÊS
Presentation transcript:

BYU COCA: CORPUS OF CONTEMPORARY AMERICAN ENGLISH Workshop Purdue University November 2015

Agenda Essential background: COCA, other BYU corpora, basics of the interface Search functions: information & practice Search syntax: information & practice Results analysis Activities (Possibly: Pedagogical uses)

COCA: Overview (1 & 2) “The Corpus of Contemporary American English (COCA) is the largest freely-available corpus of English, and the only large and balanced corpus of American English. “ (COCA website) Corpus: a database of texts that you can query Text types (registers) in COCA: spoken, fiction, popular magazines, newspapers, and academic (page 2) Timeframe of COCA collection: 1990-2012 Let’s look at text types here

COCA and other corpora (3) Wikipedia Corpus Global Web-based English (the power to compare across dialects, e.g. US/UK) Corpus of Historical American English (CoHA) ( texts from 1810- 2000) Time Magazine British National Corpus (BNC) It is important to underline: - These corpora share the interface and fucntions. Once you become fluent in using one corpus you will be able to use another one. Question: What might a researcher who is looking up of the same words and phrases in: Wikipedia and Globwe COCA and BNC CoHA & COCA be looking for exactly?

COCA Interface: Welcome Screen Interface consists of 3 active & independent frames

COCA Interface: Results Display

COCA Interface: How to search? Display: List, Chart, KWIC, Compare Search String (clicking on the word “collocates” turns off and on the function; the same with POS) Sections: Registers (Spoken, Fiction, Magazine, Newspaper, Academic) Time of publication Subregisters: MAG: Sci/Tech; FIC:Juvenile Click and scroll time (click on Collocates, POS List, Section Scroll)

Corpus: What to search for? Cheat Sheet words mysterious phrases nooks and crannies or faint + noun lemmas all forms of words, like sing or tall wildcards un*ly or r?n* complex searches such as un-X-ed adjectives or verb + any word + a form of ground.

COCA Interface: What are tags? phrases faint + noun faint [nn*] Tags can be easily checked in the POS list Add a space between the word and the tag TAGS system: CLAW 7 Tags are ascribed by the automated tagger (there are some that are wrong, but it is a small margin) Let’s check the tags for singular nouns wh- adverbs (who, when, where, how)

Activities time!

Activity 3 FREQ: tokens Per milion: shows proportion of tokens in the corpus

Activity 4.

Activity 5. Collocates delimiting function. = Search any (*) noun collocates of the word laugh (in the role of a noun) 5 spaces before or after the word laugh. “Crystal threw back her head and laughed, a throaty little laugh of sheer exuberance with a sort of purr in it. In a moment he” LEFT node RIGHT node and laughed a throaty little laugh of sheer exuberance with 5 4 3 2 1

Activity 6. KWIC: looking at research prepositions.

Pedagogical applications of corpora: Words and Phrase Analysis http://www.wordandphrase.info

THANK YOU!