Introduction to Corpus Linguistics: Key Word Analysis

Slides:



Advertisements
Similar presentations
Corpus Linguistics Richard Xiao
Advertisements

1) Terms to Know 2) Starting an Office 97 Application 8) Finding a missing file 7)File Managment 4) Utilizing the Right Mouse Button 6) Using Help 3)
Accessing and Using the e-Book Collection from EBSCOhost ® When an arrow appears, click to proceed to the next slide at your own pace. To go back, click.
Using Corpus Tools in Discourse Analysis Discourse and Pragmatics Week 12.
Corpus Linguistics: session 2 Corpus Linguistics (2): The Tools of the Trade 669o4zt
Research methods in corpus linguistics Xiaofei Lu.
Using Outlook Express Copyright: Nipawin School Division No.61, Box 2044 Nipawin, SK 2004.
Getting Started with Dreamweaver
MGMS Databases Cool, reliable resources just a few clicks away!
Supporting Students with Read Aloud Accommodations Purpose: To provide read aloud services for all students with read aloud accommodations. Materials Needed:
Making the Most of This Database 3/21/2008Created By Mitch Lawson.
Microsoft Windows LEARNING HOW USE AN OPERATING SYSTEM 1.
Researching language with computers Paul Thompson.
1 IE in the Classroom The Internet Explorer Web Browser EDW647 Internet for Educators Roger Webster, Ph.D. Millersville University Department of Computer.
Go to your school’s web locker site school name.schoolweblockers.com) Your user name is the first letter of your first name, the first 4.
Searching and Using Databases. Use this tab on the library’s homepage to access databases or go directly to the database page.library’s homepagedatabase.
Corpus Linguistics in Research Doctorate in Education University of Warwick 6th November 2008.
Tutorial support.ebsco.com Core Collections Complete.
Introduction to EBSCOhost Tutorial support.ebsco.com.
WRITING REPORTS Introduction Section 0 Lecture 1 Slide 1 Lecture 6 Slide 1 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2015.
Corpus Linguistics MOHAMMAD ALIPOUR ISLAMIC AZAD UNIVERSITY, AHVAZ BRANCH.
Literary Reference Center Tutorial support.ebsco.com.
Tutorial support.ebsco.com. Welcome to Explora, EBSCO’s engaging interface for schools and public libraries. Designed to meet the unique needs of its.
PRIMENJENA LINGVISTIKA I NASTAVA JEZIKA II 3 rd class.
Introduction to EBSCOhost
Getting Started with Dreamweaver
Literary Reference Center
Basic statistics for corpus linguistics
Introductions About the presenter: Angela Cheung Calgary, AB
How to get started with RefWorks
Module 4: Building Reports
Introduction to OBIEE:
Science Reference Center
Custom Report Generator - Advanced
Literary Reference Center
Hiring Center An Inside Look… Your inside look at the hiring center.
Keywords the words (or n word sequences) which are significantly more frequent in a specialised corpus than in a "reference corpus" generally, the reference.
Using Corpora in Linguistics
Computational and Statistical Methods for Corpus Analysis: Overview
Science Reference Center
Exploring the BNC Corpus
Corpus Linguistics I ENG 617
How to get started with RefWorks
Introduction to Corpus Linguistics: Exploring Collocation
Introduction to Corpus Linguistics: Applications Lexicography
Topics in Linguistics ENG 331
Introduction to Corpus Linguistics: Dispersion/concordance plots
Corpus Linguistics I ENG 617
Product Retrieval Statistics Canada / Statistique Canada Title page
Introduction to Corpus Linguistics: Colligation
Bomgar Remote support software
ReadySet Achieve Maximize Training Agenda
Tutorial Introduction to support.ebsco.com.
Cool, reliable resources just a few clicks away!
Literary reference center
Module 5: Data Cleaning and Building Reports
Finding Magazine and Journal Articles in
Literary Reference Center
Quick Reference Guide: Skills Profiler – Employee
Word offers a number of features to help you streamline the formatting of documents. In this chapter, you will learn how to use predesigned building blocks.
Getting Started with Dreamweaver
Download from Zotero Home Page
Introduction to RefWorks
The quality of choices determines the quantity of Key words
Find your school and click on it.
Using Microsoft Outlook: Outlook Support Number
Tutorial Introduction to help.ebsco.com.
An Introduction to the Windows Operating System
Chloe Riley | Research Commons Librarian |
Presentation transcript:

Introduction to Corpus Linguistics: Key Word Analysis John Corbett & Wendy Anderson

Review So far we have looked at: Word frequencies Manual interpretation of concordance lines Using statistics to measure collocation in terms of frequency and probability Colligation (= collocation of grammatical categories) Concordance/dispersion plots But what if we want to ask whether a feature in a particular corpus/text is unusually frequent or infrequent? To answer this, we can compare corpora using key word analysis.

This session What is key word analysis and why is it useful? Comparison of corpora: specialised & reference Cultural key words Significance of key words Using AntConc (freeware)

Choosing corpora to compare For key word analysis you need: A specialised corpus you want to explore (Corpus A) A (usually larger) reference corpus (Corpus B) Let’s say Corpus A is a corpus of broadcast media reports, or works of fiction by a particular writer. What should Corpus B be?

Specialised and reference corpora… The nature of the specialised and reference corpora will be determined by your research questions (and also practical considerations like access). Comparing a specialised corpus (Corpus A) of news broadcasts to a balanced general reference corpus like BNC or CoCA would show you significant lexis in Corpus A compared to the language in general (ie BrEng or AmEng) Comparing a specialised corpus of news broadcasts (Corpus A) to a reference corpus of the same genre (Corpus B) will show you the significant lexis in Corpus A against language in texts of the same type.

Doing it ourselves… Step 1…build a specialised corpus (or even just look at a single text). For example, collect one or more political leaders’ speeches from the web: http://www.bbc.co.uk/news/uk-scotland-11560698 Copy and save as a plain text file, eg ‘Alex Salmond Speech 1’ Add more texts if you wish.

Doing it yourself… Step 2: Find a reference corpus, eg political texts from the SCOTS corpus. Go to www.scottishcorpus.ac.uk Click on Advanced Search Click on Written > TextType and Choose ‘Written record of speech’

Doing it yourself… Select some or all of the Scottish Parliament Body texts Click on Download and Save as a zip file. Unzip the contents of the plain text files into a folder, (you might call it ‘Parliamentary Reference Corpus’)

Choose a text analysis program, eg AntConc Choose your specialised corpus by clicking File> Open File and browsing for your target text(s)

Click Tool Preferences menu and choose Keyword Preferences

In Keyword Preferences… Choose a statistical measure of ‘keyness’ (the most common is ‘log likelihood’ but you can choose ‘chi square’) Choose a threshold for the number of keywords to be displayed (eg top 100) Choose whether or not to display ‘negative keywords’ (ie those words in the specialised corpus that have an unusually low frequency compared to the reference corpus)

In Keyword Preferences… Choose a reference corpus at the bottom of the Keyword Preferences menu. You can choose a ‘Directory’ ie a folder with a group of files. Choose the Parliamentary Reference Corpus folder as your reference corpus.

In Keyword Preferences… Click ‘Apply’

Back at the main screen Click ‘Words’ in the ‘Search Term’ option Click ‘Treat all data as lowercase’ Keep ‘Sort by Keyness’ Click Start

RESULTS!!!

Provisional interpretation of results We are comparing one political speech with a reference corpus of Scottish political discourse. Notice the unusual frequency of ‘I/we’ – is this a charismatic leader and a rhetoric of inclusion? Notice the intensification of ‘I’ towards the end. ‘Scotland/Scottish/nation’ – is this a nationalist speech? ‘Labour’ – the main political rival in Scotland ‘protect/NHS’ – government as carer Personal names (‘Jimmy’ = ally; ‘Cameron’ = enemy)

Provisional interpretation of results We are comparing one political speech with a reference corpus of Scottish political discourse. Notice the unusual infrequency of ‘finance’/’cost’ – is this is a speech that avoids economics? ‘problem’ – does the speech focus on upbeat topics? The keyword analysis, however, should only act as a point of departure for broader analysis.

Keywords and concordance plots It is sometimes interesting to look at how keywords are dispersed in a text. Simply load your text as for keywords, choose ‘concordance plot’ from the tab, and run a word search. Here, I have chosen ‘games’ for the Alex Salmond speech. It occurs four times in the speech…but where? And ‘I’ occurs 77 times, but where? Run concordance plots to find out.

‘Concordance plot’ for ‘games’ in Salmond speech

‘Concordance plot’ for ‘I’ in Salmond speech

Take-home messages Key word analysis is used when comparing corpora. Statistical programs are used to calculate words that appear unusually frequently or infrequently in one corpus, as opposed to the other one. This kind of analysis can tell us something interesting about the content, style and/or ideology of the corpus. We can combine key word analysis with other types of corpus search (eg concordance plots) to increase our understanding of the text.