Introduction to Corpus Linguistics: Dispersion/concordance plots

Slides:



Advertisements
Similar presentations
Corpus Linguistics Richard Xiao
Advertisements

Uses of a Corpus “[E]xplore actual patterns of language use”
Using Corpus Tools in Discourse Analysis Discourse and Pragmatics Week 12.
Recent Developments in Technological Tools for the Purpose of Facilitating SLA.
What is a corpus?* A corpus is defined in terms of  form  purpose The word corpus is used to describe a collection of examples of language collected.
Corpus Linguistics. What is corpus linguistics? Method / Theory in Linguistics Analysis of collections of texts (corpora) Verifying/ Strengthening or.
Using Corpora in Linguistics Introduction to WordSmith Tools for Beginners Íde O’Sullivan Regional Writing Centre
SOWK 6003 Social Work Research Week 10 Quantitative Data Analysis
Using Corpora in Linguistics
Introducing Corpus Linguistics: AntConc and Project Gutenberg. Dr Glenn Hadikin.
Corpus Linguistics: session 2 Corpus Linguistics (2): The Tools of the Trade 669o4zt
Corpus Linguistics What can a corpus tell us ? Levels of information range from simple word lists to catalogues of complex grammatical structures and.
Quantifying Data.
Research methods in corpus linguistics Xiaofei Lu.
Corpus Linguistics Case study 2 Grammatical studies based on morphemes or words. G Kennedy (1998) An introduction to corpus linguistics, London: Longman,
Memory Strategy – Using Mental Images
Albert Gatt LIN 3098 Corpus Linguistics. In this lecture Some more on corpora and grammar Construction Grammar as a theoretical framework Collostructional.
McEnery, T., Xiao, R. and Y.Tono Corpus-based language studies. Routledge. Unit A 2. Representativeness, balance and sampling (pp13-21)
Researching language with computers Paul Thompson.
ENG 626 CORPUS APPROACHES TO LANGUAGE STUDIES exploring frequencies in texts Bambang Kaswanti Purwo
Roper Center for Public Opinion Research Social Science Research and Instructional Council June, 2015.
practical aspects1 Translation Tools Translation Memory Systems Text Concordance Tools Useful Websites.
Tracking Language Development with Learner Corpora Xiaofei Lu CALPER 2010 Summer Workshop July 12, 2010.
Title: Corpus-Based Analysis of the Representation of Thai Government in Bangkok Post and the Nation: Critical Discourse Analysis Mr.Warawit Natephra.
1 Statistical NLP: Lecture 7 Collocations. 2 Introduction 4 Collocations are characterized by limited compositionality. 4 Large overlap between the concepts.
英 3B 戴偲婷. WConcord is a fast and easy to use concordancer for unlimited amounts of text. It allows the user to load multiple plain text files (.txt)
Corpus Linguistics in Research Doctorate in Education University of Warwick 6th November 2008.
How to use LegalEasy - the Legal English & Corpus Consultation Website for Law students.
学习者书面语中的程序化词汇研究 Procedural vocabulary and EFL writing quality 梁茂成
Levels of Linguistic Analysis
Corpus Linguistics MOHAMMAD ALIPOUR ISLAMIC AZAD UNIVERSITY, AHVAZ BRANCH.
Use of Concordancers A corpus (plural corpora) – a large collection of texts, written or spoken, stored on a computer. A concordancer – a computer programme.
PRIMENJENA LINGVISTIKA I NASTAVA JEZIKA II 3 rd class.
XAIRA is an XML Aware Indexing and Retrieval Architecture ● Developed from the British National Corpus Sara program, it provides: – platform-independent.
Descriptive Statistics ( )
Gender Representation
Vocabulary Module 2 Activity 5.
Corpus Linguistics Anca Dinu February, 2017.
Measuring Monolinguality
Statistical NLP: Lecture 7
CORPUS LINGUISTICS Corpus linguistics is the study of language as expressed in samples (corpora) or "real world" text. An approach to derive at a set of.
ALE161 國際行銷英文簡報技巧 International Marketing Presentation Techniques
Searching corpora.
AntConc is a freeware, multiplatform of application suitable for all types of users

Using Corpora in Linguistics
Computational and Statistical Methods for Corpus Analysis: Overview
A course in media discourse
Exploring the BNC Corpus
عمادة التعلم الإلكتروني والتعليم عن بعد
Week 12 Option 3: Database Design
Session 3: Collocation 1.
Introduction to Corpus Linguistics: Exploring Collocation
Introduction to Corpus Linguistics: Applications Lexicography
Topics in Linguistics ENG 331
Corpus Linguistics I ENG 617
Introduction to Corpus Linguistics: Key Word Analysis
UNIT 15 Webpage Creator.
Introduction to Corpus Linguistics: Basic tools: Concordances
Introduction to Corpus Linguistics: Colligation
Central Tendency and Variability
Corpora and Concordancers in ESL/EFL Class:
Corpus-Based ELT CEL Symposium Creating Learning Designers
A Search for Discipline-Specific Vocabulary
Levels of Linguistic Analysis
Lesson 2 follow up.
Introduction: Statistics meets corpus linguistics
Using GOLD to Tracking L2 Development
Applied Linguistics Chapter Four: Corpus Linguistics
WELL DONE – ONE DOWN! Not so bad, right? Paper 2 to go….
Presentation transcript:

Introduction to Corpus Linguistics: Dispersion/concordance plots John Corbett and Wendy Anderson

This session Understanding concordance (or ‘dispersion’) plots What are they and why are they useful? Some case studies Using free and commercial text analysis software and Developing our own questions

How does a topic pattern within texts? Sometimes it is useful to know where in a text a word or group of related words occur. Some text analysis programs (like freeware AntConc or the commercially-available WordSmith) allow you to show concordance/dispersion plots that indicate the position in the text where the word or words occur.

Download AntConc from www.laurenceanthony.net/software.html

Download text(s) to analyse, copy the text you want, clean it up and save it as a plain text file

Open your plain text file using AntConc Open File View to check the file is ok Click on Word List to generate frequencies Click on Concordance to and search for your node (eg ‘independence’)

Then click on ‘Concordance Plot’ to find out where in the text the node occurs…

Summing up…concordance plots Certain concordancing programs, like AntConc, will allow us to look at the way words are patterned or dispersed through a text. We can see quickly where there is ‘bunching’ ie where in a text a topic is covered and where it is neglected. Comparing texts we can see similarity or difference in expression of particular concepts as the texts develop.

Developing your research tools… So far, we have been considering a range of corpus analysis tools to analyse certain features of large bodies of texts. We have been looking at Frequency Manual interpretation of concordances Statistical measures of collocation (MI, t-score, z-score…) Colligation Concordance/dispersion plots How would you develop and research your own question?

Some research questions (based on Davies 2010) Lexical change and innovation: How have words such as ‘globalisation’, ‘teenage’, ‘adolescent’, ‘same-sex’, ‘mentor’, ‘downsize’, etc entered the language. Have other words disappeared, eg ‘scullery’, or reappeared with a different sense ‘wireless’. Morphological change and innovation: How have suffixes like ‘-nik’, ‘-gate’ ‘-ista’ entered and diffused through the language? How about particular words like ‘flammable’ ‘inflammable’ ‘uninflammable’?

TIME magazine: frequency of ‘scullery’

TIME magazine: frequency of ‘wireless’

More research questions Syntactical questions: What are the most frequent phrasal verbs in British English/American English/different genres? How is the ‘get- passive’ used in English? How are constructions like ‘end up V-ing’ used in English? Semantic questions: Are the central and typical meanings of lexical items the same or different? Is ‘gather’ used more frequently to mean ‘collect’ or ‘understand’? How have the meanings of ‘gay’, ‘lame’, ‘green’, etc changed in the last decades?

Even more research questions Discourse analysis: What are we saying about issues like gender, the environment, politics, terrorism, revolution, immigration, etc that is different from what it was 10, 30 or 100 years ago? In any given text about gender, the environment, politics, terrorism, etc, how are particular words or expressions (eg ‘woman’ or ‘battle’ or ‘I’) distributed? Think of a question that particularly interests you. How would you begin to explore it? Which corpus tools would you select first?