Presentation is loading. Please wait.

Presentation is loading. Please wait.

Using Corpora in Linguistics Introduction to WordSmith Tools for Beginners Íde O’Sullivan Regional Writing Centre www.ul.ie/rwc.

Similar presentations


Presentation on theme: "Using Corpora in Linguistics Introduction to WordSmith Tools for Beginners Íde O’Sullivan Regional Writing Centre www.ul.ie/rwc."— Presentation transcript:

1 Using Corpora in Linguistics Introduction to WordSmith Tools for Beginners Íde O’Sullivan Regional Writing Centre www.ul.ie/rwc

2 Regional Writing Centre2 Corpus Linguistics  McEnery and Wilson (2001:1) describe corpus linguistics as “the study of language based on examples of ‘real life’ language use”. McEnery, T. and Wilson, A. (2001) (2 nd edition) Corpus Linguistics. Edinburgh: Edinburgh University Press.

3 Regional Writing Centre3 Corpus: Definition  “A corpus is [the name given to] a set of texts which has been put together for some purpose, usually (though not necessarily), in computer-readable form” (Wray, Trott & Bloomer, 1990:213). Wray, T., Trott, K. & Bloomer, A. (1998) Projects in Linguistics: A Practical Guide to Researching Language. London, New York: Arnold.

4 Regional Writing Centre4 Corpus: Definition  “a corpus typically implies a finite body of text, sampled to be maximally representative of a particular variety of a language, and which can be stored and manipulated using a computer” McEnery and Wilson (2001:73).  Corpus ≠ Archive

5 Regional Writing Centre5 Concordancing: Definition  “A concordance, in its simplest form, is an alphabetical listing of the words in a text, given together with the contexts in which they appear”. Catherine Ball, Concordances & Corpora: Tutorial: http://www.georgetown.edu/faculty/ballc/cor pora/tutorial.html http://www.georgetown.edu/faculty/ballc/cor pora/tutorial.html

6 Regional Writing Centre6 Concordancing: Definition  “A concordance is a list of examples of a particular word, part of a word or combination of words, in its contexts drawn from a text corpus. The search word is sometimes also referred to as a keyword. The most common way of displaying a concordance is by a series of lines h the keyword in context (KWIC)”. Kettemann, B. (1995) “Concordancing in stylistics teaching”, in Grosser, W., Hogg, J. and Hubmeyer, K. (eds), Style: Literary and Non-Literary. Contemporary Trends in Cultural Stylistics. New York: The Edwin Mellen Press: 307-318.

7 Regional Writing Centre7

8 8 Software to Analyse Corpora  “Concordancing software enables you to discover patterns that exist in natural language by grouping text in such a way that they are clearly visible […] The real value of the concordancer lies in this question of visibility” (Tribble & Jones, 1997:3). Tribble, C. and Jones, G. (1997) Concordances in the Classroom: Using Corpora in Language Education. Houston TX: Athelstan.

9 Regional Writing Centre9

10 10 Using Corpora in Language Learning and Teaching Organisation of the CD  This CD contains a collection of small genre- specific academic and journalistic corpora in English, French, Gaeilge, German and Spanish.  For each language there are two small genre- specific corpora: a journalistic corpus (100,000 words) and an academic corpus (50,000 words). The journalistic corpora are divided into four subcorpora: current affairs, editorials, reviews and sport. The academic corpora are divided into two subcorpora: theses and articles.

11 Regional Writing Centre11 Using Corpora in Language Learning and Teaching Organisation of the CD

12 Regional Writing Centre12 Sources of Journalistic Corpora English:Irish Examiner Irish Independent Irish Times French:Le Monde L’Humanité Gaeilge:Beo Foinse Lá German:Die Süddeutsche Zeitung Die Frankfurter Allgemeine Zeitung Spanish:La Vanguardia El Periódico

13 Regional Writing Centre13 Sources of Academic Corpora  Articles and thesis written by native speakers  Subject Areas:Literature,Cultural Studies, Translation Studies, Education, Applied Linguistics,Sociolinguistics, Corpus Linguistics,Media Studies, Language Pedagogy,Teacher Training, Discourse Analysis,Politics, Research Methodology, Second Language Acquisition, History of Language

14 Regional Writing Centre14 WordSmith Tools  Wordlists Frequency Alphabetical order Statistical information  Keywords  Concord Collocations Clusters Patterns Plot Source text

15 Regional Writing Centre15 WordSmith Tools  Concord Sorting data Concord expansion option Concordance with multiple views Settings Wildcards Advanced searching Close texts

16 Regional Writing Centre16 Worksheet  Run individual wordlists for the Academic Corpus and the Journalistic Corpus. Compare and contrast your findings to reach relative conclusions about each genre.  Run a concordance lists for a chosen aspect of the language: Do any collocational patterns emerge from this evidence? What are the most common clusters including the search word(s). Identify the most common uses of the word. Are their exceptions to these uses?

17 Regional Writing Centre17 Resources  WordSmith Tools: http://www.lexically.net/wordsmith/ http://www.lexically.net/wordsmith/  MonoConc and ParaConc http://www.athel.com/mono.html

18 Regional Writing Centre18 Online Resources  Tim Johns Data-driven Learning Page: http://www.eisu.bham.ac.uk/johnstf/tim conc.htm  Mike Barlow: http://www.athel.com/corpus.html http://www.athel.com/corpus.html  Other resources: http://www.ul.ie/~appliedlanguages/LI4 113_C&C_websites.doc http://www.ul.ie/~appliedlanguages/LI4 113_C&C_websites.doc

19 Regional Writing Centre19 Online Concordancing  Hong Kong Virtual Language Centre http://www.edict.com.hk/concordance/de fault.htm  The Compleat Lexical Tutor (Lextutor) http://www.lextutor.ca/  French Learner Language Oral Corpus (flloc) http://www.flloc.soton.ac.uk/

20 Regional Writing Centre20 Resources Freeware Concordancers  ConcApp: http://www.edict.com.hk/pub/concapp/ http://www.edict.com.hk/pub/concapp/  Create your own corpus - Disposable corpus  Issues of copyright  Issue of reliability

21 Regional Writing Centre21 Resources  British National Corpus (corpus demo) http://info.ox.ac.uk/bnc/ http://info.ox.ac.uk/bnc/  Cobuild Bank of English (wordbanks online) http://www.cobuild.collins.co.uk/ http://www.cobuild.collins.co.uk/  Corpus Concordance Sampler http://www.collins.co.uk/Corpus/Corpus Search.aspx  Limerick Corpus of Irish-English (L-CIE): http://www.ul.ie/~lcie/ http://www.ul.ie/~lcie/


Download ppt "Using Corpora in Linguistics Introduction to WordSmith Tools for Beginners Íde O’Sullivan Regional Writing Centre www.ul.ie/rwc."

Similar presentations


Ads by Google