 What is the BNC?  What is Xaira?  How to use the BNC for: › Language teaching and learning › Research.

Slides:



Advertisements
Similar presentations
Part Two: Using Xaira to explore corpora Richard Xiao
Advertisements

Introduction to Xaira Part One: All about Xaira Andrew Hardie.
Concordancing at Upper-Intermediate Levels What it is not What you will get from this talk.
Uses of a Corpus “[E]xplore actual patterns of language use”
Using Corpus Tools in Discourse Analysis Discourse and Pragmatics Week 12.
Recent Developments in Technological Tools for the Purpose of Facilitating SLA.
Deny A. Kwary 10 May 2010 Using Freeware Computer Programmes for English Language Teaching and Learning 1.
Project 1 Introduction to HTML.
DEMO: BNC and Xaira (see lilac sheet). Start Xaira and open BNC Via ‘bnc-xml.xcorpus’ or Xaira.
Constructing and Evaluating Web Corpora: ukWaC Adriano Ferraresi University of Bologna Aston University Postgraduate Conference.
Chapter 12: ADO.NET and ASP.NET Programming with Microsoft Visual Basic.NET, Second Edition.
1/26 Corpus Linguistics. 2/26 Varieties of English Relevance of corpus linguistics to this course –Previously studies of stylistics were largely informal.
Using the BNC for teaching and research. Teaching and learning.
Pedagogic uses of a corpus of student writing and their implications for sampling and annotation Alois Heuboeck University of Reading, UK.
CM143 - Web Week 2 Basic HTML. Links and Image Tags.
Resources Primary resources – Lexicons, structured vocabularies – Grammars (in widest sense) – Corpora – Treebanks Secondary resources – Designed for a.
Corpus Linguistics: session 2 Corpus Linguistics (2): The Tools of the Trade 669o4zt
Copyright © 2003 Pearson Education, Inc. Slide 1-1 Created by Cheryl M. Hughes, Harvard University Extension School — Cambridge, MA The Web Wizard’s Guide.
Corpus Linguistics What can a corpus tell us ? Levels of information range from simple word lists to catalogues of complex grammatical structures and.
1st Project Introduction to HTML.
Research methods in corpus linguistics Xiaofei Lu.
HINARI Advanced Course Part B Table of Contents: Section B Internet Resources (a sampling of gateways and portals) Google Books Booksee.org FreeBookCentre.net.
HTML 1 Introduction to HTML. 2 Objectives Describe the Internet and its associated key terms Describe the World Wide Web and its associated key terms.
Chapter ONE Introduction to HTML.
Memory Strategy – Using Mental Images
Differentiating Instruction Using Lexile Measures and OSLIS Developing Targets for Student Success Module I.
Getting Started with Expression Web 3
The Internet and the World Wide Web. The Internet A Network is a collection of computers and devices that are connected together. The Internet is a worldwide.
Researching language with computers Paul Thompson.
PowerConc: An R-gram Based Corpus Analysis Tool Jiajin Xu & Yunlong Jia Beijing Foreign Studies University.
Tutorial 1: Browser Basics.
ENG 626 CORPUS APPROACHES TO LANGUAGE STUDIES exploring frequencies in texts Bambang Kaswanti Purwo
Why We Need Corpora and the Sketch Engine Adam Kilgarriff Lexical Computing Ltd, UK Universities of Leeds and Sussex.
1.The COBUILD approach to grammar is simple and direct.
TALC Applying some Developments in Corpus Building Technology to Language Teaching and Learning TALC 2006 Paris.
Click on the tab to find journals by Subjects. From the drop down menu, we will select Parasitology and Parasitic Diseases.
Tracking Language Development with Learner Corpora Xiaofei Lu CALPER 2010 Summer Workshop July 12, 2010.
Seminar in Applied Corpus Linguistics: Introduction APLNG 597A Xiaofei Lu August 26, 2009.
LOGISTICS, LOGISTICAL, LOGISTIC: DIACHRONIC AND SYNCHRONIC CORPUS ANALYSIS Dr. Violeta Jurkovič Faculty of Maritime Studies and Transport Portorož.
Introduction to Morpho BEAM Workshop Samantha Romanello Long Term Ecological Research University of New Mexico.
L JSTOR Tools for Linguists 22nd June 2009 Michael Krot Clare Llewellyn Matt O’Donnell.
Introduction to Morpho RCN Workshop Samantha Romanello Long Term Ecological Research University of New Mexico.
Table of Contents – Part B HINARI Resources –Clinical Evidence –Cochrane Library –EBM Guidelines –Essential Evidence Plus –HINARI EBM Journals.
HTML Overview Part 5 – JavaScript 1. Scripts 2  Scripts are used to add dynamic content to a web page.  Scripts consist of a list of commands that execute.
Applying some Developments in Corpus Building Technology to Language Teaching and Learning TALC 2006 Paris.
HTML Concepts and Techniques Fifth Edition Chapter 1 Introduction to HTML.
Introducing… …..from the BMJ Evidence Centre. BestPractice is a new concept for health information delivered at the point-of-care Diagnosis PrognosisTreatmentPrevention.
To find journals by language of publication, click on the Languages bar in the horizontal frame. The Languages drop down menu appear and we will choose.
Corpus Linguistics MOHAMMAD ALIPOUR ISLAMIC AZAD UNIVERSITY, AHVAZ BRANCH.
Linux+ Guide to Linux Certification, Second Edition Chapter 4 Exploring Linux Filesystems.
Colby Smart, E-Learning Specialist Humboldt County Office of Education
HTML HyperText Markup Language Victoria E. Kozlek.
Web Design Terminology Unit 2 STEM. 1. Accessibility – a web page or site that address the users limitations or disabilities 2. Active server page (ASP)
Using Corpora in TEFL By Terri Yueh. WhyWhy Work With Corpora? Why  From Vocabulary to Corpus  Choosing a Corpus Choosing a Corpus  Examples of Word.
What is a Corpus? What is not a corpus?  the Web  collection of citations  a text Definition of a corpus “A corpus is a collection of pieces of language.
Making trouble-free corpus tasks in 10 minutes Jennie Wright.
Use of Concordancers A corpus (plural corpora) – a large collection of texts, written or spoken, stored on a computer. A concordancer – a computer programme.
THE PROCESS OF WORDS BEING ENTERED IN A DICTIONARY WORD FORMATION IN ENGLISH Magdalena Soklevska April, 2016.
PRIMENJENA LINGVISTIKA I NASTAVA JEZIKA II 3 rd class.
XAIRA is an XML Aware Indexing and Retrieval Architecture ● Developed from the British National Corpus Sara program, it provides: – platform-independent.
CORPUS LINGUISTICS Corpus linguistics is the study of language as expressed in samples (corpora) or "real world" text. An approach to derive at a set of.
Project 1 Introduction to HTML.
Computational and Statistical Methods for Corpus Analysis: Overview
Exploring the BNC Corpus
عمادة التعلم الإلكتروني والتعليم عن بعد
A CORPUS-BASED STUDY OF COLLOCATIONS OF HIGH-FREQUENCY VERB —— MAKE
Corpus-Based ELT CEL Symposium Creating Learning Designers
(word formation: follow up)
Using GOLD to Tracking L2 Development
Applied Linguistics Chapter Four: Corpus Linguistics
Presentation transcript:

 What is the BNC?  What is Xaira?  How to use the BNC for: › Language teaching and learning › Research

 A 100 million word collection of samples of British English from a wide range of sources (10% spoken, 90% written texts).  Available under licence; latest edition is BNC XML edition (2007) ‏

 Reference Book Publishing  Natural language processing  Language Teaching and Learning › Materials design › Classroom reference › Independent learning  Linguistic Research › BNC as source of real language use › BNC as benchmark

 X ML A ware I ndexing and R etrieval A rchitecture  A text searching tool  Usable with any XML corpus  Provided free with the BNC XML Edition

Word – the different word forms in the corpus Phrase – the multi-word phrase or single word form Addkey – words with additional keys such as POS codes Pattern – word patterns XML – specific XML start- or end- tags Query builder – a complex query CQL – commands in CQL, the language Xaira uses to represent its queries internally

 How Xaira looks  How the solutions are displayed

 Manage the windows on the screen

 Solutions  No solutions  Too many solutions dialogue box

 Page mode/ Line mode  Plain text / XML text  Scope of context  Reference (status bar)

 ‘It's an interesting idea. ‘ It ‘s an interesting idea.

Case studies  1: She’ll (turn/ go) mad!!  2: Men are handsome/ women are beautiful › Language teaching and learning  Materials design, classroom reference, independent learning  3: Words in my corpus vs. ‘standard’ use › Research

 Task: › comparing use of “Turn” and “Go” › Turn + adj. vs. Go + adj.  Language point: Semantic prosody  Xaira functions: › Open the BNC › New query – query builder (word query + Addkey) › Sort

Query = Turn AND Go + Adjective

Link type: - Next - Not next - one-way - two-way

 Go + adj. (a-z)  Turn + adj. (a-z) 2 keys: 1. Examples of ‘go’, then ‘turn’ 2. Adj. (a-z)

Two Three

 Task: › Comparing the frequencies of collocates  Men vs. Handsome/ Beautiful  Women vs. Beautiful/ Handsome  Language point: collocations  Xaira functions: › Word query › Collocation Source: The BNC Handbook (1998)

__ 1

 Men with Handsome 15  Men with Beautiful8  Women with Beautiful 83  Women with Handsome 2

 Data: Bumrungrad and Vitallife websites “Linguistic keywords reflect the content of a particular text (Scott, 1997; 2000) through their high frequency”  Task/ research purpose › Identifying keywords to see which words are used particularly frequently on the websites › Comparing words in a website against ‘standard’ use Source: Watson Todd, R.

Process Collect texts of the websites Conduct word frequency Product A corpus Absolute frequencies

BIH and VitallifeBIH onlyVitallife only andcenterand centerandVitallife aboutato theusabout usthemanagement

WordsFrequenciesCorpus sizeFrequenciesCorpus size andKnown UnknownKnown centre. about. the. …. BNC Websites

BIH and VLBIH onlyVL only VitallifecenterVitallife Bumrungrad wellness centeroverviewprograms wellnesshospitalnutraceuticals usinternationalmedicine

 Words relating to the hospital itself or its location  Words associated with websites  Words relating to medical priorities  Words relating to promotional priorities  Words relating to non-traditional interpretations of health

KeywordFreq.LLSourceExample Words relating to the hospital itself or its location Vitallife Vitallife At Vitallife we understand Bumrungrad Bumrungrad Bumrungrad serves over a million patients hospital Bumrungrad the largest private hospital international Bumrungrad Bumrungrad International is a complete medical campus Thailand Bumrungrad best quality service in Thailand Bangkok Bumrungrad located in the heart of Bangkok

 Collocation › Definition of the word › Phrase › Semantic prosody  Contrastive studies › Geographical varieties and languages › Categories of users  Language teaching and learning › Word meaning › Grammatical structures Source: The BNC Handbook (1998)

 Aston, G. and Burnard, L. (1998), The BNC Handbook: exploring the British National Corpus with SARA. Edinburgh: Edinburgh University Press.  Oxford University Computing services (All About Xaira)  Reference guide for the BNC (XML edition)  The British National Corpus, version 3 (BNC XML Edition) Distributed by Oxford University Computing Services on behalf of the BNC Consortium. URL: