What is the BNC? What is Xaira? How to use the BNC for: › Language teaching and learning › Research
A 100 million word collection of samples of British English from a wide range of sources (10% spoken, 90% written texts). Available under licence; latest edition is BNC XML edition (2007)
Reference Book Publishing Natural language processing Language Teaching and Learning › Materials design › Classroom reference › Independent learning Linguistic Research › BNC as source of real language use › BNC as benchmark
X ML A ware I ndexing and R etrieval A rchitecture A text searching tool Usable with any XML corpus Provided free with the BNC XML Edition
Word – the different word forms in the corpus Phrase – the multi-word phrase or single word form Addkey – words with additional keys such as POS codes Pattern – word patterns XML – specific XML start- or end- tags Query builder – a complex query CQL – commands in CQL, the language Xaira uses to represent its queries internally
How Xaira looks How the solutions are displayed
Manage the windows on the screen
Solutions No solutions Too many solutions dialogue box
Page mode/ Line mode Plain text / XML text Scope of context Reference (status bar)
‘It's an interesting idea. ‘ It ‘s an interesting idea.
Case studies 1: She’ll (turn/ go) mad!! 2: Men are handsome/ women are beautiful › Language teaching and learning Materials design, classroom reference, independent learning 3: Words in my corpus vs. ‘standard’ use › Research
Task: › comparing use of “Turn” and “Go” › Turn + adj. vs. Go + adj. Language point: Semantic prosody Xaira functions: › Open the BNC › New query – query builder (word query + Addkey) › Sort
Query = Turn AND Go + Adjective
Link type: - Next - Not next - one-way - two-way
Go + adj. (a-z) Turn + adj. (a-z) 2 keys: 1. Examples of ‘go’, then ‘turn’ 2. Adj. (a-z)
Two Three
Task: › Comparing the frequencies of collocates Men vs. Handsome/ Beautiful Women vs. Beautiful/ Handsome Language point: collocations Xaira functions: › Word query › Collocation Source: The BNC Handbook (1998)
__ 1
Men with Handsome 15 Men with Beautiful8 Women with Beautiful 83 Women with Handsome 2
Data: Bumrungrad and Vitallife websites “Linguistic keywords reflect the content of a particular text (Scott, 1997; 2000) through their high frequency” Task/ research purpose › Identifying keywords to see which words are used particularly frequently on the websites › Comparing words in a website against ‘standard’ use Source: Watson Todd, R.
Process Collect texts of the websites Conduct word frequency Product A corpus Absolute frequencies
BIH and VitallifeBIH onlyVitallife only andcenterand centerandVitallife aboutato theusabout usthemanagement
WordsFrequenciesCorpus sizeFrequenciesCorpus size andKnown UnknownKnown centre. about. the. …. BNC Websites
BIH and VLBIH onlyVL only VitallifecenterVitallife Bumrungrad wellness centeroverviewprograms wellnesshospitalnutraceuticals usinternationalmedicine
Words relating to the hospital itself or its location Words associated with websites Words relating to medical priorities Words relating to promotional priorities Words relating to non-traditional interpretations of health
KeywordFreq.LLSourceExample Words relating to the hospital itself or its location Vitallife Vitallife At Vitallife we understand Bumrungrad Bumrungrad Bumrungrad serves over a million patients hospital Bumrungrad the largest private hospital international Bumrungrad Bumrungrad International is a complete medical campus Thailand Bumrungrad best quality service in Thailand Bangkok Bumrungrad located in the heart of Bangkok
Collocation › Definition of the word › Phrase › Semantic prosody Contrastive studies › Geographical varieties and languages › Categories of users Language teaching and learning › Word meaning › Grammatical structures Source: The BNC Handbook (1998)
Aston, G. and Burnard, L. (1998), The BNC Handbook: exploring the British National Corpus with SARA. Edinburgh: Edinburgh University Press. Oxford University Computing services (All About Xaira) Reference guide for the BNC (XML edition) The British National Corpus, version 3 (BNC XML Edition) Distributed by Oxford University Computing Services on behalf of the BNC Consortium. URL: