Botox, themself and slugs Corpus Methods in Many Places Adam Kilgarriff Lexical Computing Ltd.

Slides:



Advertisements
Similar presentations
Grammar is to Meaning as the Law if to Good Behaviour Adam Kilgarriff Lexical Computing Ltd Lexicography MasterClass Ltd Universities of Leeds and Sussex.
Advertisements

Grammar Skill First Grade Unit 1 Week 4 Created by Kristi Waltke
Trademarks. Trademark A commercial symbol, word, name or other device that identifies and distinguishes products of a particular firm Trademark law entitles.
Building Wordnets Piek Vossen, Irion Technologies.
An investigation into Corpus-based learning about language inin the primary-school: CLLIP Corpus evidence of the features of childrens literature.
Funeral for a Brand: How Trademarks Become Generic.
CODE/ CODE SWITCHING.
Corpus Processing and NLP
1 Corpora for all Adam Kilgarriff Lexical Computing Ltd Lexicography MasterClass Ltd Universities of Leeds and Sussex.
EVOLUTION GONE BAD OR NOT?. Ape Walrus Bird Cat.
1 Linguistics and translation theory Mark Shuttleworth Teaching Translation Swansea, 20 January 2006.
Magic travelling I saw a lion at the Zoo. I saw a baby elephant too
Corpus Creation for Lexicography Adam Kilgarriff, Michael Rundell Lexicography MasterClass, UK Elaine Ui Dhonnchadha ITE (Linguistics Institute of Ireland)
Domain Templates Seeing patterns in related words.
Constructing and Evaluating Web Corpora: ukWaC Adriano Ferraresi University of Bologna Aston University Postgraduate Conference.
Talking about your homework News story? –What made you choose…? One of your words? –What made you choose…? (Give your vocabulary books to another student.
1 Corpora for the coming decade Adam Kilgarriff Lexical Computing Ltd.
Today Writing: using the comma –Writing task Corpus linguistics talk, Part 2 Re-organize groups –Group news discussion.
Grammar and Grammars Dialects of Native Speakers.
From Semantic Similarity to Semantic Relations Georgeta Bordea, November 25 Based on a talk by Alessandro Lenci titled “Will DS ever become Semantic?”,
What's on the Web? The Web as a Linguistic Corpus Adam Kilgarriff Lexical Computing Ltd University of Leeds.
Intellectual Property and Internet Law
Discovering Nouns: Learning & understanding the difference in common, proper, concrete, & abstract nouns!
Natural Language Processing Expectation Maximization.
Loss of Distinctiveness. Generally, a mark can lose distinctiveness in two ways Generally, a mark can lose distinctiveness in two ways Used by a competitor.
McEnery, T., Xiao, R. and Y.Tono Corpus-based language studies. Routledge. Unit A 2. Representativeness, balance and sampling (pp13-21)
Animal World. Are animals that live with people at home or in the house. Pets.
1 Corpora, Language Technology and Maltese Adam Kilgarriff Lexical Computing Ltd Lexicography MasterClass Ltd University of Sussex.
Intangible Assets Mark Fielding-Pritchard 2015Intangibles1.
Word senses Adam Kilgarriff Lexical Computing Ltd Lexicography MasterClass Ltd Universities of Leeds, Sussex.
GDEX: Automatically finding good dictionary examples in a corpus Adam Kilgarriff, Miloš Husák, Katy McAdam, Michael Rundell, Pavel Rychlý Lexical Computing.
1 Corpora, Dictionaries, and points in between in the age of the web Adam Kilgarriff Lexical Computing Ltd Lexicography MasterClass Ltd Universities of.
1 Corpora, Language Technology and Maltese Adam Kilgarriff Lexical Computing Ltd Lexicography MasterClass Ltd University of Sussex.
Dr. Monira Al-Mohizea MORPHOLOGY & SYNTAX WEEK 12.
Why We Need Corpora and the Sketch Engine Adam Kilgarriff Lexical Computing Ltd, UK Universities of Leeds and Sussex.
Corpora by Web Services Adam Kilgarriff Lexical Computing Ltd Lexicography MasterClass Ltd Universities of Leeds and Sussex.
1 Using Corpora in Language Research -also Introduction to the Sketch Engine (WS15) part 1 Adam Kilgarriff Lexical Computing Ltd Universities of Leeds.
Intellectual Property Chapter 5. Intellectual Property Property resulting from intellectual, creative processes—the products of an individual’s mind.
Auditory learning Some people learn through listening. Listening is very important, as much of what we learn is by the teacher talking to us. Therefore.
1 Evaluating word sketches and corpora Adam Kilgarriff Lexical Computing Ltd Lexicography MasterClass Ltd Universities of Leeds and Sussex.
Using Corpora in Language Research Adam Kilgarriff Lexical Computing Ltd Universities of Leeds January 2013Adam Kilgarriff.
Malta, May 2010Kilgarriff: Corpora by Web Services1 Corpora by Web Services Adam Kilgarriff Lexical Computing Ltd Lexicography MasterClass Ltd Universities.
LOGISTICS, LOGISTICAL, LOGISTIC: DIACHRONIC AND SYNCHRONIC CORPUS ANALYSIS Dr. Violeta Jurkovič Faculty of Maritime Studies and Transport Portorož.
A NOSE TEETH LEGS EARS EYES A NECK A TAIL WINGS SEA SNIMALSBIRDS NEPUGNI ONLDPIH KSAHR SIHF SEOGO GAEEL WLO RAPTOR PENGUIN DOLPHIN SHARK FISH GOOSE EAGLE.
“Set” American Sign Language IV. What do I do?  Each of you will receive a 3x5 card.
The Sketch Engine as Infrastructure for Large Scale Text Collections for Humanities Research Adam Kilgarriff Lexical Computing Ltd. & Univ of Leeds, UK.
Quoted & Reported Speech. We often have to give information about what people say or think. In order to do this you can use “direct = quoted” speech,
Ideas for 100K Word Data Set for Human and Machine Learning Lori Levin Alon Lavie Jaime Carbonell Language Technologies Institute Carnegie Mellon University.
“It’s my bread” By Kauri 2. “It’s my bread,” Said the rhino.
New API Website Feb Financial Advisor Communications System Headline From Blog Content Tagged “FACS” Organic Transparency is a marketing strength.
Auckland 2012Kilgarriff: NLP and Corpus Processing1 The contribution of NLP: corpus processing.
© 2015 albert-learning.com Punctuation For Children Punctuation Punctuations.
Learning Usage of English KWICly with WebLEAP/DSR Takashi Yamanoue Kagoshima University, Japan Toshiro Minami Kyushu Institute of Information Sciences.
Word Classes Nouns, verbs, adjectives, adverbs, pronouns, determiners, numerals, auxiliaries, prepositions, conjunctions
Link Translation provides training and practical experience on industry- standard Computer Assisted Translation (CAT) tools for our team of linguists.
Grammar is to Meaning as the Law if to Good Behaviour Adam Kilgarriff Lexical Computing Ltd Lexicography MasterClass Ltd Universities of Leeds and Sussex.
Corpus search What are the most common words in English
School Kids Investigating Language & Life in Society 1 February 2015 Lesson 3: Linguistic Landscapes & Levels of Linguistic Structure Teaching Fellows.
LING 200 Introduction to Linguistics Prof. Sharon Hargus Winter 2009 Jan. 5, 2009.
Exploring Variation in Lexis and Genre in the Sketch Engine Adam Kilgarriff Lexical Computing Ltd., UK Supported by EU Project PRESEMT.
A2 Unit 8 Business Planning – Lesson 6. BELLWORK – THE UK FRUIT JUICE AND SMOOTHIE MARKET Rank in order of 1 to 5, the biggest sellers of Fruit Juice/Smoothies.
Taking Notes First thing first- As soon as the teacher says, I need you to take notes, get out your paper. What else could they say?
Spoken Term Discovery for Language Documentation using Translations
Have you got a pet? What pets have you got? How old is your pet?
THEMATIC AND INFORMATION STRUCTURES
Quotation Marks English 7.
Find The Hidden Images.
What is Data?.
Corpora, Language Technology and Maltese
Find the hidden images ....
Presentation transcript:

Botox, themself and slugs Corpus Methods in Many Places Adam Kilgarriff Lexical Computing Ltd

Corpus: a collection of texts How people talk about X Indirect evidence of ▫How people think about X

Case study 1: botox Trade mark infringement ▫Allergan Inc own the mark botox ▫Competitor Klein Becker used it in advertising ▫Allergan sued ▫Klein Becker counter-case  Botox is a generic word so cannot be a mark

Well-known past cases In English-speaking world ▫hoover ▫kleenex ▫aspirin ▫biro Fracophonie: ▫bic

Linguistic question When is a word generic and when is it name-like Linguistic ‘expert witness’

Lots of evidence This corpus: 876 hits ▫UKWAC: web-crawled, 1.5 b words, 2006  Formal  Informal  News  Blogs  Jokes  Advertising

Linguistic features Prototypical for brands ▫Pre-modifier  Adidas trainers ▫Limited integration into linguistic system

Capitalisation

Native speakers think of is as a name ▫they capitalise else not unless ▫beginning of sentence ▫heading Professional writers ▫Capitalise if it is a trademark  or risk being sued

Capitalisation But what is the norm? ▫all nouns in UKWaC (freq>50) ▫how often capitalised?

Adam Kilgarriff15 English nouns: % capitalized

Capitalisation U-shaped distribution ▫Most types  Names  At or near to 100% capitalised ▫Most tokens  Common nouns  Under 50% ▫50-95% not many items

WordFrequency% caps Trademark words Colgate Vauxhall Asda Budweiser Boeing Levis Sanyo Ikea Nivea Odeon Contended Sabatier Botox Generic words Ribbon Debacle Itch Macaque Peephole Reunion Schoolboy Fossil Housewife Ferry

Case study 1: botox Trade mark infringement ▫Allergan Inc own the mark botox ▫Competitor Klein Becker used it in advertising ▫Allergan sued ▫Klein Becker counter-case  Botox is a generic word so cannot be a mark

Case study 2: Bible Translation Themself ▫BNC ▫Ukwac: 389, 0.1/m ▫08: /m ▫12: 11, /m

Case study 3: Species Which species deserve preserving ▫Rhino vs. roach

cat + dog 0.76 ▫feed,own,love,want,give,allow,adopt,keep,leave,bring,see ▫train,have,get,find,help horse + dog,cat ▫feed,own,love,help,give,breed,keep,leave,bring,see,train, allow,put,get,name donkey + pony ▫saddle,pet,ride,groom,hitch,domesticate,tether lion + wolf ▫shoot,feed,hunt,resemble,breed,spot,tame,kill,encounter ▫chase roach + snail ▫repel,squash crocodile + alligator ▫wrestle,stuff,spot,embroider,tame goose + pigeon ▫feed,breed,pluck,deter,stuff,slaughter,roast,chase