Crowdsourcing Ling 240. What is crowdfunding? Crowdsourcing—definition “the practice of obtaining information or services by soliciting input from a.

Slides:



Advertisements
Similar presentations
Corpora in language variation studies
Advertisements

Using an enhanced MDA model in study of World Englishes
Applying Crowd Sourcing and Workflow in Social Conflict Detection By: Reshmi De, Bhargabi Chakrabarti 28/03/13.
Verbs Longman Student Grammar of Spoken and Written English Biber; Conrad; Leech (2009, p ) Verbs provide the focal point of the clause. The main.
MT Evaluation: Human Measures and Assessment Methods : Machine Translation Alon Lavie February 23, 2011.
Introduction to phrases & clauses
1 Developing Statistic-based and Rule-based Grammar Checkers for Chinese ESL Learners Howard Chen Department of English National Taiwan Normal University.
Conducting systematic reviews for development of clinical guidelines 8 August 2013 Professor Mike Clarke
GENERATING AUTOMATIC SEMANTIC ANNOTATIONS FOR RESEARCH DATASETS AYUSH SINGHAL AND JAIDEEP SRIVASTAVA CS DEPT., UNIVERSITY OF MINNESOTA, MN, USA.
Amazon Mechanical Turk (Mturk) What is MTurk? – Crowdsourcing Internet marketplace that utilizes human intelligence to perform tasks that computers are.
Individualized Rating Scales (IRS)
Predicting Text Quality for Scientific Articles Annie Louis University of Pennsylvania Advisor: Ani Nenkova.
Constructing and Evaluating Web Corpora: ukWaC Adriano Ferraresi University of Bologna Aston University Postgraduate Conference.
Information Extraction and Ontology Learning Guided by Web Directory Authors:Martin Kavalec Vojtěch Svátek Presenter: Mark Vickers.
Memory Strategy – Using Mental Images
Designing Ranking Systems for Hotels on Travel Search Engines by Mining User-Generated and Crowd sourced Content Author - Anindya Ghose, Panagiotis G.
McEnery, T., Xiao, R. and Y.Tono Corpus-based language studies. Routledge. Unit A 2. Representativeness, balance and sampling (pp13-21)
Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification on Reviews Peter D. Turney Institute for Information Technology National.
Processing of large document collections Part 3 (Evaluation of text classifiers, applications of text categorization) Helena Ahonen-Myka Spring 2005.
BTANT 129 w5 Introduction to corpus linguistics. BTANT 129 w5 Corpus The old school concept – A collection of texts especially if complete and self-contained:
Basic Web Applications 2. Search Engine Why we need search ensigns? Why we need search ensigns? –because there are hundreds of millions of pages available.
2007. Software Engineering Laboratory, School of Computer Science S E Towards Answering Opinion Questions: Separating Facts from Opinions and Identifying.
ADVERBS!!!!! English & Creative Writing Skills 8 th Grade.
An Analysis of Assessor Behavior in Crowdsourced Preference Judgments Dongqing Zhu and Ben Carterette University of Delaware.
Scott Duvall, Brett South, Stéphane Meystre A Hands-on Introduction to Natural Language Processing in Healthcare Annotation as a Central Task for Development.
Change in Style: A Multi-Dimensional Approach John C. Paolillo SCAN Research Group Meeting October 4, 2002.
1 Emotion Classification Using Massive Examples Extracted from the Web Ryoko Tokuhisa, Kentaro Inui, Yuji Matsumoto Toyota Central R&D Labs/Nara Institute.
1 Statistical NLP: Lecture 9 Word Sense Disambiguation.
Assessment of Morphology & Syntax Expression. Objectives What is MLU Stages of Syntactic Development Examples of Difficulties in Syntax Why preferring.
English Review for Final These are the chapters to review. In Textbook: Chapter 1 Nouns Chapter 2 Pronouns Chapter 3 Adjectives Chapter 4 Verbs Chapter.
English Review for Final These are the chapters to review. In Textbook: Chapter 1 Nouns Chapter 2 Pronouns Chapter 3 Adjectives Chapter 4 Verbs Chapter.
Arabic Tokenization, Part-of-Speech Tagging and Morphological Disambiguation in One Fell Swoop Nizar Habash and Owen Rambow Center for Computational Learning.
What to Know: 9 Essential Things to Know About Web Searching Janet Eke Graduate School of Library and Information Science University of Illinois at Champaign-Urbana.
Using a Lemmatizer to Support the Development and Validation of the Greek WordNet Harry Kornilakis 1, Maria Grigoriadou 1, Eleni Galiotou 1,2, Evangelos.
Date: 2013/8/27 Author: Shinya Tanaka, Adam Jatowt, Makoto P. Kato, Katsumi Tanaka Source: WSDM’13 Advisor: Jia-ling Koh Speaker: Chen-Yu Huang Estimating.
Accuracy Assessment Having produced a map with classification is only 50% of the work, we need to quantify how good the map is. This step is called the.
A Bootstrapping Method for Building Subjectivity Lexicons for Languages with Scarce Resources Author: Carmen Banea, Rada Mihalcea, Janyce Wiebe Source:
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li and Li Fei-Fei Dept. of Computer Science, Princeton University, USA CVPR ImageNet1.
Wikipedia as Sense Inventory to Improve Diversity in Web Search Results Celina SantamariaJulio GonzaloJavier Artiles nlp.uned.es UNED,c/Juan del Rosal,
Nursing research Is a systematic inquiry into a subject that uses various approach quantitative and qualitative methods) to answer questions and solve.
L JSTOR Tools for Linguists 22nd June 2009 Michael Krot Clare Llewellyn Matt O’Donnell.
Improving Search Results Quality by Customizing Summary Lengths Michael Kaisser ★, Marti Hearst  and John B. Lowe ★ University of Edinburgh,  UC Berkeley,
English Review for Final These are the chapters to review. In Textbook: Chapter 9 Nouns Chapter 10 Pronouns Chapter 11 Adjectives Chapter 12 Verbs Chapter.
Number Sense Disambiguation Stuart Moore Supervised by: Anna Korhonen (Computer Lab)‏ Sabine Buchholz (Toshiba CRL)‏
Communicative and Academic English for the EFL Professional.
What do we mean by Syntax? Unit 6 – Presentation 1 “the order or arrangement of words within a sentence” And what is a ‘sentence’? A group of words that.
CSC 594 Topics in AI – Text Mining and Analytics
Corpus search What are the most common words in English
Levels of Linguistic Analysis
The World Wide Web. What is the worldwide web? The content of the worldwide web is held on individual pages which are gathered together to form websites.
From Words to Senses: A Case Study of Subjectivity Recognition Author: Fangzhong Su & Katja Markert (University of Leeds, UK) Source: COLING 2008 Reporter:
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Cheap and Fast – But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks EMNLP 2008 Rion Snow CS Stanford Brendan O’Connor Dolores.
Towards Semi-Automated Annotation for Prepositional Phrase Attachment Sara Rosenthal William J. Lipovsky Kathleen McKeown Kapil Thadani Jacob Andreas Columbia.
Spam Detection Kingsley Okeke Nimrat Virk. Everyone hates spams!! Spam s, also known as junk s, are unwanted s sent to numerous recipients.
Twitter as a Corpus for Sentiment Analysis and Opinion Mining
General Architecture of Retrieval Systems 1Adrienn Skrop.
Big Data: Every Word Managing Data Data Mining TerminologyData Collection CrowdsourcingSecurity & Validation Universal Translation Monolingual Dictionaries.
Lecture-6 Bscshelp.com. Todays Lecture  Which Kinds of Applications Are Targeted?  Business intelligence  Search engines.
Data mining in web applications
Typical farms and hybrid approaches

Advanced English 6 November 1-2, 2017
Topics in Linguistics ENG 331
Levels of Linguistic Analysis
Sadov M. A. , NRU HSE, Moscow, Russia Kutuzov A. B
Applied Linguistics Chapter Four: Corpus Linguistics
Advanced English 6 November 10, 14
Parts of Speech II.
Presentation transcript:

Crowdsourcing Ling 240

What is crowdfunding?

Crowdsourcing—definition “the practice of obtaining information or services by soliciting input from a large number of [non-expert] people, typically via the internet” (OED) Examples: Wikipedia Google Translate FamilySearch Indexing

COCA's registers based on publication type

Crowdsourcing What are the benefits of collecting data through crowdsourcing? What are the limitations/weaknesses? What can be done to ensure that crowdsourcing workers are giving quality data?

Crowdsourcing in linguistics Wilhelm Kaeding (1897) Thousands of non-experts helped compile and analyze an 11 million word corpus of German Oxford English Dictionary (1858 – 1928) Hundreds of non-expert readers submitted 6 million quotation slips Perceptual dialectology Dialect perceptions elicited from non-experts

Mechanical Turk (Amazon) Strengths Inexpensive Fast Quality control Access to thousands of people Growing body of research strongly supports the quality of MTurk data E.g., Buhmester et al., 2011; Kittur et al., 2008; Suri & Watts, 2011; Urbano et al., 2010

Case study--

Register classification Traditional ‘user’-based approach ‘Expert’ classifies texts into registers by simply sampling from the publication type of interest Limitations ‘Publication type’ is not a meaningful criterion for web documents Experts can’t agree on register category for internet texts

Corpus Extracted from the Corpus of Global Web-based English (GloWbE), constructed by Mark Davies (Near) random sampling methods used to build the corpus Google searches of highly frequent English 3-grams (e.g., is not the, and from the) used to identify URLs links for each n-gram (i.e., Google results pages) Davies randomly extracted c. 49,300 URLs from GloWbE Only web pages from USA, UK, Canada, Aus., and NZ Documents < 75 words were excluded Non-textual material was removed from all web pages (HTML scrubbing and boilerplate removal) using JusText 1,445 URLs were excluded from subsequent analysis because they consisted mostly of photos or graphics. Final corpus for the study: 48,555 web documents.

People asked to determine mode of passage, then participants, purpose, etc. This led to 7 sub-registers

Crowdsourcing end-user data: Classification Developed a computer-adaptive survey for register classification Tested the tool through 10 rounds of piloting, resulting in numerous revisions Recruited 908 raters through Mechanical Turk 6 responses x 4 raters x 49,300 texts = 1.2 million individual ratings

Agreement results for the general register classification of 48,147 web documents (Fleiss’ Kappa =.47, moderate agreement) 69% of documents achieved majority agreement Additional 11.8% are potential 2-way hybrids

Frequencies of general register categories (i.e., documents where 3 or 4 raters were in agreement)

Systematic patterns of disagreement 28 different 2-2 combinations are possible in theory But, only 7 of those combinations occurred > 100 times in our corpus of 48,000 documents Because these are widely attested user-based patterns, we are able to interpret disagreement as a special pattern of agreement

Frequencies of 2-way hybrids that occur 100+ times

Multi-Dimensional analysis Factor analysis to identify dimensions based on co-occurrence among a large set of linguistic features Interpret dimensions functionally Calculate scores for each text on each dimension 17

Features used by Biber adopted: Positive features: Verbs: present tense verbs, mental verbs, do as pro ‑ verb, be as main verb, possibility modals Pronouns: 1st person pronouns, 2nd person pronouns, it, demonstrative pronouns, indefinite pronouns Adverbs: general emphatics, hedges, amplifiers Dependent clauses: that complement clauses (with that deletion), causative adverbial clauses, WH clauses Other: contractions, analytic negation, discourse particles, sentence relatives, WH questions, clause coordination ================================== Negative features: Nouns, long words, prepositional phrases, attributive adjectives, lexical diversity

The results Linguistic (use-based) variation across user-based register categories

Web registers along Dimension 1

What have we learned? Non-expert users can reliably classify web documents At least 1 in 10 internet texts belongs to a hybrid register category Publication type ≠ register (at least for the web) E.g., blogs showed up in several register categories Triangulating end-user classifications with linguistic analysis gives us a more complete understanding of register variation on the web

Web register research: Next steps Comprehensive linguistic description of the patterns of register variation on the web A new multi-dimensional analysis of web registers Detailed linguistic descriptions of ‘unique’ web registers Automatic prediction of register (‘AGI’) Automatically coded large corpus of web documents Extend descriptions to include ‘private’ web registers

Areas for future user-based research Register classification of printed texts Reader/listener perceptions Corpus annotation Word sense disambiguation

5. The future of crowdsourcing in user-based linguistics User-based analyses have always happened; now we can do them in a more valid way using crowdsourcing Triangulating use-based linguistic data offers a more complete understanding of discourse Linguists are often unable to fully analyze and interpret patterns in use-based datasets, particularly those that are very large Harnessing the power of user-based data via crowdsourcing could help us tackle big, difficult problems in linguistics

Mechanical Turk The name comes from an 18 th century machine that played chess. A person actually hid inside and played

Mechanical Turk Amazon's Mechanical Turk is a crowdsourcing tool. Researchers who need human evaluation can get data People who want to make some money help with the project (less than minimum wage) – Image recognition – Speech processing – Subjective evaluation – Giving opinions – Tagging corpora – Match picture with product

Mechanical Turk Example: word sense disambiguation in corpora – What should head be tagged as? Noun or verb? – What does head mean in a sentence? They charged the head of finances with the crime. (person with office) The beer was flat with no head. (froth) They were going head first (manner of movement) Computers can't do it well but people can

How does it work?

Couldn't people cheat? After reviewing results the requester can reject a worker When rejected, they don't get paid Workers have approval rates Requesters can choose only workers with good rates

Advantages Thousands of potential workers available You can get results fast Demographic variety (not just undergrads) Cheap (average $1.40 per hour)

Disadvantages Cheating Some studies show it's at same rates as in lab Ways to test “While exercising how often have you had a fatal heart attack?” It requires money Can't do many types of experiments (RT)

Go look at it Mechanical Turk website