Maxine Eskenazi Language Technologies Institute Carnegie Mellon University.

Slides:



Advertisements
Similar presentations
The Writing Process. What is it? Have you heard this phrase before? What do you know about the writing process? Have you heard this phrase before? What.
Advertisements

Maintaining data quality: fundamental steps
Thomas A. Stewart Literacy Test (OSSLT) Prep Guide 2013
Show Me the Money! How to ask for a Raise!.
Introducing Extensive Reading
CREATING A PAYMENT REQUEST FOR A NEW VENDOR
Helping Your Child Learn to Read
Using the Internet to find the best Information For Students in grades 7-8.
St Mark’s Elm Tree Parent Consultation 2013
Tips for Taking the FSA ELA Reading and Mathematics Assessments
Order of Operations And Real Number Operations
+ What is Communication?. + Communication: sending or receiving messages 3 Parts: 1)Sender 2)Message 3)Receiver.
World Consumption Comparison Project: Creating Your Charts Rubric and Instructions
Substitute FAQs SubFinder Overview. FAQs Do I have to have touch-tone service to use SubFinder? No, but you do need a telephone that can be switched from.
Interviewing and Script Writing
Logo Lesson 5 TBE Fall 2004 Farah Fisher. Prerequisites  Given a shape, use basic Logo commands and/or a procedure to draw the shape, with and.
How to create a common assessment using Limelight Revised Saint Joseph School District.
Downloading and Installing AutoCAD Architecture 2015 This is a 4 step process 1.Register with the Autodesk Student Community 2.Downloading the software.
DECEMBER 2014 Revising for Modern Languages Advanced Higher.
DynEd Courseware How to use the program Skills Center, Room 801.
Classroom language & Giving instructions
Listening Task Purpose of the test:
Level 2 IT Users Qualification – Unit 1 Improving Productivity Name.
Microsoft ® Office Word 2007 Training Mail Merge II: Use the Ribbon and perform a complex mail merge [Your company name] presents:
This is Google Drive. It stores all the documents you have made here.
WRITING EFFECTIVE S. Before writing the Make a plan! Think about the purpose of the Think about the person who will read the and.
How Do I Find a Job to Apply to?
Google Training By: Amy Shannon and Dave Auwerda.
Copyright ©: SAMSUNG & Samsung Hope for Youth. All rights reserved Tutorials Screens: Presentation skills Suitable for: Improver Advanced.
Vocabulary Link Listening Pronunciation Speaking Language Link LESSON A Writting Reading Video Program.
GRADE 9-10 FSA ELA READING SESSION 1 2 INSTRUCTIONS Today, you are going to take Session 1 of the Grade ___ Florida Standards Assessments English Language.
What is Museum Box? A Museum box is a way of presenting information that allows you to create a cube project that can be shared with others. You can use.
Interactive Story Telling -- You Can Shine Designed by Glory Chuang October 17,
Selection Control Structures. Simple Program Design, Fourth Edition Chapter 4 2 Objectives In this chapter you will be able to: Elaborate on the uses.
Downloading and Installing Autodesk Revit 2016
1. Reading 2. Writing 3. Listening 4. Speaking Listening and Speaking are used a lot…
Downloading and Installing Autodesk Inventor Professional 2015 This is a 4 step process 1.Register with the Autodesk Student Community 2.Downloading the.
Unit 18 Future trends. Objectives Focus Warm up 18.1 Making predictions 18.2 Talking about the future 18.3 Changing the way we work Sum-up Assignments.
Unit 1 – Improving Productivity Mollie painter. Instructions- 100 words per box.
Unit 1 – Improving Productivity Instructions ~ 100 words per box.
Optimizing Your Computer To Run Faster Using Msconfig Technical Demonstration by: Chris Kilkenny.
Interview with a Top Producing Real Estate Agent.
How to organize your notes When you are done reading this, you will know: Various ways to take notes How to keep track of your sources How to NOT screw.
Listening is a Skill Presented by: Dr. Patricia L. McDiarmid HLTH 365 Fall 2012.
 Sender: ENCODE the message.  Receiver: DECODE the message.  Feedback: Response to communication that shows whether the message is understood.
Learn English With a WorldWide Perspective 1.
Source Cards. Getting Started: This Power Point will help take you through the process of writing your source cards and making sure they are perfect.
This is Bonus Video 4.1B in the course: Get Paid To Write Copy Module 4: How to speak to clients, quote for work and get paid what you’re worth.
Careers that Fit your Personality, Interests and Talents 9 th /10 th Grade.
1 Taking Notes. 2 STOP! Have I checked all your Source cards yet? Do they have a yellow highlighter mark on them? If not, you need to finish your Source.
Work Arbitrage  get paid helping others find work! Zero investment Work from home Immediate start Fast and easy Zero training or investment.
Henrik Kjems-Nielsen ICES Secretariat InterCatch – the screen guide.
9.3.3 Writing a Research Paper. Do Now: Get out your Chromebooks and open the L1 Research Frame document from your Drive Agenda: ●Do Now ●Research Check-In.
© English Language Testing Ltd Taking the Password Skills Test.
© English Language Testing Ltd Taking the Password Skills Receptive Test.
SCC P2P – Collaboration Made Easy Contract Management training
ELPA21 Data Entry Interface (DEI) Overview
Welcome to your first Online Class Session
Delete this box when you are done!
Microsoft Word Reviewing Documents.
Data Entry Interface (DEI) Overview
Star Math PreTest Instructions For iPad users with the STAR app
One-Page Memoir Revisions
Star Early Literacy PreTest Instructions
Data Entry Interface (DEI) Overview
AIRWays Benchmark Previewing System
2019 Convention Planner Training Using NCA Convention Central: Part I
How Students Log In and Start a Test
Data Entry Interface (DEI) Overview
Presentation transcript:

Maxine Eskenazi Language Technologies Institute Carnegie Mellon University

 What is the problem?  How to insure that crowdsourcing results are reliable  The solutions: ◦ Testing the equipment ◦ Framing the task ◦ Testing the workers ◦ Training the workers ◦ Assessing the work

 Crowdsourcing is a great resource! ◦ You have large amounts of data to process ◦ It’s faster and cheaper while maintaining high quality  But, you can make it say what you want ◦ Example: Looking for sentences that include a well- pronounced example of the word, “table”:  “Do you agree that the word “table” was said in this sentence?” vs  “Please annotate this sentence”  You can get results that are meaningless  But you can get great results if you are careful!

 Testing the equipment - for those who will listen to something (to annotate, for example) ◦ Ask them to use a headset and then ask them to click yes if they can hear something  Relying on worker self-assessment is nice, but not very reliable ◦ Play something to them and ask them to write down what they heard  Compare what they wrote to what they heard (you had already written this down) and give them feedback, if they still can’t hear, on how to connect the headset

 Testing the equipment - for those who will record something  Ask them to speak into the microphone and then play it back to them and ask them if they heard something  Relying on worker self-assessment has sometimes worked in this case.  Ask them to read something from the screen and then use a speech recognizer to align what they said with what they read  MIT has the WAMI toolkit for this, and there are others as well  Have some other worker listen to what they said and annotate it, then compare that annotation to the text  This may take too much time

 Framing the task - Workers need to know what the task is and how to do it ◦ Write a description of the task and instructions on what to do  Get others to read that description and follow your instructions -sandbox  Revise and try out again ◦ Give examples and counterexamples  Give at least two to three of each ◦ Become a worker and try others’ tasks yourself!!  You understand issues better when you put yourself in their shoes

 Framing the task  VERY IMPORTANT ◦ Keep the cognitive load as low as possible! Break one complex task into several tasks ◦ Example – instead of “label the words you hear as well as the non-words, parts of words and pauses”,  you would ask “label the words you hear”, then  in a separate task “label the non-words, like lipsmacks, you hear”  in a separate task “label the parts of words, like restarts, you hear”  In a separate task “label where the pauses are”

 Framing the task ◦ Another example  Interspeech2013 – 25 th anniversary  Statistics on past 25 years – 18 categories  Total number of papers  Total number of different authors  2 harder-to-define categories - Total number of cohorts of authors  1500 attendees were quizzed  Crowd had close to correct or right answer on the first 16, nothing close on the last 2

 Framing the task ◦ Workers will choose the task they want to work on for several reasons:  How much they can make per hour  Calculate how much you should pay them so they make at least minimum wage (how much time it takes to complete one task)  How can you make the task go faster?  Putting all of one task on one page without scrolling  No scrolling saves their time  Example, ten sentences to annotate plus the instructions  Let them minimize the instructions if they want  Change font size and space between sentences to get it all on the screen at the same time  Eliminate any other unnecessary keystrokes

 Framing the task ◦ What it will be used for  You make your task more appealing when you tell people why you want them to do this task  Example from our work:  We are asking you to simplify some sentences. They are taken from everyday documents like driver license applications. This is so that we can automatically simplify everyday documents ◦ How nice it looks  Subliminal detail that has been shown to be effective

 Testing the workers – why? ◦ Do not assume they are native speakers of X – test them!  Just because you have geolocation, that does not mean the person fluently speaks the language of that country ◦ Do not assume that all speakers of Y can write down what they hear – test them! ◦ Not everyone is honest and there are bots

 Testing the workers – How? ◦ To test for speakers of X, you could ask them to translate (type in) something from English into the target language  Make sure that there is some word or expression that Google Translate or other would get wrong  You have already translated this sentence by hand  Compare the two texts

 Testing the workers – How? ◦ Give a new worker three items to do  Say you want them to listen to a sentence and annotate it  Give them three sentences to annotate  Compare their annotation with the hand annotation you already have done for this ◦ Getting good work often requires some human expert work to establish a “gold standard” ahead of time!!  So if you have lots of data, the investment is worth it, but it may not be for small datasets

Training the workers - t he pretesting you have done should serve as training for most tasks You could give more specific feedback if there is something they are doing that can be corrected Example, you asked for annotation that ends with a $ and one worker is not adding that $ but is annotating well. Just send that person a message to add the $. And keep the worker.

 Training the workers You can put up a small amount of tasks to start Say 100 tasks (for example, 100 utterances to annotate) Check whether the tasks are being done correctly Check whether each worker is doing the work correctly Revise your task if all workers are not doing well Or notify a worker if they are not doing as well as the other workers they risk not being paid and may want to abandon your tasks

 Assessing the work ◦ There are three places where you can assess work:  Before starting the task  See training and testing  While tasks are still live  Here is the best place to get rid of bots and cheaters  After tasks are done (post-processing)

◦ During the task  Compare work to “golden standard”  Create a dataset (about 10 percent of total items to be processed), for example of human expert labelled items  For every ten items, put in 1 gold standard item  Compare worker output to that item  Compare one worker’s output to that of others (inter- worker)  Majority wins, so have an odd number of workers for each task  Compare one worker’s output to their own work (intra- worker)  Give the worker the same item every 20 or 30 items and compare his/her performance on that item - consistency

 Assessing the work during the task ◦ Another thing to watch out for is bots and cheaters  Bots – creators model the task  Cheaters – get through the task as quickly as possible  While you would pay a poor worker, you should refuse to pay a bot and someone who you are sure is a cheater ◦ For cheaters, look at how much time it took to do each item  too fast? It’s a cheater ◦ Give a series of multiple choice items  If a worker answers B consistently they are either a bot or a cheater ◦ Put up small groups of tasks with different names  The tasks will be finished too quickly for a bot to be created (model of your task to be made)

 Assessing the work - after the task, on all of the data at once  Gold standard  Pull out the gold standard you created and compare the work that you have collected to it  Intraworker comparison  Does a worker consistently agree with the crowd?  Ask the worker if they are confident in their answer – if they consistently say no, do not use their work  Note that consulting the workers often brings in good feedback!

 Assessing the work - after the task, on all of the data at once  Interworker comparison  In the same way that you would compare the work of one worker to the gold standard, you can compare the work of one worker to another.  Look for one worker who does not agree with all of the others (uneven numbers again)  No need for gold standard for this, so your expert might need to label less data  Assess the work of one crowd by another  Ask one crowd to do the task  Give the same task to another crowd, showing the first crowd’s work, for example:  “Please correct the following”  “Does this text match what was said?” (yes-no or change what was wrong)

 We have seen ways to ensure that what you get is high quality and makes sense  Equipment can be tested reliably  Instructions and all of the setup that ensures the task makes sense can be tested  Workers can be pretested and trained  Bots and cheaters can be eliminated  The work can be assessed before, during or after the task is completed.

 Too much information?  These slides will be up on my website  Google for Maxine Eskenazi Research

 Any questions from the crowd?