1 To Share a Task or Not: Some Ramblings from a Mad (i.e., crazy) INLGer Kathy McCoy CIS Department University of Delaware.

Slides:



Advertisements
Similar presentations
Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 4 Shieber 1993; van Deemter 2002.
Advertisements

Recreational Sport Management & Careers
Key Verbs in Written Response and Essay Questions Most questions contain a key verb or command term.
Concordance. A library historically indexed its collection in a very space- and time-consuming manner. The index consisted of a physical card catalogue.
February Leadership: Making Coherence MTL Meeting February 2010 Pandora Bedford Astrid Fossum Laura Maly Cynthia Rodriguez This material is based upon.
Aaron Williams Allie Reid Timmy Chong.  For years, corporations have sponsored high school sports. Their ads are found on the outfield fence at baseball.
Entrepreneurship Presenter:Syed Tariq ijaz kaka khel MBA (Human Resource Management)
Developing and Implementing an Educational Research Agenda Elizabeth Hoppe Health Professions Educational Research Symposium Nova Southeastern University.
From requirements to design
Good Research Questions. A paradigm consists of – a set of fundamental theoretical assumptions that the members of the scientific community accept as.
Improving Students’ Flexibility in Algebra: The Benefits of Comparison Jon R. Star Michigan State University (Harvard University, as of July 2007)
Effective Teaching I. Outcomes of Last Year’s Retreat The two strategies identified as ways to help us become more effective teachers were: Educating.
PPA 502 – Program Evaluation
Writing Good Software Engineering Research Papers A Paper by Mary Shaw In Proceedings of the 25th International Conference on Software Engineering (ICSE),
1 The centrality of assessment “The spirit and style of student assessment define the de facto curriculum.” (Rowntree, 1977) “Assessment will often swamp.
TOGETHER EVERYONE ACHIEVES MORE
Discussion group C Particular challenges of working together at doctoral level Report by Liviu Matei.
Katarzyna Gromek Broc University of York. Your essay how to make the essay more analytical than descriptive, how to acknowledge sources in a more interesting.
Student Page Top Introduction Task Process Evaluation Conclusion Teacher page Credits Evolution: Darwin Theory A WebQuest for Grade 12 (Darwin Theory)
Principled Negotiation 4 Scholars from the Harvard Negotiation Project have suggested ways of dealing with negotiation from a cooperative and interest-
Influencing the Research Agenda Findings from an independent evaluation of a Cancer Network Consumer Research Panel Cindy Cooper, Julia Moore, Rosemary.
ECON 101: Introduction to Economics - I Lecture 3 – Demand and Supply.
Principles of Effective Teaching A summary of research in K-12 classrooms Jere Brophy
Operational Issues – Lessons learnt So you want to do an Impact Evaluation… Luis ANDRES Lead Economist Sustainable Development Department South Asia Region.
The Success Game of Life
Transitioning to Online Teaching: Tips on being a Successful Teacher.
Outcome Based Evaluation for Digital Library Projects and Services
METHODS (CONT’D)MOTIVATIONRESULTSDISCUSSION OBJECTIVES METHODS CONCLUSIONS ACKNOWLEDGEMENTS REFERENCES Poster Title: Brief Description of What Was Done.
Writing research proposal/synopsis
Parent concerns and issues with school district budget April 3, 2013.
Epidemiology Literature Critique Outline and guidelines.
Comparative Investment Problems ©Dr. Bradley C. Paul 2002 revisions 2009 Note – The concepts covered in these slides can be found in most Engineering Economics.
Chapter 6. Semantics is the study of the meaning of words, phrases and sentences. In semantic analysis, there is always an attempt to focus on what the.
Promising Ideas and Issues to Consider in Reaching Reading and Literacy Goals Logistics of supervision, training, support to teachers Sakil Malik Director.
Analysis, Scoping and Costing. Analysis The purpose of analysis is to confirm the current needs of the business or marketplace. It defines – The current.
5/30/20161 Iterative Project Management Chapter 2 – How Do Iterative Projects Function? Part 1 Iterative Project Management / 01 - Iterative and Incremental.
Searching the web Enormous amount of information –In 1994, 100 thousand pages indexed –In 1997, 100 million pages indexed –In June, 2000, 500 million pages.
Research Paper ELS Laverne What is a research paper?  Research papers place an emphasis on the development of a student's critical thinking and writing.
For Monday Read chapter 24, sections 1-3 Homework: –Chapter 23, exercise 8.
ACE TESOL Diploma Program – London Language Institute OBJECTIVES You will understand: 1. The terminology and concepts of semantics, pragmatics and discourse.
Desiderata for evaluation Nancy Green, Kathy McCoy, David McDonald, Cecile Paris, Donia Scott.
Monroe’s Motivated Sequence. THE FIVE STEP PROCESS: 1. Attention 2. Need 3. Satisfaction 4. Visualization 5. Action.
28/9/15. Because we live in a world of limited resources and not enough time. There will always be more to do than time and resources will allow. Project.
Targeted Selection This workforce solution was funded by a grant awarded under the President’s Community- Based Job Training Grants as implemented by the.
Jette Viethen 20 April 2007NLGeval07 Automatic Evaluation of Referring Expression Generation is Possible.
+ ENG 105i Writing in Business Social Media Bootcamp & Interview Prep Day 1 September 11, 2015.
An Evaluation Competition? Eight Reasons to be Cautious Donia Scott Open University & Johanna Moore University of Edinburgh.
Mobile Application Design and Development Insert Your App Name Northeastern University1 Name of App Tagline (80 chars max, including spaces) Team member.
Streatham Wells Parents’ Evening Wednesday 14 October Parents, Children and Homework What can a parent do?
Enterprise Technology & Analysis for Enterprise ETI 6134 Dr. Karla Moore.
For Monday Read chapter 26 Homework: –Chapter 23, exercises 8 and 9.
CHAPTER 16 Preparing Effective Proposals. PRELIMINARY CONSIDERATIONS  Conducting a Preliminary Assessment  Prior to Writing the Proposal  How Fundable.
A Presentation on TRAINING NEEDS ANALYSIS. Shradha(02) Vidya(34) Rothin(58) Pallav(48) Preeti Minz(11) Preeti Kumari(S2) Rohan Charly(24)
Publishing in Theoretical Linguistics Journals. Before you submit to a journal… Make sure the paper is as good as possible. Get any feedback that you.
Banaras Hindu University. A Course on Software Reuse by Design Patterns and Frameworks.
Project Management Enabling Quality Marien de Wilde, PMP April 2007.
The Cross Lingual Wiki Engine Project Alain Désilets National Research Council of Canada
Presented by The Solutions Group Decision Making Tools.
WHAT IS NEGOTIATION Negotiation is the process by which we search for terms to obtain what we want from somebody who wants something from us.
Abstract  An abstract is a concise summary of a larger project (a thesis, research report, performance, service project, etc.) that concisely describes.
The ‘text’ as linguistic unit. Different approaches to the study of texts from a linguistic perspective have been put forward - e.g. text grammar vs.
Reading and Writing to Succeed on the EAS (Educating All Students) Exam: The “Constructed Response” or Short Essay A Student Workshop by Writing Across.
UNIT 2 – LESSON 6 ENCODE AN EXPERIENCE.
Written Task 1.
Research Issues at the Boundary of AI and Robotics
Communicative competence
My spectrum of rigid/focused thinking and setting goals
Characteristics of Recognition
Identifying the Need for Instruction
Planning Training Programs
Presentation transcript:

1 To Share a Task or Not: Some Ramblings from a Mad (i.e., crazy) INLGer Kathy McCoy CIS Department University of Delaware

2 What is intended by Shared Task? A competition for money? A competition for money? A funded activity in itself? A funded activity in itself? A competition just for the fun of it? A competition just for the fun of it? A competition or a cooperation? A competition or a cooperation? A cooperation would entail groups of researchers collaborating on a larger system (need agreed-upon architecture) A cooperation would entail groups of researchers collaborating on a larger system (need agreed-upon architecture) A competition would entail different groups working “against each other” on the same problem A competition would entail different groups working “against each other” on the same problem

3 What is the desired outcome? An advance in technology that may be applicable in lots of different places? An advance in technology that may be applicable in lots of different places? An advance in NLG technology that will allow more commercialization? bigger web presence? more excitement? An advance in NLG technology that will allow more commercialization? bigger web presence? more excitement? More funding for INLG research? More funding for INLG research? More publications of INLG research? More publications of INLG research? Bring more people into the field? Bring more people into the field? Get some important task done (that needs INLG)? Get some important task done (that needs INLG)?

4 What about Comparative Evaluations? Major problem here is that we must agree on what is to be evaluated and how. Major problem here is that we must agree on what is to be evaluated and how. Must have a number of different groups working on precisely the same problem with same assumptions. Must have a number of different groups working on precisely the same problem with same assumptions. What is the desired outcome of comparative evaluations? What is the desired outcome of comparative evaluations? We get to name a system “winner”? We get to name a system “winner”? Presumably we would learn something about the task, but it isn’t quite clear to me what that something is. Presumably we would learn something about the task, but it isn’t quite clear to me what that something is.

5 2 Ends of the Spectrum in Shared Task/Evaluations 1. The killer application Text summarization should have been it! Text summarization should have been it! Generates excitement in the field Generates excitement in the field Generates funding opportunities Generates funding opportunities 2. Component pieces Referring expression generation is an example Referring expression generation is an example What will be accomplished? What will be accomplished? Someone gets a gold star? Someone gets a gold star?

6 The kind you want depends on your ultimate goal. The kind you want depends on your ultimate goal. Both share some dangers revolving around choice of evaluation methods. Both share some dangers revolving around choice of evaluation methods.

7 Dangers in Shared Task Exclusion Exclusion Shared task metrics become the de facto standard for evaluating research in the field Shared task metrics become the de facto standard for evaluating research in the field Doesn’t allow one to do research that doesn’t do well with the metrics (and the metrics are going to be prejudiced) Doesn’t allow one to do research that doesn’t do well with the metrics (and the metrics are going to be prejudiced) May leave generation behind – Killer Apps may find such interesting problems that generation becomes secondary. May leave generation behind – Killer Apps may find such interesting problems that generation becomes secondary. Emphasis on shallow processing excluding theoretical benefits Emphasis on shallow processing excluding theoretical benefits

8 Multiple or Human-Based Metrics Don’t Help

9 The Killer App Story The application itself must define the appropriate metric(s) – does the application work? The application itself must define the appropriate metric(s) – does the application work? Many of the things we hold near and dear have a significantly smaller influence than some other things Many of the things we hold near and dear have a significantly smaller influence than some other things Discourse coherence Discourse coherence Complicated syntax/variation in syntax Complicated syntax/variation in syntax Lexical choice Lexical choice Referring expression generation Referring expression generation

10 But… We KNOW these things are important! We KNOW these things are important! Problem becomes: Problem becomes: Other “more important” aspects are deemed to make more of a difference Other “more important” aspects are deemed to make more of a difference By the time these issues come up, people have invested too much time into a particular kind of solution By the time these issues come up, people have invested too much time into a particular kind of solution

11 Comparative Evaluations The nature of the shared/agreed upon evaluation methods placed a judgment on importance of some aspects over others The nature of the shared/agreed upon evaluation methods placed a judgment on importance of some aspects over others Evaluation is necessarily prejudiced with respect to which issues are stressed Evaluation is necessarily prejudiced with respect to which issues are stressed Referring expressions: distinguishing descriptions with concrete knowledge base Referring expressions: distinguishing descriptions with concrete knowledge base What about referring expressions in news stories? Pronoun use? Conjunctions? Influence of surrounding text? What about referring expressions in news stories? Pronoun use? Conjunctions? Influence of surrounding text?