Presentation is loading. Please wait.

Presentation is loading. Please wait.

Robert's Drawers (and other variations on GRE shared tasks) Gatt, Belz, Reiter, Viethen.

Similar presentations


Presentation on theme: "Robert's Drawers (and other variations on GRE shared tasks) Gatt, Belz, Reiter, Viethen."— Presentation transcript:

1 Robert's Drawers (and other variations on GRE shared tasks) Gatt, Belz, Reiter, Viethen

2 Available resources ● TUNA Corpus (Gatt et al; ca. 2500 refs)  one-shot references  balanced  2500 refs to furniture or people ● Robert's drawers (Viethen and Dale; ca. 140 refs)  one-shot references  not yet balanced ● GREC (“GRE in Context”) (Belz and Varges)  2000 introductory passages from Wikipedia  1000 annotated, rest in progress  annotated for reference to the main subject (“topic”)  different NP types:subjects, objects, possessives ● COCONUT (Jordan)  goes beyond just identification ● (possibly another corpus of newspaper texts)

3 Short-term additions to resources ● Add comprehension data:  Carry out experiments to get people to identify referents and pair results with corpus descriptions. Data include: ● reaction time ● error rate ● self-paced reading for GREC-type corpora

4 Long-term additions to resources ● Eye-tracking data ● Situated reference in virtual environments (Koller et al, this Workshop) ● In progress: small multimodal corpus (Bangerter, van der Sluis, Gatt)

5 Task definition ● Task structure:  provide a data source  have a small set of clearly defined tasks but ALSO:  have an open category ● Evaluation:  default metric  call for proposals for evaluation metrics  correlate metrics with human judgments/performance ● Scope for variation:  Task: content determination, realisation, lexical choice  Type of reference: full definite, anaphoric, singular/plural  Goal: model production or enhance comprehension

6 (Sub-)communities ● GRE people (the usual suspects) ● CoNLL/EMNLP community ● Psycholinguists:  advice/expertise  computational psycholinguistic modelling

7 Aims ● “Community” aims:  Have fun!  Get people working together, consolidate the community  Broaden the community ● Broader aims:  Have a test-bed to see if NLG STECs actually work  GRE is probably the best initial candidate ● Scientific aims:  Hothouse effect  Evaluation: ● Use different methods ● Evaluate the methods

8 Execution: Logistics ● Dry run to pilot the idea  Possibly at UCNLG (September)  Shared competitive task: Content Determination ● singular definites, furniture  Production evaluation, using TUNA  Include a call for evaluation metrics  Also include open track ● Main event (larger scale & wider scope)  Co-located with INLG?  Several shared tasks + open category  Evaluation: ● Production: match between algorithm & human ● Comprehension: ease of identification, etc.

9 Evaluation: £££ ● Sources of expense:  Human evaluations  Adding comprehension data to the corpora  Organisational costs (web site, etc) ● Who's paying?  Community effort  Aberdeen platform grant  Brighton Prodigy project funds  No special funding (yet)


Download ppt "Robert's Drawers (and other variations on GRE shared tasks) Gatt, Belz, Reiter, Viethen."

Similar presentations


Ads by Google