1 Dialogue, Speech and Images: The Companions Project Data Set Yorick Wilks, David Benyon, Christopher Brewster, Pavel Ircing, and Oli Mival

Slides:



Advertisements
Similar presentations
Please note: this session is being recorded. Welcome to Blackboard Collaborate Before we start: 1. Please check your audio is working: Tools > Audio >
Advertisements


SAL (Sensitive Artificial Listener) Emotion induction technique developed at QUB.
Stefanie Murray, MPHPatrick J. Lemmon Public Health Prevention Specialist, CDCCo-founder and former Office of Family Health, OHAExecutive Director Men.
Future Lifestyle Integrated with Speech Recognition 沈育德.
How to conduct the interview  Have different types of questions, open ended and closed questions.  Descriptive will allow them to first get to know.
Deciding How to Measure Usability How to conduct successful user requirements activity?
Evaluating Human-Machine Conversation for Appropriateness David Benyon, Preben Hansen, Oli Mival and Nick Webb.
Interview Skills for Nurse Surveyors A skill you already have and use –Example. Talk with friends about something fun You listen You pay attention You.
PHONEXIA Can I have it in writing?. Discuss and share your answers to the following questions: 1.When you have English lessons listening to spoken English,
Diary studies Rikard Harr November 2010 © Rikard Harr Outline The Diary study: benefits, challenges and alternatives The papers: aims and use of.
“POS” Interview Checklist Tips for Successful Recorded Interviews Write down and/or record each person’s name and title/position, and for students their.
Placement Podcasts The University of Huddersfield Experience Lisa Ward and Jane Gaffikin 4 th September, 2007 ASET Annual Conference, Cardiff.
1 Testing Oral Ability Pertemuan 22 Matakuliah: >/ > Tahun: >
Elluminate as a virtual classroom Fang Lou 1. Outline of the session What is Elluminate? How do we use it? Overview of the Elluminate Different levels.
Seminar on Rural Sustainability - A North American Perspective Alex Mayer, Michigan Technological University.
Gender Issues in Systems Design and User Satisfaction for e- testing software Prepared by Sahel AL-Habashneh. Department of Business information systems.
Data collection methods Questionnaires Interviews Focus groups Observation –Incl. automatic data collection User journals –Arbitron –Random alarm mechanisms.
Building the Design Studio of the Future Aaron Adler Jacob Eisenstein Michael Oltmans Lisa Guttentag Randall Davis October 23, 2004.
—————————————————————————————————————————— Design of Interactive Computational Media Jan.-Apr Slide T5.1 Scenarios Presented by Faye Baron Prepared.
The New Medical Device Audio & Video Player The interactive learning tool that is full of possibilities! from CMO Digital.
ELecta Live Update What’s new in Version 4.8 What’s New in V. 4.8 February
Twenty-First Century Automatic Speech Recognition: Meeting Rooms and Beyond ASR 2000 September 20, 2000 John Garofolo
Copyright ©: SAMSUNG & Samsung Hope for Youth. All rights reserved Tutorials Screens: Presentation skills Suitable for: Improver Advanced.
Communication Skills Anyone can hear. It is virtually automatic. Listening is another matter. It takes skill, patience, practice and conscious effort.
Data collection and experimentation. Why should we talk about data collection? It is a central part of most, if not all, aspects of current speech technology.
MULTIMEDIA What is Multimedia? The word MULTIMEDIA is made up from two words, MULTI meaning more than one and MEDIA meaning a way of displaying or passing.
ACTIVITY. THE BRIEF You need to provide solid proof to your stakeholders that your mobile website meets the needs of your audience. You have two websites.
Scenarios Presented by Krista Strickland. Definition Scenario: “a narrative or story that describes the activities of one or more persons, including information.
Find out more about your family.  We are all going to learn a little bit more about ourselves and our heritage by interviewing an older family member.
Speak Smart, Stand Smart, Be Smart
The Monologue Project The project you’ll remember years from now!
Enhancing Teaching and Learning with Podcasts Mico e-Learning Workshop.
Yukon Education Literature Circle Meeting #2 October 24, :00 PM - 9:30 PM.
ACADEMIC CONVERSATIONS
©2010 John Wiley and Sons Chapter 6 Research Methods in Human-Computer Interaction Chapter 6- Diaries.
Class Usability Experience User slides are in BLUE.
Information Technology – Dialogue Systems Ulm University (Germany) Speech Data Corpus for Verbal Intelligence Estimation.
Independent Study Bites via the Web By David Hill “Learning is what remains after all that was learned has been forgotten.”
Specialized Input and Output. Inputting Sound ● The microphone is the most basic device for inputting sounds into a computer ● Microphones capture sounds.
Final Exam Review Session 14 LBSC 790 / INFM 718B Building the Human-Computer Interface.
I.T. supporting older and disabled people: Prof. Alan Newell, MBE, FRSE, Applied Computing, University of Dundee, Scotland, UK.
National Area-Based Development Programme (NABDP) Ministry of Rural Rehabilitation and Development Kabul, Afghanistan Advice for Field Photos By Jayne.
What is Voice Thread? VoiceThread is an application that runs inside your web browser, so there is no software to download, install, or update. VoiceThread.
©2011 1www.id-book.com Data Gathering Chapter 7. ©2011 Data Gathering What is data gathering? –The act of gathering data through a study The data can.
The New Pharmaceutical Audio & Video Player The interactive learning tool that is full of possibilities! from CMO Digital.
National DRS Patient Feedback Angela Ellingford 2010.
WEB EDITOR MEETING Welcome. WHY HAVE THESE? We have a rather large collective of very smart, talented people Sharing ideas Communicating.
Kidblog- Authentic Writing Date: Time Instructor Name Click the microphone icon at the top of the Audio & Video window to enter the Setup Wizard. This.
Tracking Functionality Changes in IRI: A Distance Education Software System C. Michael Overstreet, Kurt Maly, Ayman Abdel-Hamid, Ye Wang Old Dominion University.
Job Shadowing General Guidelines for Students. 1.Work out with your host what you will see and do that day, based on your interests and availability.
Steps in Planning a Usability Test Determine Who We Want To Test Determine What We Want to Test Determine Our Test Metrics Write or Choose our Scenario.
EXAMINERS’ COMMENTS RAPHAEL’S LONG TURN GRAMMAR Accurate use of simple grammatical structures and also of some complex sentences: ‘they could also be preparing.
PRESENTER: MS. CRYSTAL WATSON DATE: OCTOBER 4, 2014 Preparing for a Successful Job Interview.
Week 2: Interviews. Definition and Types  What is an interview? Conversation with a purpose  Types of interviews 1. Unstructured 2. Structured 3. Focus.
COMMUNICATION Pages 4-6. Michigan Merit Curriculum Standard 7: Social Skills – 4.9 Demonstrate how to apply listening and assertive communication skills.
Welcome All Shamsul Ahsan Instructor(Science) Shamsul Ahsan Instructor(Science)
Value in Webinars ID: Andrea Hildreth Client: Walden University, Capstone Project.
Adult Student Match.
Factors facilitating academic success: a student perspective
Chapter 6. Data Collection in a Wizard-of-Oz Experiment in Reinforcement Learning for Adaptive Dialogue Systems by: Rieser & Lemon. Course: Autonomous.
Kidblog Date and Time Presenter Name
Helping All Pupils Share Their Learning
Before we get started You will need the following apps downloaded to your mobile device: Microsoft Translator Office Lens  This matches with Engage section.
Creating Interactive Assignments in BCPS One
This presentation has been prepared by Vault Intelligence Limited (“Vault") and is intended for off line demonstration, presentation and educational purposes.
Evaluation of Mobile Interfaces
Inclusive Practice Using Lecture Capture
Starting and Ending Class – Conclusion
Level 1 & 2 Speaking Test Workshop. Why are we conducting this workshop?  To set the standards for assessment level 1 & 2.  To train students according.
Presentation transcript:

1 Dialogue, Speech and Images: The Companions Project Data Set Yorick Wilks, David Benyon, Christopher Brewster, Pavel Ircing, and Oli Mival

2 Companions Project 4-year, FP6, EU Project 14 partner sites (academic + commercial research) Research in Multimodal interfaces –Machine learning applied to dialogue systems –Emotions and ECAs –Dialogue and planning for mobile devices –Two prototypes/demonstrators Senior Companion Health and Fitness Companion

3 A Multiplicity of Companions Two major prototypes: –A Health and Fitness Companions Task driven, focussed, domain specific –A Senior Companion Open domain, mixed initiative, building a life narrative via photos Other Companions: –A mobile version of the H&FC –A home/cookery focused version of H&FC –An SC for the Czech language

4 The need for Dialogue Corpora General paucity of dialogue corpora The SC is open domain (because photos can be qabout anything), aimed at the elderly, and we cannot assume dialogue structures transfer Key idea: Use the initial prototype to generate more data Initial data collection more limited based on WoZ methodology - this is what this talk is about.

5 Specifications Modified WoZ –Emphasis on naturally occuring dialogues relevant to domain –People asked to reminisce about photos Initially random public domain Proper scenario - photos of personal importance –Photos primarily of people and events: friends and relative, weddings, holidays, etc. –We assumed the WoZ knew how many people in photos (because we assumed image processing technology could tell the System)

6 Specifications (2) WoZ instructed to use a standard set of questions such as: –What are is the name of the person in the picture? –Where is this picture taken? –What is the relationship between the people? But interviewer not limited to this User is encouraged to express feelings, memories, and associations

7 Data Collection Set up (English) Data with two set ups: –WoZ with Avatar + TTS as system – WoZ without TTS i.e. with human interviewer Use of TTS (although theoretically more realistic) slowed down the experiments too much Photos showed one at time, participation tails off after about 20 min

8

9 Senior Companion Data Collection at Napier September 2007: –45 sessions/ 30 hours Gender 27 male/13 female Age: sessions in homes, 38 at Napier With avatar 16 sessions, without avatar 29 sessions –Early sessions were not transcribed to ASR standards, later sessions used ‘Transcriber’ tool (Barras et al., 2001)

10 Current status (English data): TOTAL SESSIONS = 101 (approx 70 hours) –In.TRS format = 42 (approx 30 hours) 27 sessions with full video and trs files (waiting on transcription for 16) 15 sessions with trs files with no video –In simple text format = 59 (approx 40 hours) 55 session pre-transcriber (.trs files) 4 sessions pre-transcriber with full video

11 Moira Ross, 68, Aberdeen, Scotland

12 Data Collection Sample M1:Okay, I think we’re ready to start looking at your pictures now. Please tell me about your first photo. F1:Okay, that’s at a friend’s wedding and that’s Martin and my son, Stefan, that’s a few years old now, wearing their kilts. M1:How old is Stefan? F1:I think in that picture he must have been about five? M1:Is that Stefan on the right? F1:It is, yes. M1:Great. Is there anything else you would like to say about them? F1:Yeah, well I remember that day, about… it was a friend, Chris’s wedding, so… and I think it was a… yeah… Stefan had his kilt outfit on that day. M1:That’s very interesting, how does this photo make you feel? F1:It just… it reminds me that it was winter, it was right after Christmas, that wedding, it was very cold that day. M1:Okay, let’s move on to the next. F1:That’s me and Martin in Gibraltar. That was very, very many years ago. We were visiting a friend in Gibraltar.

13 Czech SC data recording Set up chosen with the avatar Wizard a dedicated room has been established for the recording the subject sees on the screen: –the photo currently being discussed –the avatar (“talking head”) audio is captured by high-quality wireless microphones the subject is also simultaneously recorded by 3 miniDV cameras – video is intended for future use in emotion detection, gesture recognition, etc.

14 Recording room setup

15 Current state of the Czech Data 50 subjects recorded (mostly seniors) average length of an interview is 55 minutes, average number of photos being discussed during the session is 8.5 it turns out that old people really DO enjoy discussing their photos with an artificial “companion” and reminiscing about them on the other hand, people in “productive-age” often tend to provide just the technical description of the discussed photo

16 Availability and Format We plan to make all this data publicly available Most appropriate format still open issue Four data streams at least: –Audio –Transcription –Video –Images discussed

17 Thank You Comments and advice - welcome!