Browser Evaluation Test …A Trial Run Pierre Wellner & Mike Flynn, IDIAP Fribourg Nov 26, 2004 Mike Flynn, Pierre Wellner IDIAP Simon Tucker, Steve Whittaker University of Sheffield
Outline Reminder of BET Trial Run Results Analysis Future work
Reminder What is a Browser for? “Browsing a meeting recording is an attempt to find a maximum number of observations of interest in a minimum amount of time.” “Observations of Interest” –Pairs of complementary statements about the meeting –Of interest to… the participants, or to people who missed the meeting. Observers –Unlimited access –No time limit actually 4½ x meeting time (on average) Subjects –Answer as many Questions as possible –Time limit: ½ meeting time –Questions are observation pairs, without indication
The BET Process
Trial Run: Observers Needed native English speakers –University of Sheffield –Students, researchers, lecturers Meetings1 x 44 minutes Observers6 Observations294 (only 255 used)
Observer’s Screen Shot
Observations… about the observations Examples: Agnes thinks having the sofa along the whiteboard is a good idea. Agnes thinks the sofa will be in the way if under the whiteboard. Martin wants to put the coffee machine along the left wall. Martin wants to put the coffee machine along the right wall. Mainly about what was said, not done Participants names all in top ten words –Others: the, of, to, at, is, that 283/294 (83%) use participant by name Observation density…
Observation Density Graph
Trial Run: Subjects 11f + 13m = 24 total University of Sheffield Three conditions: “Guess”- no media whatsoever “Base”- same media as Observers “F 1 ”- Ferret with Brno ASR transcript + slides + speaker segmentations
Guess Condition Screen Shot
Base Condition Screen Shot
F1 Condition Screen Shot
Results: Guess Condition SubjectAnswersCorrectIncorrectScore A % A % A % Total %
Results: Base Condition SubjectAnswersCorrectIncorrectScore B % B % B % B % B552340% B631233% B % B854180% B983537% B % B % Base Total %
Results: F 1 Condition SubjectAnswersCorrectIncorrectScore C % C263350% C % C % C % C % C % C % C % C % F 1 Total %
Details Scores by time Media time-difference Speed versus accuracy
Results by time, overlaid Scores by Time
Media time difference histogram Proximity of Answers to Questions
Speed versus Accuracy graph Speed versus Accuracy
BET scores ConditionSpeedAccuracy Guess % Base % F %
Future work AMI recording 100 hour corpus More observations More subjects –reduce confidence interval (~18% wide) Design, test & compare browser improvements