11/4/1999ACM Multimedia 991 Auto-Summarization of Audio-Video Presentations Li-wei He, Elizabeth Sanocki Anoop Gupta, Jonathan Grudin Collaboration and Multimedia Group Microsoft Research
11/4/1999ACM Multimedia 992 Motivation On-demand multimedia is becoming pervasive –Corporate training and communication At Microsoft, over 360 courses online in two years –Research seminars Microsoft Research archives about 2 talks daily
11/4/1999ACM Multimedia 993 Motivation (Cont.) Effective summarization and browsing techniques can help viewers utilize time better –Audio-video different from text –Many approaches possible Time-compression, indexes, highlights, … This talk focuses on: –Informational presentations –Automatic summarization methods
11/4/1999ACM Multimedia 994 What Is a Video Summary? Assembled from segments of the original
11/4/1999ACM Multimedia 995 The 4 Cs of a Good Summary Conciseness: as short as possible Coverage: covers key points Context: defines terms before using them Coherence: flows naturally and fluidly
11/4/1999ACM Multimedia 996 Talk Outline Introduction Automatic summarization –Sources of information in A/V presentations –Three algorithms Evaluation
11/4/1999ACM Multimedia 997 Sources of Information Audio and video –Pitch and pause information Speaker actions –Slide-transition points End-user actions –Video segments watched by earlier viewers
11/4/1999ACM Multimedia 998 Auto-summarization Methods Method 1 (S) Method 2 (P) Method 3 (SPU) Slide transition XX Pitch analysis XX User access log X
11/4/1999ACM Multimedia Slide-based Method (S) Rationale –Beginning of a slide marks a new topic –Time devoted to slide indicates its importance Algorithm –First N% of video for each slide
11/4/1999ACM Multimedia Pitch-based Method (P) Rationale –Pitch activity indicates the speakers emphasis Algorithm (based on Arons ISSLP 94) –Compute pitch for every 1ms frame –Count the number of frames above a threshold in 15 second windows –Select the windows with the most count
11/4/1999ACM Multimedia Combined Method (SPU) The amount of time that previous viewers spent on a slide indicates importance
11/4/1999ACM Multimedia Combined Method (SPU) Algorithm –Compute importance measure for each slide –Allocate summary time for each slide according to the importance measure –Use pitch-based algorithm to pick the segments in each slide Average Viewer Count of Slide N Average Viewer Count of Slide N-1 Importance of Slide N =
11/4/1999ACM Multimedia 9913 Talk Outline Introduction Automatic summarization Evaluation –Experimental design –Results
11/4/1999ACM Multimedia 9914 Experimental Design To compare summarization techniques –Original presenters (authors) created summaries (A) as gold standard –Authors wrote quiz questions that covered the content of summaries –Objective measure: quiz score improvement after watching a summary –Subjective measures: user survey
11/4/1999ACM Multimedia 9915 Experimental Design (Cont.) 4 summary types (S, P, SPU, A) 4 talks chosen from Microsoft training site 24 Microsoft employees were subjects –Summary types and talks are counter-balanced within each subject
11/4/1999ACM Multimedia 9916 Demo Summary
11/4/1999ACM Multimedia 9917 Quiz Score Improvement As expected, author-created summaries did best No significant difference among the automatic methods
11/4/1999ACM Multimedia 9918 Survey Rating Results A >> SPU > P = S Context (1-7) Concise (1-7) Coherent (1-7) Coverage (%) A SPU P S
11/4/1999ACM Multimedia 9919 Percent of Value Derived From slide content: 46% From audio content: 36% From video content: 18%
11/4/1999ACM Multimedia 9920 Interesting Sequence Effect OrderClearChoppyOverall
11/4/1999ACM Multimedia 9921 Conclusions Ability to skim/browse will be key to wide use Automated methods can add significant value –Add domain knowledge is important –Increasing acceptance over time Evaluation is a key but very difficult
11/4/1999ACM Multimedia 9922 Conclusions (Cont.) Getting the human into the loop –Speakers –End-users as a group E.g. collaborative filtering –End-users as an individual E.g. interactive browsing Visit us at:
11/4/1999ACM Multimedia 9923 Interface of a Typical Talk Table of content Video Slides VCR-like controls
11/4/1999ACM Multimedia 9924 Summary Characteristics Talks were from MS internal training site –UI Design, Internet Explorer, Dynamic HTML, Microsoft Transaction Server Average length –20% to 25% of the original –10 to 14 minutes Overlap with author-created summaries was no better than chance
11/4/1999ACM Multimedia 9925 Survey on the Summary Just Watched Concise: It captured the essence of the talk without using too many sentences Coverage: My confidence that it covered the key points of the talk is … Context: It is clear and easy to understand Coherent: It provided reasonable context, transitions, and sentence flow
11/4/1999ACM Multimedia 9926 Survey Rating Results A >> SPU > P = S
11/4/1999ACM Multimedia 9927 Information Not Used Spoken text content Speaker gestures
11/4/1999ACM Multimedia 9928 Talk Outline Introduction –Motivation –Definition of a video summary –Attributes of a good summary Automatic summarization Evaluation
11/4/1999ACM Multimedia 9929 Viewers Over Time for One Talk Viewer number decreases overall and within each slide
11/4/1999ACM Multimedia 9930 Importance Measure Average Viewer Count of Slide N Average Viewer Count of Slide N-1 Importance of Slide N =
11/4/1999ACM Multimedia 9931 Author-created Summary (A) Original presenters (authors) were asked to produce summaries of the talks –Author marked the text transcript –Video summaries were generated manually by aligning the video with the marked portions
11/4/1999ACM Multimedia 9932 Summary Automatic algorithms performed respectably –Thats pretty cool for a computer. I thought someone had sat down and made them –SPU was preferred over S and P Will viewers get used to auto summary?
11/4/1999ACM Multimedia 9933 Future Work Compare audio/video and text summaries Interactive and intelligent video browser Visit us at