Automatic Video Editing Stanislav Sumec
Motivation Multiple source video data – several cameras in the meeting room, several meeting rooms in teleconference, auditorium, … Present only subset of available data at the each moment (viewer cannot watch everything). Camera showing “the most interesting” and “the most important” activity has to be selected. Shot composition should satisfy some predefined rules (e.g. results similar to TV programs edited by human). Editing should be adjusted according to desired information (user requirements, e.g. activity of selected participant).
Possible Applications Offline editing Production of small-size audio-video packages summarizing meetings, lectures, … Live editing Teleconferences – reduction of dataflow, selection of projected remote participant, … Broadcasting of meetings, lectures, …
Main Idea Assumptions Activities on every camera can be evaluated. Methodology of the video editing can be described by various rules. Algorithm Evaluate measure of interest of all cameras. Assign one camera or more blended cameras to each time point of the meeting with respect to given aspects and requirements. Extensible design – new rules can be simply integrated (e.g. new activity), new types of events can processed, …
Current State General engine processing various types of rules have been design (common for different events e.g. meeting, lecture, …). Virtual cameras simulating detail cameras are supported (if physical cameras are not available). Elementary rules for activity evaluation are available (for AMI meetings, M4 meetings, lectures). Speech activity. Other high-level participants’ activities – taking notes, standing, sitting, moving,.. Motion/localization of participants based on low – level features. Projection screen localization, slides change detection. Rules covering some film making methodology have been proposed. Timing of shot switching – minimal and maximal length, periodicity, … Classification of views – distant view, detail, half-detail, … Definition of views types which can follow certain view type. General model of the resulting video – introduction, presentation, conclusion. Simple summarization rules based on measured activity have been tested.
Evaluation Methodology for evaluation of results have been defined. Combination of experiments with human viewers and synthetic experiments is used. Particular rules are tested with viewers using comparative method (time consuming, expensive). Overall quality is evaluated using synthetic criterions (repeatable, cheap). Technical criterions evaluate if desired information is included in resulting videos. Pseudo-esthetical criterions check shot composition (if some rule was not violated).
Results Algorithm have been adapted to all meeting rooms used in AMI corpus, M4 corpus, and recording of lectures (BUT). Offline generation of videos is possible – set of command line tools. Real-time working demo have been implemented. Some demonstration videos have been generated. Examples Designed rules have been tested using proposed methodology. Detailed result of experiments are presented at MLMI07 (Evaluation of Automatic Video Editing).
System Overview Setup (Desired information, editing properties, …) Video editing algorithm (rules) Scenario Features extraction Video editor Input video streams Output video stream Resize, compression Camera selection
Remote Meeting Room AVE algorithm Streaming server (JMF, Real, Microsoft) AV stream User requirements Remote participant Java Applet or ActiveX Component AV streams cutting Scenario AV stream