Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Lessons Learned From Building a Terabyte Digital Video Library Presented by Jia Yao Multimedia Communications and Visualization Laboratory Department.

Similar presentations


Presentation on theme: "1 Lessons Learned From Building a Terabyte Digital Video Library Presented by Jia Yao Multimedia Communications and Visualization Laboratory Department."— Presentation transcript:

1 1 Lessons Learned From Building a Terabyte Digital Video Library Presented by Jia Yao Multimedia Communications and Visualization Laboratory Department of Computer Engineering & Computer Science University of Missouri-Columbia Columbia, MO 65211

2 2 Lessons Learned From Building A Terabyte Digital Video Library zHoward D. Wactlar, Michael G. Christel, Yihong Gong and Alexander G. Hauptmann, “Lessons Learned From Building A Terabyte Digital Video Library” IEEE Computer, vol. 32, no. 2, pp. 66-73, Feb. 1999. zInformedia Project, at Carnegie Mellon University, begun in 1994, was one of six funded by the US national Science Foundation, the US defense Advanced Research Projects Agency, and the National Aeroautics and Space Administration, under the US digital Library Initiative.

3 3 Lessons Learned From Building a Terabyte Digital Video Library zChallenges of building such a video library: yhow to embed information yhow to handle the voluminous file size yhow to deal with the temporal characteristic of video zThis paper talks about: yautomatically extracting information from digitized video ycreating interfaces that allowed users to search for and retrieve videos based on extracted information yvalidating the system through user testbeds

4 4 Video Processing zTwo types of video data: news video and documentary video zVideo retrieval is done by using integrated speech processing, image processing and information retrieval techniques zSpeech processing: yuse CMU sphinx speech recognition system to generate a complete transcript of the speech in video ySphinx’s word error rate is inversely proportional to the amount of processing time, by running the algorithm on parallel machines, it gives excellent result in two to three times real speech time yerror rate can be further lowered by using general language models

5 5 Video Processing zInformation retrieval: yalthough word error rate is 30% high, information retrieval precision and recall were degraded only less than 10% yredundancy in language helps the retrieval of video based on speech recognition yuse of phonetic transcription will also help to reduce error rate yprovide more match candidates: possible matching object will not lost yproblem: training based on small amount of speech data might not be sufficient yproblem: errors in the automatic partitioning of video streams into video segments may affect information retrieval effectiveness

6 6 Video Processing zImage processing ythe task is to fully characterize the scene and all objects within it and to provide efficient, effective access to this information through indices and similarity measures ycurrently image processing techniques are used to xpartition each video segment into shots, choose a representative frame (key frame) for each shot -- usually the middle frame in one shot, but also can be the last one frame if the shot contains a camera motion shot: a video clip recorded with one continuous camera operation segment: several shots describing a topic xidentify and index features to support image similarity matching -- detect face region in news video; retrieval by interest region and color xcreate metadata (metadata:data to describe the structure of raw data) derived from imagery for use in text matching -- Video OCR, first find the caption in video, then use OCR software to extract the text in it

7 7 Video Processing yfuture challenges: xcontent-based retrieval xeffective segmentation of video into segments, and then break story into shots -- need to use transcript information (such as closed caption) and language model to help segmentation

8 8 Informedia Interface zBecause the underlying speech, image and language processing (formerly referred to as information retrieval in the paper) are imperfect and produce ambiguous, incomplete metadata, powerful browsing capabilities are essential in a multimedia information retrieval system zThe use of headlines: ysearch result contains several video segments with thumbnail images displayed. Headline of one segment will pop up when mouse was moved over the thumbnail images yphrases are evaluated using statistical approach, then used as components of the headlines ysignificant information is given first in headline, followed by explanation ysegment size and record date are provided

9 9 Informedia Interface zThe use of the thumbnail: ythumbnail of each segment has to be chosen very carefully in order to get good performance yuse the key frame of first shot in the segment -- result not good yuse the key frame of the most related shot in the segment -- good zThe use of filmstrip: ykey frames from a segments’ shots can be presented in sequential order as filmstrips. yquickly shows the content of one segment ymatch bar (shows matched query word) let user determine location of interest fast, directly jump to that point and start playback ytranscripts can help hearing-impaired people

10 10 Informedia Interface zThe use of skims: ya skim incorporates both video and audio information from a longer source so that a two minute skim can represent 20 minute original video yoriginally, use subsampled video as skim -- performance not very good yimprovement: select frames based on phrases rather than words in transcript zConclusion: yinteraction is a important part of digital video library yintegrated audio, image and language search can help to reduce the limitations of individual methods yphrase plays a important role in speech understanding


Download ppt "1 Lessons Learned From Building a Terabyte Digital Video Library Presented by Jia Yao Multimedia Communications and Visualization Laboratory Department."

Similar presentations


Ads by Google