Download presentation
Presentation is loading. Please wait.
Published byMaryann Owens Modified over 9 years ago
1
Understanding the Semantics of Media Lecture Notes on Video Search & Mining, Spring 2012 Presented by Jun Hee Yoo Biointelligence Laboratory School of Computer Science and Engineering Seoul National Univertisy http://bi.snu.ac.kr
2
Semantic Understanding There are some tools which attempt to segment video at a higher level. But this level of analysis does not tell us much about the meaning represented in the media. Problem Statement © 2012, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr2
3
Approach Segmentation Literature Use LSI because it allow us to quantify the position of a portion of the document in a multi-dimensional semantic space. Propose to summarize the text with LSI and analyze the signal with smooth Gaussians. Semantic Retrieval Literature Use mixtures of probability experts for semantic-audio retrieval (MPESAR) to model which more sophisticated model connecting words and media. © 2012, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr3
4
Analysis Tools © 2012, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr4
5
Segmenting Video Temporal Properties of Video Color: It provides robust evidence for a shot change in a video signal. However, it cannot tell us global structure of the video. Random words form a transcript: The words indicate a lot about the overall structure of the story. © 2012, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr5
6
Segmenting Video Test Material CNN Headline News (30min TV show). 21 st Century Jet (Documentary). Use automatic speech recognition(ASR) to provide a transcript of the audio. © 2012, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr6
7
Segmenting Video Scale Space Convert the original signal into scaled space. In scale space, we analyze a signal with many different kernels. © 2012, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr7 With Low Pass Filter Histogram
8
Segmenting Video Combined Image and Audio Data Combined color, words and scale space analysis. The result is a 20-dimensional vector function of time and scale. © 2012, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr8
9
Segmenting Video Hierarchical Segmentation Results Color and word autocorrelations for the Boeing 777 video © 2012, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr9
10
Segmenting Video Hierarchical Segmentation Results Grouping 4-8 sentences produces a larger semantic autocorrelation. © 2012, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr10
11
Segmenting Video Intermediate Results A scale-space segmentation algorithm produced a boundary map showing the edges in the signal. © 2012, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr11
12
Segmenting Video A comparison of ground truth. Left: estimated result. Right: ground truth. © 2012, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr12
13
Segmenting Video Shot Boundary Segmentation. Use commercial product, designed by YesVideo. © 2012, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr13
14
Segmenting Video Manual Segmentation result © 2012, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr14
15
Semantic Retrieval © 2012, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr15 MPESAR process
16
Semantic Retrieval Acoustic Signal processing chain Acoustic to Semantic Lookup © 2012, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr16
17
Semantic Retrieval © 2012, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr17 Testing
18
Retrieval Results © 2012, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr18 Histogram of true label ranks based on likelihoods from audio-to-semantic tests Histogram of true label ranks based on likelihoods from semantic-to-acoustic tests
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.