Multimodal Analysis Video Representation Video Highlights Extraction Video Browsing Video Retrieval Video Summarization
Multimodal Analysis Video Representation Video Browsing Video Retrieval Video Content Bottom-up Top-down Highlights based Summarization Table of Contents based Summarization Video Summarization content access data flow
Video with Audio Track Play / Break Audio-Visual Markers Highlight Candidates Highlight Groups Feature extraction & segmentation Key audio-visual object detection Audio-visual markers association Grouping Pla y Break Visual Marker Audio Marker A highlight
Scenes Groups Shots Key frames Visual Semantic Audio Camera motion Highlight Groups Highlight Candidates Audio-Visual Markers Play/Break Retrieval Highlights based Summarization ToC Index Highlights Browsing ToC based Summarization
Audio Video Audio Markers Detection Visual Markers Detection ApplauseCheers Baseball CatcherSoccer Goal Post Golfer Bending to Hit A-V Marker Negotiation Golf Swings Highlight Candidates Finer Resolution Highlights Golf Putts Which sport is it? Non-highlights Strikes, BallsBall-hitsNon-highlights Corner KicksPenalty KicksNon-highlights Excited Speech
Soccer video markers Baseball video marker Golf video markers
Time DomainFrequency Domain Audio Class Recognition (GMM) Applause Cheering Music Speech Excited Speech Feature Extraction Comparing Likelihoods Input Audio Classify the input audio into one of 5 Audio Markers using GMM Audio Class MFCC Coefficients. Training the GMM Classifiers using MFCC features and BIC. Feature Extraction Training Audio Clips
Time DomainFrequency Domain Audio Class Recognition (GMM) Excited Speech Other (Applause,Cheering, Music, Speech) Feature Extraction Comparing Likelihoods Input Audio Classify the input audio into one of 2 Audio Markers using GMM Audio Class MFCC Coefficients. Training the GMM Classifiers using MFCC features and CV Feature Extraction Training Audio Clips Task = Sports Highlights
Time DomainFrequency Domain Audio Class Recognition (GMM) Applause Cheering Music Speech Excited Speech Feature Extraction Comparing Likelihoods Input Audio Classify the input audio into one of 5 Audio Markers using GMM Audio Class MFCC Coefficients. Training the GMM Classifiers using MFCC features and BIC. Feature Extraction Training Audio Clips
Feature Extraction Audio Classifier Importance Level Calculation Task Input Audio MDCTs Class Label Importance Level
Generic Audio Classification Compare Likelihoods Applause Cheering Music Speech Excited Speech MDCTs Class Label Training Data for GMMs Feature Extraction Training Audio Clips
Feature Extraction Audio Classifier Importance Level Calculation Task Input Audio MDCTs Class Label Importance Level
Task Specific Audio Classification Compare Likelihoods MDCTs Class Label Feature Extraction Task = Sports Highlights Excited Speech Other (Applause,Cheering, Music, Speech) Training Audio Clips
Filter 1Filter 2 Catcher Model Interpretation Filter 1 Filter 2Filter 1Filter 2 Golfer view 1Golfer view 2 Golf Models Interpretation Filter 1Filter 2Filter 3 Goal post view 1 Filter 1Filter 2 Filter 3 Goal post view 2 Soccer Models Interpretation
Pla y Break Video clip A-V markers catcherbat swingcheer excited speech Index