Download presentation
Presentation is loading. Please wait.
1
Semantic Video Classification
Jianping Fan Department of Computer Science UNC-Charlotte, USA
2
Outline of presentation
Traditional Approaches Deep Learning Approach
3
1. Research Motivation Challenges: Content Representation and Concept Interpretation Tusk – Spear Ear – Huge Fan Body -- Wall Tail -- Snake Trunk --- Tree Leg --- Pillar Most existing CBVR systems describe multimedia like six blind men!
4
Semantic Medical Concepts
1. Research Motivation Semantic Medical Concepts Semantic Gap Low-Level Multimodal Signals We focus on a medical education video domain!
5
2. Problem Definition Quality of features Video content representation Semantic video concept modeling Classifier training and feature selection Semantic video classification Knowledge Visualization for Video Access
6
3. Video Content Representation
Video Shots Easy to implement Features may not be representative Semantic Video Objects Difficult to implement Features are representative How to take both advantages?
7
3. Video Content Representation
Semantic Medical Concepts Gap 2 Semantic Gap Principal Video Shots Gap 1 Low-Level Multimodal Signals
8
3. Video Content Representation
Definition of Principal Video Shot Video Shots Salient Objects of Interesting Principal Video Shot Physical Video Shot Concept-Sensitive Salient Objects
9
3. Video Content Representation
Salient Object Definition Medical Video Stream Video Stream Audio Stream Surgery Objects Medical Equipment Video Text Medical Environment Human Voice Noise Music Taxonomy of Medical Video Components
10
3. Video Content Representation
Salient object definition: semantic-sensitive, feature-invariant objects in the domain of interest General News Medical Face Microphone Visceral Videotext Crowd Blood
11
3. Video Content Representation
Automatic Salient Object Detection
12
3. Video Content Representation
Automatic Salient Object Detection Support Vector Machine is used for binary region classification!
13
3. Video Content Representation
Automatic Salient Object Detection Salient Objects Human Face Lecture Slide Blood Regions Background Music Precision 90.3% 92.4% 90.2% 94.6% Recall 87.5% 89.6% 86.7% 81.2% Salient Objects Human Speech Human Skin Lecture Sketch Blue Cloth Precision 89.8% 96.3% 93.3% 96.7% Recall 82.6% 95.4% 88.5% 95.8%
14
3. Video Content Representation
15
3. Video Content Representation
16
3. Video Content Representation
17
3. Video Content Representation
18
3. Video Content Representation
19
3. Video Content Representation
Global Visual Features for Video Shot Representation; Local Visual Features for Salient Object Representation
20
3. Video Content Representation
Global Visual Features for Video Shot Representation Video Sequence Shot 1 Shot i Shot n Color HSV color histogram, dominant color, … Texture Edge histogram, wavelet coefficients, Tamura features, … Motion Directional motion histogram, Camera motion, … Other features
21
3. Video Content Representation
Local Visual Features for Salient Object Representation Video Sequence Object 1 Object i Object n Color HSV color histogram, dominant color, … Texture Edge histogram, Tamura, …. Shape Rectangular box, moments, ….. Motion Trajectory, motion histogram, … Confidence Map
22
4. Semantic Video Concept Modeling
Concept i Semantic Video Concept Nc Gap 2 PVS Class 1 PSV Class j PSV Class S Semantic Gap Gap 1 Low-Level Auditory Signal Low-Level Vision Signal Low-Level Image-Textual Signal
23
4. Semantic Video Concept Modeling
24
4. Semantic Video Concept Modeling
25
4. Semantic Video Concept Modeling
How can we formulate this composition relationship (concept modeling)?
26
4. Semantic Video Concept Modeling
How to model the data distributions for concepts?
27
4. Semantic Video Concept Modeling
Atomic Video Concept Modeling Atomic Video Concept Type 1 PSV Type i PSV Type n PSV
28
4. Semantic Video Concept Modeling
Hierarchical Video Concept Modeling Semantic Video Concept at the second level: Semantic Video Concept at the nth level:
29
5. Video Concept Learning for Classification
Model Selection & Parameter Estimation We need techniques for estimating: Model Structure Model Parameters
30
5. Video Concept Learning for Classification
Adaptive EM Algorithm (a) Start from a large K Perform automatic merging, splitting, & elimination © Incorporating negative samples for concept learning
31
5. Video Concept Learning for Classification
Adaptive EM Algorithm Overlapped mixture components from same concept model: merging
32
5. Video Concept Learning for Classification
Adaptive EM Algorithm Elongated mixture components from same concept model: splitting Sample Distributions Mixture Component
33
5. Video Concept Learning for Classification
Adaptive EM Algorithm Tailed mixture components from different concept models: splitting & elimination Concept 1 Concept 2
34
5. Video Concept Learning for Classification
Merging: Splitting: Elimination:
35
5. Video Concept Learning for Classification
Normalization Acceptance Probability
36
5. Video Concept Learning for Classification
Advantages Performing model selection and parameter estimation jointly in a single algorithm! Negative samples are incorporated for discriminative learning of finite mixture models with higher classification accuracy! © New approach for hierarchical classifier training by performing automatic merging, splitting, and elimination of components!
37
6. Learning with Unlabeled Samples
Unlabeled Samples for Concept Learning Certain Unlabeled Samples Unlabeled Samples Used for Concept Learning Unlabeled Samples Informative Unlabeled Samples Uncertain Unlabeled Samples Outliers & New Concepts
38
6. Learning with Unlabeled Samples
Unlabeled samples partitioning Certain unlabeled samples Outliers & New Concepts Informative unlabeled samples
39
6. Learning with Unlabeled Samples
Unlabeled Samples for Concept Learning Confidence Score:
40
6. Learning with Unlabeled Samples
Unlabeled Samples for Concept Learning Penalty Term: Certain Unlabeled Samples Uncertain Unlabeled Samples 1
41
6. Learning with Unlabeled Samples
Birth of New Mixture Components Birth operation is added to capture unknown image context classes that are interpreted by informative unlabeled samples .
42
7. Hierarchical Video Concept Learning
Second-Level Semantic Video Concept Parent Node Atomic Video Concept 1 Atomic Video Concept i Atomic Video Concept Children Nodes
43
7. Hierarchical Video Concept Learning
Model Integration by combining mixture components from all its children nodes where the initial model structure is:
44
7. Hierarchical Video Concept Learning
Elimination Merging
45
7. Hierarchical Video Concept Learning
Splitting Conclusion: Using class distribution for high-level concept learning can achieve better performance than using their prediction outputs!
46
8. Semantic Video Classification
Bayesian Rule for Video Classification:
47
8. Semantic Video Classification
48
8. Semantic Video Classification
49
8. Semantic Video Classification
50
9. Benchmark and Algorithm Evaluation
Precision Missing-recall
51
9. Algorithm Evaluation Convergence of Adaptive EM Algorithm
52
9. Algorithm Evaluation Convergence of Adaptive EM Algorithm
53
9. Algorithm Evaluation Effectiveness of Adaptive EM Algorithm
54
9. Algorithm Evaluation Hierarchical Concept Learning More concept
levels may reduce hypothesis variance! (b) More concept levels may increase risk of transmission of errors among levels!
55
9. Algorithm Evaluation Training with Unlabeled Samples
56
9. Benchmark Classification accuracy vs. representation Concepts
Class Lecture Trauma Surgery Diagnosis Salient Objects 81.6%, 82.1% 82.3%, 81.9% 81.7%, 89.3% Video Shots 64.9%, 68.5% 64.3%, 65.7% 61.1%, 59.7% Gastroinstinal Surgery Concepts Dialog Burn Surgery Salient Objects 81.4%, 85.2% 85.1%, 89.3% 79.2%, 78.6% Video Shots 71.5%, 69.3% 67.5%, 68.2% 64.8%, 69.1%
57
9. Benchmark
58
10. Video Skimming
59
11. Video Privacy Protection
60
11. Video Privacy Protection
61
12. Semantic Video Concept Modeling via Support Vector Machines
62
12. Semantic Video Concept Modeling via Support Vector Machines
positive samples negative samples positive support vectors margin negative support vectors
63
12. Semantic Video Concept Modeling via Support Vector Machines
64
13. SVM Classifier Training
Support Vector Machine may be one solution, but.. -- Training cost is , M is sample size -- Curse of Dimensionality – large-scale training samples are needed for reliable SVM classifier training! High-dimensional feature space! -- Kernel functions should be able to characterize the underlying geometric properties of image data! Heterogeneous feature space!
65
13. SVM Classifier Training
If RBF kernel is used, we need techniques for estimating the SVM parameters: C Coast to Fine Search
66
13. SVM Classifier Training
Incremental SVM Classifier Training & Convergence ---How to incorporate informative unlabeled samples for classifier training? Batch-Based Approach Incremental Approach
67
13. SVM Classifier Training
Unlabeled Samples for Classifier Training Weak SVM Classifier Certain Unlabeled Samples Incremental SVM Classifier Training Unlabeled Samples Informative Unlabeled Samples Uncertain Unlabeled Samples New Concepts & Outliers
68
14. Feature Partition and Selection
Feature Heterogeneity – Different feature types RBF kernel cannot effectively characterize the underlying geometric properties of image data in such heterogeneous feature space More existing feature selection techniques can only work on homogeneous feature space! filtering, wrapping, boosting Feature correlation is ignored!
69
14. Feature Partition and Selection
Semantic Video Concept Feature Selection by selecting the most relevant classifiers! High-Dimensional Visual Features Prior knowledge PCA Feature Subset i Feature Subset 1 Feature Subset n Boost on both training samples and features!
70
14. Feature Partition and Selection
Advantages -- The dimensions for each feature subset are relative low, thus less training samples are needed --- speed up SVM classifier training -- Feature selection in heterogeneous space via classifier selection -- The geometric property of image data in each feature subset can be characterized effectively by using RBF kernels. --Hierarchical feature selection
71
15. Benchmark More is Less: Optimal Feature Subset
72
16. Hierarchical Video Database Summarization & Visualization
73
16. Hierarchical Video Database Summarization & Visualization
Concept-Oriented Video Summarization
74
Deep Learning for Video Classification
75
Deep Learning for Video Classification
76
Deep Learning for Video Classification
77
Deep Learning for Video Classification
78
Deep Learning for Video Classification
79
Deep Learning for Video Classification
80
Deep Learning for Video Classification
81
Deep Learning for Video Classification
82
Deep Learning for Video Classification
83
Deep Learning for Video Classification
84
Deep Learning for Video Classification
85
Multi-Modal Deep Learning for Video Classification
86
Multi-Modal Deep Learning for Video Classification
87
Multi-Modal Deep Learning for Video Classification
88
Multi-Modal Deep Learning for Video Classification
89
Multi-Modal Deep Learning for Video Classification
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.