Presented by Zeehasham Rasheed

Presented by Zeehasham Rasheed
Concept-Oriented Indexing of Video Databases: Towards Semantic Sensitive Retrieval and Browsing J.Fan, H.Luo, K.Ahmed Presented by Zeehasham Rasheed

Outline Introduction Proposed Framework Semantic Video Classification
Semantic Video Database Performance Analysis Conclusion and Future work

Introduction Digital Video now plays an important role in Medical education. Several content based video retrieval (CBVR) system have been proposed. Challenging problems are semantic gap, semantic video classification and video database indexing.

Proposed Framework Semantic video content framework using Principal video shots Semantic video concept model Semantic Video Classifier training framework Concept oriented video database

Challenging Issues The performance of semantic video classifiers largely depends on the quality of features and automatic semantic video object extraction is in general very hard.

Existing CBVR systems are unable to support video access at the semantic level because of semantic gap. So to bridge the semantic gap, the rule based approach uses domain knowledge to define rules for extracting semantic video concepts.

Semantic video classification techniques can be classified into Rule based approach and statistical approach. Rule based approach provide ease to insert, delete and modify the existing rules Statistical approach uses machine learning techniques.

Semantic Sensitive Video Content Analysis
It is necessary to understand what are the suitable video patterns for interpreting certain domains for medical education. A good semantic sensitive video content framework should be able to enhance the quality of features.

Developed a novel framework by using Principal video shots for video content representation and feature extraction. Based on the knowledge of Medical consultants, a set of multimodal salient objects and semantic medical concepts have been designed.

Multimodal Salient objects include visual, auditory and image textual salient objects.
Visual salient objects include human faces, blood-red regions, skin region. Auditory salient objects include single human speech, multiple human speech.

Semantic Medical concepts include lecture presentation, gastrointestinal surgery, dialog, traumatic surgery. So all these are required to select the principal video shots.

Semantic Video Concept and Database Modeling
Which database model can be used to support concept oriented video database Proposed a novel framework to organize large scale video collection into domain dependent concept heirarchy.

The deepest level of concept hierarchy is defined as the domain dependent elementary semantic medical concepts. For example five different types of principal video shots such as human face, slides, text lines, slides, human speech, are related to elementary semantic medical concept “Lecture Presentation”

Semantic Video Classification
Major step is to classify the principal video shots into most relevant elementary semantic medical concept. Use one against all rule to label training samples Where X are the perceptual features and C is the semantic label for sample

Posterior probability that a principal video shot with feature X can be assigned to elementary semantic medical concept C is determined by Bayesian Framework.

In the last, to achieve better likelihood for higher classification accuracy, they used maximum a posterior probability (MAP) as a classifier.

The MAP estimation can be achieved by using the expectation maximization algorithm
They called it adaptive EM algorithm

Testing The principal video shots and their features are extracted from test video clips. Linear Discriminant analysis is used to obtain more representative features. Given an unlabeled principal video shot and its feature values.

It is finally assigned to best elementary semantic medical concept corresponds to maximum posterior probability

Concept Oriented Video Database Organization
Uses the following technique to support the statistical video database indexing Each database node (semantic medical concept node) is described by the semantic label (keyword), visual summary and statistical properties of the class distribution.

Representation of database node is done by following parameters

Hierarchical Video Retrieval
Intuitive approach for the naive users to specify queries. Query Concept Specification via Browsing: support user to get a good idea of video content quickly by browsing the visual summary for semantic medical concept nodes. They can pick one or multiple video clips as their query.

Query Concept Specification via Keywords: Keywords are most useful for the naive users to specify their queries at semantic level. Query Concept Specification via Pattern combination: user can interpret query by using general combinations of principal video shots.

Query Concept Evaluation for query-by-example
After query concepts are interpreted by selected video clips, search is performed. User can then label those retrieved videos as relevant or irrelevant. To improve the search results for the next iteration, some steps have been taken.

Information sample selection: Irrelevant video data samples obtained in the previous query and located in the nearest neighbor sphere is used to shrink the sampling area. So in this way, irrelevant samples are taken out from the sampling area of the current query iteration.

Best Search Direction Prediction: Relevance feedback improves the query results and reduces the size of query iterations, the best search direction for the next query iteration can be predicted by combining such nearest neighbor sphere reduction.

Query Refinement: Only the previous query vector and positive samples are used to determine query vector for the next iteration. It is based on Rocchio’s formula

Performance Analysis Benchmark Matrics:
1- Classification Accuracy (misclassification ratio versus classification accuracy ration) 2- Retrieval Accuracy (Precision versus Recall)

Experiments Our experiments are conducted on two image/video databases: skin database (i.e., marked face database) from Purdue University and medical video database. The skin database consists of 1265 face images. 150 face images are selected as the labeled samples for classifier training. The medical video database includes more than principal video shots from 45 h of MPEG medical videos, where 1500 principal video shots are selected as the training samples and labeled by our medical consultant.

Conclusion and Future Work
Adaptive EM algorithm have improved the classification accuracy. A novel semantic sensitive video content framework via principal video shots have been proposed. Future work is to obtain more accurate estimation using unlabeled data.

Questions ???

Presented by Zeehasham Rasheed

Similar presentations

Presentation on theme: "Presented by Zeehasham Rasheed"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Presented by Zeehasham Rasheed

Similar presentations

Presentation on theme: "Presented by Zeehasham Rasheed"— Presentation transcript:

Similar presentations

About project

Feedback