Context-based Visual Concept Detection Using Domain Adaptive Semantic Diffusion Yu-Gang Jiang, Jun Wang, Shih-Fu Chang, Chong-Wah Ngo VIREO Research Group (VIREO), City University of Hong Kong Digital Video and Multimedia Lab (DVMM), Columbia University 1 NIST TRECVID Workshop, Nov. 2009
Overview: framework Local Feature Global Feature SVM Classifiers VIREO-374: 374 LSCOM concept detectors VIREO-374: 374 LSCOM concept detectors Domain Adaptive Semantic Diffusion
3 Overview: performance DASD Local + global features Local feature alone Local feature is still the most powerful component (MAP=0.150) Global features help a little bit (MAP=0.156) DASD further contributes incrementally to the final detection
Overview: framework Local Feature Global Feature SVM Classifiers VIREO-374: 374 LSCOM concept detectors VIREO-374: 374 LSCOM concept detectors Domain Adaptive Semantic Diffusion
Local feature representation 5 Chang et al TRECVID 2008; Jiang, Yang, Ngo & Hauptmann, IEEE TMM, to appear
Context-based concept detection Local Feature Global Feature SVM Classifiers VIREO-374: 374 LSCOM concept detectors VIREO-374: 374 LSCOM concept detectors DASD: Domain Adaptive Semantic Diffusion
DASD - motivation Most existing methods aim at the assignment of concept labels individually –but concepts do not occur in isolation! military personnel smoke explosion_fire roadoutdoor vehicle building 7
8 Documentary Videos Broadcast News Videos Most existing methods aim at the assignment of concept labels individually –but concepts do not occur in isolation! Domain change between training and testing data was not considered DASD - motivation
9 Jiang, Wang, Chang & Ngo, ICCV 2009 DASD - overview road vehicle water sky
DASD - overview Domain adaptive semantic diffusion (DASD) –Semantic graph Nodes are concepts Edges represent concept correlation – Graph diffusion Smooth concept detection scores w.r.t the concept correlation vehicle road Watersky … … … …
DASD - formulation Energy function 11 Concept affinity Detection score of concept c i on test samples
DASD - formulation (cont.) Gradually smooth the function makes the detection scores in accordance with the concept relationships Detection score smoothing process 12
DASD - formulation (cont.) Graph adaptation Graph adaptation process 13
Iteration: 8Iteration: 12Iteration: 0Iteration: 4Iteration: 16Iteration: 20 Graph adaptation - example Broadcast news video domain Documentary video domain 14
Experiments on TV Baseline detectors –VIREO-374 Graph construction: –Ground-truth labels on TRECVID 2005 SPORTSWEATHER OFFICEBUILDING DESERT MOUNTAIN WALKING PEOPLE- MARCHING EXPLOSION-FIRE MAP TRUCK CORP. LEADER SPORTSWEATHEROFFICE DESERT MOUNTAIN WATER POLICE MILITARY ANIMAL TWO PEOPLE NIGHT TIME TELEPHONE STREET CLASSROOMBUS TRECVID 05/06 (Broadcast News Videos) TRECVID 07 (Documentary Videos) 15
Results on TV Performance gain on TRECVID Datasets TRECVID # of evaluated concepts3920 Baseline (MAP) SD11.8%15.6%12.1% DASD11.9%17.5%16.2% 16 SD: semantic diffusion (without graph adaptation) Consistent improvement over all 3 data sets DASD: domain adaptive semantic diffusion Graph adaptation further improves the performance
Results on TV (cont.) TRECVID 2006 Test Data 17 TRECVIDJiang et alAytar et alWeng et alDASD %4.0%N/A11.9% 2006N/A 16.7%17.5% Comparison with the state-of-the-arts
18 Results on TRECVID 09 30% 10% 5%
Results on TRECVID 09 (cont.) 19 Quality of contextual detectors (VIREO-374) Context VIREO-374 TV09 detectors TV06 detectors TV07 detectors 18% 16% 5% DASD performance gain
DASD - computational time Complexity is O(mn) –m: # concepts; n: # video shots Only 2 milliseconds per shot/keyframe! 20 TRECVID 05TRECVID 06TRECVID 07 SD59s84s12s DASD89s165s28s
Summary 21 A well-designed approach using local features achieves good results for concept detection. Context information is helpful ! –Domain adaptive semantic diffusion effective for enhancing concept detection accuracy can alleviate the effect of data domain changes highly efficient ! – Future directions include: detector reliability: diffusion over directed graph web data annotation: utilize contextual information to improve the quality of tags –Source code available for download from DVMM lab research page
22