Download presentation
Presentation is loading. Please wait.
Published byFrederica Parker Modified over 9 years ago
1
Detecting Cartoons a Case Study in Automatic Video-Genre Classification Tzvetanka Ianeva Arjen de Vries Hein Röhrig
2
Outline Goal: remove cartoons from search results in TREC-2002 video track Our Approach: extract Image Descriptors & SVM Machine Learning Related work Novel Descriptors from Granulometry SVM Learning Experimental Results
3
TREC-2002 video track TREC- workshops for large scale evaluation of information retrieval technology CWI participation: Probabilistic Multimedia Retrieval Model does not distinguish sufficiently “Cartoons”
4
Example of undesirable ‘cartoon’ Query Best Matches returned
5
Related work M.Roach et al. Motion based classification of cartoons (2001) B.T.Truong et al. Automatic genre identification for content-based video categorization (2000) J.R.Smith et al. Searching for images and videos on the world wide web N.C.Rowe et al. Automatic caption localization for photographs on www pages V.Athitsos et al. [ASF] Distinguishing photographs and graphics on the www
6
Cartoons What is a Cartoon? –Cartoons do not contain any photographic material –Photos photographic camera Appears easy to find cartoons –Few, simple, strong colors, patches of uniform colors, strong black edges, text
7
Quiz: Cartoon or Photo?
9
Examples not so Typical
11
Photos like cartoons
13
“Cartoons” like photos
15
Artificial photos
17
Small cues
18
Overlapping Frames
19
Mixed
20
Shadow & Sparkle
21
Image Descriptors greater correlation normalized Example: avg. sat., thresh. brightness Input ImageImage descriptors 0. 6231 0.9266 … 0.2880 0.4125 ( 240x352x3 ) … …… 12148 12
22
Overview of our all image descriptors Image Descriptors Dimension average saturation 1 threshold brightness 1 color histogram 45 edge-direction histogram 40 compression ratio 1 multi-scale pat. spectrum 60
23
Brightness and Saturation HSV color model Cartoons brighter => use % pixels with Value > 0.4 Cartoons have strong colors => use average Saturation
24
Saturation in cartoon and photo images 0.2880 0.6231 RGBS-(HSV)RGBS-(HSV)
25
Brightness in cartoon and photo images. 0.92660.4125 RGBV-(HSV)RGBV-HSV
26
Histograms Image I : XxY -> R c Filter F : I -> I’ Bins B k partition of R c h k = #{ (x,y) : I’(x,y) є B k } E.g. brightness metric: I grayscale, c=1, B 1 = [ 0, 0.4 ], B 2 =[0.4,1], return h 2
27
Color Histogram More general than brightness & saturation Again HSV color space Partition HSV into 3x3x5 = 45 bins Cartoons have less colors => col. hist. desc.
28
Color histogram for in the 45-bin HSV
30
Edge detection Cartoons have strong black edges => Approx. total derivative of intensity I(x,y) I x,y , I x,y xx yy Approx. | | and histogram of ( , | |) 5 intervals for | | 0 … sqrt(20) 8 intervals for 0 … 2
31
Edge angles & edge magnitudes
32
Edge histogram
33
Compressibility Cartoons: more simple composition Detect complexity by measuring compression ratio Theory: “Kolmogorov complexity” Our application: use lossless PNG compression Lossy JPEG not useful 0.13548 0.23365
34
Granulometries Idea: measure size distribution of objects How? openings by structuring element of growing scale Normalized size distribution Derivative = pattern spectrum
35
Openings Opening = erosion then dilation with same SE
36
Structuring Elements Non-flat parabola better(?) than flat disk Parabola: efficient computation, symmetry
37
Small-scale pattern spectrum descriptors SE disk r i = i, i = 1,…20
38
SVM Learning Simplest case: linear separator SVM finds hyperplane with largest margin Closest points = Support Vectors
39
SVM Learning: nonseparable Noisy data: no separating hyperplane at all! Solution: penalty C for points inside the margin C SVM machines
40
SVM = quadratic programming SVM task: Equivalent dual problem:
41
SVM with kernels SVM task: Equivalent dual problem:
42
SVM kernels RBF kernels Polynomial kernels
43
SVM with kernels: decision function SVM task: Equivalent dual problem: Decision function:
44
Experimental Data Key frames from TREC 2002 Video Track 13,026 photographic images 1,620 cartoons Manually classified Experiments 1-3: train on (random) 3908 photos and 486 cartoons
45
Experiment 1: individual performance σ 2 = 0.1 0.05 < σ 2 < 0.5 σ 2 = 0.07 0.05 < σ 2 < 0.5 σ 2 = 0.07 E t = E p +E c | p| |p|+|c| |c| |p|+|c|
46
Experiment 2: “convergence” of SVM learning (Pattern spectrum)
47
Experiment 3: combined performance σ 2 = 0.06
48
Experiment 4: web-image classifier on our data Test set: random 1,000 photos and 1,000 cartoons
49
Experiment 5: Performance on web images + dimension and file type features Comparison with 14,039 photographic and 9,512 graphical images harvested from WWW train on (random) 4239 photographics and 2826 graphics
50
Conclusions Hard task: good classifier Use dynamics/spatio-temporal relations ? Semantic Gap? Combine classifiers? Granulometry not enough
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.