Detecting Cartoons a Case Study in Automatic Video-Genre Classification Tzvetanka Ianeva Arjen de Vries Hein Röhrig.

Detecting Cartoons a Case Study in Automatic Video-Genre Classification Tzvetanka Ianeva Arjen de Vries Hein Röhrig

Outline Goal: remove cartoons from search results in TREC-2002 video track Our Approach: extract Image Descriptors & SVM Machine Learning Related work Novel Descriptors from Granulometry SVM Learning Experimental Results

TREC-2002 video track TREC- workshops for large scale evaluation of information retrieval technology CWI participation: Probabilistic Multimedia Retrieval Model does not distinguish sufficiently “Cartoons”

Example of undesirable ‘cartoon’ Query Best Matches returned

Related work M.Roach et al. Motion based classification of cartoons (2001) B.T.Truong et al. Automatic genre identification for content-based video categorization (2000) J.R.Smith et al. Searching for images and videos on the world wide web N.C.Rowe et al. Automatic caption localization for photographs on www pages V.Athitsos et al. [ASF] Distinguishing photographs and graphics on the www

Cartoons What is a Cartoon? –Cartoons do not contain any photographic material –Photos photographic camera Appears easy to find cartoons –Few, simple, strong colors, patches of uniform colors, strong black edges, text

Quiz: Cartoon or Photo?

Examples not so Typical

Photos like cartoons

“Cartoons” like photos

Artificial photos

Small cues

Overlapping Frames

Shadow & Sparkle

Image Descriptors greater correlation normalized Example: avg. sat., thresh. brightness Input ImageImage descriptors 0. 6231 0.9266 … 0.2880 0.4125 ( 240x352x3 ) … …… 12148 12

Overview of our all image descriptors Image Descriptors Dimension average saturation 1 threshold brightness 1 color histogram 45 edge-direction histogram 40 compression ratio 1 multi-scale pat. spectrum 60

Brightness and Saturation HSV color model Cartoons brighter => use % pixels with Value > 0.4 Cartoons have strong colors => use average Saturation

Saturation in cartoon and photo images 0.2880 0.6231 RGBS-(HSV)RGBS-(HSV)

Brightness in cartoon and photo images. 0.92660.4125 RGBV-(HSV)RGBV-HSV

Histograms Image I : XxY -> R c Filter F : I -> I’ Bins B k partition of R c h k = #{ (x,y) : I’(x,y) є B k } E.g. brightness metric: I grayscale, c=1, B 1 = [ 0, 0.4 ], B 2 =[0.4,1], return h 2

Color Histogram More general than brightness & saturation Again HSV color space Partition HSV into 3x3x5 = 45 bins Cartoons have less colors => col. hist. desc.

Color histogram for in the 45-bin HSV

Edge detection Cartoons have strong black edges => Approx. total derivative of intensity  I(x,y)   I  x,y ,  I  x,y    xx yy   Approx. |  | and  histogram of ( , |  |) 5 intervals for |  |  0 … sqrt(20) 8 intervals for   0 … 2 

Edge angles & edge magnitudes

Edge histogram

Compressibility Cartoons: more simple composition Detect complexity by measuring compression ratio Theory: “Kolmogorov complexity” Our application: use lossless PNG compression Lossy JPEG not useful 0.13548 0.23365

Granulometries Idea: measure size distribution of objects How? openings by structuring element of growing scale Normalized size distribution Derivative = pattern spectrum

Openings Opening = erosion then dilation with same SE

Structuring Elements Non-flat parabola better(?) than flat disk Parabola: efficient computation, symmetry

Small-scale pattern spectrum descriptors SE disk r i = i, i = 1,…20

SVM Learning Simplest case:  linear separator SVM finds hyperplane with largest margin Closest points = Support Vectors

SVM Learning: nonseparable Noisy data: no separating hyperplane at all! Solution: penalty C for points inside the margin C SVM machines

SVM = quadratic programming SVM task: Equivalent dual problem:

SVM with kernels SVM task: Equivalent dual problem:

SVM kernels RBF kernels Polynomial kernels

SVM with kernels: decision function SVM task: Equivalent dual problem: Decision function:

Experimental Data Key frames from TREC 2002 Video Track 13,026 photographic images 1,620 cartoons Manually classified Experiments 1-3: train on (random) 3908 photos and 486 cartoons

Experiment 1: individual performance σ 2 = 0.1 0.05 < σ 2 < 0.5 σ 2 = 0.07 0.05 < σ 2 < 0.5 σ 2 = 0.07 E t = E p +E c | p| |p|+|c| |c| |p|+|c|

Experiment 2: “convergence” of SVM learning (Pattern spectrum)

Experiment 3: combined performance σ 2 = 0.06

Experiment 4: web-image classifier on our data Test set: random 1,000 photos and 1,000 cartoons

Experiment 5: Performance on web images + dimension and file type features Comparison with 14,039 photographic and 9,512 graphical images harvested from WWW train on (random) 4239 photographics and 2826 graphics

Conclusions Hard task: good classifier Use dynamics/spatio-temporal relations ? Semantic Gap? Combine classifiers? Granulometry not enough

Detecting Cartoons a Case Study in Automatic Video-Genre Classification Tzvetanka Ianeva Arjen de Vries Hein Röhrig.

Similar presentations

Presentation on theme: "Detecting Cartoons a Case Study in Automatic Video-Genre Classification Tzvetanka Ianeva Arjen de Vries Hein Röhrig."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Detecting Cartoons a Case Study in Automatic Video-Genre Classification Tzvetanka Ianeva Arjen de Vries Hein Röhrig.

Similar presentations

Presentation on theme: "Detecting Cartoons a Case Study in Automatic Video-Genre Classification Tzvetanka Ianeva Arjen de Vries Hein Röhrig."— Presentation transcript:

Similar presentations

About project

Feedback