Presentation is loading. Please wait.

Presentation is loading. Please wait.

IIIT Hyderabad Synthesizing Classifiers for Novel Settings Viresh Ranjan CVIT,IIIT-H Adviser: Prof. C. V. Jawahar, IIIT-H Co-Adviser: Dr. Gaurav Harit,

Similar presentations


Presentation on theme: "IIIT Hyderabad Synthesizing Classifiers for Novel Settings Viresh Ranjan CVIT,IIIT-H Adviser: Prof. C. V. Jawahar, IIIT-H Co-Adviser: Dr. Gaurav Harit,"— Presentation transcript:

1 IIIT Hyderabad Synthesizing Classifiers for Novel Settings Viresh Ranjan CVIT,IIIT-H Adviser: Prof. C. V. Jawahar, IIIT-H Co-Adviser: Dr. Gaurav Harit, IIT, Jodhpur 1

2 IIIT Hyderabad Overview 1.Visual Recognition & Retrieval Tasks. 2.Challenges in Visual Recognition & Retrieval a)Dataset Shift. b)Large number of categories. 3.Handling Dataset Shift. 4.Handling large number of categories.

3 IIIT Hyderabad Overview 1.Visual Recognition & Retrieval Tasks 2.Challenges in Visual Recognition & Retrieval a)Dataset Shift b)Large number of categories 3.Handling Dataset Shift 4.Handling large number of categories

4 IIIT Hyderabad Introduction Image Feature Extraction Classifier Image labels “Car” “Not Car” “Car” “Not Car” Visual Recognition & Retrieval Object Recognition

5 IIIT Hyderabad Introduction Image Feature Extraction Classifier Image labels “room” “Not room” “room” “Not room” Visual Recognition & Retrieval Word image retrieval

6 IIIT Hyderabad Introduction Image Feature Extraction Classifier Image labels “2” “Not 2” “2” “Not 2” Visual Recognition & Retrieval Handwritten digit classification

7 IIIT Hyderabad Overview 1.Visual Recognition & Retrieval Tasks 2.Challenges in Visual Recognition & Retrieval a)Dataset Shift b)Large number of categories 3.Handling Dataset Shift 4.Handling large number of categories

8 IIIT Hyderabad Introduction Challenges in Visual Recognition & Retrieval Dataset Shift Target (test set)Source (training set) Dataset Shift in Object Recognition

9 IIIT Hyderabad Introduction Challenges in Visual Recognition & Retrieval Dataset Shift Source(training set) Target(test set) Printed handwritten Dataset Shift in digits classification

10 IIIT Hyderabad Introduction Challenges in Visual Recognition & Retrieval Dataset Shift Source(training set) Target(test set) Dataset Shift in word image retrieval

11 IIIT Hyderabad Overview 1.Visual Recognition & Retrieval Tasks 2.Challenges in Visual Recognition & Retrieval a)Dataset Shift b)Large number of categories 3.Handling Dataset Shift 4.Handling large number of categories

12 IIIT Hyderabad Introduction Challenges in Visual Recognition & Retrieval Dataset Shift Too many categories Around 200K word categories in English language

13 IIIT Hyderabad Introduction Challenges in Visual Recognition & Retrieval Dataset Shift Too many categories Tackling the challenges Dataset Shift –i) Domain Adaptation ii) Kernelized feature extraction Too many categories – Transfer Learning

14 IIIT Hyderabad Overview 1.Visual Recognition & Retrieval Tasks 2.Challenges in Visual Recognition & Retrieval a)Dataset Shift b)Large number of categories 3.Handling Dataset Shift a)Handling Dataset Shift in object recognition by Domain Adaptation b)Handling Dataset Shift in digit classification by Domain Adaptation c)Handling Dataset Shift in word image retrieval by Kernelized Feature Extraction 4.Handling large number of categories

15 IIIT Hyderabad 3. a. Handling Dataset Shift in object recognition by Domain Adaptation

16 IIIT Hyderabad Problem Statement Target DomainSource Domain Given: Labeled Source Domain, Unlabeled Target Domain. Goal: Classify target domain images. 16

17 IIIT Hyderabad Overview of Domain Adaptation 17 Target Classification Target classification using Source classifier using DA (a) (b) Unlabeled Target domain images Labeled Source domain images

18 IIIT Hyderabad Proposed Approach Target Domain Source Domain Domain Specific Domain Independent Decompose features into: Domain Specific features Domain Independent features 18 Domain Specific

19 IIIT Hyderabad Source Specific Domain Independent Discard domain specific features 19 Target Specific Discard Proposed Approach Target Domain Source Domain

20 IIIT Hyderabad Proposed Approach Domain Independent Train classifiers using domain independent features 20 Classifier Train Test Target Domain Source Domain

21 IIIT Hyderabad Sparse Representation: ImageDictionary Sparse coefficients 21 However, above sparse representation cannot separate domain specific & independent features. How do we separate domain specific & independent features ? Learning Domain Specific & Domain Independent features

22 IIIT Hyderabad Learning Domain Specific & Domain Independent features Key idea: domain specific & shared atoms in dictionary. Source imageSource Specific Atoms Shared Atoms 22 Target imageTarget Specific Atoms Shared Atoms Coeff. for Source specific atoms Coeff. for shared atoms Coeff. for Target specific atoms Coeff. for shared atoms

23 IIIT Hyderabad Source Specific Atoms Shared Atoms Target Specific Atoms (1)(2) 23 Learning Domain Specific & Domain Independent features Target Domain Source Domain

24 IIIT Hyderabad 24 Learning Cross Domain Classifiers Source images Target images Source specific coeffs. Coeffs. for shared atoms Target specific coeffs. Coeffs. for shared atoms Sparse representation

25 IIIT Hyderabad 25 Learning Cross Domain Classifiers Source images Discard domain specific coeffs. Train classifiers using coeffs. for shared atoms

26 IIIT Hyderabad (3) Source reconstruction errorTarget reconstruction error 26 Learning Domain Specific & Domain Independent features where Y s contains source images, Y t contains target images, D s and D t are source and target dictionary. (4) (5)

27 IIIT Hyderabad Experiments Dataset  10 object classes from Caltech-256 (C), Webcam(W), Dslr(D), Amazon(A) Feature representation  SURF features  BOW representation(800 visual words) 27

28 IIIT Hyderabad Results Unsupervised Setting(no target labels) 28 MethodC->AC->DA->CA->WW->CW->AD->AD->W MOD src 39.842.137.036.219.826.830.155.3 MOD tgt 44.444.036.838.230.535.434.569.5 Gopalan et al.36.832.635.331.021.727.532.066.0 Gong et al.40.441.137.935.729.335.536.179.1 Ni et al.45.442.340.437.936.338.339.186.2 PSDL(ours)47.648.539.838.931.836.037.979.1

29 IIIT Hyderabad 29 Results PSDL Original features PSDL Original features PSDL Original features Query Retrieved Images

30 IIIT Hyderabad Overview 1.Visual Recognition & Retrieval Tasks 2.Challenges in Visual Recognition & Retrieval a)Dataset Shift b)Large number of categories 3.Handling Dataset Shift a)Handling Dataset Shift in word image retrieval by Kernelized Feature Extraction b)Handling Dataset Shift in digit classification by Domain Adaptation. c)Handling Dataset Shift in object recognition by Domain Adaptation 4.Handling large number of categories

31 IIIT Hyderabad 3. b. Handling Dataset Shift in digit classification by Domain Adaptation

32 IIIT Hyderabad Problem Statement Given: Labeled Source Domain, Unlabeled Target Domain. Goal: Classify target domain images. 32 Target DomainSource Domain

33 IIIT Hyderabad Approach Overview 33 Source data Target data Source Subspace Target Subspace Common Subspace

34 IIIT Hyderabad 34 Desired properties for Subspace: Preserve local geometry of data. Utilize label information. Locality Preserving Projections(LPP) [1]: Preserves local neighborhood. Can utilize label information. [1] X. He and P. Niyogi, “Locality preserving projections,” in NIPS, 2003, pp. 234–241 Locality Preserving Subspace Alignment(LPSA)

35 IIIT Hyderabad 35 Locality Preserving Projection(LPP): (6) Locality Preserving Subspace Alignment(LPSA)

36 IIIT Hyderabad Locality Preserving Subspace Alignment(LPSA) 36 Supervised Locality Preserving Projection(sLPP): (6)

37 IIIT Hyderabad Locality Preserving Subspace Alignment(LPSA) 37

38 IIIT Hyderabad Locality Preserving Subspace Alignment(LPSA) 38 Approach: Aligning Subspaces Target Subspace (7)

39 IIIT Hyderabad Approach: Projection,

40 IIIT Hyderabad Datasets 40 DatasetSourceNo. Images Printed digitsRendering digits in 300 different fonts. 3000 Handwritten digits(HW) Sampling 300 images per digit MNIST. 3000

41 IIIT Hyderabad Experimental Results 41 SourceTargetMethodAccuracy HandwrittenPrintedNo Adaptation48.8 HandwrittenPrintedPCA(source)55.9 HandwrittenPrintedPCA(target)56.5 HandwrittenPrintedPCA(combined)56.5 HandwrittenPrintedFernando et al [2] 57.0 HandwrittenPrintedLPSA(Ours)64.8 [2] Fernando, Basura, et al. "Unsupervised visual domain adaptation using subspace alignment." Computer Vision (ICCV), 2013 IEEE International Conference on. IEEE, 2013.

42 IIIT Hyderabad Experimental Results 42 SourceTargetMethodAccuracy PrintedHandwrittenNo Adaptation70.0 PrintedHandwrittenPCA(source)68.1 PrintedHandwrittenPCA(target)68.9 PrintedHandwrittenPCA(combined)70.2 PrintedHandwrittenFernando et al [2] 70.6 PrintedHandwrittenLPSA(Ours)73.2 [2] Fernando, Basura, et al. "Unsupervised visual domain adaptation using subspace alignment." Computer Vision (ICCV), 2013 IEEE International Conference on. IEEE, 2013.

43 IIIT Hyderabad Experimental Results 43 Test Image No Adaptation DA using LPSA

44 IIIT Hyderabad Overview 1.Visual Recognition & Retrieval Tasks 2.Challenges in Visual Recognition & Retrieval a)Dataset Shift b)Large number of categories 3.Handling Dataset Shift a)Handling Dataset Shift in object recognition by Domain Adaptation b)Handling Dataset Shift in digit classification by Domain Adaptation c)Handling Dataset Shift in word image retrieval by Kernelized Feature Extraction 4.Handling large number of categories

45 IIIT Hyderabad 3.c. Handling Dataset Shift in word image retrieval by Kernelized Feature Extraction

46 IIIT Hyderabad Style-Content Factorization 46

47 IIIT Hyderabad Style-Content Factorization 47 Asymmetric Bilinear Model (Freeman et al. 2000). Factor 1 Factor 2 Image

48 IIIT Hyderabad 48 Asymmetric Bilinear Model (Freeman et al. 2000). Style Content Image Style-Content Factorization

49 IIIT Hyderabad 49 Asymmetric Bilinear Model (Freeman et al. 2000). Style Content Image Style-Content Factorization

50 IIIT Hyderabad 50 Asymmetric Bilinear Model (Freeman et al. 2000). (8) Style dependent Basis Vectors Content Vector Image Notation: refers to style(font), refers to content. Style-Content Factorization

51 IIIT Hyderabad 51 Asymmetric Bilinear Model (Freeman et al. 2000). (8) Style dependent Basis Vectors Content Vector Image Notation: refers to style(font), refers to content. Style-Content Factorization

52 IIIT Hyderabad 52 Problems with Asymmetric Bilinear Model – Needs separate learning for each new style(font). – Model is too simplistic, overlooks nonlinear interactions. Style-Content Factorization

53 IIIT Hyderabad 53 Problems with Asymmetric Bilinear Model – Needs separate learning for each new style. – Model is too simplistic, overlooks nonlinear relationship. To tackle these problems, we propose a kernelized version of Asymmetric Bilinear Model. Style-Content Factorization

54 IIIT Hyderabad Non-linear Style-Content Factorization 54 Asymmetric Kernel Bilinear model(AKBM) (10) (11) where

55 IIIT Hyderabad Non-linear Style-Content Factorization 55 (12) Style Basis Content vector Asymmetric Kernel Bilinear model(AKBM)

56 IIIT Hyderabad Non-linear Style-Content Factorization 56 Learning the Asymmetric Kernel Bilinear model(AKBM) parameters (13) Data fitting term Regularizer

57 IIIT Hyderabad Non-linear Style-Content Factorization 57 Learning the Asymmetric Kernel Bilinear model(AKBM) parameters The mapping function is not known. Kernel trick comes to rescue. (13)

58 IIIT Hyderabad Non-linear Style-Content Factorization 58 Learning the Asymmetric Kernel Bilinear model(AKBM) parameters Kernel Trick (13) (14) Here is the kernel matrix.

59 IIIT Hyderabad Non-linear Style-Content Factorization 59 Learning the Asymmetric Kernel Bilinear model(AKBM) parameters (14) Objective is non-convex in and, but convex with respect to any one of them. We solve it by alternating between solving the convex problem for keeping constant and vice-versa.

60 IIIT Hyderabad 60 Non-linear Style-Content Factorization Representing content using AKBM For novel query in any style, content is found by minimizing following objective (15) (16)

61 IIIT Hyderabad Datasets DatasetNo. distinct wordsNo. word images D120019472 D22004923 D32008463 D420013557 D52002868 Dlab5005000 61 D1-D5 consists of word images from 5 different books, varying in font. Dlab is generated under laboratory settings, consists of 10 widely varying fonts.

62 IIIT Hyderabad Datasets Dlab

63 IIIT Hyderabad Experimental Results 63 D1->D2D1->D3D1->D4D2->D1D2->D3D2->D4 No Transfer 0.630.550.680.690.680.76 ABM(Freeman et al.) 0.670.590.700.710.760.83 AKBM(ours) 0.880.720.840.850.830.91 Asymmetric Kernel Bilinear model(AKBM) refers to our Kernelized style-content factorization.

64 IIIT Hyderabad QueryRetrieved Images(Cross font) No Transfer AKBM No Transfer AKBM

65 IIIT Hyderabad 65 Retrieval results on Dlab Experimental Results

66 IIIT Hyderabad Overview 1.Visual Recognition & Retrieval Tasks 2.Challenges in Visual Recognition & Retrieval a)Dataset Shift b)Large number of categories 3.Handling Dataset Shift a)Handling Dataset Shift in object recognition by Domain Adaptation b)Handling Dataset Shift in digit classification by Domain Adaptation c)Handling Dataset Shift in word image retrieval by Kernelized Feature Extraction 4.Handling large number of categories via Transfer Learning

67 IIIT Hyderabad 4. Handling large number of categories via Transfer Learning

68 IIIT Hyderabad Problem Statement 68 To design a scalable classifier based document image retrieval system. Around 200K word categories in English language

69 IIIT Hyderabad Proposed Approach 69 Top few frequent words have most coverage. A query word can be Frequent query : corresponding to the frequent words(higher coverage). Rare query : corresponding to the rare words(less coverage).

70 IIIT Hyderabad Proposed Approach 70 Classifiers are trained for frequent queries & synthesized on-the-fly for rare queries. Rare queries consist of characters already present in one or multiple frequent queries. To synthesize classifier for a novel rare query, cut and paste relevant portions from existing frequent classifiers.

71 IIIT Hyderabad Proposed Approach 71 On-the-fly classifier synthesis

72 IIIT Hyderabad Proposed Approach 72 On-the-fly classifier synthesis

73 IIIT Hyderabad Datasets 73 DatasetSourceTypeNo. of Images D11 bookClean26,555 D22 booksClean35,730 D31 bookNoisy4373

74 IIIT Hyderabad Experimental Results 74 Where mAP is the mean average precision for the 100 queries. DatasetSourceType# Images# queries OCR (mAP) LDA (mAP) D11 bookClean26,5551000.970.98 D22 booksClean35,7301000.950.92 D31 bookNoisy43731000.890.98

75 IIIT Hyderabad Experimental Results 75 Datase t No. of queries mAP (frequent queries) mAP (rare queries) D11000.990.90 D21000.980.87 D310010.82 Where mAP is the mean average precision for the 100 queries.

76 IIIT Hyderabad Conclusion 76 Domain Adaptation reduces the mismatch across source & target domains. AKBM is more robust to font variations, in comparison to Asymmetric Bilinear Model. Transfer learning can be used to design scalable classifier based word image retrieval systems.

77 IIIT Hyderabad Contributions 77 PSDL: a joint dictionary learning strategy, suitable for domain adaptation. LPSA: a subspace alignment strategy for domain adaptation. AKBM: a nonlinear style-content factorization model. DQC: a transfer learning strategy for on-the-fly learning of word image classifiers.

78 IIIT Hyderabad Thank You 78 Related Publications 1. Viresh Ranjan, Gaurav Harit and C.V. Jawahar: Enhancing World Image Retrieval in Presence of Font Variations, International Conference on Pattern Recognition, 2014 (Oral) 2. Viresh Ranjan, Gaurav Harit and C.V. Jawahar: Document Retrieval with Unlimited Vocabulary, IEEE Winter Conference on Applications of Computer Vision(WACV), 2015 3. Viresh Ranjan, Gaurav Harit and C.V. Jawahar: Learning Partially Shared Dictionaries for Domain Adaptation, 12th Asian Conference on Computer Vision (ACCV 2014) (Workshop: FSLCV 2014) 4. Viresh Ranjan, Gaurav Harit and C.V. Jawahar: Domain Adaptation by Aligning Locality Preserving Subspaces, 8th International Conference on Advances in Pattern Recognition(ICAPR 2015)


Download ppt "IIIT Hyderabad Synthesizing Classifiers for Novel Settings Viresh Ranjan CVIT,IIIT-H Adviser: Prof. C. V. Jawahar, IIIT-H Co-Adviser: Dr. Gaurav Harit,"

Similar presentations


Ads by Google