Learning visual representations for unfamiliar environments Kate Saenko, Brian Kulis, Trevor Darrell UC Berkeley EECS & ICSI.

Slides:



Advertisements
Similar presentations
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Advertisements

Improving the Fisher Kernel for Large-Scale Image Classification Florent Perronnin, Jorge Sanchez, and Thomas Mensink, ECCV 2010 VGG reading group, January.
ECG Signal processing (2)
1 Manifold Alignment for Multitemporal Hyperspectral Image Classification H. Lexie Yang 1, Melba M. Crawford 2 School of Civil Engineering, Purdue University.
Feature Selection as Relevant Information Encoding Naftali Tishby School of Computer Science and Engineering The Hebrew University, Jerusalem, Israel NIPS.
Aggregating local image descriptors into compact codes
Integrated Instance- and Class- based Generative Modeling for Text Classification Antti PuurulaUniversity of Waikato Sung-Hyon MyaengKAIST 5/12/2013 Australasian.
Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
Foreground Focus: Finding Meaningful Features in Unlabeled Images Yong Jae Lee and Kristen Grauman University of Texas at Austin.
ICONIP 2005 Improve Naïve Bayesian Classifier by Discriminative Training Kaizhu Huang, Zhangbing Zhou, Irwin King, Michael R. Lyu Oct
1 Welcome to the Kernel-Class My name: Max (Welling) Book: There will be class-notes/slides. Homework: reading material, some exercises, some MATLAB implementations.
An Introduction of Support Vector Machine
Classification using intersection kernel SVMs is efficient Joint work with Subhransu Maji and Alex Berg Jitendra Malik UC Berkeley.
Learning Globally-Consistent Local Distance Functions for Shape-Based Image Retrieval and Classification Computer Vision, ICCV IEEE 11th International.
Patch to the Future: Unsupervised Visual Prediction
Addressing the Medical Image Annotation Task using visual words representation Uri Avni, Tel Aviv University, Israel Hayit GreenspanTel Aviv University,
Optimal Design Laboratory | University of Michigan, Ann Arbor 2011 Design Preference Elicitation Using Efficient Global Optimization Yi Ren Panos Y. Papalambros.
Discriminative and generative methods for bags of features
Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
1 Transfer Learning Algorithms for Image Classification Ariadna Quattoni MIT, CSAIL Advisors: Michael Collins Trevor Darrell.
1 Jun Wang, 2 Sanjiv Kumar, and 1 Shih-Fu Chang 1 Columbia University, New York, USA 2 Google Research, New York, USA Sequential Projection Learning for.
Active Learning with Support Vector Machines
Supervised Distance Metric Learning Presented at CMU’s Computer Vision Misc-Read Reading Group May 9, 2007 by Tomasz Malisiewicz.
Semi-Supervised Clustering Jieping Ye Department of Computer Science and Engineering Arizona State University
Affordance Prediction via Learned Object Attributes Tucker Hermans James M. Rehg Aaron Bobick Computational Perception Lab School of Interactive Computing.
Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)
Learning from Multiple Outlooks Maayan Harel and Shie Mannor ICML 2011 Presented by Minhua Chen.
Learning to Learn By Exploiting Prior Knowledge
Jinhui Tang †, Shuicheng Yan †, Richang Hong †, Guo-Jun Qi ‡, Tat-Seng Chua † † National University of Singapore ‡ University of Illinois at Urbana-Champaign.
Cao et al. ICML 2010 Presented by Danushka Bollegala.
Person-Specific Domain Adaptation with Applications to Heterogeneous Face Recognition (HFR) Presenter: Yao-Hung Tsai Dept. of Electrical Engineering, NTU.
Step 3: Classification Learn a decision rule (classifier) assigning bag-of-features representations of images to different classes Decision boundary Zebra.
Machine Learning CUNY Graduate Center Lecture 3: Linear Regression.
This week: overview on pattern recognition (related to machine learning)
The use of machine translation tools for cross-lingual text-mining Blaz Fortuna Jozef Stefan Institute, Ljubljana John Shawe-Taylor Southampton University.
Overcoming Dataset Bias: An Unsupervised Domain Adaptation Approach Boqing Gong University of Southern California Joint work with Fei Sha and Kristen Grauman.
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
ADVANCED CLASSIFICATION TECHNIQUES David Kauchak CS 159 – Fall 2014.
Transfer Learning with Applications to Text Classification Jing Peng Computer Science Department.
Kernel Methods A B M Shawkat Ali 1 2 Data Mining ¤ DM or KDD (Knowledge Discovery in Databases) Extracting previously unknown, valid, and actionable.
Classifiers Given a feature representation for images, how do we learn a model for distinguishing features from different classes? Zebra Non-zebra Decision.
Semantic Embedding Space for Zero ­ Shot Action Recognition Xun XuTimothy HospedalesShaogang GongAuthors: Computer Vision Group Queen Mary University of.
Unsupervised Learning of Visual Sense Models for Polysemous Words Kate Saenko Trevor Darrell Deepak.
Low-Rank Kernel Learning with Bregman Matrix Divergences Brian Kulis, Matyas A. Sustik and Inderjit S. Dhillon Journal of Machine Learning Research 10.
Extending the Multi- Instance Problem to Model Instance Collaboration Anjali Koppal Advanced Machine Learning December 11, 2007.
Neural Networks - Lecture 81 Unsupervised competitive learning Particularities of unsupervised learning Data clustering Neural networks for clustering.
Geodesic Flow Kernel for Unsupervised Domain Adaptation Boqing Gong University of Southern California Joint work with Yuan Shi, Fei Sha, and Kristen Grauman.
Query Sensitive Embeddings Vassilis Athitsos, Marios Hadjieleftheriou, George Kollios, Stan Sclaroff.
CSC2535: Computation in Neural Networks Lecture 12: Non-linear dimensionality reduction Geoffrey Hinton.
H. Lexie Yang1, Dr. Melba M. Crawford2
Image Classification for Automatic Annotation
Support vector machine LING 572 Fei Xia Week 8: 2/23/2010 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A 1.
Chapter1: Introduction Chapter2: Overview of Supervised Learning
Iterative similarity based adaptation technique for Cross Domain text classification Under: Prof. Amitabha Mukherjee By: Narendra Roy Roll no: Group:
Demosaicking for Multispectral Filter Array (MSFA)
Optimal Reverse Prediction: Linli Xu, Martha White and Dale Schuurmans ICML 2009, Best Overall Paper Honorable Mention A Unified Perspective on Supervised,
Support Vector Machines Reading: Ben-Hur and Weston, “A User’s Guide to Support Vector Machines” (linked from class web page)
Semi-Supervised Clustering
Data Mining, Neural Network and Genetic Programming
Constrained Clustering -Semi Supervised Clustering-
Introductory Seminar on Research: Fall 2017
Machine Learning Basics
Metric Learning for Clustering
Deep learning and applications to Natural language processing
Learning with information of features
Random feature for sparse signal classification
CS 2750: Machine Learning Support Vector Machines
Image Classification Painting and handwriting identification
Machine learning overview
Unsupervised learning of visual sense models for Polysemous words
Presentation transcript:

Learning visual representations for unfamiliar environments Kate Saenko, Brian Kulis, Trevor Darrell UC Berkeley EECS & ICSI

The challenge of large scale visual interaction Last decade has proven the superiority of models learned from data vs. hand engineered structures!

Unsupervised: Learn models from found data; often exploit multiple modalities (text+image) Large-scale learning … The Tote is the perfect example of two handbag design principles that... The lines of this tote are incredibly sleek, but... The semi buckles that form the handle attachments are...

E.g., finding visual senses 4 Artifact sense: telephone DICTIONARY 1: (n) telephone, phone, telephone set (electronic equipment that converts sound into electrical signals that can be transmitted over distances and then converts received signals back into sounds):phone telephone set 2: (n) telephone, telephony (transmitting speech at a distance): telephony [Saenko and Darrell 09]

Unsupervised: Learn models from found data; often exploit multiple modalities (text+image) Supervised: Crowdsource labels (e.g., ImageNet) Large-scale Learning … The Tote is the perfect example of two handbag design principles that... The lines of this tote are incredibly sleek, but... The semi buckles that form the handle attachments are...

Yet… Even the best collection of images from the web and strong machine learning methods can often yield poor classifiers on in-situ data! Supervised learning assumption: training distribution == test distribution Unsupervised learning assumption: joint distribution is stationary w.r.t. online world and real world Almost never true! 6 ?

What You Saw Is Not What You Get The models fail due to domain shift SVM:54% NBNN:61% SVM:20% NBNN:19%

Close-up Far-away amazon.com Consumer images FLICKR CCTV Examples of visual domain shifts digital SLRwebcam

Examples of domain shift: change in camera, feature type, dimension digital SLR webcam SURF VQ to 300 SIFT VQ to 1000 Different dimensions

Solutions? Do nothing (poor performance) Collect all types of data (impossible) Find out what changed (impractical) Learn what changed

Prior Work on Domain Adaptation Pre-process the data [Daumé 07] : replicate features to also create source- and domain- specific versions; re-train learner on new features SVM-based methods [Yang07], [Jiang08], [Duan09], [Duan10] : adapt SVM parameters Kernel mean matching [Gretton09] : re-weight training data to match test data distribution

Our paradigm: Transform-based Domain Adaptation Previous methods drawbacks cannot transfer learned shift to new categories cannot handle new features We can do both by learning domain transformations * Example: green and blue domains W * Saenko, Kulis, Fritz, and Darrell. Adapting visual category models to new domains. ECCV, 2010

Symmetric assumption fails! Limitations of symmetric transforms Saenko et al. ECCV10 used metric learning: symmetric transforms same features How do we learn more general shifts ? W

Asymmetric transform (rotation) Latest approach*: asymmetric transforms Metric learning model no longer applicable We propose to learn asymmetric transforms – Map from target to source – Handle different dimensions *Kulis, Saenko, and Darrell, What You Saw is Not What You Get: Domain Adaptation Using Asymmetric Kernel Transforms, CVPR 2011

Asymmetric transform (rotation) W Latest approach: asymmetric transforms Metric learning model no longer applicable We propose to learn asymmetric transforms – Map from target to source – Handle different dimensions

Model Details Learn a linear transformation to map points from one domain to another – Call this transformation W – Matrices of source and target: W

Loss Functions Choose a point x from the source and y from the target, and consider inner product: Should be large for similar objects and small for dissimilar objects

Loss Functions Input to problem includes a collection of m loss functions General assumption: loss functions depend on data only through inner product matrix

Regularized Objective Function Minimize a linear combination of sum of loss functions and a regularizer: We use squared Frobenius norm as a regularizer – Not restricted to this choice

The Model Has Drawbacks A linear transformation may be insufficient Cost of optimization grows as the product of the dimensionalities of the source and target data What to do?

Kernelization Main idea: run in kernel space – Use a non-linear kernel function (e.g., RBF kernel) to learn non-linear transformations in input space – Resulting optimization is independent of input dimensionality – Additional assumption necessary: regularizer is a spectral function

Kernelization Original Transformation Learning Problem Kernel matrices for source and target New Kernel Problem Relationship between original and new problems at optimality

Summary of approach Input space 1. Multi-Domain Data 2. Generate Constraints, Learn W 3. Map via W4. Apply to New Categories Test point y1y1 y2y2

Multi-domain dataset

Experimental Setup Utilized a standard bag-of-words model Also utilize different features in the target domain – SURF vs SIFT – Different visual word dictionaries Baseline for comparing such data: KCCA

Same-Category Results Baselines (knn, svm, metric learning) explained in paper Our Method

Novel-class experiments Test methods ability to transfer domain shift to unseen classes Train transform on half of the classes, test on the other half Our Method (linear) Our Method

Extreme shift example Nearest neighbors in source using transformation Query from target Nearest neighbors in source using KCCA+KNN

Conclusion Should not rely on hand-engineered features any more than we rely on hand engineered models! Learn feature transformation across domains Developed a domain adaptation method based on regularized non-linear transforms – Asymmetric transform achieves best results on more extreme shifts – Saenko et al ECCV 2010 and Kulis et al CVPR 2011; journal version forthcoming