Total Variation and Euler's Elastica for Supervised Learning

Slides:



Advertisements
Similar presentations
Pattern Recognition and Machine Learning
Advertisements

Johann Radon Institute for Computational and Applied Mathematics: 1/25 Signal- und Bildverarbeitung, Image Analysis and Processing.
Various Regularization Methods in Computer Vision Min-Gyu Park Computer Vision Lab. School of Information and Communications GIST.
L1 sparse reconstruction of sharp point set surfaces
TVL1 Models for Imaging: Global Optimization & Geometric Properties Part I Tony F. Chan Math Dept, UCLA S. Esedoglu Math Dept, Univ. Michigan Other Collaborators:
Active Contours, Level Sets, and Image Segmentation
Variational Pairing of Image Segmentation and Blind Restoration Leah Bar Nir Sochen* Nahum Kiryati School of Electrical Engineering *Dept. of Applied Mathematics.
Maximum Margin Markov Network Ben Taskar, Carlos Guestrin Daphne Koller 2004.
Pattern Recognition and Machine Learning
A Geometric Perspective on Machine Learning 何晓飞 浙江大学计算机学院 1.
Support Vector Machines
Machine learning continued Image source:
Large Scale Manifold Transduction Michael Karlen Jason Weston Ayse Erkan Ronan Collobert ICML 2008.
Robust Moving Object Detection & Categorization using self- improving classifiers Omar Javed, Saad Ali & Mubarak Shah.
Graph Based Semi- Supervised Learning Fei Wang Department of Statistical Science Cornell University.
The Perceptron Algorithm (Dual Form) Given a linearly separable training setand Repeat: until no mistakes made within the for loop return:
Comp 775: Deformable models: snakes and active contours Marc Niethammer, Stephen Pizer Department of Computer Science University of North Carolina, Chapel.
What is Learning All about ?  Get knowledge of by study, experience, or being taught  Become aware by information or from observation  Commit to memory.
Lecture 10: Support Vector Machines
EE565 Advanced Image Processing Copyright Xin Li Different Frameworks for Image Processing Statistical/Stochastic Models: Wiener’s MMSE estimation.
Linear Discriminant Functions Chapter 5 (Duda et al.)
Online Learning Algorithms
An Introduction to Support Vector Machines Martin Law.
Active Learning for Class Imbalance Problem
Biointelligence Laboratory, Seoul National University
Introduction to variable selection I Qi Yu. 2 Problems due to poor variable selection: Input dimension is too large; the curse of dimensionality problem.
Deformable Models Segmentation methods until now (no knowledge of shape: Thresholding Edge based Region based Deformable models Knowledge of the shape.
Semisupervised Learning A brief introduction. Semisupervised Learning Introduction Types of semisupervised learning Paper for review References.
1 Learning with Local and Global Consistency Presented by Qiuhua Liu Duke University Machine Learning Group March 23, 2007 By Dengyong Zhou, Olivier Bousquet,
Machine Learning Using Support Vector Machines (Paper Review) Presented to: Prof. Dr. Mohamed Batouche Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan.
Kernel Methods A B M Shawkat Ali 1 2 Data Mining ¤ DM or KDD (Knowledge Discovery in Databases) Extracting previously unknown, valid, and actionable.
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 3: LINEAR MODELS FOR REGRESSION.
Stochastic Subgradient Approach for Solving Linear Support Vector Machines Jan Rupnik Jozef Stefan Institute.
CS 782 – Machine Learning Lecture 4 Linear Models for Classification  Probabilistic generative models  Probabilistic discriminative models.
Center for Evolutionary Functional Genomics Large-Scale Sparse Logistic Regression Jieping Ye Arizona State University Joint work with Jun Liu and Jianhui.
An Introduction to Support Vector Machines (M. Law)
Transductive Regression Piloted by Inter-Manifold Relations.
A Novel Local Patch Framework for Fixing Supervised Learning Models Yilei Wang 1, Bingzheng Wei 2, Jun Yan 2, Yang Hu 2, Zhi-Hong Deng 1, Zheng Chen 2.
Nonlinear Learning Using Local Coordinate Coding K. Yu, T. Zhang and Y. Gong, NIPS 2009 Improved Local Coordinate Coding Using Local Tangents K. Yu and.
Considering Cost Asymmetry in Learning Classifiers Presented by Chunping Wang Machine Learning Group, Duke University May 21, 2007 by Bach, Heckerman and.
SemiBoost : Boosting for Semi-supervised Learning Pavan Kumar Mallapragada, Student Member, IEEE, Rong Jin, Member, IEEE, Anil K. Jain, Fellow, IEEE, and.
Learning from Positive and Unlabeled Examples Investigator: Bing Liu, Computer Science Prime Grant Support: National Science Foundation Problem Statement.
Biointelligence Laboratory, Seoul National University
Linear Models for Classification
Final Exam Review CS479/679 Pattern Recognition Dr. George Bebis 1.
Maximum Entropy Discrimination Tommi Jaakkola Marina Meila Tony Jebara MIT CMU MIT.
8.3 Empirical Models. Definition An empirical model is based only on data and is used to predict, not explain, a system. An empirical model consists of.
Locally Linear Support Vector Machines Ľubor Ladický Philip H.S. Torr.
Feature Selction for SVMs J. Weston et al., NIPS 2000 오장민 (2000/01/04) Second reference : Mark A. Holl, Correlation-based Feature Selection for Machine.
Optimal Reverse Prediction: Linli Xu, Martha White and Dale Schuurmans ICML 2009, Best Overall Paper Honorable Mention A Unified Perspective on Supervised,
Learning by Loss Minimization. Machine learning: Learn a Function from Examples Function: Examples: – Supervised: – Unsupervised: – Semisuprvised:
SemiBoost : Boosting for Semi-supervised Learning Pavan Kumar Mallapragada, Student Member, IEEE, Rong Jin, Member, IEEE, Anil K. Jain, Fellow, IEEE, and.
Big Data Infrastructure Week 8: Data Mining (1/4) This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States.
CMPS 142/242 Review Section Fall 2011 Adapted from Lecture Slides.
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Semi-Supervised Learning Using Label Mean
Learning Deep Generative Models by Ruslan Salakhutdinov
Lecture 07: Soft-margin SVM
Boosting and Additive Trees (2)
Machine Learning Basics
Asymmetric Gradient Boosting with Application to Spam Filtering
LINEAR AND NON-LINEAR CLASSIFICATION USING SVM and KERNELS
Learning with information of features
Lecture 07: Soft-margin SVM
CSCI B609: “Foundations of Data Science”
Overview of Machine Learning
Biointelligence Laboratory, Seoul National University
Softmax Classifier.
Machine learning overview
Using Manifold Structure for Partially Labeled Classification
Presentation transcript:

Total Variation and Euler's Elastica for Supervised Learning Tong Lin, Hanlin Xue, Ling Wang, Hongbin Zha Contact: tonglin123@gmail.com Peking University, China 2012-6-29 Key Lab. Of Machine Perception, School of EECS, Peking University, China

Background Supervised Learning: Prior Work: Definition: Predict u: x -> y, with training data (x1, y1), …, (xN, yN) Two tasks: Classification and Regression Prior Work: SVM: RLS: Regularized Least Squares, Rifkin, 2002 Hinge loss: Squared loss:

Background Prior Work (Cont.): Laplacian Energy: “Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples,” Belkin et al., JMLR 7:2399-2434, 2006 Hessian Energy: “Semi-supervised Regression using Hessian Energy with an Application to Semi-supervised Dimensionality Reduction,” K.I. Kim, F. Steinke, M. Hein, NIPS 2009 GLS: “Classification using geometric level sets,” Varshney & Willsky, JMLR 11:491-516, 2010

Motivation SVM Our Proposed Method

3D display of the output classification function u(x) by the proposed EE model Large margin should not be the sole criterion; we argue sharper edges and smoother boundaries can play significant roles.

Models General: Laplacian Regularization (LR): Total Variation (TV): Euler’s Elastica (EE):

TV&EE in Image Processing TV: a measure of total quantity of the value change Image denoising (Rudin, Osher, Fatemi, 1992) Elastica was introduced by Euler in 1744 on modeling torsion-free elastic rods Image inpainting (Chan et al., 2002)

TV can preserve sharp edges, while EE can produce smooth boundaries For details, see T. Chan & J. Shen’s textbook: Image Processing and Analysis: Variational, PDE, Wavelet, and Stochastic Methods, SIAM, 2005

Decision boundary The mean curvature k in high dimensional space can have same expression except the constant 1/(d-1).

Framework

Energy Functional Minimization The calculus of variations → Euler-Lagrange PDE

a. Laplacian Regularization (LR) Solutions a. Laplacian Regularization (LR) Radial Basis Function Approximation b. TV & EE: We develop two solutions Gradient descent time marching (GD) Lagged linear equation iteration (LagLE)

Experiments: Two-Moon Data SVM EE Both methods can achieve 100% accuracies with different parameter combinations

Experiments: Binary Classification

Experiments: Multi-class Classification

Experiments: Multi-class Classification Note: Results of TV and EE are computed by the LagLE method.

Experiments: Regression

Conclusions End, thank you! Contributions: Future Work: Introduce TV&EE to the ML community Demonstrate the significance of curvature and gradient empirically Achieve superior performance for classification and regression Future Work: Hinge loss Other basis functions Extension to semi-supervised setting Existence and uniqueness of the PDE solutions Fast algorithm to reduce the running time End, thank you!