Spoken Language Group Chinese Information Processing Lab. Institute of Information Science Academia Sinica, Taipei, Taiwan

Slides:

Advertisements

Similar presentations

Text mining Gergely Kótyuk Laboratory of Cryptography and System Security (CrySyS) Budapest University of Technology and Economics

Advertisements

ECG Signal processing (2)

Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?

SVM - Support Vector Machines A new classification method for both linear and nonlinear data It uses a nonlinear mapping to transform the original training.

Carolina Galleguillos, Brian McFee, Serge Belongie, Gert Lanckriet Computer Science and Engineering Department Electrical and Computer Engineering Department.

An Introduction of Support Vector Machine

Support Vector Machines

1 Lecture 5 Support Vector Machines Large-margin linear classifier Non-separable case The Kernel trick.

Search Engines Information Retrieval in Practice All slides ©Addison Wesley, 2008.

Machine learning continued Image source:

Manifold Learning Dimensionality Reduction. Outline Introduction Dim. Reduction Manifold Isomap Overall procedure Approximating geodesic dist. Dijkstra’s.

Discriminative, Unsupervised, Convex Learning Dale Schuurmans Department of Computing Science University of Alberta MITACS Workshop, August 26, 2005.

The Disputed Federalist Papers : SVM Feature Selection via Concave Minimization Glenn Fung and Olvi L. Mangasarian CSNA 2002 June 13-16, 2002 Madison,

Support Vector Machines and Kernel Methods

Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?

Fuzzy Support Vector Machines (FSVMs) Weijia Wang, Huanren Zhang, Vijendra Purohit, Aditi Gupta.

Support Vector Machines (SVMs) Chapter 5 (Duda et al.)

Locally Constraint Support Vector Clustering

The value of kernel function represents the inner product of two training points in feature space Kernel functions merge two steps 1. map input data from.

Support Vector Classification (Linearly Separable Case, Primal) The hyperplanethat solves the minimization problem: realizes the maximal margin hyperplane.

Proximal Support Vector Machine Classifiers KDD 2001 San Francisco August 26-29, 2001 Glenn Fung & Olvi Mangasarian Data Mining Institute University of.

Reduced Support Vector Machine

Support Vector Machines Formulation  Solve the quadratic program for some : min s. t.,, denotes where or membership.  Different error functions and measures.

Classification Problem 2-Category Linearly Separable Case A- A+ Malignant Benign.

Supervised Distance Metric Learning Presented at CMU’s Computer Vision Misc-Read Reading Group May 9, 2007 by Tomasz Malisiewicz.

Three Algorithms for Nonlinear Dimensionality Reduction Haixuan Yang Group Meeting Jan. 011, 2005.

Bioinformatics Challenge  Learning in very high dimensions with very few samples  Acute leukemia dataset: 7129 # of gene vs. 72 samples  Colon cancer.

Lecture outline Support vector machines. Support Vector Machines Find a linear hyperplane (decision boundary) that will separate the data.

Distance Metric Learning: A Comprehensive Survey

Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)

An Introduction to Support Vector Machines Martin Law.

Outline Separating Hyperplanes – Separable Case

This week: overview on pattern recognition (related to machine learning)

Support Vector Machine & Image Classification Applications

Machine Learning Seminar: Support Vector Regression Presented by: Heng Ji 10/08/03.

IEEE TRANSSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE

Learning a Kernel Matrix for Nonlinear Dimensionality Reduction By K. Weinberger, F. Sha, and L. Saul Presented by Michael Barnathan.

Proximal Support Vector Machine Classifiers KDD 2001 San Francisco August 26-29, 2001 Glenn Fung & Olvi Mangasarian Data Mining Institute University of.

SVM Support Vector Machines Presented by: Anas Assiri Supervisor Prof. Dr. Mohamed Batouche.

Support Vector Machines Reading: Ben-Hur and Weston, “A User’s Guide to Support Vector Machines” (linked from class web page)

Classifiers Given a feature representation for images, how do we learn a model for distinguishing features from different classes? Zebra Non-zebra Decision.

An Introduction to Support Vector Machines (M. Law)

Local Fisher Discriminant Analysis for Supervised Dimensionality Reduction Presented by Xianwang Wang Masashi Sugiyama.

Machine Learning Weak 4 Lecture 2. Hand in Data It is online Only around 6000 images!!! Deadline is one week. Next Thursday lecture will be only one hour.

GRASP Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger, Fei Sha and Lawrence K. Saul ICML’04 Department of Computer.

RSVM: Reduced Support Vector Machines Y.-J. Lee & O. L. Mangasarian First SIAM International Conference on Data Mining Chicago, April 6, 2001 University.

Principal Component Analysis Machine Learning. Last Time Expectation Maximization in Graphical Models – Baum Welch.

CSSE463: Image Recognition Day 14 Lab due Weds, 3:25. Lab due Weds, 3:25. My solutions assume that you don't threshold the shapes.ppt image. My solutions.

CSE4334/5334 DATA MINING CSE4334/5334 Data Mining, Fall 2014 Department of Computer Science and Engineering, University of Texas at Arlington Chengkai.

Support Vector Machine Debapriyo Majumdar Data Mining – Fall 2014 Indian Statistical Institute Kolkata November 3, 2014.

Support Vector Machines Tao Department of computer science University of Illinois.

Maximum Entropy Discrimination Tommi Jaakkola Marina Meila Tony Jebara MIT CMU MIT.

CS Statistical Machine learning Lecture 12 Yuan (Alan) Qi Purdue CS Oct

Support Vector Machines Reading: Ben-Hur and Weston, “A User’s Guide to Support Vector Machines” (linked from class web page)

Mesh Segmentation via Spectral Embedding and Contour Analysis Speaker: Min Meng

Spoken Language Group Chinese Information Processing Lab. Institute of Information Science Academia Sinica, Taipei, Taiwan

Incremental Reduced Support Vector Machines Yuh-Jye Lee, Hung-Yi Lo and Su-Yun Huang National Taiwan University of Science and Technology and Institute.

Nonlinear Dimension Reduction: Semi-Definite Embedding vs. Local Linear Embedding Li Zhang and Lin Liao.

A Brief Introduction to Support Vector Machine (SVM) Most slides were from Prof. A. W. Moore, School of Computer Science, Carnegie Mellon University.

Day 17: Duality and Nonlinear SVM Kristin P. Bennett Mathematical Sciences Department Rensselaer Polytechnic Institute.

Spectral Methods for Dimensionality

PREDICT 422: Practical Machine Learning

Unsupervised Riemannian Clustering of Probability Density Functions

Machine Learning Basics

Metric Learning for Clustering

Multivariate Methods Berlin Chen

Nonlinear Dimension Reduction:

Linear Discrimination

University of Wisconsin - Madison

SVMs for Document Ranking

Presentation transcript:

Spoken Language Group Chinese Information Processing Lab. Institute of Information Science Academia Sinica, Taipei, Taiwan Multiple Parameter Selection of Support Vector Machine Hung-Yi Lo

2007/07/11 2 Outline Phonetic Boundary Refinement Using Support Vector Machine (ICASSP’07, ICSLP’07) Automatic Model Selection for Support Vector Machine (Distance Metric Learning for Support Vector Machine)

2007/07/11 3 Automatic Model Selection for Support Vector Machine (Distance Metric Learning for Support Vector Machine)

2007/07/11 4 Automatic Model Selection for SVM The problem of choosing a good parameter or model setting for a better generalization ability is the so called model selection. We have two parameter in support vector machine:  regularization variable C  Gaussian kernel width parameter γ Support vector machine formulation: Gaussian kernel: min s. t. (QP)

2007/07/11 5 C.-M. Huang, Y.-J. Lee, Dennis K. J. Lin and S.-Y. Huang. "Model Selection for Support Vector Machines via Uniform Design", A special issue on Machine Learning and Robust Data Mining of Computational Statistics and Data Analysis. (To appear) Automatic Model Selection for SVM

2007/07/11 6 Automatic Model Selection for SVM Strength:  Automate the training progress of SVM, nearly no human- effort needed.  The object of the model selection procedure is directly related to testing performance. In my experimental experience, testing correctness always better than the results of human-tuning.  Nested uniform-designed-based method is much faster than exhaustive grid search. Weakness:  No closed-form solution, need doing experimental search.  Time consuming.

2007/07/11 7 Distance Metric Learning L. Yang "Distance Metric Learning: A Comprehensive Survey", Ph.D. survey Many works have done to learn a quadratic (Mahalanobis) distance measures: where x i is the input vector for the i th training case and Q is a symmetric, positive semi-definite matrix. Distance metric learning is equivalent to feature transformation:

2007/07/11 8 Supervised Distance Metric Learning Local Local Adaptive Distance Metric Learning Neighborhood Components Analysis Relevant Component Analysis Unsupervised Distance Metric Learning Nonlinear embedding LLE, ISOMAP, Laplacian Eigenmaps Distance Metric Learning based on SVM Large Margin Nearest Neighbor Based Distance Metric Learning Cast Kernel Margin Maximization into a SDP problem Kernel Methods for Distance Metrics Learning Kernel Alignment with SDP Learning with Idealized Kernel Linear embedding PCA, MDS Global Distance Metric Learning by Convex Programming

2007/07/11 9 Distance Metric Learning Strength:  Usually have closed-form solution. Weakness:  The object of the distance metric learning is based some data distribution criterion, but not the evaluation performance.

2007/07/11 10 Automatic Multiple Parameter Selection for SVM Gaussian kernel: Traditionally, each dimension of the feature vector will be normalized into zero-mean and one standard deviation. So each dimension have the same contribute to the kernel. However, some features should be more important. which is equivalent to diagonal distance metric learning:

2007/07/11 11 I would like to do this task by experimental search, and incorporate data distribution criterion as some heuristic.  Much more time consuming, might only applicable on small data. Feature selection is another similar task and can be solved by experimental search, while the diagonal of the matrix is zero or one.  Applicable on large data.  But, already have many publication. Automatic Multiple Parameter Selection for SVM

2007/07/11 12 Thank you!