1 NHDC and PHDC: Local and Global Heat Diffusion Based Classifiers Haixuan Yang Group Meeting Sep 26, 2005.

Slides:



Advertisements
Similar presentations
ECG Signal processing (2)
Advertisements

Learning Trajectory Patterns by Clustering: Comparative Evaluation Group D.
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) ETHEM ALPAYDIN © The MIT Press, 2010
K-NEAREST NEIGHBORS AND DECISION TREE Nonparametric Supervised Learning.
Data Mining Classification: Alternative Techniques
Data Mining Classification: Alternative Techniques
Carolina Galleguillos, Brian McFee, Serge Belongie, Gert Lanckriet Computer Science and Engineering Department Electrical and Computer Engineering Department.
Support Vector Machines
Nonlinear Dimension Reduction Presenter: Xingwei Yang The powerpoint is organized from: 1.Ronald R. Coifman et al. (Yale University) 2. Jieping Ye, (Arizona.
Lecture 3 Nonparametric density estimation and classification
Ai in game programming it university of copenhagen Statistical Learning Methods Marco Loog.
Pattern recognition Professor Aly A. Farag
1 DiffusionRank: A Possible Penicillin for Web Spamming Haixuan Yang Group Meeting Jan. 16, 2006.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
K nearest neighbor and Rocchio algorithm
MACHINE LEARNING 9. Nonparametric Methods. Introduction Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2 
Prénom Nom Document Analysis: Parameter Estimation for Pattern Recognition Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Semi-Supervised Classification by Low Density Separation Olivier Chapelle, Alexander Zien Student: Ran Chang.
Chapter 4 (Part 1): Non-Parametric Classification
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
1 Heat Diffusion Model and its Applications Haixuan Yang Term Presentation Dec 2, 2005.
Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Machine Learning Models on Random Graphs Haixuan Yang Supervisors: Prof. Irwin King and Prof. Michael R. Lyu June 20, 2007.
Semi-Supervised Learning Using Randomized Mincuts Avrim Blum, John Lafferty, Raja Reddy, Mugizi Rwebangira.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Supervised Distance Metric Learning Presented at CMU’s Computer Vision Misc-Read Reading Group May 9, 2007 by Tomasz Malisiewicz.
Three Algorithms for Nonlinear Dimensionality Reduction Haixuan Yang Group Meeting Jan. 011, 2005.
Chapter 4 (part 2): Non-Parametric Classification
A Study of the Relationship between SVM and Gabriel Graph ZHANG Wan and Irwin King, Multimedia Information Processing Laboratory, Department of Computer.
Semi-Supervised Learning D. Zhou, O Bousquet, T. Navin Lan, J. Weston, B. Schokopf J. Weston, B. Schokopf Presents: Tal Babaioff.
Pattern Recognition. Introduction. Definitions.. Recognition process. Recognition process relates input signal to the stored concepts about the object.
INSTANCE-BASE LEARNING
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
CS Instance Based Learning1 Instance Based Learning.
Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)
Diffusion Maps and Spectral Clustering
Binary Variables (1) Coin flipping: heads=1, tails=0 Bernoulli Distribution.
Graph Embedding: A General Framework for Dimensionality Reduction Dong XU School of Computer Engineering Nanyang Technological University
1 Learning with Local and Global Consistency Presented by Qiuhua Liu Duke University Machine Learning Group March 23, 2007 By Dengyong Zhou, Olivier Bousquet,
An Introduction to Support Vector Machine (SVM) Presenter : Ahey Date : 2007/07/20 The slides are based on lecture notes of Prof. 林智仁 and Daniel Yeung.
Image Modeling & Segmentation Aly Farag and Asem Ali Lecture #2.
Graph-based Text Classification: Learn from Your Neighbors Ralitsa Angelova , Gerhard Weikum : Max Planck Institute for Informatics Stuhlsatzenhausweg.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
CS 478 – Tools for Machine Learning and Data Mining SVM.
SemiBoost : Boosting for Semi-supervised Learning Pavan Kumar Mallapragada, Student Member, IEEE, Rong Jin, Member, IEEE, Anil K. Jain, Fellow, IEEE, and.
1 Heat Diffusion Classifier on a Graph Haixuan Yang, Irwin King, Michael R. Lyu The Chinese University of Hong Kong Group Meeting 2006.
Chapter1: Introduction Chapter2: Overview of Supervised Learning
KNN & Naïve Bayes Hongning Wang Today’s lecture Instance-based classifiers – k nearest neighbors – Non-parametric learning algorithm Model-based.
Elements of Pattern Recognition CNS/EE Lecture 5 M. Weber P. Perona.
Classification Course web page: vision.cis.udel.edu/~cv May 14, 2003  Lecture 34.
An Improved Algorithm for Decision-Tree-Based SVM Sindhu Kuchipudi INSTRUCTOR Dr.DONGCHUL KIM.
Eick: kNN kNN: A Non-parametric Classification and Prediction Technique Goals of this set of transparencies: 1.Introduce kNN---a popular non-parameric.
1 Further Investigations on Heat Diffusion Models Haixuan Yang Supervisors: Prof Irwin King and Prof Michael R. Lyu Term Presentation 2006.
SemiBoost : Boosting for Semi-supervised Learning Pavan Kumar Mallapragada, Student Member, IEEE, Rong Jin, Member, IEEE, Anil K. Jain, Fellow, IEEE, and.
KNN & Naïve Bayes Hongning Wang
Spectral Methods for Dimensionality
Cross Domain Distribution Adaptation via Kernel Mapping
Ch8: Nonparametric Methods
کاربرد نگاشت با حفظ تنکی در شناسایی چهره
Outline Parameter estimation – continued Non-parametric methods.
Outline Nonlinear Dimension Reduction Brief introduction Isomap LLE
K Nearest Neighbor Classification
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Haixuan Yang, Irwin King, & Michael R. Lyu ICONIP2005
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Data Mining Classification: Alternative Techniques
Nonparametric density estimation and classification
Multivariate Methods Berlin Chen
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Presentation transcript:

1 NHDC and PHDC: Local and Global Heat Diffusion Based Classifiers Haixuan Yang Group Meeting Sep 26, 2005

2 Introduction Graph Heat Diffusion Model NHDC and PHDC algorithms Connections with other models Experiments Conclusions and future work Outline

3 Introduction Kondor & Lafferty (NIPS2002)  Construct a diffusion kernel on a graph  Handle discrete attributes  Apply to a large margin classifier  Achieve goof performance in accuracy on 5 data sets from UCI Lafferty & Kondor (JMLR2005)  Construct a diffusion kernel on a special manifold  Handle continuous attributes  Restrict to text classification  Apply to SVM  Achieve good performance in accuracy on WEbKB and Reuters Belkin & Niyogi (Neural Computation 2003)  Reduce dimension by heat kernel and local distance Tenenbaum et al (Science 2000)  Reduce dimension by local distance

4 Introduction We inherit the ideas  Local information is relatively accurate in a nonlinear manifold.  The way heat diffuses on a manifold is related to the density of the data on the manifold: the point where heat diffuses rapidly is one that has high density.  For example, in the ideal case when the manifold is the Euclidean space, heat diffuses in the same way as Gaussian density:  The way heat diffuses on a manifold can be understood as a generalization of the Gaussian density from Euclidean space to manifold.  Learn local information by k nearest neighbors.

5 Introduction We think differently:  Unknown manifold in most cases.  Unknown solution for the known manifold.  The explicit form of the approximation to the solution in (Lafferty & Lebanon JMLR2005): is a rare case.  Establish the heat diffusion equation directly on a graph that is formed by K nearest neighbors.  Always have an explicit form in any case.  Form a classifier by the solution directly.

6 Illustration The first heat diffusion The second heat diffusion

7 Illustration

8

9 Heat received from A class: Heat received from B class: Heat received from A class: Heat received from B class: 0.08 SVM

10 Graph Heat Diffusion Model Given a directed weighted graph G=(V,E,W), where V={1,2,…,n}, E={(i,j): if there is an edge from i to j}, W=( w(i,j) ) is the weight matrix. The edge (i,j) is imagined as a pipe that connects i and j, w(i,j) is the pipe length. Let f(i,t) be the heat at node i at time t. At time t, i receives M(i,j,t,dt) amount of heat from its neighbor j during a period of dt.

11 Graph Heat Diffusion Model Suppose that M(i,j,t,dt) is proportional to the time period dt. Suppose that M(i,j,t,dt) is proportional to the heat difference f(j,t)-f(i,t). Moreover, the heat flows from j to i through the pipe and therefore the heat diffuses in the pipe in the same way as it does in the Euclidean space as described before.

12 Graph Heat Diffusion Model The heat difference f(i,t+dt) and f(i,t) can be expressed as: It can be expressed as a matrix form: Let dt tends to zero, the above equation becomes:

13 NHDC and PHDC algorithm - Step 1 [Construct neighborhood graph]  Define graph G over all data points both in the training data set and in the test data set.  Add edge from j to i if j is one of the K nearest neighbors of i.  Set edge weight w(i,j)=d(i, j) if j is one of the K nearest neighbors of i, where d(i, j) be the Euclidean distance between point i and point j.

14 NHDC and PHDC algorithm - Step 2 [Compute the Heat Kernel]  Using equation

15 NHDC and PHDC algorithm - Step 3 [ Compute the Heat Distribution ]  Set f(0): for each class c, nodes labeled by class c, has an initial unit heat at time 0, all other nodes have no heat at time 0.  In PHDC, use equation to compute the heat distribution.  In NHDC, use equation

16 NHDC and PHDC algorithm - Step 4 [ Classify the nodes ]  For each node in the test data set, classify it to the class from which it receives most heat.

17 Connections with other models The Parzen window approach (when the window function takes the normal form) is a special case of the NHDC. It is a non-parametric method for probability density estimation: The class-conditional density for class k Assign x to a class whose value is maximal.

18 Connections with other models The Parzen window approach (when the window function takes the normal form) is a special case of the NHDC. In our model, let K=n-1, then the graph constructed in Step 1 will be a complete graph. The matrix H will be Heat that x p receives from the data points in class k

19 Connections with other models KNN is a special case of the NHDC. For each test data, assign it to the class that has the maximal number in its K nearest neighbors.

20 Connections with other models KNN is a special case of the NHDC. In our model, letβtend to infinity, then the matrix H becomes Heat that x p receives from the data points in class k The number of the cases in class q in its K nearest neighbor.

21 Connections with other models PHDC can approximate NHDC. If γis small, then Since the identity matrix has no effect on the heat distribution, PHDC and NHDC has similar classification accuracy when γ is small.

22 Connections with other models PHDC NHDC KNNPWA

23 Experiments 2 artificial Data sets Spiral-100 Spiral-1000 Compare with Parzen window (The window function takes the normal form), KNN and SVM. The result is the average of the ten-cross validation.

24 Experiments Results AlgorithmNHDCPHDCKNNPWASVM Spiral Spiral Credit-g Diabetes Glass Iris Sonar Vehicle

25 Conclusions and future work Avoid the difficulty of finding the explicit expression for the unknown geometry Avoid the difficult of finding a closed form heat kernel for some complicated geometries. Both NHDC and PHDC are efficient in accuracy. There is space to develop it further. The assumption in the local heat diffusion is not fully justified. We are now using a directed graph. Converting it into a undirected graph may be more reasonable because that in reality heat diffuses symmetrically. Apply it to SVM?