1 Further Investigations on Heat Diffusion Models Haixuan Yang Supervisors: Prof Irwin King and Prof Michael R. Lyu Term Presentation 2006.

Slides:

Advertisements

Similar presentations

Principles of Density Estimation

Advertisements

ECG Signal processing (2)

Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.

ICONIP 2005 Improve Naïve Bayesian Classifier by Discriminative Training Kaizhu Huang, Zhangbing Zhou, Irwin King, Michael R. Lyu Oct

Data Mining Classification: Alternative Techniques

Lecture 3 Nonparametric density estimation and classification

Low Complexity Keypoint Recognition and Pose Estimation Vincent Lepetit.

Watching Unlabeled Video Helps Learn New Human Actions from Very Few Labeled Snapshots Chao-Yeh Chen and Kristen Grauman University of Texas at Austin.

Pattern recognition Professor Aly A. Farag

DATA MINING LECTURE 12 Link Analysis Ranking Random walks.

1 DiffusionRank: A Possible Penicillin for Web Spamming Haixuan Yang Group Meeting Jan. 16, 2006.

Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.

Prénom Nom Document Analysis: Parameter Estimation for Pattern Recognition Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.

Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.

1 Heat Diffusion Model and its Applications Haixuan Yang Term Presentation Dec 2, 2005.

Semantic text features from small world graphs Jure Leskovec, IJS + CMU John Shawe-Taylor, Southampton.

Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.

Machine Learning Models on Random Graphs Haixuan Yang Supervisors: Prof. Irwin King and Prof. Michael R. Lyu June 20, 2007.

Semi-Supervised Learning Using Randomized Mincuts Avrim Blum, John Lafferty, Raja Reddy, Mugizi Rwebangira.

Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.

Supervised Distance Metric Learning Presented at CMU’s Computer Vision Misc-Read Reading Group May 9, 2007 by Tomasz Malisiewicz.

Three Algorithms for Nonlinear Dimensionality Reduction Haixuan Yang Group Meeting Jan. 011, 2005.

Discriminative Naïve Bayesian Classifiers Kaizhu Huang Supervisors: Prof. Irwin King, Prof. Michael R. Lyu Markers: Prof. Lai Wan Chan, Prof. Kin Hong.

Preference Analysis Joachim Giesen and Eva Schuberth May 24, 2006.

1 NHDC and PHDC: Local and Global Heat Diffusion Based Classifiers Haixuan Yang Group Meeting Sep 26, 2005.

Chapter 4 (part 2): Non-Parametric Classification

A Study of the Relationship between SVM and Gabriel Graph ZHANG Wan and Irwin King, Multimedia Information Processing Laboratory, Department of Computer.

Semi-Supervised Learning D. Zhou, O Bousquet, T. Navin Lan, J. Weston, B. Schokopf J. Weston, B. Schokopf Presents: Tal Babaioff.

Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)

Radial Basis Function Networks

Diffusion Maps and Spectral Clustering

Image Segmentation Rob Atlas Nick Bridle Evan Radkoff.

Manifold learning: Locally Linear Embedding Jieping Ye Department of Computer Science and Engineering Arizona State University

The PageRank Citation Ranking: Bringing Order to the Web Presented by Aishwarya Rengamannan Instructor: Dr. Gautam Das.

Methods in Medical Image Analysis Statistics of Pattern Recognition: Classification and Clustering Some content provided by Milos Hauskrecht, University.

Graph Embedding: A General Framework for Dimensionality Reduction Dong XU School of Computer Engineering Nanyang Technological University

Random Walks and Semi-Supervised Learning Longin Jan Latecki Based on : Xiaojin Zhu. Semi-Supervised Learning with Graphs. PhD thesis. CMU-LTI ,

A Markov Random Field Model for Term Dependencies Donald Metzler W. Bruce Croft Present by Chia-Hao Lee.

ICML2004, Banff, Alberta, Canada Learning Larger Margin Machine Locally and Globally Kaizhu Huang Haiqin Yang, Irwin King, Michael.

Image Modeling & Segmentation Aly Farag and Asem Ali Lecture #2.

Graph-based Text Classification: Learn from Your Neighbors Ralitsa Angelova ， Gerhard Weikum : Max Planck Institute for Informatics Stuhlsatzenhausweg.

Predictive Ranking -H andling missing data on the web Haixuan Yang Group Meeting November 04, 2004.

Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.

1 Heat Diffusion Classifier on a Graph Haixuan Yang, Irwin King, Michael R. Lyu The Chinese University of Hong Kong Group Meeting 2006.

Clustering of Uncertain data objects by Voronoi- diagram-based approach Speaker: Chan Kai Fong, Paul Dept of CS, HKU.

Optimal Dimensionality of Metric Space for kNN Classification Wei Zhang, Xiangyang Xue, Zichen Sun Yuefei Guo, and Hong Lu Dept. of Computer Science &

Classification (slides adapted from Rob Schapire) Eran Segal Weizmann Institute.

Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.

Classification Course web page: vision.cis.udel.edu/~cv May 14, 2003  Lecture 34.

Math 285 Project Diffusion Maps Xiaoyan Chong Department of Mathematics and Statistics San Jose State University.

KAIST TS & IS Lab. CS710 Know your Neighbors: Web Spam Detection using the Web Topology SIGIR 2007, Carlos Castillo et al., Yahoo! 이 승 민.

Feature Selction for SVMs J. Weston et al., NIPS 2000 오장민 (2000/01/04) Second reference : Mark A. Holl, Correlation-based Feature Selection for Machine.

Network Partition –Finding modules of the network. Graph Clustering –Partition graphs according to the connectivity. –Nodes within a cluster is highly.

Spectral Methods for Dimensionality

Shan Lu, Jieqi Kang, Weibo Gong, Don Towsley UMASS Amherst

Ch8: Nonparametric Methods

Outline Nonlinear Dimension Reduction Brief introduction Isomap LLE

K Nearest Neighbor Classification

Haixuan Yang, Irwin King, & Michael R. Lyu ICONIP2005

Data Mining Classification: Alternative Techniques

Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.

Nonparametric density estimation and classification

Multivariate Methods Berlin Chen

Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.

Multivariate Methods Berlin Chen, 2005 References:

Hairong Qi, Gonzalez Family Professor

Topological Signatures For Fast Mobility Analysis

Shan Lu, Jieqi Kang, Weibo Gong, Don Towsley UMASS Amherst

Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.

ECE – Pattern Recognition Lecture 10 – Nonparametric Density Estimation – k-nearest-neighbor (kNN) Hairong Qi, Gonzalez Family Professor Electrical.

Presentation transcript:

1 Further Investigations on Heat Diffusion Models Haixuan Yang Supervisors: Prof Irwin King and Prof Michael R. Lyu Term Presentation 2006

2 Outline  Introduction  Input Improvement – Three candidate graphs  Outside Improvement – DiffusionRank  Inside Improvement – Volume-based heat difusion model  Summary

3 Introduction PHDCVolume-based HDM DiffusionRank HDM on Graphs Inside Improvement Input Improvement Outside Improvement PHDC: the model proposed last year

4 PHDC  PHDC is a classifier motivated by Tenenbaum et al (Science 2000)  Approximate the manifold by a KNN graph  Reduce dimension by shortest paths Kondor & Lafferty (NIPS 2002)  Construct a diffusion kernel on an undirected graph  Apply to a large margin classifier Belkin & Niyogi (Neural Computation 2003)  Approximate the manifold by a KNN graph  Reduce dimension by heat kernels Lafferty & Kondor (JMLR 2005)  Construct a diffusion kernel on a special manifold  Apply to SVM

5 PHDC  Ideas we inherit Local information  relatively accurate in a nonlinear manifold. Heat diffusion on a manifold  a generalization of the Gaussian density from Euclidean space to manifold.  heat diffuses in the same way as Gaussian density in the ideal case when the manifold is the Euclidean space.  Ideas we think differently Establish the heat diffusion equation on a weighted directed graph.  The broader settings enable its application on ranking on the Web pages. Construct a classifier by the solution directly.

6 Heat Diffusion Model in PDHC  Notations  Solution  Classifier G is the KNN Graph: Connect a directed edge (j,i) if j is one of the K nearest neighbors of i. For each class k, f(i,0) is set as 1 if data is labeled as k and 0 otherwise. Assign data j to a label q if j receives most heat from data in class q.

7 Input Improvement  Three candidate graphs KNN Graph  Connect points j and i from j to i if j is one of the K nearest neighbors of i, measured by the Euclidean distance. SKNN-Graph  Choose the smallest K*n/2 undirected edges, which amounts to K*n directed edges. Minimum Spanning Tree  Choose the subgraph such that It is a tree connecting all vertices; the sum of weights is minimum among all such trees.

8 Input Improvement  Illustration Manifold KNN Graph SKNN-Graph Minimum Spanning Tree

9 Input Improvement  Advantages and disadvantages KNN Graph  Democratic to each node  Resulting classifier is a generalization of KNN  May not be connected  Long edges may exit while short edges are removed SKNN-Graph  Not democratic  May not be connected  Short edges are more important than long edges Minimum Spanning Tree  Not democratic  Long edges may exit while short edges are removed  Connection is guaranteed  Less parameter  Faster in training and testing

10 Experiments  Experimental Setup Experimental Environments  Hardware: Nix Dual Intel Xeon 2.2GHz  OS: Linux Kernel smp (RedHat 7.3)  Developing tool: C  Data Description 3 artificial Data sets and 6 datasets from UCI  Comparison Algorithms:  Parzen window KNN SVM KNN-H SKNN-H MST-H Results: average of the ten-fold cross validation Dataset Case s ClassesVariable Syn Syn Syn Breast-w68329 Glass21469 Iono Iris15034 Sonar Vehicle846418

11 Experiments  Results Dataset SVM KNN PWAKNN-HMST-HSKNN-H Syn Syn Syn Breast-w Glass Iono Iris Sonar Vehicle

12 Conclusions  KNN-H, SKNN-H and MST-H Candidates for the Heat Diffusion Classifier on a Graph.

13 Application Improvement  PageRank Tries to find the importance of a Web page based on the link structure. The importance of a page i is defined recursively in terms of pages which point to it: Two problems:  The incomplete information about the Web structure.  The web pages manipulated by people for commercial interests. About 70% of all pages in the.biz domain are spam About 35% of the pages in the.us domain belong to spam category.

14 Why PageRank is susceptible to web spam?  Two reasons Over-democratic  All pages are born equal--equal voting ability of one page: the sum of each column is equal to one. Input-independent  For any given non-zero initial input, the iteration will converge to the same stable distribution.  Heat Diffusion Model -- a natural way to avoid these two reasons of PageRank Points are not equal as some points are born with high temperatures while others are born with low temperatures. Different initial temperature distributions will give rise to different temperature distributions after a fixed time period.

15 DiffusionRank  On an undirected graph Assumption: the amount of the heat flow from j to i is proportional to the heat difference between i and j. Solution:  On a directed graph Assumption: there is extra energy imposed on the link (j, i) such that the heat flow only from j to i if there is no link (i,j). Solution:  On a random directed graph Assumption: the heat flow is proportional to the probability of the link (j,i). Solution:

16 DiffusionRank  On a random directed graph Solution: The initial value f(i,0) in f(0) is set to be 1 if i is trusted and 0 otherwise according to the inverse PageRank.

17 Computation consideration  Approximation of heat kernel  N=? When N>=30, the real eigenvalues of are less than 0.01; when N>=100, they are less than We use N=100 in the paper. When N tends to infinity

18 Discuss γ  γcan be understood as the thermal conductivity.  When γ=0, the ranking value is most robust to manipulation since no heat is diffused, but the Web structure is completely ignored;  When γ= ∞, DiffusionRank becomes PageRank, it can be manipulated easily.  Whenγ=1, DiffusionRank works well in practice

19 DiffusionRank  Advantages Can detect Group-group relations Can cut Graphs Anti-manipulation +1 γ= 0.5 or 1

20 DiffusionRank  Experiments Data:  a toy graph (6 nodes)  a middle-size real-world graph ( nodes)  a large-size real-world graph crawled from CUHK ( nodes) Compare with TrustRank and PageRank

21 Results  The tendency of DiffusionRank when γ becomes larger  On the toy graph

22 Anti-manipulation On the toy graph

23 Anti-manipulation on the middle graph and the large graph

24 Stability--the order difference between ranking results for an algorithm before it is manipulated and those after that

25 Conclusions  This anti-manipulation feature enables DiffusionRank to be a candidate as a penicillin for Web spamming.  DiffusionRank is a generalization of PageRank (when γ=∞).  DiffusionRank can be employed to detect group-group relation.  DiffusionRank can be used to cut graph.

26 Inside Improvement  Motivations Finite Difference Method is a possible way to solve the heat diffusion equation.  the discretization of time  the discretization of space and time

27 Motivation  Problems where we cannot employ FD directly in the real data analysis: The graph constructed is irregular; The density of data varies; this also results in an irregular graph; The manifold is unknown; The differential equation expression is unknown even if the manifold is known.

28 Intuition

29 Volume-based Heat Diffusion Model  Assumption There is a small patch SP[j] of space containing node j; The volume of the small patch SP[j] is V (j), and the heat diffusion ability of the small patch is proportional to its volume. The temperature in the small patch SP[j] at time t is almost equal to f(j,t) because every unseen node in the small patch is near node j.  Solution

30 Volume Computation  Define V(i) to be the volume of the hypercube whose side length is the average distance between node i and its neighbors. a maximum likelihood estimation

31 Experiments K: KNN P: Parzen window U: UniverSvm L: LightSVM C: consistency method VHD-v: by the best v VHD: v is found by the estimation HD: without volume consideration C1: 1st variation of C C2: 2nd variation of C

32 Conclusions  The proposed VHDM has the following advantages: It can model the effect of unseen points by introducing the volume of a node; It avoids the difficulty of finding the explicit expression for the unknown geometry by approximating the manifold by a finite neighborhood graph; It has a closed form solution that describes the heat diffusion on a manifold; VHDC is a generalization of both the Parzen Window Approach (when the window function is a multivariate normal kernel) and KNN.

33 Summary  The input improvement of PHDC provide us more choices for the input graphs.  The outside improvement provides us a possible penicillin for Web spamming, and a potentially useful tool for group- group discovery and graph cut.  The inside improvement shows us a promising classifier.