Download presentation
Presentation is loading. Please wait.
1
Learning using Graph Mincuts Shuchi Chawla Carnegie Mellon University 1/11/2003
2
Shuchi Chawla, Carnegie Mellon University2 Learning from Labeled and Unlabeled Data Cheap and available in large amounts Gives information about distribution of examples Useful with a prior Our prior: ‘close’ examples have a similar classification
3
Shuchi Chawla, Carnegie Mellon University3 Classification using Graph Mincut Suppose the quality of a classification is defined by pairwise relationships between examples: If two examples are similar, but classified differently, we incur a penalty eg. Markov Random Fields Graph mincut minimizes this penalty
4
Shuchi Chawla, Carnegie Mellon University4 Design Issues What is the right Energy function? Given an energy function, find a graph that represent the energy function We deal with a simpler question: Given a distance metric on data, “learn” a graph (edge weights) that gives a good clustering
5
Shuchi Chawla, Carnegie Mellon University5 Assigning Edge Weights Some decreasing function of distance between nodes eg. exponential decrease with appropriate slope Unit weight edges Connect nodes if they are within a distance of What is a good value of ? Connect every node to its k nearest neighbours What is a good value of k ? Sparser graph => faster algorithm
6
Shuchi Chawla, Carnegie Mellon University6 Connecting “near-by” nodes Connect every pair with distance less than Need a method for finding a “good” very problem dependent Possible approach: Use degree of connectivity, density of edges or value of the cut to pick the right value
7
Shuchi Chawla, Carnegie Mellon University7 Connecting “near-by” nodes As increases, value of the cut increases Cut value = 0 ) supposedly no-error situation “Mincut- 0 ” Very sensistive to ambiguity in classification or noise in the dataset Should allow longer distance dependencies
8
Shuchi Chawla, Carnegie Mellon University8 Connecting “near-by” nodes Grow till the graph becomes sufficiently well connected Growing till the largest component contains half the nodes seems to work well (Mincut- ½ ) Reasonably robust to noise
9
Shuchi Chawla, Carnegie Mellon University9 A sample of results MUSH 97.7 97.091.1 MUSH* 88.756.987.083.3 VOTING 91.366.183.389.6 PIMA 72.348.872.368.1 Dataset Mincut- opt Mincut- 0 Mincut- 1/2 3-NN
10
Shuchi Chawla, Carnegie Mellon University10 Which mincut is the “correct” mincut? There can be “many” mincuts in the graph Assign a high confidence value to examples on which all mincuts agree Overall accuracy related to the fraction of examples that get a “high confidence” label. Grow until a reasonable fraction of examples gets a high confidence label
11
Shuchi Chawla, Carnegie Mellon University11 Connecting to nearest neighbors Connect every node to its k nearest neighbours As k increases, it is more likely to have small disconnected components Connect to m nearest labeled and k other nearest neighbors
12
Shuchi Chawla, Carnegie Mellon University12 Other “hacks” Weigh edges to labeled and unlabeled examples differently Weigh different attributes differently eg. Use information gain as in decision trees Weigh edges to positive and negative example differently: for a more balanced cut
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.