Learning using Graph Mincuts Shuchi Chawla Carnegie Mellon University 1/11/2003.

Learning using Graph Mincuts Shuchi Chawla Carnegie Mellon University 1/11/2003

Shuchi Chawla, Carnegie Mellon University2 Learning from Labeled and Unlabeled Data  Cheap and available in large amounts  Gives information about distribution of examples Useful with a prior  Our prior: ‘close’ examples have a similar classification

Shuchi Chawla, Carnegie Mellon University3 Classification using Graph Mincut  Suppose the quality of a classification is defined by pairwise relationships between examples: If two examples are similar, but classified differently, we incur a penalty eg. Markov Random Fields  Graph mincut minimizes this penalty

Shuchi Chawla, Carnegie Mellon University4 Design Issues  What is the right Energy function?  Given an energy function, find a graph that represent the energy function  We deal with a simpler question: Given a distance metric on data, “learn” a graph (edge weights) that gives a good clustering

Shuchi Chawla, Carnegie Mellon University5 Assigning Edge Weights  Some decreasing function of distance between nodes eg. exponential decrease with appropriate slope  Unit weight edges Connect nodes if they are within a distance of  What is a good value of  ? Connect every node to its k nearest neighbours What is a good value of k ? Sparser graph => faster algorithm

Shuchi Chawla, Carnegie Mellon University6 Connecting “near-by” nodes  Connect every pair with distance less than   Need a method for finding a “good”  very problem dependent  Possible approach: Use degree of connectivity, density of edges or value of the cut to pick the right value

Shuchi Chawla, Carnegie Mellon University7 Connecting “near-by” nodes  As  increases, value of the cut increases  Cut value = 0 ) supposedly no-error situation “Mincut-  0 ”  Very sensistive to ambiguity in classification or noise in the dataset  Should allow longer distance dependencies

Shuchi Chawla, Carnegie Mellon University8 Connecting “near-by” nodes  Grow  till the graph becomes sufficiently well connected  Growing till the largest component contains half the nodes seems to work well (Mincut-  ½ )  Reasonably robust to noise

Shuchi Chawla, Carnegie Mellon University9 A sample of results MUSH 97.7 97.091.1 MUSH* 88.756.987.083.3 VOTING 91.366.183.389.6 PIMA 72.348.872.368.1 Dataset Mincut-  opt Mincut-  0 Mincut-  1/2 3-NN

Shuchi Chawla, Carnegie Mellon University10 Which mincut is the “correct” mincut?  There can be “many” mincuts in the graph  Assign a high confidence value to examples on which all mincuts agree  Overall accuracy related to the fraction of examples that get a “high confidence” label.  Grow  until a reasonable fraction of examples gets a high confidence label

Shuchi Chawla, Carnegie Mellon University11 Connecting to nearest neighbors  Connect every node to its k nearest neighbours  As k increases, it is more likely to have small disconnected components  Connect to m nearest labeled and k other nearest neighbors

Shuchi Chawla, Carnegie Mellon University12 Other “hacks”  Weigh edges to labeled and unlabeled examples differently  Weigh different attributes differently eg. Use information gain as in decision trees  Weigh edges to positive and negative example differently: for a more balanced cut

Learning using Graph Mincuts Shuchi Chawla Carnegie Mellon University 1/11/2003.

Similar presentations

Presentation on theme: "Learning using Graph Mincuts Shuchi Chawla Carnegie Mellon University 1/11/2003."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Learning using Graph Mincuts Shuchi Chawla Carnegie Mellon University 1/11/2003.

Similar presentations

Presentation on theme: "Learning using Graph Mincuts Shuchi Chawla Carnegie Mellon University 1/11/2003."— Presentation transcript:

Similar presentations

About project

Feedback