Learning using Graph Mincuts Shuchi Chawla Carnegie Mellon University 1/11/2003.

Slides:



Advertisements
Similar presentations
Automatic Photo Pop-up Derek Hoiem Alexei A.Efros Martial Hebert Carnegie Mellon University.
Advertisements

Sparsification and Sampling of Networks for Collective Classification
SI/EECS 767 Yang Liu Apr 2,  A minimum cut is the smallest cut that will disconnect a graph into two disjoint subsets.  Application:  Graph partitioning.
DECISION TREES. Decision trees  One possible representation for hypotheses.
Random Forest Predrag Radenković 3237/10
Data Mining Classification: Alternative Techniques
Online Social Networks and Media. Graph partitioning The general problem – Input: a graph G=(V,E) edge (u,v) denotes similarity between u and v weighted.
Discriminative Segment Annotation in Weakly Labeled Video Kevin Tang, Rahul Sukthankar Appeared in CVPR 2013 (Oral)
ALADDIN Workshop on Graph Partitioning in Vision and Machine Learning Jan 9-11, 2003 Welcome! [Organizers: Avrim Blum, Jon Kleinberg, John Lafferty, Jianbo.
Lecture 21: Spectral Clustering
Clustering and greedy algorithms — Part 2 Prof. Noah Snavely CS1114
Co-Training and Expansion: Towards Bridging Theory and Practice Maria-Florina Balcan, Avrim Blum, Ke Yang Carnegie Mellon University, Computer Science.
Semi-Supervised Classification by Low Density Separation Olivier Chapelle, Alexander Zien Student: Ran Chang.
Improving the Graph Mincut Approach to Learning from Labeled and Unlabeled Examples Avrim Blum, John Lafferty, Raja Reddy, Mugizi Rwebangira Carnegie Mellon.
Clustering… in General In vector space, clusters are vectors found within  of a cluster vector, with different techniques for determining the cluster.
HCS Clustering Algorithm
Carla P. Gomes CS4700 CS 4700: Foundations of Artificial Intelligence Carla P. Gomes Module: Nearest Neighbor Models (Reading: Chapter.
Ensemble Learning: An Introduction
Semi-Supervised Learning Using Randomized Mincuts Avrim Blum, John Lafferty, Raja Reddy, Mugizi Rwebangira.
Semi-Supervised Learning D. Zhou, O Bousquet, T. Navin Lan, J. Weston, B. Schokopf J. Weston, B. Schokopf Presents: Tal Babaioff.
Improving the Graph Mincut Approach to Learning from Labeled and Unlabeled Examples Avrim Blum, John Lafferty, Raja Reddy, Mugizi Rwebangira.
Clustering An overview of clustering algorithms Dènis de Keijzer GIA 2004.
1 Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data Presented by: Tun-Hsiang Yang.
The Erdös-Rényi models
Using Friendship Ties and Family Circles for Link Prediction Elena Zheleva, Lise Getoor, Jennifer Golbeck, Ugur Kuter (SNAKDD 2008)
Chapter 10 Boosting May 6, Outline Adaboost Ensemble point-view of Boosting Boosting Trees Supervised Learning Methods.
Random Walks and Semi-Supervised Learning Longin Jan Latecki Based on : Xiaojin Zhu. Semi-Supervised Learning with Graphs. PhD thesis. CMU-LTI ,
1 A Graph-Theoretic Approach to Webpage Segmentation Deepayan Chakrabarti Ravi Kumar
COMMON EVALUATION FINAL PROJECT Vira Oleksyuk ECE 8110: Introduction to machine Learning and Pattern Recognition.
An Introduction to Support Vector Machine (SVM) Presenter : Ahey Date : 2007/07/20 The slides are based on lecture notes of Prof. 林智仁 and Daniel Yeung.
CS Decision Trees1 Decision Trees Highly used and successful Iteratively split the Data Set into subsets one attribute at a time, using most informative.
Ensembles. Ensemble Methods l Construct a set of classifiers from training data l Predict class label of previously unseen records by aggregating predictions.
Pattern Recognition April 19, 2007 Suggested Reading: Horn Chapter 14.
Paired Sampling in Density-Sensitive Active Learning Pinar Donmez joint work with Jaime G. Carbonell Language Technologies Institute School of Computer.
SemiBoost : Boosting for Semi-supervised Learning Pavan Kumar Mallapragada, Student Member, IEEE, Rong Jin, Member, IEEE, Anil K. Jain, Fellow, IEEE, and.
Slides for “Data Mining” by I. H. Witten and E. Frank.
Decision Trees Binary output – easily extendible to multiple output classes. Takes a set of attributes for a given situation or object and outputs a yes/no.
1Ellen L. Walker Category Recognition Associating information extracted from images with categories (classes) of objects Requires prior knowledge about.
Gaussian Mixture Models and Expectation-Maximization Algorithm.
Network Community Behavior to Infer Human Activities.
Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales Bo Pang and Lillian Lee Cornell University Carnegie.
Clustering Instructor: Max Welling ICS 178 Machine Learning & Data Mining.
Chapter 13 (Prototype Methods and Nearest-Neighbors )
Classification and Prediction: Ensemble Methods Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
Informatics tools in network science
Example Apply hierarchical clustering with d min to below data where c=3. Nearest neighbor clustering d min d max will form elongated clusters!
Correlation Clustering Shuchi Chawla Carnegie Mellon University Joint work with Nikhil Bansal and Avrim Blum.
Network Partition –Finding modules of the network. Graph Clustering –Partition graphs according to the connectivity. –Nodes within a cluster is highly.
Eick: kNN kNN: A Non-parametric Classification and Prediction Technique Goals of this set of transparencies: 1.Introduce kNN---a popular non-parameric.
Instance-Based Learning Evgueni Smirnov. Overview Instance-Based Learning Comparison of Eager and Instance-Based Learning Instance Distances for Instance-Based.
SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.
Tree and Forest Classification and Regression Tree Bagging of trees Boosting trees Random Forest.
Real-Time Hierarchical Scene Segmentation and Classification Andre Uckermann, Christof Elbrechter, Robert Haschke and Helge Ritter John Grossmann.
Correlation Clustering
Introduction to Machine Learning and Tree Based Methods
Community detection in graphs
Using Friendship Ties and Family Circles for Link Prediction
Boosting Nearest-Neighbor Classifier for Character Recognition
K Nearest Neighbor Classification
Introduction to Data Mining, 2nd Edition by
Nearest-Neighbor Classifiers
Dynamic and Online Algorithms for Set Cover
Design of Hierarchical Classifiers for Efficient and Accurate Pattern Classification M N S S K Pavan Kumar Advisor : Dr. C. V. Jawahar.
Department of Computer Science University of York
Classification Algorithms
COSC 4335: Other Classification Techniques
Data Mining Classification: Alternative Techniques
Topological Signatures For Fast Mobility Analysis
Clustering The process of grouping samples so that the samples are similar within each group.
“Traditional” image segmentation
Presentation transcript:

Learning using Graph Mincuts Shuchi Chawla Carnegie Mellon University 1/11/2003

Shuchi Chawla, Carnegie Mellon University2 Learning from Labeled and Unlabeled Data  Cheap and available in large amounts  Gives information about distribution of examples Useful with a prior  Our prior: ‘close’ examples have a similar classification

Shuchi Chawla, Carnegie Mellon University3 Classification using Graph Mincut  Suppose the quality of a classification is defined by pairwise relationships between examples: If two examples are similar, but classified differently, we incur a penalty eg. Markov Random Fields  Graph mincut minimizes this penalty

Shuchi Chawla, Carnegie Mellon University4 Design Issues  What is the right Energy function?  Given an energy function, find a graph that represent the energy function  We deal with a simpler question: Given a distance metric on data, “learn” a graph (edge weights) that gives a good clustering

Shuchi Chawla, Carnegie Mellon University5 Assigning Edge Weights  Some decreasing function of distance between nodes eg. exponential decrease with appropriate slope  Unit weight edges Connect nodes if they are within a distance of  What is a good value of  ? Connect every node to its k nearest neighbours What is a good value of k ? Sparser graph => faster algorithm

Shuchi Chawla, Carnegie Mellon University6 Connecting “near-by” nodes  Connect every pair with distance less than   Need a method for finding a “good”  very problem dependent  Possible approach: Use degree of connectivity, density of edges or value of the cut to pick the right value

Shuchi Chawla, Carnegie Mellon University7 Connecting “near-by” nodes  As  increases, value of the cut increases  Cut value = 0 ) supposedly no-error situation “Mincut-  0 ”  Very sensistive to ambiguity in classification or noise in the dataset  Should allow longer distance dependencies

Shuchi Chawla, Carnegie Mellon University8 Connecting “near-by” nodes  Grow  till the graph becomes sufficiently well connected  Growing till the largest component contains half the nodes seems to work well (Mincut-  ½ )  Reasonably robust to noise

Shuchi Chawla, Carnegie Mellon University9 A sample of results MUSH MUSH* VOTING PIMA Dataset Mincut-  opt Mincut-  0 Mincut-  1/2 3-NN

Shuchi Chawla, Carnegie Mellon University10 Which mincut is the “correct” mincut?  There can be “many” mincuts in the graph  Assign a high confidence value to examples on which all mincuts agree  Overall accuracy related to the fraction of examples that get a “high confidence” label.  Grow  until a reasonable fraction of examples gets a high confidence label

Shuchi Chawla, Carnegie Mellon University11 Connecting to nearest neighbors  Connect every node to its k nearest neighbours  As k increases, it is more likely to have small disconnected components  Connect to m nearest labeled and k other nearest neighbors

Shuchi Chawla, Carnegie Mellon University12 Other “hacks”  Weigh edges to labeled and unlabeled examples differently  Weigh different attributes differently eg. Use information gain as in decision trees  Weigh edges to positive and negative example differently: for a more balanced cut