Improving the Graph Mincut Approach to Learning from Labeled and Unlabeled Examples Avrim Blum, John Lafferty, Raja Reddy, Mugizi Rwebangira.

Slides:



Advertisements
Similar presentations
Semi-Supervised Learning Avrim Blum Carnegie Mellon University [USC CS Distinguished Lecture Series, 2008]
Advertisements

Max Flow Min Cut. Theorem The maximum value of an st-flow in a digraph equals the minimum capacity of an st-cut. Theorem If every arc has integer capacity,
SI/EECS 767 Yang Liu Apr 2,  A minimum cut is the smallest cut that will disconnect a graph into two disjoint subsets.  Application:  Graph partitioning.
Introduction to Markov Random Fields and Graph Cuts Simon Prince
1 Maximum flow sender receiver Capacity constraint Lecture 6: Jan 25.
Bioinformatics III1 We will present an algorithm that originated by Ford and Fulkerson (1962). Idea: increase the flow in a network iteratively until it.
R. Johnsonbaugh Discrete Mathematics 5 th edition, 2001 Chapter 8 Network models.
Randomized Algorithms Randomized Algorithms CS648 Lecture 20 Probabilistic Method (part 1) Lecture 20 Probabilistic Method (part 1) 1.
Foreground/Background Image Segmentation. What is our goal? To label each pixel in an image as belonging to either the foreground of the scene or the.
The Maximum Network Flow Problem. CSE Network Flows.
Chapter 10: Iterative Improvement The Maximum Flow Problem The Design and Analysis of Algorithms.
Lectures on Network Flows
1 A Random-Surfer Web-Graph Model (Joint work with Avrim Blum & Hubert Chan) Mugizi Rwebangira.
ALADDIN Workshop on Graph Partitioning in Vision and Machine Learning Jan 9-11, 2003 Welcome! [Organizers: Avrim Blum, Jon Kleinberg, John Lafferty, Jianbo.
Learning using Graph Mincuts Shuchi Chawla Carnegie Mellon University 1/11/2003.
Self Taught Learning : Transfer learning from unlabeled data Presented by: Shankar B S DMML Lab Rajat Raina et al, CS, Stanford ICML 2007.
Techniques For Exploiting Unlabeled Data Mugizi Rwebangira Thesis Proposal May 11,2007 Committee: Avrim Blum, CMU (Co-Chair) John Lafferty, CMU (Co-Chair)
Co-Training and Expansion: Towards Bridging Theory and Practice Maria-Florina Balcan, Avrim Blum, Ke Yang Carnegie Mellon University, Computer Science.
Improving the Graph Mincut Approach to Learning from Labeled and Unlabeled Examples Avrim Blum, John Lafferty, Raja Reddy, Mugizi Rwebangira Carnegie Mellon.
CSC 2300 Data Structures & Algorithms April 17, 2007 Chapter 9. Graph Algorithms.
CSE 421 Algorithms Richard Anderson Lecture 22 Network Flow.
HCS Clustering Algorithm
Stereo & Iterative Graph-Cuts Alex Rav-Acha Vision Course Hebrew University.
Maria-Florina Balcan Carnegie Mellon University Margin-Based Active Learning Joint with Andrei Broder & Tong Zhang Yahoo! Research.
Semi-Supervised Learning Using Randomized Mincuts Avrim Blum, John Lafferty, Raja Reddy, Mugizi Rwebangira.
MATH 310, FALL 2003 (Combinatorial Problem Solving) Lecture 15, Friday, October 3.
MRF Labeling With Graph Cut CMPUT 615 Nilanjan Ray.
Maximum Flows Lecture 4: Jan 19. Network transmission Given a directed graph G A source node s A sink node t Goal: To send as much information from s.
Maria-Florina Balcan Learning with Similarity Functions Maria-Florina Balcan & Avrim Blum CMU, CSD.
Graph-Based Semi-Supervised Learning with a Generative Model Speaker: Jingrui He Advisor: Jaime Carbonell Machine Learning Department
Semi-Supervised Learning D. Zhou, O Bousquet, T. Navin Lan, J. Weston, B. Schokopf J. Weston, B. Schokopf Presents: Tal Babaioff.
Measuring Uncertainty in Graph Cut Solutions Pushmeet Kohli Philip H.S. Torr Department of Computing Oxford Brookes University.
Graph-Cut Algorithm with Application to Computer Vision Presented by Yongsub Lim Applied Algorithm Laboratory.
Prim’s Algorithm and an MST Speed-Up
Techniques For Exploiting Unlabeled Data Mugizi Rwebangira Thesis Defense September 8,2008 Committee: Avrim Blum, CMU (Co-Chair) John Lafferty, CMU (Co-Chair)
CSE 421 Algorithms Richard Anderson Lecture 22 Network Flow.
FEATURE EXTRACTION FOR JAVA CHARACTER RECOGNITION Rudy Adipranata, Liliana, Meiliana Indrawijaya, Gregorius Satia Budhi Informatics Department, Petra Christian.
1 Min-cut for Undirected Graphs Given an undirected graph, a global min-cut is a cut (S,V-S) minimizing the number of crossing edges, where a crossing.
CS774. Markov Random Field : Theory and Application Lecture 13 Kyomin Jung KAIST Oct
Semisupervised Learning A brief introduction. Semisupervised Learning Introduction Types of semisupervised learning Paper for review References.
1 Limits of Learning-based Signature Generation with Adversaries Shobha Venkataraman, Carnegie Mellon University Avrim Blum, Carnegie Mellon University.
Network Flow. Network flow formulation A network G = (V, E). Capacity c(u, v)  0 for edge (u, v). Assume c(u, v) = 0 if (u, v)  E. Source s and sink.
1 Markov Random Fields with Efficient Approximations Yuri Boykov, Olga Veksler, Ramin Zabih Computer Science Department CORNELL UNIVERSITY.
A deterministic near-linear time algorithm for finding minimum cuts in planar graphs Thank you, Steve, for presenting it for us!!! Parinya Chalermsook.
COP5992 – DATA MINING TERM PROJECT RANDOM SUBSPACE METHOD + CO-TRAINING by SELIM KALAYCI.
1 Gene family classification using a semi-supervised learning method Nan Song Advisors: John Lafferty, Dannie Durand.
CSCI-256 Data Structures & Algorithm Analysis Lecture Note: Some slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved. 25.
Subjectivity Recognition on Word Senses via Semi-supervised Mincuts Fangzhong Su and Katja Markert School of Computing, University of Leeds Human Language.
1 EE5900 Advanced Embedded System For Smart Infrastructure Static Scheduling.
Correlation Clustering Nikhil Bansal Joint Work with Avrim Blum and Shuchi Chawla.
Refined Online Citation Matching and Adaptive Canonical Metadata Construction CSE 598B Course Project Report Huajing Li.
CSE 421 Algorithms Richard Anderson Lecture 22 Network Flow.
Markov Random Fields in Vision
Maximum Flow Problem Definitions and notations The Ford-Fulkerson method.
SemiBoost : Boosting for Semi-supervised Learning Pavan Kumar Mallapragada, Student Member, IEEE, Rong Jin, Member, IEEE, Anil K. Jain, Fellow, IEEE, and.
The Maximum Network Flow Problem
Correlation Clustering
Semi-supervised Machine Learning Gergana Lazarova
Multiplying 2 Digit Factors
Combining Labeled and Unlabeled Data with Co-Training
Lecture 22 Network Flow, Part 2
Richard Anderson Lecture 23 Network Flow
Richard Anderson Lecture 21 Network Flow
Lecture 21 Network Flow, Part 1
Richard Anderson Lecture 22 Network Flow
Name _________________________ Date _______________
Lecture 21 Network Flow, Part 1
Lecture 22 Network Flow, Part 2
Semi-Supervised Learning
Richard Anderson Lecture 22 Network Flow
Presentation transcript:

Improving the Graph Mincut Approach to Learning from Labeled and Unlabeled Examples Avrim Blum, John Lafferty, Raja Reddy, Mugizi Rwebangira

Outline Often have little labeled data but lots of unlabeled data Graph mincuts: based on a belief that most ‘close’ examples have same classification Problem: -Does not say where it is most confident Our approach: Add noise to edges to extract confidence scores

Learning using Graph Mincuts: Blum and Chawla (ICML 2001)

Construct a Graph

Add sink and source -+

Obtain s-t mincut Mincut -+

Classification +- Mincut

Goal To obtain a measure of confidence on each classification Our approach Add random noise to the edges Run min cut several times For each unlabeled example take majority vote

Experiments Digits data set (each digit is a 16 X 16 integer array) 100 labeled examples 3900 unlabeled examples 100 runs of mincut

Results

Conclusions 3% error on 80% of the data Standard mincut only gives us 6% error on all the data Future Work Conduct further experiments on other data sets Compare with similar algorithm of Jerry Zhu Investigate the properties of the distribution we get by selecting minimum cuts in this way

Questions?