Presentation is loading. Please wait.

Presentation is loading. Please wait.

Combining Labeled and Unlabeled Data with Co-Training

Similar presentations


Presentation on theme: "Combining Labeled and Unlabeled Data with Co-Training"— Presentation transcript:

1 Combining Labeled and Unlabeled Data with Co-Training
Avrim Blum Tom Mitchell Madison, 1998

2 Semi-Supervised Learning
Most of ML assumes labeled training data But labels can be expensive to maintain Semi-supervised learning: Make use of unlabeled data in combination of small amount of labeled data

3 Some Co-Training Slides from a Tutorial by Avrim Blum

4 Co-training Many problems have two different sources of info you can use to determine label. E.g., classifying webpages: can use words on page or words on links pointing to the page. My Advisor Prof. Avrim Blum x2- Text info x1- Link info x - Link info & Text info

5 Co-training Idea: Use small labeled sample to learn initial rules.
E.g., “my advisor” pointing to a page is a good indicator it is a faculty home page. E.g., “I am teaching” on a page is a good indicator it is a faculty home page. my advisor

6 Co-training Idea: Use small labeled sample to learn initial rules. E.g., “my advisor” pointing to a page is a good indicator it is a faculty home page. E.g., “I am teaching” on a page is a good indicator it is a faculty home page. Then look for unlabeled examples where one rule is confident and the other is not. Have it label the example for the other. Training 2 classifiers, one on each type of info. Using each to help train the other. hx1,x2i

7 Results: webpages 12 labeled examples, 1000 unlabeled

8 Sample Run

9 Co-Training Theorems [BM98] if x1, x2 are independent given the label, and if have alg that is robust to noise, then can learn from an initial “weakly-useful” rule plus unlabeled data. Faculty home pages Faculty with advisees “My advisor”


Download ppt "Combining Labeled and Unlabeled Data with Co-Training"

Similar presentations


Ads by Google