CS 678 - Ensembles and Bayes1 Semi-Supervised Learning Can we improve the quality of our learning by combining labeled and unlabeled data Usually a lot.

Slides:



Advertisements
Similar presentations
Knowledge Transfer via Multiple Model Local Structure Mapping Jing Gao, Wei Fan, Jing Jiang, Jiawei Han l Motivate Solution Framework Data Sets Synthetic.
Advertisements

PEBL: Web Page Classification without Negative Examples Hwanjo Yu, Jiawei Han, Kevin Chen- Chuan Chang IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING,
Unsupervised Learning Clustering K-Means. Recall: Key Components of Intelligent Agents Representation Language: Graph, Bayes Nets, Linear functions Inference.
Integrated Instance- and Class- based Generative Modeling for Text Classification Antti PuurulaUniversity of Waikato Sung-Hyon MyaengKAIST 5/12/2013 Australasian.
Unsupervised Learning
1 Semi-supervised learning for protein classification Brian R. King Chittibabu Guda, Ph.D. Department of Computer Science University at Albany, SUNY Gen*NY*sis.
Machine learning continued Image source:
CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
Semi-supervised Learning Rong Jin. Semi-supervised learning  Label propagation  Transductive learning  Co-training  Active learning.
An Overview of Machine Learning
CMPUT 466/551 Principal Source: CMU
Self Taught Learning : Transfer learning from unlabeled data Presented by: Shankar B S DMML Lab Rajat Raina et al, CS, Stanford ICML 2007.
Robust Moving Object Detection & Categorization using self- improving classifiers Omar Javed, Saad Ali & Mubarak Shah.
Final review LING572 Fei Xia Week 10: 03/13/08 1.
Zhu, Rogers, Qian, Kalish Presented by Syeda Selina Akter.
Sparse vs. Ensemble Approaches to Supervised Learning
Co-Training and Expansion: Towards Bridging Theory and Practice Maria-Florina Balcan, Avrim Blum, Ke Yang Carnegie Mellon University, Computer Science.
Semi-supervised learning and self-training LING 572 Fei Xia 02/14/06.
Semi-Supervised Classification by Low Density Separation Olivier Chapelle, Alexander Zien Student: Ran Chang.
Unsupervised Learning: Clustering Rong Jin Outline  Unsupervised learning  K means for clustering  Expectation Maximization algorithm for clustering.
Inductive Semi-supervised Learning Gholamreza Haffari Supervised by: Dr. Anoop Sarkar Simon Fraser University, School of Computing Science.
Co-training LING 572 Fei Xia 02/21/06. Overview Proposed by Blum and Mitchell (1998) Important work: –(Nigam and Ghani, 2000) –(Goldman and Zhou, 2000)
Graph-Based Semi-Supervised Learning with a Generative Model Speaker: Jingrui He Advisor: Jaime Carbonell Machine Learning Department
Visual Recognition Tutorial
Course Summary LING 572 Fei Xia 03/06/07. Outline Problem description General approach ML algorithms Important concepts Assignments What’s next?
Knowledge Transfer via Multiple Model Local Structure Mapping Jing Gao† Wei Fan‡ Jing Jiang†Jiawei Han† †University of Illinois at Urbana-Champaign ‡IBM.
Using Image Priors in Maximum Margin Classifiers Tali Brayer Margarita Osadchy Daniel Keren.
CS Instance Based Learning1 Instance Based Learning.
Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)
Semi-supervised Learning Rong Jin. Semi-supervised learning  Label propagation  Transductive learning  Co-training  Active learing.
For Better Accuracy Eick: Ensemble Learning
Semi-Supervised Learning
Crash Course on Machine Learning
Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem
Final review LING572 Fei Xia Week 10: 03/11/
Transfer Learning From Multiple Source Domains via Consensus Regularization Ping Luo, Fuzhen Zhuang, Hui Xiong, Yuhong Xiong, Qing He.
This week: overview on pattern recognition (related to machine learning)
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
Smart RSS Aggregator A text classification problem Alban Scholer & Markus Kirsten 2005.
Semisupervised Learning A brief introduction. Semisupervised Learning Introduction Types of semisupervised learning Paper for review References.
Active Learning An example From Xu et al., “Training SpamAssassin with Active Semi- Supervised Learning”
A Weakly-Supervised Approach to Argumentative Zoning of Scientific Documents Yufan Guo Anna Korhonen Thierry Poibeau 1 Review By: Pranjal Singh Paper.
Transfer Learning Task. Problem Identification Dataset : A Year: 2000 Features: 48 Training Model ‘M’ Testing 98.6% Training Model ‘M’ Testing 97% Dataset.
1 COMP3503 Semi-Supervised Learning COMP3503 Semi-Supervised Learning Daniel L. Silver.
Today Ensemble Methods. Recap of the course. Classifier Fusion
Ensembles. Ensemble Methods l Construct a set of classifiers from training data l Predict class label of previously unseen records by aggregating predictions.
Ensemble Learning Spring 2009 Ben-Gurion University of the Negev.
MACHINE LEARNING 8. Clustering. Motivation Based on E ALPAYDIN 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2  Classification problem:
SemiBoost : Boosting for Semi-supervised Learning Pavan Kumar Mallapragada, Student Member, IEEE, Rong Jin, Member, IEEE, Anil K. Jain, Fellow, IEEE, and.
Principal Component Analysis Machine Learning. Last Time Expectation Maximization in Graphical Models – Baum Welch.
Linear Models for Classification
Active learning Haidong Shi, Nanyi Zeng Nov,12,2008.
Mehdi Ghayoumi MSB rm 132 Ofc hr: Thur, a Machine Learning.
COP5992 – DATA MINING TERM PROJECT RANDOM SUBSPACE METHOD + CO-TRAINING by SELIM KALAYCI.
CS558 Project Local SVM Classification based on triangulation (on the plane) Glenn Fung.
Classification Ensemble Methods 1
Machine Learning in Practice Lecture 24 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
Course Review #2 and Project Parts 3-6 LING 572 Fei Xia 02/14/06.
Classification using Co-Training
Web-Mining Agents: Transfer Learning TrAdaBoost R. Möller Institute of Information Systems University of Lübeck.
SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.
1 Machine Learning Lecture 8: Ensemble Methods Moshe Koppel Slides adapted from Raymond J. Mooney and others.
Introduction to Machine Learning Nir Ailon Lecture 12: EM, Clustering and More.
Unsupervised Learning Part 2. Topics How to determine the K in K-means? Hierarchical clustering Soft clustering with Gaussian mixture models Expectation-Maximization.
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Data Mining Practical Machine Learning Tools and Techniques
Probabilistic Models with Latent Variables
Unsupervised Learning II: Soft Clustering with Gaussian Mixture Models
Semi-Supervised Learning
What is Artificial Intelligence?
Presentation transcript:

CS Ensembles and Bayes1 Semi-Supervised Learning Can we improve the quality of our learning by combining labeled and unlabeled data Usually a lot more unlabeled data available than labeled Assume a set L of labeled data and U of unlabeled data (from the same distribution) Focus on Semi-Supervised Classification though there are many other variations – Aiding clustering with some labeled data – Regression – Model selection with unlabeled data (COD) Transduction vs Induction

How Semi-Supervised Works Approaches make strong model assumptions (guesses). If wrong can make things worse. Some common used assumptions – Clusters of data are from the same class – Data can be represented as a mixture of parameterized distributions – Decision boundaries should go through non-dense areas of the data – Model should be as simple as possible (Occam) CS Ensembles and Bayes2

Unsupervised Learning of Domain Features PCA, SVD NLDR – Non-Linear Dimensionality Reduction Deep Learning – Deep Belief Nets – Sparse Auto-encoders – Self-Taught Learning CS Ensembles and Bayes3

Self-Training (Bootstrap) Self-Training – Train supervised model on labeled data L – Test on unlabeled data U – Add the most confidently classified members of U to L – Repeat Multi-Model – Uses an ensemble to trained models for Self-Training – Co-Training Train two models with different independent features sets Add most confident instances from U of one model into L of the other – Multi-View training Find ensemble of multiple diverse models trained on L which also tend to all agree well on U CS Ensembles and Bayes4

More Models Generative – Assume data can be represented by some mixture of parameterized models (e.g. Gaussian) and use EM to learn parameters (ala Baum-Welch) CS Ensembles and Bayes5

Graph Models – Neighbor nodes assumed to be similar with larger edge weights. – Force same class member in L to be close, while maintaining smoothness with respect to the graph for U. – Add in members of U as neighbors based on some similarity – Iteratively label U (breadth first) CS Ensembles and Bayes6

TSVM Transductive SVM (TSVM) or Semi-Supervised SVM (S3VM) Maximize margin of both L and U. Decision surface placed in non-dense spaces – Assumes classes are "well-separated" – Can also try simultaneously maintain class proportion on both sides similar to labeled proportion CS Ensembles and Bayes7

8