COP5992 – DATA MINING TERM PROJECT RANDOM SUBSPACE METHOD + CO-TRAINING by SELIM KALAYCI.

Slides:



Advertisements
Similar presentations
Mustafa Cayci INFS 795 An Evaluation on Feature Selection for Text Clustering.
Advertisements

Random Forest Predrag Radenković 3237/10
Ensemble Methods An ensemble method constructs a set of base classifiers from the training data Ensemble or Classifier Combination Predict class label.
Longin Jan Latecki Temple University
Paper presentation for CSI5388 PENGCHENG XI Mar. 23, 2005
Multiple Criteria for Evaluating Land Cover Classification Algorithms Summary of a paper by R.S. DeFries and Jonathan Cheung-Wai Chan April, 2000 Remote.
Robust Moving Object Detection & Categorization using self- improving classifiers Omar Javed, Saad Ali & Mubarak Shah.
Co-Training and Expansion: Towards Bridging Theory and Practice Maria-Florina Balcan, Avrim Blum, Ke Yang Carnegie Mellon University, Computer Science.
Ensemble Learning what is an ensemble? why use an ensemble?
Unsupervised Models for Named Entity Classification Michael Collins and Yoram Singer Yimeng Zhang March 1 st, 2007.
Spatial Semi- supervised Image Classification Stuart Ness G07 - Csci 8701 Final Project 1.
Ensemble Learning: An Introduction
Combining Labeled and Unlabeled Data for Multiclass Text Categorization Rayid Ghani Accenture Technology Labs.
1 How to be a Bayesian without believing Yoav Freund Joint work with Rob Schapire and Yishay Mansour.
Semi-Supervised Learning Using Randomized Mincuts Avrim Blum, John Lafferty, Raja Reddy, Mugizi Rwebangira.
Adaboost and its application
Three kinds of learning
Co-training LING 572 Fei Xia 02/21/06. Overview Proposed by Blum and Mitchell (1998) Important work: –(Nigam and Ghani, 2000) –(Goldman and Zhou, 2000)
Introduction to Boosting Aristotelis Tsirigos SCLT seminar - NYU Computer Science.
Bagging LING 572 Fei Xia 1/24/06. Ensemble methods So far, we have covered several learning methods: FSA, HMM, DT, DL, TBL. Question: how to improve results?
Course Summary LING 572 Fei Xia 03/06/07. Outline Problem description General approach ML algorithms Important concepts Assignments What’s next?
Machine Learning: Ensemble Methods
Distributed Representations of Sentences and Documents
CS Ensembles and Bayes1 Semi-Supervised Learning Can we improve the quality of our learning by combining labeled and unlabeled data Usually a lot.
Semi-Supervised Natural Language Learning Reading Group I set up a site at: ervised/
Boosting Main idea: train classifiers (e.g. decision trees) in a sequence. a new classifier should focus on those cases which were incorrectly classified.
Ensembles of Classifiers Evgueni Smirnov
CSCI 347 / CS 4206: Data Mining Module 06: Evaluation Topic 01: Training, Testing, and Tuning Datasets.
Machine Learning CS 165B Spring 2012
Data mining and machine learning A brief introduction.
CS 391L: Machine Learning: Ensembles
Chapter 7: Transformations. Attribute Selection Adding irrelevant attributes confuses learning algorithms---so avoid such attributes Both divide-and-conquer.
LOGO Ensemble Learning Lecturer: Dr. Bo Yuan
Benk Erika Kelemen Zsolt
1 COMP3503 Semi-Supervised Learning COMP3503 Semi-Supervised Learning Daniel L. Silver.
Combining multiple learners Usman Roshan. Bagging Randomly sample training data Determine classifier C i on sampled data Goto step 1 and repeat m times.
Today Ensemble Methods. Recap of the course. Classifier Fusion
Ensembles. Ensemble Methods l Construct a set of classifiers from training data l Predict class label of previously unseen records by aggregating predictions.
Ensemble Learning Spring 2009 Ben-Gurion University of the Negev.
CLASSIFICATION: Ensemble Methods
Data Reduction via Instance Selection Chapter 1. Background KDD  Nontrivial process of identifying valid, novel, potentially useful, and ultimately understandable.
Combining labeled and unlabeled data for text categorization with a large number of categories Rayid Ghani KDD Lab Project.
Classification (slides adapted from Rob Schapire) Eran Segal Weizmann Institute.
Random Forests Ujjwol Subedi. Introduction What is Random Tree? ◦ Is a tree constructed randomly from a set of possible trees having K random features.
Finding τ → μ−μ−μ+ Decays at LHCb with Data Mining Algorithms
Decision Trees IDHairHeightWeightLotionResult SarahBlondeAverageLightNoSunburn DanaBlondeTallAverageYesnone AlexBrownTallAverageYesNone AnnieBlondeShortAverageNoSunburn.
… Algo 1 Algo 2 Algo 3 Algo N Meta-Learning Algo.
Data Mining By: Johan Johansson. Mining Techniques Association Rules Association Rules Decision Trees Decision Trees Clustering Clustering Nearest Neighbor.
Classification using Co-Training
Combining multiple learners Usman Roshan. Decision tree From Alpaydin, 2010.
Ensembles of Classifiers Evgueni Smirnov. Outline 1 Methods for Independently Constructing Ensembles 1.1 Bagging 1.2 Randomness Injection 1.3 Feature-Selection.
AdaBoost Algorithm and its Application on Object Detection Fayin Li.
1 Machine Learning Lecture 8: Ensemble Methods Moshe Koppel Slides adapted from Raymond J. Mooney and others.
Adaboost (Adaptive boosting) Jo Yeong-Jun Schapire, Robert E., and Yoram Singer. "Improved boosting algorithms using confidence- rated predictions."
Defect Prediction Techniques He Qing What is Defect Prediction? Use historical data to predict defect. To plan for allocation of defect detection.
Combining Bagging and Random Subspaces to Create Better Ensembles
Ensemble Classifiers.
Machine Learning: Ensemble Methods
Data Mining Practical Machine Learning Tools and Techniques
Bagging and Random Forests
Advanced data mining with TagHelper and Weka
Zaman Faisal Kyushu Institute of Technology Fukuoka, JAPAN
Trees, bagging, boosting, and stacking
Combining Base Learners
Introduction to Data Mining, 2nd Edition
iSRD Spam Review Detection with Imbalanced Data Distributions
Ensemble learning.
Chapter 7: Transformations
Ch13. Ensemble method (draft)
… 1 2 n A B V W C X 1 2 … n A … V … W … C … A X feature 1 feature 2
Presentation transcript:

COP5992 – DATA MINING TERM PROJECT RANDOM SUBSPACE METHOD + CO-TRAINING by SELIM KALAYCI

RANDOM SUBSPACE METHOD (RSM) Proposed by Ho Proposed by Ho “The Random Subspace for Constructing Decision Forests”, 1998 Another combining technique for weak classifiers like Bagging, Boosting. Another combining technique for weak classifiers like Bagging, Boosting.

RSM ALGORITHM 1. Repeat for b = 1, 2,..., B: (a) Select an r-dimensional random subspace X from the original p-dimensional feature space X. 2. Combine classifiers C b (x), b = 1, 2,..., B, by simple majority voting to a final decision rule

MOTIVATION FOR RSM Redundancy in Data Feature Space Redundancy in Data Feature Space Completely redundant feature set Completely redundant feature set Redundancy is spread over many features Redundancy is spread over many features Weak classifiers that have critical training sample sizes Weak classifiers that have critical training sample sizes

RSM PERFORMANCE ISSUES RSM Performance depends on: RSM Performance depends on: Training sample size Training sample size The choice of a base classifier The choice of a base classifier The choice of combining rule (simple majority vs. weighted) The choice of combining rule (simple majority vs. weighted) The degree of redundancy of the dataset The degree of redundancy of the dataset The number of features chosen The number of features chosen

DECISION FORESTS (by Ho) A combination of trees instead of a single tree A combination of trees instead of a single tree Assumption: Dataset has some redundant features Assumption: Dataset has some redundant features Works efficiently with any decision tree algorithm and data splitting method Works efficiently with any decision tree algorithm and data splitting method Ideally, look for best individual trees with lowest tree similarity Ideally, look for best individual trees with lowest tree similarity

UNLABELED DATA Small number of labeled documents Small number of labeled documents Large pool of unlabeled documents Large pool of unlabeled documents How to classify unlabeled documents accurately? How to classify unlabeled documents accurately?

EXPECTATION-MAXIMIZATION (E-M)

CO-TRAINING Blum and Mitchel, “Combining Labeled and Unlabeled Data with Co-Training”, Blum and Mitchel, “Combining Labeled and Unlabeled Data with Co-Training”, Requirements: Requirements: Two sufficiently strong feature sets Two sufficiently strong feature sets Conditionally independent Conditionally independent

CO-TRAINING

APPLICATION OF CO-TRAINING TO A SINGLE FEATURE SET Algorithm: Obtain a small set L of labeled examples Obtain a large set U of unlabeled examples Obtain two sets F 1 and F 2 of features that are sufficiently redundant While U is not empty do: Learn classifier C 1 from L based on F 1 Learn classifier C 2 from L based on F 2 For each classifier C i do: C i labels examples from U based on F i C i chooses the most confidently predicted examples E from U E is removed from U and added (with their given labels) to L End loop

THINGS TO DO How can we measure redundancy and use it efficiently? How can we measure redundancy and use it efficiently? Can we improve Co-training? Can we improve Co-training? How can we apply RSM efficiently to: How can we apply RSM efficiently to: Supervised learning Supervised learning Semi-supervised learning Semi-supervised learning Unsupervised learning Unsupervised learning

QUESTIONS ???????????????????????????????????????????????????? ???????????????????????????????????????????????????? ???????????????????????????????????????????????????? ???????????????????????????????????????????????????? ???????????????????????????????????????????????????? ???????????????????????????????????????????????????? ???????????????????????????????????????????????????? ????????????????????????????????????????????????????