Web-Mining Agents: Transfer Learning TrAdaBoost R. Möller Institute of Information Systems University of Lübeck.

Slides:



Advertisements
Similar presentations
Knowledge Transfer via Multiple Model Local Structure Mapping Jing Gao, Wei Fan, Jing Jiang, Jiawei Han l Motivate Solution Framework Data Sets Synthetic.
Advertisements

Latent Variables Naman Agarwal Michael Nute May 1, 2013.
Statistical Machine Learning- The Basic Approach and Current Research Challenges Shai Ben-David CS497 February, 2007.
Thomas Trappenberg Autonomous Robotics: Supervised and unsupervised learning.
A Survey on Transfer Learning Sinno Jialin Pan Department of Computer Science and Engineering The Hong Kong University of Science and Technology Joint.
Teaching Machines to Learn by Metaphors Omer Levy & Shaul Markovitch Technion – Israel Institute of Technology.
Co Training Presented by: Shankar B S DMML Lab
On-line learning and Boosting
Machine learning continued Image source:
Universal Learning over Related Distributions and Adaptive Graph Transduction Erheng Zhong †, Wei Fan ‡, Jing Peng*, Olivier Verscheure ‡, and Jiangtao.
Frustratingly Easy Domain Adaptation
Self Taught Learning : Transfer learning from unlabeled data Presented by: Shankar B S DMML Lab Rajat Raina et al, CS, Stanford ICML 2007.
Chang WangChang Wang, Sridhar mahadevanSridhar mahadevan.
Sparse vs. Ensemble Approaches to Supervised Learning
Cross Domain Distribution Adaptation via Kernel Mapping Erheng Zhong † Wei Fan ‡ Jing Peng* Kun Zhang # Jiangtao Ren † Deepak Turaga ‡ Olivier Verscheure.
CS Word Sense Disambiguation. 2 Overview A problem for semantic attachment approaches: what happens when a given lexeme has multiple ‘meanings’?
Unsupervised Models for Named Entity Classification Michael Collins and Yoram Singer Yimeng Zhang March 1 st, 2007.
Spatial Semi- supervised Image Classification Stuart Ness G07 - Csci 8701 Final Project 1.
Active Learning with Support Vector Machines
1 How to be a Bayesian without believing Yoav Freund Joint work with Rob Schapire and Yishay Mansour.
Data Mining with Decision Trees Lutz Hamel Dept. of Computer Science and Statistics University of Rhode Island.
Knowledge Transfer via Multiple Model Local Structure Mapping Jing Gao† Wei Fan‡ Jing Jiang†Jiawei Han† †University of Illinois at Urbana-Champaign ‡IBM.
Sparse vs. Ensemble Approaches to Supervised Learning
CS Ensembles and Bayes1 Semi-Supervised Learning Can we improve the quality of our learning by combining labeled and unlabeled data Usually a lot.
Clustering & Dimensionality Reduction 273A Intro Machine Learning.
Introduction to domain adaptation
CS Machine Learning. What is Machine Learning? Adapt to / learn from data  To optimize a performance function Can be used to:  Extract knowledge.
Training and future (test) data follow the same distribution, and are in same feature space.
(ACM KDD 09’) Prem Melville, Wojciech Gryc, Richard D. Lawrence
Transfer Learning From Multiple Source Domains via Consensus Regularization Ping Luo, Fuzhen Zhuang, Hui Xiong, Yuhong Xiong, Qing He.
Active Learning for Class Imbalance Problem
Shai Ben-David University of Waterloo Towards theoretical understanding of Domain Adaptation Learning ECML/PKDD 2009 LNIID Workshop.
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
Transfer Learning Part I: Overview Sinno Jialin Pan Sinno Jialin Pan Institute for Infocomm Research (I2R), Singapore.
1 Logistic Regression Adapted from: Tom Mitchell’s Machine Learning Book Evan Wei Xiang and Qiang Yang.
Kernel Classifiers from a Machine Learning Perspective (sec ) Jin-San Yang Biointelligence Laboratory School of Computer Science and Engineering.
Regression Using Boosting Vishakh Advanced Machine Learning Fall 2006.
Universit at Dortmund, LS VIII
EVALUATING TRANSFER LEARNING APPROACHES FOR IMAGE INFORMATION MINING APPLICATIONS Surya Durbha*, Roger King, Nicolas Younan, *Indian Institute of Technology(IIT),
Transfer Learning Task. Problem Identification Dataset : A Year: 2000 Features: 48 Training Model ‘M’ Testing 98.6% Training Model ‘M’ Testing 97% Dataset.
Transfer Learning with Applications to Text Classification Jing Peng Computer Science Department.
Unsupervised Constraint Driven Learning for Transliteration Discovery M. Chang, D. Goldwasser, D. Roth, and Y. Tu.
Modern Topics in Multivariate Methods for Data Analysis.
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.6: Linear Models Rodney Nielsen Many of.
Source-Selection-Free Transfer Learning
BOOSTING David Kauchak CS451 – Fall Admin Final project.
Transfer Learning for Image Classification Group No.: 15 Group member : Feng Cai Sauptik Dhar Sauptik.
Today Ensemble Methods. Recap of the course. Classifier Fusion
Ensemble Learning Spring 2009 Ben-Gurion University of the Negev.
Extending the Multi- Instance Problem to Model Instance Collaboration Anjali Koppal Advanced Machine Learning December 11, 2007.
Bing LiuCS Department, UIC1 Chapter 8: Semi-supervised learning.
Learning to Share Meaning in a Multi-Agent System (Part I) Ganesh Padmanabhan.
Linear Models for Classification
HAITHAM BOU AMMAR MAASTRICHT UNIVERSITY Transfer for Supervised Learning Tasks.
Machine Learning Saarland University, SS 2007 Holger Bast Marjan Celikik Kevin Chang Stefan Funke Joachim Giesen Max-Planck-Institut für Informatik Saarbrücken,
Iterative similarity based adaptation technique for Cross Domain text classification Under: Prof. Amitabha Mukherjee By: Narendra Roy Roll no: Group:
Wenyuan Dai, Ou Jin, Gui-Rong Xue, Qiang Yang and Yong Yu Shanghai Jiao Tong University & Hong Kong University of Science and Technology.
Machine Learning ICS 178 Instructor: Max Welling Supervised Learning.
Self-taught Clustering – an instance of Transfer Unsupervised Learning † Wenyuan Dai joint work with ‡ Qiang Yang, † Gui-Rong Xue, and † Yong Yu † Shanghai.
Unsupervised Learning Part 2. Topics How to determine the K in K-means? Hierarchical clustering Soft clustering with Gaussian mixture models Expectation-Maximization.
Convolutional Neural Network
Cross Domain Distribution Adaptation via Kernel Mapping
Transfer Learning in Astronomy: A New Machine Learning Paradigm
Introductory Seminar on Research: Fall 2017
Self organizing networks
CSCI B609: “Foundations of Data Science”
The
Overview of Machine Learning
Unsupervised Learning II: Soft Clustering with Gaussian Mixture Models
Semi-Supervised Learning
Presentation transcript:

Web-Mining Agents: Transfer Learning TrAdaBoost R. Möller Institute of Information Systems University of Lübeck

by: HAITHAM BOU AMMAR MAASTRICHT UNIVERSITY Based on an excerpt of: Transfer for Supervised Learning Tasks

Traditional Machine Learning vs. Transfer Source Task Knowledge Target Task Learning System Different Tasks Learning System Traditional Machine LearningTransfer Learning

Transfer Learning Definition Notation:  Domain :  Feature Space:  Marginal Probability Distribution: with  Given a domain then a task is : Label SpaceP(Y|X)

Transfer Learning Definition Given a source domain and source learning task, a target domain and a target learning task, transfer learning aims to help improve the learning of the target predictive function using the source knowledge, where or

Transfer Definition Therefore, if either : Domain Differences Task Differences

Questions to answer when transferring What to Transfer ? How to Transfer ? When to Transfer ? Instances ? Model ? Features ? Map Model ? Unify Features ? Weight Instances ? In which Situations

Algorithms: TrAdaBoost Assumptions:  Source and Target task have same feature space:  Marginal distributions are different: Not all source data might be helpful !

Algorithm: TrAdaBoost Idea:  Iteratively reweight source samples such that:  reduce effect of “bad” source instances  encourage effect of “good” source instances Requires:  Source task labeled data set  Very small Target task labeled data set  Unlabeled Target data set  Base Learner

Algorithm: TrAdaBoost Weights Initialization Hypothesis Learning and error calculation Weights Update

Algorithms: Self-Taught Learning Problem Targeted is : 1.Little labeled data 2.A lot of unlabeled data Build a model on the labeled data Natural scenes Car Motorcycle

Algorithms: Self-Taught Learning Assumptions:  Source and Target task have different feature space:  Marginal distributions are different:  Label Space is different:

Algorithms: Self-Taught Learning Framework:  Source Unlabeled data set:  Target Labeled data set: Natural scenes Build classifier for cars and Motorbikes

Algorithms: Self-Taught Learning Step One: Discover high level features from Source data by Regularization TermRe-construction Error Constraint on the Bases

Algorithm: Self-Taught Learning Unlabeled Data Set High Level Features

Algorithm: Self-Taught Learning Step Two: Project target data onto the attained features by Informally, find the activations in the attained bases such that: 1.Re-construction is minimized 2.Attained vector is sparse

Algorithms: Self-Taught Learning High Level Features

Algorithms: Self-Taught Learning Step Three: Learn a Classifier with the new features Target Task Source Task Learn new features (Step 1) Project target data (Step 2) Learn Model (Step 3)

Conclusions Transfer learning is to re-use source knowledge to help a target learner Transfer learning is not generalization TrAdaBoost transfers instances Self-Taught Learning transfers unlabeled features

Next in Web-Mining Agents: Unlabeled Features Revisited Unsupervised Learning: Clustering