Transfer Learning Task. Problem Identification Dataset : A Year: 2000 Features: 48 Training Model ‘M’ Testing 98.6% Training Model ‘M’ Testing 97% Dataset.

Slides:



Advertisements
Similar presentations
Applications of one-class classification
Advertisements

Knowledge Transfer via Multiple Model Local Structure Mapping Jing Gao, Wei Fan, Jing Jiang, Jiawei Han l Motivate Solution Framework Data Sets Synthetic.
Topic Identification in Forums Evaluation Strategy IA Seminar Discussion Ahmad Ammari School of Computing, University of Leeds.
Topic models Source: Topic models, David Blei, MLSS 09.
Hierarchical Dirichlet Process (HDP)
A Survey on Transfer Learning Sinno Jialin Pan Department of Computer Science and Engineering The Hong Kong University of Science and Technology Joint.
Unsupervised Learning Clustering K-Means. Recall: Key Components of Intelligent Agents Representation Language: Graph, Bayes Nets, Linear functions Inference.
Albert Gatt Corpora and Statistical Methods Lecture 13.
Integrated Instance- and Class- based Generative Modeling for Text Classification Antti PuurulaUniversity of Waikato Sung-Hyon MyaengKAIST 5/12/2013 Australasian.
Unsupervised Learning
1 Semi-supervised learning for protein classification Brian R. King Chittibabu Guda, Ph.D. Department of Computer Science University at Albany, SUNY Gen*NY*sis.
An Overview of Machine Learning
Statistical Topic Modeling part 1
Joint Sentiment/Topic Model for Sentiment Analysis Chenghua Lin & Yulan He CIKM09.
Self Taught Learning : Transfer learning from unlabeled data Presented by: Shankar B S DMML Lab Rajat Raina et al, CS, Stanford ICML 2007.
Generative Topic Models for Community Analysis
Latent Dirichlet Allocation a generative model for text
Spatial Semi- supervised Image Classification Stuart Ness G07 - Csci 8701 Final Project 1.
Data Mining with Decision Trees Lutz Hamel Dept. of Computer Science and Statistics University of Rhode Island.
Adapted by Doug Downey from Machine Learning EECS 349, Bryan Pardo Machine Learning Clustering.
Visual Recognition Tutorial
Distributed Representations of Sentences and Documents
CS Ensembles and Bayes1 Semi-Supervised Learning Can we improve the quality of our learning by combining labeled and unlabeled data Usually a lot.
Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)
Introduction to Data Mining Engineering Group in ACL.
CSC 478 Programming Data Mining Applications Course Summary Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
Training and future (test) data follow the same distribution, and are in same feature space.
Transfer Learning From Multiple Source Domains via Consensus Regularization Ping Luo, Fuzhen Zhuang, Hui Xiong, Yuhong Xiong, Qing He.
Dongyeop Kang1, Youngja Park2, Suresh Chari2
Active Learning for Class Imbalance Problem
Processing of large document collections Part 2 (Text categorization) Helena Ahonen-Myka Spring 2006.
Introduction to Machine Learning for Information Retrieval Xiaolong Wang.
Example 16,000 documents 100 topic Picked those with large p(w|z)
Topic Models in Text Processing IR Group Meeting Presented by Qiaozhu Mei.
1 Logistic Regression Adapted from: Tom Mitchell’s Machine Learning Book Evan Wei Xiang and Qiang Yang.
Xiaoxiao Shi, Qi Liu, Wei Fan, Philip S. Yu, and Ruixin Zhu
Universit at Dortmund, LS VIII
EVALUATING TRANSFER LEARNING APPROACHES FOR IMAGE INFORMATION MINING APPLICATIONS Surya Durbha*, Roger King, Nicolas Younan, *Indian Institute of Technology(IIT),
Transfer Learning with Applications to Text Classification Jing Peng Computer Science Department.
Modern Topics in Multivariate Methods for Data Analysis.
Source-Selection-Free Transfer Learning
Transfer Learning Motivation and Types Functional Transfer Learning Representational Transfer Learning References.
Transfer Learning for Image Classification Group No.: 15 Group member : Feng Cai Sauptik Dhar Sauptik.
Ensemble Learning Spring 2009 Ben-Gurion University of the Negev.
MACHINE LEARNING 8. Clustering. Motivation Based on E ALPAYDIN 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2  Classification problem:
Geodesic Flow Kernel for Unsupervised Domain Adaptation Boqing Gong University of Southern California Joint work with Yuan Shi, Fei Sha, and Kristen Grauman.
Overview of the final test for CSC Overview PART A: 7 easy questions –You should answer 5 of them. If you answer more we will select 5 at random.
Linear Models for Classification
HAITHAM BOU AMMAR MAASTRICHT UNIVERSITY Transfer for Supervised Learning Tasks.
Topic Modeling using Latent Dirichlet Allocation
Latent Dirichlet Allocation
KNN & Naïve Bayes Hongning Wang Today’s lecture Instance-based classifiers – k nearest neighbors – Non-parametric learning algorithm Model-based.
Machine Learning Saarland University, SS 2007 Holger Bast Marjan Celikik Kevin Chang Stefan Funke Joachim Giesen Max-Planck-Institut für Informatik Saarbrücken,
Iterative similarity based adaptation technique for Cross Domain text classification Under: Prof. Amitabha Mukherjee By: Narendra Roy Roll no: Group:
Link Distribution on Wikipedia [0407]KwangHee Park.
MACHINE LEARNING 7. Dimensionality Reduction. Dimensionality of input Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
CSC 478 Programming Data Mining Applications Course Summary Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
Web-Mining Agents: Transfer Learning TrAdaBoost R. Möller Institute of Information Systems University of Lübeck.
Transfer and Multitask Learning Steve Clanton. Multiple Tasks and Generalization “The ability of a system to recognize and apply knowledge and skills.
KNN & Naïve Bayes Hongning Wang
Understanding unstructured texts via Latent Dirichlet Allocation Raphael Cohen DSaaS, EMC IT June 2015.
Semi-Supervised Clustering
Cross Domain Distribution Adaptation via Kernel Mapping
Transfer Learning in Astronomy: A New Machine Learning Paradigm
Introductory Seminar on Research: Fall 2017
Self organizing networks
Classification Discriminant Analysis
Unsupervised Learning II: Soft Clustering with Gaussian Mixture Models
Michal Rosen-Zvi University of California, Irvine
Topic Models in Text Processing
Presentation transcript:

Transfer Learning Task

Problem Identification Dataset : A Year: 2000 Features: 48 Training Model ‘M’ Testing 98.6% Training Model ‘M’ Testing 97% Dataset : B Year: 2006 Features: 96 Model ‘M’ Training Testing 60.9% ??

Transfer Learning Transfer learning is the improvement of learning in a new task through the transfer of knowledge from a related task that has already been learned.

Traditional Machine Learning vs. Transfer Source Task Knowledge Target Task Learning System Different Tasks Learning System Traditional Machine LearningTransfer Learning

Transfer Learning Definition Given a source domain and source learning task, a target domain and a target learning task, transfer learning aims to help improve the learning of the target predictive function using the source knowledge, where or

Transfer Definition Therefore, if either : Domain Differences Task Differences

Examples: Cancer Data Age Smoking AgeHeight Smoking

Examples: Cancer Data Task Source: Classify into cancer or no cancer Task Target: Classify into cancer level one, cancer level two, cancer level three

Settings of Transfer Learning Transfer learning settings Labelled data in a source domain Labelled data in a target domain Tasks Inductive Transfer Learning × √ Classification Regression … √ √ Transductive Transfer Learning √× Classification Regression … Unsupervised Transfer Learning ×× Clustering …

Questions to answer when transferring What to Transfer ? How to Transfer ? When to Transfer ? Instances ? Model ? Features ? Map Model ? Unify Features ? Weight Instances ? In which Situations

What to Transfer ?? Transfer learning approachesDescription Instance-transferTo re-weight some labeled data in a source domain for use in the target domain Feature-representation-transferFind a “good” feature representation that reduces difference between a source and a target domain or minimizes error of models Model-transferDiscover shared parameters or priors of models between a source domain and a target domain Relational-knowledge-transferBuild mapping of relational knowledge between a source domain and a target domain.

Inductive Transfer Learning (Instance-transfer) Assumption: the source domain and target domain data use exactly the same features and labels. Motivation: Although the source domain data can not be reused directly, there are some parts of the data that can still be reused by re-weighting. Main Idea: Discriminatively adjust weighs of data in the source domain for use in the target domain.

Instance-transfer Assumptions: Source and Target task have same feature space: Marginal distributions are different: Not all source data might be helpful !

Algorithms: What to Transfer ? How to Transfer ? Instances Weight Instances

Algorithm: TrAdaBoost Idea: Iteratively reweight source samples such that: reduce effect of “bad” source instances encourage effect of “good” source instances Requires: Source task labeled data set Very small Target task labeled data set Unlabeled Target data set Base Learner

Our Case D1D1 M D2D2 % D 2 Transfer Learning

Self taught clustering Unsupervised transfer learning Co-clustering, no labelled data Feature based transfer learning Features are not the same Tasks may not be the same First applied on image clustering Key idea: found high level shared features, new feature representation

Self Taught Learning

Self taught learning

Latent Dirichlet Allocation (LDA) LDA is a generative probabilistic model of a corpus. The basic idea is that the documents are represented as random mixtures over latent topics, where a topic is characterized by a distribution over words. Typically used for topic modeling Forums, twitter messages, text corpus Do not consider word order Can be viewed as a dimension reduction technique.