Bridging Domains Using World Wide Knowledge for Transfer Learning

Slides:



Advertisements
Similar presentations
Knowledge Transfer via Multiple Model Local Structure Mapping Jing Gao, Wei Fan, Jing Jiang, Jiawei Han l Motivate Solution Framework Data Sets Synthetic.
Advertisements

Document Summarization using Conditional Random Fields Dou Shen, Jian-Tao Sun, Hua Li, Qiang Yang, Zheng Chen IJCAI 2007 Hao-Chin Chang Department of Computer.
Personalized Query Classification Bin Cao, Qiang Yang, Derek Hao Hu, et al. Computer Science and Engineering Hong Kong UST.
Automatic Identification of Cognates, False Friends, and Partial Cognates University of Ottawa, Canada University of Ottawa, Canada.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Validating Transliteration Hypotheses Using the Web: Web.
Relational Learning with Gaussian Processes By Wei Chu, Vikas Sindhwani, Zoubin Ghahramani, S.Sathiya Keerthi (Columbia, Chicago, Cambridge, Yahoo!) Presented.
Self Taught Learning : Transfer learning from unlabeled data Presented by: Shankar B S DMML Lab Rajat Raina et al, CS, Stanford ICML 2007.
Cross Domain Distribution Adaptation via Kernel Mapping Erheng Zhong † Wei Fan ‡ Jing Peng* Kun Zhang # Jiangtao Ren † Deepak Turaga ‡ Olivier Verscheure.
Context-Aware Query Classification Huanhuan Cao 1, Derek Hao Hu 2, Dou Shen 3, Daxin Jiang 4, Jian-Tao Sun 4, Enhong Chen 1 and Qiang Yang 2 1 University.
Dept. of Computer Science & Engineering, CUHK Pseudo Relevance Feedback with Biased Support Vector Machine in Multimedia Retrieval Steven C.H. Hoi 14-Oct,
Presented by Zeehasham Rasheed
Cross Validation Framework to Choose Amongst Models and Datasets for Transfer Learning Erheng Zhong ¶, Wei Fan ‡, Qiang Yang ¶, Olivier Verscheure ‡, Jiangtao.
Knowledge Transfer via Multiple Model Local Structure Mapping Jing Gao† Wei Fan‡ Jing Jiang†Jiawei Han† †University of Illinois at Urbana-Champaign ‡IBM.
Bing LiuCS Department, UIC1 Learning from Positive and Unlabeled Examples Bing Liu Department of Computer Science University of Illinois at Chicago Joint.
Maria-Florina Balcan A Theoretical Model for Learning from Labeled and Unlabeled Data Maria-Florina Balcan & Avrim Blum Carnegie Mellon University, Computer.
Using Image Priors in Maximum Margin Classifiers Tali Brayer Margarita Osadchy Daniel Keren.
Introduction to Data Mining Engineering Group in ACL.
TransRank: A Novel Algorithm for Transfer of Rank Learning Depin Chen, Jun Yan, Gang Wang et al. University of Science and Technology of China, USTC Machine.
Transfer Learning From Multiple Source Domains via Consensus Regularization Ping Luo, Fuzhen Zhuang, Hui Xiong, Yuhong Xiong, Qing He.
Semi-Supervised Learning with Concept Drift using Particle Dynamics applied to Network Intrusion Detection Data Fabricio Breve Institute of Geosciences.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 An Efficient Concept-Based Mining Model for Enhancing.
Active Learning for Class Imbalance Problem
 An important problem in sponsored search advertising is keyword generation, which bridges the gap between the keywords bidded by advertisers and queried.
A Survey for Interspeech Xavier Anguera Information Retrieval-based Dynamic TimeWarping.
Topical Crawlers for Building Digital Library Collections Presenter: Qiaozhu Mei.
 Text Representation & Text Classification for Intelligent Information Retrieval Ning Yu School of Library and Information Science Indiana University.
Universit at Dortmund, LS VIII
EVALUATING TRANSFER LEARNING APPROACHES FOR IMAGE INFORMATION MINING APPLICATIONS Surya Durbha*, Roger King, Nicolas Younan, *Indian Institute of Technology(IIT),
Transfer Learning Task. Problem Identification Dataset : A Year: 2000 Features: 48 Training Model ‘M’ Testing 98.6% Training Model ‘M’ Testing 97% Dataset.
1 Co-Training for Cross-Lingual Sentiment Classification Xiaojun Wan ( 萬小軍 ) Associate Professor, Peking University ACL 2009.
Modern Topics in Multivariate Methods for Data Analysis.
Source-Selection-Free Transfer Learning
Bridged Refinement for Transfer Learning XING Dikan, DAI Wenyua, XUE Gui-Rong, YU Yong Shanghai Jiao Tong University
Multi-Task Learning for Boosting with Application to Web Search Ranking Olivier Chapelle et al. Presenter: Wei Cheng.
A Novel Local Patch Framework for Fixing Supervised Learning Models Yilei Wang 1, Bingzheng Wei 2, Jun Yan 2, Yang Hu 2, Zhi-Hong Deng 1, Zheng Chen 2.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. An IPC-based vector space model for patent retrieval Presenter: Jun-Yi Wu Authors: Yen-Liang Chen, Yu-Ting.
Intelligent Database Systems Lab Presenter : Chang,Chun-Chih Authors : Youngjoong Ko, Jungyun Seo 2009, IPM Text classification from unlabeled documents.
Dual Transfer Learning Mingsheng Long 1,2, Jianmin Wang 2, Guiguang Ding 2 Wei Cheng, Xiang Zhang, and Wei Wang 1 Department of Computer Science and Technology.
Bing LiuCS Department, UIC1 Chapter 8: Semi-supervised learning.
HAITHAM BOU AMMAR MAASTRICHT UNIVERSITY Transfer for Supervised Learning Tasks.
Challenges with XML Challenges with Semi-Structured collections Ludovic Denoyer University of Paris 6 Bridging the gap between research communities.
Computational Approaches for Biomarker Discovery SubbaLakshmiswetha Patchamatla.
Multiple Instance Learning for Sparse Positive Bags Razvan C. Bunescu Machine Learning Group Department of Computer Sciences University of Texas at Austin.
Exploring in the Weblog Space by Detecting Informative and Affective Articles Xiaochuan Ni, Gui-Rong Xue, Xiao Ling, Yong Yu Shanghai Jiao-Tong University.
Iterative similarity based adaptation technique for Cross Domain text classification Under: Prof. Amitabha Mukherjee By: Narendra Roy Roll no: Group:
Wenyuan Dai, Ou Jin, Gui-Rong Xue, Qiang Yang and Yong Yu Shanghai Jiao Tong University & Hong Kong University of Science and Technology.
Context-Aware Query Classification Huanhuan Cao, Derek Hao Hu, Dou Shen, Daxin Jiang, Jian-Tao Sun, Enhong Chen, Qiang Yang Microsoft Research Asia SIGIR.
Intelligent Database Systems Lab Presenter : Chuang, Kai-Ting Authors : Rafael Odon de Alencar, Clodoveu Augusto Davis Jr., Marcos André Gonçalves 2010,
Intelligent Database Systems Lab Presenter: NENG-KAI, HONG Authors: HUAN LONG A, ZIJUN ZHANG A, ⇑, YAN SU 2014, APPLIED ENERGY Analysis of daily solar.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Web-Mining Agents: Transfer Learning TrAdaBoost R. Möller Institute of Information Systems University of Lübeck.
Max-Confidence Boosting With Uncertainty for Visual tracking WEN GUO, LIANGLIANG CAO, TONY X. HAN, SHUICHENG YAN AND CHANGSHENG XU IEEE TRANSACTIONS ON.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Enhancing Text Clustering by Leveraging Wikipedia Semantics.
CS & CS ST: Probabilistic Data Management Fall 2016 Xiang Lian Kent State University Kent, OH
Experience Report: System Log Analysis for Anomaly Detection
Semi-Supervised Learning Using Label Mean
Exploring Social Tagging Graph for Web Object Classification
Sentiment analysis algorithms and applications: A survey
ANOMALY DETECTION FRAMEWORK FOR BIG DATA
Machine Learning overview Chapter 18, 21
Machine Learning overview Chapter 18, 21
Cross Domain Distribution Adaptation via Kernel Mapping
Transfer Learning in Astronomy: A New Machine Learning Paradigm
MID-SEM REVIEW.
Open-Category Classification by Adversarial Sample Generation
Adaptive entity resolution with human computation
iSRD Spam Review Detection with Imbalanced Data Distributions
Deep Cross-media Knowledge Transfer
GANG: Detecting Fraudulent Users in OSNs
Three steps are separately conducted
Presentation transcript:

Bridging Domains Using World Wide Knowledge for Transfer Learning Evan Wei Xiang, Bin Cao, Derek Hao Hu, and Qiang Yang TKDE, 2010 presented by Wen-Chung Liao, 2010/05/12

Outlines Motivation Objectives Methodology Experiments Conclusions Comments

Motivation Supervised learning, require sufficient labeled instances It is not easy or feasible to obtain new labeled data in a domain of interest To solve this problem, transfer learning techniques capture the shared knowledge from some related domains (source domains ) where labeled data are available use the knowledge to improve the performance of data mining tasks in a target domain. domain adaptation techniques, However, transfer learning may not work well when the difference (information gap) between the source and target domains is large.

Objectives To solve this problem, introduce a bridge between the two different domains by leveraging additional knowledge sources Wikipedia or the Open Directory Project (ODP) treat the two domains  from a single underlying distribution “domain adaptation problem”  classification problem under the supervised setting or a semisupervised (transductive) setting. Introduces a novel domain adaptation algorithm called BIG (Bridging Information Gap). we apply semisupervised learning (SSL) to domain adaption problems based on the use of the auxiliary data (bridge). the labeled data from the source domain the unlabeled data from the target domain an auxiliary data source such as the Wikipedia.

Support vector machines (SVMs)

Methodology SVM TSVM NP-Hard Information Gap with No Background Knowledge Available SVM Information Gap with Background Knowledge TSVM Selecting the set of unlabeled data {xi} from K to minimize the margin NP-Hard

Experiments

Conclusions THREE MAJOR CONTRIBUTIONS FUTURE WORK 1) We view the problem from a new perspective, i.e., we consider the problem of transfer learning as one of filling in the information gap based on a large document corpus. 2) we show that we can successfully bridge the source and target domains using well developed semisupervised learning algorithms. 3) We propose a minmargin algorithm that can effectively identify and reduce the information gap between two domains. FUTURE WORK First, we plan to validate the effectiveness of our approach through other semisupervised learning algorithms and other relational knowledge bases We plan to extend our approach to be able to consider heterogeneous transfer learning Finally, we will try to develop online TSVM methods for incremental cross-domain transductive learning.

Comments Advantage Shortage Applications new perspective Shortage Applications Web and document data mining applications information retrieval spam detection online advertisement Web search