Multi-view Exploratory Learning for AKBC Problems Bhavana Dalvi and William W. Cohen School Of Computer Science, Carnegie Mellon University Motivation.

Slides:



Advertisements
Similar presentations
A Comparison of Implicit and Explicit Links for Web Page Classification Dou Shen 1 Jian-Tao Sun 2 Qiang Yang 1 Zheng Chen 2 1 Department of Computer Science.
Advertisements

Latent Variables Naman Agarwal Michael Nute May 1, 2013.
Recognizing Human Actions by Attributes CVPR2011 Jingen Liu, Benjamin Kuipers, Silvio Savarese Dept. of Electrical Engineering and Computer Science University.
Collectively Representing Semi-Structured Data from the Web Bhavana Dalvi, William W. Cohen and Jamie Callan Language Technologies Institute Carnegie Mellon.
AUTOMATIC GLOSS FINDING for a Knowledge Base using Ontological Constraints Bhavana Dalvi (PhD Student, LTI) Work done with: Prof. William Cohen, CMU Prof.
Comparing Twitter Summarization Algorithms for Multiple Post Summaries David Inouye and Jugal K. Kalita SocialCom May 10 Hyewon Lim.
CLASSIFYING ENTITIES INTO AN INCOMPLETE ONTOLOGY Bhavana Dalvi, William W. Cohen, Jamie Callan School of Computer Science, Carnegie Mellon University.
A Probabilistic Framework for Semi-Supervised Clustering
Applications Chapter 9, Cimiano Ontology Learning Textbook Presented by Aaron Stewart.
Personalized Search Result Diversification via Structured Learning
Introduction to Automatic Classification Shih-Wen (George) Ke 7 th Dec 2005.
The use of unlabeled data to improve supervised learning for text summarization MR Amini, P Gallinari (SIGIR 2002) Slides prepared by Jon Elsas for the.
Distributional Clustering of Words for Text Classification Authors: L.Douglas Baker Andrew Kachites McCallum Presenter: Yihong Ding.
Context-Aware Query Classification Huanhuan Cao 1, Derek Hao Hu 2, Dou Shen 3, Daxin Jiang 4, Jian-Tao Sun 4, Enhong Chen 1 and Qiang Yang 2 1 University.
Gimme’ The Context: Context- driven Automatic Semantic Annotation with CPANKOW Philipp Cimiano et al.
Recovering Articulated Object Models from 3D Range Data Dragomir Anguelov Daphne Koller Hoi-Cheung Pang Praveen Srinivasan Sebastian Thrun Computer Science.
Semi-Supervised Clustering Jieping Ye Department of Computer Science and Engineering Arizona State University
Ranking by Odds Ratio A Probability Model Approach let be a Boolean random variable: document d is relevant to query q otherwise Consider document d as.
Scalable Text Mining with Sparse Generative Models
Hypertext Categorization using Hyperlink Patterns and Meta Data Rayid Ghani Séan Slattery Yiming Yang Carnegie Mellon University.
Text Classification With Labeled and Unlabeled Data Presenter: Aleksandar Milisic Supervisor: Dr. David Albrecht.
EVENT IDENTIFICATION IN SOCIAL MEDIA Hila Becker, Luis Gravano Mor Naaman Columbia University Rutgers University.
1/16 Final project: Web Page Classification By: Xiaodong Wang Yanhua Wang Haitang Wang University of Cincinnati.
Learning Table Extraction from Examples Ashwin Tengli, Yiming Yang and Nian Li Ma School of Computer Science Carnegie Mellon University Coling 04.
Ontology Learning and Population from Text: Algorithms, Evaluation and Applications Chapters Presented by Sole.
Tag Clouds Revisited Date : 2011/12/12 Source : CIKM’11 Speaker : I- Chih Chiu Advisor : Dr. Koh. Jia-ling 1.
C OLLECTIVE ANNOTATION OF WIKIPEDIA ENTITIES IN WEB TEXT - Presented by Avinash S Bharadwaj ( )
Modeling Documents by Combining Semantic Concepts with Unsupervised Statistical Learning Author: Chaitanya Chemudugunta America Holloway Padhraic Smyth.
Name : Emad Zargoun Id number : EASTERN MEDITERRANEAN UNIVERSITY DEPARTMENT OF Computing and technology “ITEC547- text mining“ Prof.Dr. Nazife Dimiriler.
Exploiting Ontologies for Automatic Image Annotation M. Srikanth, J. Varner, M. Bowden, D. Moldovan Language Computer Corporation
2007. Software Engineering Laboratory, School of Computer Science S E Towards Answering Opinion Questions: Separating Facts from Opinions and Identifying.
A Two Tier Framework for Context-Aware Service Organization & Discovery Wei Zhang 1, Jian Su 2, Bin Chen 2,WentingWang 2, Zhiqiang Toh 2, Yanchuan Sim.
Collectively Representing Semi-Structured Data from the Web Bhavana Dalvi, William W. Cohen and Jamie Callan Language Technologies Institute, Carnegie.
1 Statistical NLP: Lecture 9 Word Sense Disambiguation.
Active Learning An example From Xu et al., “Training SpamAssassin with Active Semi- Supervised Learning”
Topic Modelling: Beyond Bag of Words By Hanna M. Wallach ICML 2006 Presented by Eric Wang, April 25 th 2008.
1 Clustering: K-Means Machine Learning , Fall 2014 Bhavana Dalvi Mishra PhD student LTI, CMU Slides are based on materials from Prof. Eric Xing,
SYMPOSIUM ON SEMANTICS IN SYSTEMS FOR TEXT PROCESSING September 22-24, Venice, Italy Combining Knowledge-based Methods and Supervised Learning for.
Data Mining and Machine Learning Lab Unsupervised Feature Selection for Linked Social Media Data Jiliang Tang and Huan Liu Computer Science and Engineering.
Partially Supervised Classification of Text Documents by Bing Liu, Philip Yu, and Xiaoli Li Presented by: Rick Knowles 7 April 2005.
Learning to Link with Wikipedia David Milne and Ian H. Witten Department of Computer Science, University of Waikato CIKM 2008 (Best Paper Award) Presented.
Exploratory Learning Semi-supervised Learning in the presence of unanticipated classes Bhavana Dalvi, William W. Cohen, Jamie Callan School Of Computer.
Prediction of Molecular Bioactivity for Drug Design Experiences from the KDD Cup 2001 competition Sunita Sarawagi, IITB
EXPLORATORY LEARNING Semi-supervised Learning in the presence of unanticipated classes Bhavana Dalvi, William W. Cohen, Jamie Callan School of Computer.
Wikipedia as Sense Inventory to Improve Diversity in Web Search Results Celina SantamariaJulio GonzaloJavier Artiles nlp.uned.es UNED,c/Juan del Rosal,
MACHINE LEARNING 8. Clustering. Motivation Based on E ALPAYDIN 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2  Classification problem:
Acclimatizing Taxonomic Semantics for Hierarchical Content Categorization --- Lei Tang, Jianping Zhang and Huan Liu.
Date : 2013/03/18 Author : Jeffrey Pound, Alexander K. Hudek, Ihab F. Ilyas, Grant Weddell Source : CIKM’12 Speaker : Er-Gang Liu Advisor : Prof. Jia-Ling.
Chapter 11 Statistical Techniques. Data Warehouse and Data Mining Chapter 11 2 Chapter Objectives  Understand when linear regression is an appropriate.
1 A Web Search Engine-Based Approach to Measure Semantic Similarity between Words Presenter: Guan-Yu Chen IEEE Trans. on Knowledge & Data Engineering,
Topic Models Presented by Iulian Pruteanu Friday, July 28 th, 2006.
Subjectivity Recognition on Word Senses via Semi-supervised Mincuts Fangzhong Su and Katja Markert School of Computing, University of Leeds Human Language.
Using Wikipedia for Hierarchical Finer Categorization of Named Entities Aasish Pappu Language Technologies Institute Carnegie Mellon University PACLIC.
From Words to Senses: A Case Study of Subjectivity Recognition Author: Fangzhong Su & Katja Markert (University of Leeds, UK) Source: COLING 2008 Reporter:
CONTEXTUAL SEARCH AND NAME DISAMBIGUATION IN USING GRAPHS EINAT MINKOV, WILLIAM W. COHEN, ANDREW Y. NG SIGIR’06 Date: 2008/7/17 Advisor: Dr. Koh,
Virtual Examples for Text Classification with Support Vector Machines Manabu Sassano Proceedings of the 2003 Conference on Emprical Methods in Natural.
Hypertext Categorization using Hyperlink Patterns and Meta Data Rayid Ghani Séan Slattery Yiming Yang Carnegie Mellon University.
Hierarchical Beta Process and the Indian Buffet Process by R. Thibaux and M. I. Jordan Discussion led by Qi An.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Mixture Densities Maximum Likelihood Estimates.
Semi-Supervised Learning William Cohen. Outline The general idea and an example (NELL) Some types of SSL – Margin-based: transductive SVM Logistic regression.
Document Clustering with Prior Knowledge Xiang Ji et al. Document Clustering with Prior Knowledge. SIGIR 2006 Presenter: Suhan Yu.
Data Mining and Text Mining. The Standard Data Mining process.
Hierarchical Semi-supervised Classification with Incomplete Class Hierarchies Bhavana Dalvi ¶*, Aditya Mishra †, and William W. Cohen * ¶ Allen Institute.
2016/9/301 Exploiting Wikipedia as External Knowledge for Document Clustering Xiaohua Hu, Xiaodan Zhang, Caimei Lu, E. K. Park, and Xiaohua Zhou Proceeding.
Semi-Supervised Clustering
System for Semi-automatic ontology construction
Constrained Clustering -Semi Supervised Clustering-
Statistical NLP: Lecture 9
Michal Rosen-Zvi University of California, Irvine
Statistical NLP : Lecture 9 Word Sense Disambiguation
Presentation transcript:

Multi-view Exploratory Learning for AKBC Problems Bhavana Dalvi and William W. Cohen School Of Computer Science, Carnegie Mellon University Motivation Modeling Unobserved Classes Multi-view Exploratory EM AKBC tasks Acknowledgements : This work is supported by Google PhD Fellowship in Information Extraction and a Google Research Grant. Conclusions  Traditional EM method for SSL jointly learns missing labels of unlabeled data points as well as model parameters.  We consider two extensions of traditional EM for SSL:  We consider a new latent variable, unobserved classes, by dynamically introducing new classes when appropriate.  Assigning multiple labels from multiple levels of class hierarchy while satisfying ontological constraints, and considering multiple data views.  Our proposed framework combines structural search for the best class hierarchy with SSL, reducing the semantic draft associated with erroneously grouping unanticipated classes with expected classes.  Exploratory learning helps reduce semantic drift of seeded classes. It gets more powerful in conjunction with multiple data views and class hierarchy, when imposed as soft-constraints on the label vectors.  It can be applied for multiple AKBC tasks like macro- reading, gloss finding, ontology extension etc.  Datasets and code can be downloaded from: Model Selection  This step makes sure that we do not create too many new classes.  We tried BIC, AIC, and AICc criteria, and Extended AIC (AICc) worked best for our tasks. AICc(g) = AIC(g) + 2 * v * (v+1) / (n – v -1) Here g: Model being evaluated, L(g): Log-likelihood of data given g, v: Number of free parameters of the model, n: Number of data points. Multiple Data Views Incorporating Multiple Views and Ontological Constraints  Each data point is assigned a bit vector of labels. Subset and mutual exclusion constraints decide consistency of potential bit vectors.  GLOFIN: A mixed integer program is solved for each data point to get optimal label vector. [Dalvi et al. WSDM 2015]  Optimized Divide and Conquer (OptDAC): Here we combine 1) divide and conquer based top-down strategy to detect and place new categories in the ontology, with 2) mixed integer programming technique (GLOFIN) to select optimal set of labels for a data point, consistent w.r.t. ontological constraints.  Semi-supervised classification of noun-phrases into categories, using distributional features.  Exploratory learning can reduce semantic drift of seed classes. [Dalvi et al. ECML 2013] Macro-reading (Explore-EM) Micro-reading  Task: To classify an entity mention using context specific features.  Clustering NIL entities for KBP entity discovery and linking (EDL) task [Mazaitis et al., KBP 2014] Multi-view Hierarchical SSL (MaxAgree)  MaxAgree method exploits clues from different data views.  We define multi-view clustering as an optimization problem and compare various methods for combining scores across views. MaxAgree method is more robust compared to Prod-Score method when we vary difference of performance between views.  Our proposed Hier-MaxAgree method can incorporate both: the clues from multiple view, and ontological constraints. [Dalvi and Cohen, in submission]  On entity classification for NELL KB, our proposed Hier-MaxAgree method gave state-of-the-art performance. Different Document Representations  Naïve Bayes: Assumes multinomial distribution for feature occurrences, explicitly models class prior.  Seeded K-Means: Similarity based on cosine distance between centroids and data points  Seeded von Mises-Fisher: SSL method for data distributed on the unit hyper-sphere. Ontological Constraints Automatic gloss finding for KBs (GLOFIN)  We developed GLOFIN method that takes a gloss-free KB, a large collection of glosses and automatically matches glosses to entities in the KB. [Dalvi et al. WSDM 2015]  We used Glosses with only one candidate KB entity (unambiguous glosses) are used as training data to train hierarchical classification model for categories in the KB. Ambiguous glosses are then disambiguated based on the KB category they are put in.  Our method outperformed SVM and a label propagation baseline especially when amount of training data is small.  In future: Apply GLOFIN to word sense disambiguation w.r.t. WordNet synset hierarchy. Hierarchical Exploratory Learning (OptDAC)  We proposed OptDAC that can do hierarchical SSL in the presence of incomplete class ontologies.  It employs mixed integer programming formulation to find optimal label assignments for a data point, while traversing the class ontology in top- down fashion to detect whether a new class needs to be added and where to place it. [Dalvi and Cohen, under review] Performance improvement over best view Correlation w.r.t difference in views CoefficientP-value Prod-Score MaxAgree Text-patterns + Ontology-1 Text-patterns + Ontology-2 HTML-tables + Ontology-1 HTML-tables + Ontology-2 An example of extended ontology by OptDAC Root Food Location Country State Vegetable Condiment 1.0 Coke C8 Example use-case of Exploratory EM 20 Newsgroups Dataset (#seed classes = 6)