CLASSIFYING ENTITIES INTO AN INCOMPLETE ONTOLOGY Bhavana Dalvi, William W. Cohen, Jamie Callan School of Computer Science, Carnegie Mellon University.

Slides:



Advertisements
Similar presentations
EMNLP, June 2001Ted Pedersen - EM Panel1 A Gentle Introduction to the EM Algorithm Ted Pedersen Department of Computer Science University of Minnesota.
Advertisements

AUTOMATIC GLOSS FINDING for a Knowledge Base using Ontological Constraints Bhavana Dalvi (PhD Student, LTI) Work done with: Prof. William Cohen, CMU Prof.
The UNIVERSITY of Kansas EECS 800 Research Seminar Mining Biological Data Instructor: Luke Huan Fall, 2006.
Clustering… in General In vector space, clusters are vectors found within  of a cluster vector, with different techniques for determining the cluster.
1 Lecture 5: Automatic cluster detection Lecture 6: Artificial neural networks Lecture 7: Evaluation of discovered knowledge Brief introduction to lectures.
Unsupervised Learning: Clustering Rong Jin Outline  Unsupervised learning  K means for clustering  Expectation Maximization algorithm for clustering.
Unsupervised Learning: Clustering Some material adapted from slides by Andrew Moore, CMU. Visit for
Clustering Evaluation April 29, Today Cluster Evaluation – Internal We don’t know anything about the desired labels – External We have some information.
Spatial Semi- supervised Image Classification Stuart Ness G07 - Csci 8701 Final Project 1.
Dan Phelleg, Andrew Moore Carnegie Mellon University
Unsupervised Training and Clustering Alexandros Potamianos Dept of ECE, Tech. Univ. of Crete Fall
Clustering. 2 Outline  Introduction  K-means clustering  Hierarchical clustering: COBWEB.
Clustering. 2 Outline  Introduction  K-means clustering  Hierarchical clustering: COBWEB.
Semi-Supervised Clustering Jieping Ye Department of Computer Science and Engineering Arizona State University
Adapted by Doug Downey from Machine Learning EECS 349, Bryan Pardo Machine Learning Clustering.
Holistic Web Page Classification William W. Cohen Center for Automated Learning and Discovery (CALD) Carnegie-Mellon University.
Cross Validation Framework to Choose Amongst Models and Datasets for Transfer Learning Erheng Zhong ¶, Wei Fan ‡, Qiang Yang ¶, Olivier Verscheure ‡, Jiangtao.
Multi-view Exploratory Learning for AKBC Problems Bhavana Dalvi and William W. Cohen School Of Computer Science, Carnegie Mellon University Motivation.
Clustering with Bregman Divergences Arindam Banerjee, Srujana Merugu, Inderjit S. Dhillon, Joydeep Ghosh Presented by Rohit Gupta CSci 8980: Machine Learning.
Evaluating Performance for Data Mining Techniques
Unsupervised Learning. CS583, Bing Liu, UIC 2 Supervised learning vs. unsupervised learning Supervised learning: discover patterns in the data that relate.
Annealing Paths for the Evaluation of Topic Models James Foulds Padhraic Smyth Department of Computer Science University of California, Irvine* *James.
CPSC 502, Lecture 15Slide 1 Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 16 Nov, 3, 2011 Slide credit: C. Conati, S.
Collectively Representing Semi-Structured Data from the Web Bhavana Dalvi, William W. Cohen and Jamie Callan Language Technologies Institute, Carnegie.
1 Clustering: K-Means Machine Learning , Fall 2014 Bhavana Dalvi Mishra PhD student LTI, CMU Slides are based on materials from Prof. Eric Xing,
Basic Machine Learning: Clustering CS 315 – Web Search and Data Mining 1.
Unsupervised Constraint Driven Learning for Transliteration Discovery M. Chang, D. Goldwasser, D. Roth, and Y. Tu.
CHAPTER 7: Clustering Eick: K-Means and EM (modified Alpaydin transparencies and new transparencies added) Last updated: February 25, 2014.
Unsupervised Learning: Clustering Some material adapted from slides by Andrew Moore, CMU. Visit for
Exploratory Learning Semi-supervised Learning in the presence of unanticipated classes Bhavana Dalvi, William W. Cohen, Jamie Callan School Of Computer.
1 COMP3503 Semi-Supervised Learning COMP3503 Semi-Supervised Learning Daniel L. Silver.
EXPLORATORY LEARNING Semi-supervised Learning in the presence of unanticipated classes Bhavana Dalvi, William W. Cohen, Jamie Callan School of Computer.
MACHINE LEARNING 8. Clustering. Motivation Based on E ALPAYDIN 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2  Classification problem:
Information Retrieval Lecture 6 Introduction to Information Retrieval (Manning et al. 2007) Chapter 16 For the MSc Computer Science Programme Dell Zhang.
Chapter 11 Statistical Techniques. Data Warehouse and Data Mining Chapter 11 2 Chapter Objectives  Understand when linear regression is an appropriate.
Data Science and Big Data Analytics Chap 4: Advanced Analytical Theory and Methods: Clustering Charles Tappert Seidenberg School of CSIS, Pace University.
Slides for “Data Mining” by I. H. Witten and E. Frank.
Cluster Analysis Potyó László. Cluster: a collection of data objects Similar to one another within the same cluster Similar to one another within the.
DATA MINING WITH CLUSTERING AND CLASSIFICATION Spring 2007, SJSU Benjamin Lam.
Mehdi Ghayoumi MSB rm 132 Ofc hr: Thur, a Machine Learning.
Clustering. 2 Outline  Introduction  K-means clustering  Hierarchical clustering: COBWEB.
Computational Approaches for Biomarker Discovery SubbaLakshmiswetha Patchamatla.
Exploiting Ontologies for Automatic Image Annotation Munirathnam Srikanth, Joshua Varner, Mitchell Bowden, Dan Moldovan Language Computer Corporation SIGIR.
Semi-automatic Product Attribute Extraction from Store Website
Clustering Instructor: Max Welling ICS 178 Machine Learning & Data Mining.
Compiled By: Raj Gaurang Tiwari Assistant Professor SRMGPC, Lucknow Unsupervised Learning.
Basic Machine Learning: Clustering CS 315 – Web Search and Data Mining 1.
1 Machine Learning Lecture 9: Clustering Moshe Koppel Slides adapted from Raymond J. Mooney.
Data Mining By Farzana Forhad CS 157B. Agenda Decision Tree and ID3 Rough Set Theory Clustering.
Information Retrieval Search Engine Technology (8) Prof. Dragomir R. Radev.
DECISION TREES Asher Moody, CS 157B. Overview  Definition  Motivation  Algorithms  ID3  Example  Entropy  Information Gain  Applications  Conclusion.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Mixture Densities Maximum Likelihood Estimates.
DATA MINING TECHNIQUES (DECISION TREES ) Presented by: Shweta Ghate MIT College OF Engineering.
Semi-Supervised Learning William Cohen. Outline The general idea and an example (NELL) Some types of SSL – Margin-based: transductive SVM Logistic regression.
Rodney Nielsen Many of these slides were adapted from: I. H. Witten, E. Frank and M. A. Hall Data Science Algorithms: The Basic Methods Clustering WFH:
Hierarchical Semi-supervised Classification with Incomplete Class Hierarchies Bhavana Dalvi ¶*, Aditya Mishra †, and William W. Cohen * ¶ Allen Institute.
Unsupervised Learning Part 2. Topics How to determine the K in K-means? Hierarchical clustering Soft clustering with Gaussian mixture models Expectation-Maximization.
Unsupervised Learning: Clustering
Unsupervised Learning: Clustering
Semi-Supervised Clustering
Constrained Clustering -Semi Supervised Clustering-
Machine Learning Lecture 9: Clustering
Classification of unlabeled data:
Clustering.
CSE P573 Applications of Artificial Intelligence Bayesian Learning
Data Mining 資料探勘 分群分析 (Cluster Analysis) Min-Yuh Day 戴敏育
Unsupervised Learning II: Soft Clustering with Gaussian Mixture Models
LECTURE 21: CLUSTERING Objectives: Mixture Densities Maximum Likelihood Estimates Application to Gaussian Mixture Models k-Means Clustering Fuzzy k-Means.
Text Categorization Berlin Chen 2003 Reference:
Introduction to Machine learning
Presentation transcript:

CLASSIFYING ENTITIES INTO AN INCOMPLETE ONTOLOGY Bhavana Dalvi, William W. Cohen, Jamie Callan School of Computer Science, Carnegie Mellon University

Motivation  Existing Techniques  Semi-supervised Hierarchical Classification: Carlson WSDM’10  Extending knowledge bases: Finding new relations or attributes of existing concepts Mohamed et al. EMNLP’11  Unsupervised ontology discovery: Adams et al. NIPS’10, Blei et al. JACM’10, Reisinger et al. ACL’09  Evolving Web-scale datasets  Billions of entities and hundreds of thousands of concepts  Difficult to create a complete ontology  Hierarchical classification of entities into incomplete ontologies is needed

Contributions  Hierarchical Exploratory EM  Adds new instances to the existing classes  Discovers new classes and adds them at appropriate places in the ontology  Class constraints:  Inclusion: Every entity that is “Mammal” is also an “Animal”  Mutual Exclusion: If an entity is “Electronic Device” then its not “Mammal”

Problem Definition

Review: Exploratory EM [Dalvi et al. ECML 2013] Initialize model with few seeds per class Iterate till convergence (Data likelihood and # classes) E step: Predict labels for unlabeled points If P(Cj | Xi) is nearly-uniform for a data-point Xi, j=1 to k  Create a new class C k+1, assign Xi to it M step: Recompute model parameters using seeds + predicted labels for unlabeled points  Number of classes might increase in each iteration Check if model selection criterion is satisfied If not, revert to model in Iteration `t-1’ Classification/clustering KMeans, NBayes, VMF … Max/Min ratio JS Divergence AIC, BIC, AICc …

Hierarchical Exploratory EM

Divide-And-Conquer Exploratory EM Mutual ExcIusion Root Food Location Country State Vegetable Condiment Inclusion E.g. Spinach, Potato, Pepper… Level 1 Level 2 Level 3 Assumptions:  Classes are arranged in a tree- structured hierarchy.  Classes at any level of the hierarchy are mutually exclusive.

Divide-And-Conquer Exploratory EM Root Food Location Country State Vegetable Condiment 1.0 California

Divide-And-Conquer Exploratory EM Root Food Location Country State Vegetable Condiment 1.0 California

Divide-And-Conquer Exploratory EM Root Food Location Country State Vegetable Condiment 1.0 California

Divide-And-Conquer Exploratory EM Root Food Location Country State Vegetable Condiment 1.0 Coke

Divide-And-Conquer Exploratory EM Root Food Location Country State Vegetable Condiment 1.0 Coke

Divide-And-Conquer Exploratory EM Root Food Location Country State Vegetable Condiment 1.0 Coke

Divide-And-Conquer Exploratory EM Root Food Location Country State Vegetable Condiment 1.0 Coke C8 Coke

Divide-And-Conquer Exploratory EM Root Food Location Country State Vegetable Condiment 1.0 Coke Coke

Divide-And-Conquer Exploratory EM Root Food Location Country State Vegetable Condiment 1.0 Cat C8 C Cat

What are we trying to optimize? Objective Function : Maximize { Log Data Likelihood – Model Penalty } m: #clusters, Params{C1… Cm} subject to Class constraints: Z m

Datasets Ontology 1 Ontology 2 Dataset#Classes#Levels#NELL entities #Contexts DS K3.4M DS K6.7M Clueweb09 Corpus + Subsets of NELL

Results Dataset#Train /Test Points DS-1335/ 2.2K DS-21.5K/ 11.4K

Results Dataset#Train /Test Points Level#Seed/ #Ideal Classes DS-1335/ 2.2K 22/3 34/7 DS-21.5K/ 11.4K 23.9/4 39.4/ /10

Results Dataset#Train /Test Points Level#Seed/ #Ideal Classes Macro-averaged Seed Class F1 FLAT SemisupEMExploratoryEM DS-1335/ 2.2K 22/ * 34/ * DS-21.5K/ 11.4K 23.9/ / * 42.4/ *

Results Dataset#Train /Test Points Level#Seed/ #Ideal Classes Macro-averaged Seed Class F1 FLATDAC SemisupEMExploratoryEMSemisupEMExploratoryEM DS-1335/ 2.2K 22/ * * 34/ * * DS-21.5K/ 11.4K 23.9/ * 39.4/ * * 42.4/ *

Conclusions

Thank You Questions?

Extra Slides

Class Creation Criterion

Model Selection Extended Akaike Information Criterion AICc(g) = -2*L(g) + 2*v + 2*v*(v+1)/(n – v -1) Here g: model being evaluated, L(g): log-likelihood of data given g, v: number of free parameters of the model, n: number of data-points. 