Learning Clusterwise Similarity with First-Order Features Aron Culotta and Andrew McCallum University of Massachusetts - Amherst NIPS Workshop on Theoretical.

Slides:



Advertisements
Similar presentations
Study on Ensemble Learning By Feng Zhou. Content Introduction A Statistical View of M3 Network Future Works.
Advertisements

Using Strong Shape Priors for Multiview Reconstruction Yunda SunPushmeet Kohli Mathieu BrayPhilip HS Torr Department of Computing Oxford Brookes University.
Basic Steps 1.Compute the x and y image derivatives 2.Classify each derivative as being caused by either shading or a reflectance change 3.Set derivatives.
Joint Inference in Information Extraction Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with Pedro Domingos)
Alexander Statnikov1, Douglas Hardin1,2, Constantin Aliferis1,3
Towards Automating the Configuration of a Distributed Storage System Lauro B. Costa Matei Ripeanu {lauroc, NetSysLab University of British.
Graph of a Curve Continuity This curve is _____________These curves are _____________ Smoothness This curve is _____________These curves are _____________.
Graph of a Curve Continuity This curve is continuous
Mean-Field Theory and Its Applications In Computer Vision1 1.
1 Hierarchical Part-Based Human Body Pose Estimation * Ramanan Navaratnam * Arasanathan Thayananthan Prof. Phil Torr * Prof. Roberto Cipolla * University.
Solving Markov Random Fields using Second Order Cone Programming Relaxations M. Pawan Kumar Philip Torr Andrew Zisserman.
Solving Markov Random Fields using Dynamic Graph Cuts & Second Order Cone Programming Relaxations M. Pawan Kumar, Pushmeet Kohli Philip Torr.
Localization algorithms for wireless sensor networks M.Srbinovska, C.Gavrovski Ss.Cyril and Methodius University, Skopje Faculty of Electrical Engineering.
Simple Linear Regression 1. review of least squares procedure 2
Author: Chengchen, Bin Liu Publisher: International Conference on Computational Science and Engineering Presenter: Yun-Yan Chang Date: 2012/04/18 1.
Teg Grenager NLP Group Lunch February 24, 2005
Using the Crowd for Top-K or Group-By Queries
Gate Sizing for Cell Library Based Designs Shiyan Hu*, Mahesh Ketkar**, Jiang Hu* *Dept of ECE, Texas A&M University **Intel Corporation.
Project Scheduling: Lagging, Crashing, and Activity Networks
1 Adaptive Submodularity: A New Approach to Active Learning and Stochastic Optimization Joint work with Andreas Krause 1 Daniel Golovin.
Computer vision: models, learning and inference
Chapter 10 Estimating Means and Proportions
A Partition Modelling Approach to Tomographic Problems Thomas Bodin & Malcolm Sambridge Research School of Earth Sciences, Australian National University.
Critical issues of ensemble data assimilation in application to GOES-R risk reduction program D. Zupanski 1, M. Zupanski 1, M. DeMaria 2, and L. Grasso.
2 x0 0 12/13/2014 Know Your Facts!. 2 x1 2 12/13/2014 Know Your Facts!
Authors Sebastian Riedel and James Clarke Paper review by Anusha Buchireddygari Incremental Integer Linear Programming for Non-projective Dependency Parsing.
1 ECE 776 Project Information-theoretic Approaches for Sensor Selection and Placement in Sensor Networks for Target Localization and Tracking Renita Machado.
5 x4. 10 x2 9 x3 10 x9 10 x4 10 x8 9 x2 9 x4.
Variational Inference Amr Ahmed Nov. 6 th Outline Approximate Inference Variational inference formulation – Mean Field Examples – Structured VI.
Section 3.4 The Traveling Salesperson Problem Tucker Applied Combinatorics By Aaron Desrochers and Ben Epstein.
The Project Problem formulation (one page) Literature review –“Related work" section of final paper, –Go to writing center, –Present paper(s) to class.
Feature Selection as Relevant Information Encoding Naftali Tishby School of Computer Science and Engineering The Hebrew University, Jerusalem, Israel NIPS.
Structured SVM Chen-Tse Tsai and Siddharth Gupta.
Conditional Random Fields - A probabilistic graphical model Stefan Mutter Machine Learning Group Conditional Random Fields - A probabilistic graphical.
NetSci07 May 24, 2007 Entity Resolution in Network Data Lise Getoor University of Maryland, College Park.
Corp. Research Princeton, NJ Computing geodesics and minimal surfaces via graph cuts Yuri Boykov, Siemens Research, Princeton, NJ joint work with Vladimir.
Markov Nets Dhruv Batra, Recitation 10/30/2008.
Research Introspection “ICML does ICML” Andrew McCallum Computer Science Department University of Massachusetts Amherst.
Comparison of Instance-Based Techniques for Learning to Predict Changes in Stock Prices iCML Conference December 10, 2003 Presented by: David LeRoux.
Experiments  Synthetic data: random linear scoring function with random constraints  Information extraction: Given a citation, extract author, book-title,
A Global Relaxation Labeling Approach to Coreference Resolution Coling 2010 Emili Sapena, Llu´ıs Padr´o and Jordi Turmo TALP Research Center Universitat.
INFORMATION EXTRACTION SNITA SARAWAGI. Management of Information Extraction System Performance Optimization Handling Change Integration of Extracted Information.
IJCAI 2003 Workshop on Learning Statistical Models from Relational Data First-Order Probabilistic Models for Information Extraction Advisor: Hsin-His Chen.
Multiplicative Bounds for Metric Labeling M. Pawan Kumar École Centrale Paris Joint work with Phil Torr, Daphne Koller.
Toward Unified Models of Information Extraction and Data Mining Andrew McCallum Information Extraction and Synthesis Laboratory Computer Science Department.
First-Order Probabilistic Models for Coreference Resolution Aron Culotta Computer Science Department University of Massachusetts Amherst Joint work with.
Geometric Transformations
Multidimensional Scaling By Marc Sobel. The Goal  We observe (possibly non-euclidean) proximity data. For each pair of objects number ‘i’ and ‘j’ we.
The University of Ontario Yuri Boykov Research Interests n Computer Vision n Medical Image Analysis n Graphics Combinatorial optimization algorithms Geometric,
A global approach Finding correspondence between a pair of epipolar lines for all pixels simultaneously Local method: no guarantee we will have one to.
Probabilistic Equational Reasoning Arthur Kantor
Optimal Reverse Prediction: Linli Xu, Martha White and Dale Schuurmans ICML 2009, Best Overall Paper Honorable Mention A Unified Perspective on Supervised,
Probabilistic Equational Reasoning Arthur Kantor
Multiple Sequence Alignment Vasileios Hatzivassiloglou University of Texas at Dallas.
The Canopies Algorithm from “Efficient Clustering of High-Dimensional Data Sets with Application to Reference Matching” Andrew McCallum, Kamal Nigam, Lyle.
Correlation Clustering Shuchi Chawla Carnegie Mellon University Joint work with Nikhil Bansal and Avrim Blum.
Bayesian Hierarchical Clustering Paper by K. Heller and Z. Ghahramani ICML 2005 Presented by David Williams Paper Discussion Group ( )
Final Project Presentation Information Extraction Learning to Extract Signature and Reply Lines from Vitor R. Carvalho.
Warren Shen, Xin Li, AnHai Doan Database & AI Groups University of Illinois, Urbana Constraint-Based Entity Matching.
Correlation Clustering
Pattern Matching Techniques
Nonparametric Semantic Segmentation
Recovering Temporally Rewiring Networks: A Model-based Approach
Counting in High-Density Crowd Videos
Clustering Algorithms for Noun Phrase Coreference Resolution
Computer Vision Stereo Vision.
Natural Language Processing
Clustering appearance and shape by Jigsaw, and comparing it with Epitome. Papers (1) Clustering appearance and shape by learning jigsaws (2006 NIPS) (2)
CS639: Data Management for Data Science
Statistical Relational AI
Presentation transcript:

Learning Clusterwise Similarity with First-Order Features Aron Culotta and Andrew McCallum University of Massachusetts - Amherst NIPS Workshop on Theoretical Foundations of Clustering December 10, 2005

Supervised Clustering Estimate pairwise similarity metric

Supervised Clustering

Conditional Models of Identity Uncertainty with Application to Noun Coreference y 12 y 23 y 13 g 12 g 13 g 23 [McCallum, Wellner 04] g 123 transitivity checking function Learned Pairwise Metric x2x2 x1x1 x3x3 He Jon Jonathan 1 1 1

Inference = Graph Partitioning [McCallum, Wellner 04] [Boykov et al 99] [Bansal et al 02] x2x2 x1x1 x3x3 He Jon Jonathan

Inside the Pairwise Metric String x i has low edit distance to x j x i is a pronoun in the same sentence as x j x i is the same number and gender as x j

Drawbacks of Pairwise Metric Cannot represent cluster-wide constraints E.g. –A cluster of pronouns should have at least one non- pronoun. –A researcher is unlikely to publish in more than 5 different conferences in the same year –A person is unlikely to have more than 3 different job titles in the same year [Milch et al 04]

Clusterwise Metric Measures compatibility of all nodes in a cluster Enables first-order features –mean, median, mode of attributes –maximum string edit distance is K –cluster size is greater than N

Probabilistic Interpretation of Pairwise Metric Learning x2x2 x1x1 x3x3 y 12 y 23 y 13 g 12 g 13 g 23

Probabilistic Interpretation of Clusterwise Metric Learning x2x2 x1x1 x3x3 y 12 y 23 y 13 g 12 g 13 g 23 y 123 g 123

Empirical Results Citation matching –paper deduplication –author deduplication/disambiguation Proper noun coreference Modest but consistent improvements over pairwise metric (10-30% error reduction)

Implications of Clusterwise Metric x2x2 x1x1 x3x Locally compatible -122 Globally incompatible

Open Questions What is the geometric interpretation for clusterwise metric? What are implications of clusterwise metrics on common clustering methods? What is kernel interpretation for clusterwise metric?

References N. Bansal et al. Correlation Clustering. FOCS 02 Yuri Boykov et al. Fast Approximate Energy Minimization via Graph Cuts. ICCV A. Culotta and A. McCallum. Practical Markov logic containing first- order quantifiers with application to identity uncertainty. Technical Report IR-430, University of Massachusetts, September A. McCallum and B. Wellner. Conditional models of identity uncertainty with applications to proper noun coreference. NIPS 2004 B. Milch et. al. BLOG: Relational modeling with unknown objects. Statistical Relational Learning Workshop. ICML 2004.