Unsupervised Relation Detection using Automatic Alignment of Query Patterns extracted from Knowledge Graphs and Query Click Logs Panupong PasupatDilek.

Slides:

Advertisements

Similar presentations

Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki

Advertisements

Learning to Suggest: A Machine Learning Framework for Ranking Query Suggestions Date: 2013/02/18 Author: Umut Ozertem, Olivier Chapelle, Pinar Donmez,

Date: 2012/8/13 Source: Luca Maria Aiello. al(CIKM’11) Advisor: Jia-ling, Koh Speaker: Jiun Jia, Chiou Behavior-driven Clustering of Queries into Topics.

Large-Scale Entity-Based Online Social Network Profile Linkage.

1.Accuracy of Agree/Disagree relation classification. 2.Accuracy of user opinion prediction. 1.Task extraction performance on Bing web search log with.

Machine learning continued Image source:

Problem Semi supervised sarcasm identification using SASI

Max-Margin Matching for Semantic Role Labeling David Vickrey James Connor Daphne Koller Stanford University.

Toward Whole-Session Relevance: Exploring Intrinsic Diversity in Web Search Date: 2014/5/20 Author: Karthik Raman, Paul N. Bennett, Kevyn Collins-Thompson.

1 Unsupervised Semantic Parsing Hoifung Poon and Pedro Domingos EMNLP 2009 Best Paper Award Speaker: Hao Xiong.

GENERATING AUTOMATIC SEMANTIC ANNOTATIONS FOR RESEARCH DATASETS AYUSH SINGHAL AND JAIDEEP SRIVASTAVA CS DEPT., UNIVERSITY OF MINNESOTA, MN, USA.

Research topics Semantic Web - Spring 2007 Computer Engineering Department Sharif University of Technology.

Topics in AI: Applied Natural Language Processing Information Extraction and Recommender Systems for Video Games Supervised by Dr. Noriko Tomuro Fall –

An Overview of Text Mining Rebecca Hwa 4/25/2002 References M. Hearst, “Untangling Text Data Mining,” in the Proceedings of the 37 th Annual Meeting of.

تمرين شماره 1 درس NLP سيلابس درس NLP در دانشگاه هاي ديگر ___________________________ راحله مکي استاد درس: دکتر عبدالله زاده پاييز 85.

Learning Programs Danielle and Joseph Bennett (and Lorelei) 4 December 2007.

Introduction to Machine Learning Approach Lecture 5.

Introduction to machine learning

Artificial Intelligence Research Centre Program Systems Institute Russian Academy of Science Pereslavl-Zalessky Russia.

Information Retrieval in Practice

Predicting Missing Provenance Using Semantic Associations in Reservoir Engineering Jing Zhao University of Southern California Sep 19 th,

Advisor: Hsin-Hsi Chen Reporter: Chi-Hsin Yu Date:

Intrusion Detection Jie Lin. Outline Introduction A Frame for Intrusion Detection System Intrusion Detection Techniques Ideas for Improving Intrusion.

Language Identification of Search Engine Queries Hakan Ceylan Yookyung Kim Department of Computer Science Yahoo! Inc. University of North Texas 2821 Mission.

Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.

MACHINE LEARNING 張銘軒譚恆力 1. OUTLINE OVERVIEW HOW DOSE THE MACHINE “ LEARN ” ? ADVANTAGE OF MACHINE LEARNING ALGORITHM TYPES  SUPERVISED.

Some Thoughts on HPC in Natural Language Engineering Steven Bird University of Melbourne & University of Pennsylvania.

Mining the Semantic Web: Requirements for Machine Learning Fabio Ciravegna, Sam Chapman Presented by Steve Hookway 10/20/05.

A Survey for Interspeech Xavier Anguera Information Retrieval-based Dynamic TimeWarping.

A Simple Unsupervised Query Categorizer for Web Search Engines Prashant Ullegaddi and Vasudeva Varma Search and Information Extraction Lab Language Technologies.

 Text Representation & Text Classification for Intelligent Information Retrieval Ning Yu School of Library and Information Science Indiana University.

This work is supported by the Intelligence Advanced Research Projects Activity (IARPA) via Department of Interior National Business Center contract number.

Presenter: Lung-Hao Lee ( 李龍豪 ) January 7, 309.

1 CSC 8520 Spring Paula Matuszek Kinds of Machine Learning Machine learning techniques can be grouped into several categories, in several ways: –What.

Unsupervised Constraint Driven Learning for Transliteration Discovery M. Chang, D. Goldwasser, D. Roth, and Y. Tu.

1 Machine Learning 1.Where does machine learning fit in computer science? 2.What is machine learning? 3.Where can machine learning be applied? 4.Should.

Combining multiple learners Usman Roshan. Bagging Randomly sample training data Determine classifier C i on sampled data Goto step 1 and repeat m times.

Indirect Supervision Protocols for Learning in Natural Language Processing II. Learning by Inventing Binary Labels This work is supported by DARPA funding.

BioRAT: Extracting Biological Information from Full-length Papers David P.A. Corney, Bernard F. Buxton, William B. Langdon and David T. Jones Bioinformatics.

Date : 2013/03/18 Author : Jeffrey Pound, Alexander K. Hudek, Ihab F. Ilyas, Grant Weddell Source : CIKM’12 Speaker : Er-Gang Liu Advisor : Prof. Jia-Ling.

1 A Web Search Engine-Based Approach to Measure Semantic Similarity between Words Presenter: Guan-Yu Chen IEEE Trans. on Knowledge & Data Engineering,

Exploiting Relevance Feedback in Knowledge Graph Search

DeepDive Model Dongfang Xu Ph.D student, School of Information, University of Arizona Dec 13, 2015.

Post-Ranking query suggestion by diversifying search Chao Wang.

TWC Illuminate Knowledge Elements in Geoscience Literature Xiaogang (Marshall) Ma, Jin Guang Zheng, Han Wang, Peter Fox Tetherless World Constellation.

Copyright © 2013 by Educational Testing Service. All rights reserved. Evaluating Unsupervised Language Model Adaption Methods for Speaking Assessment ShaSha.

 Ontology Induction (Chen et al., 2013 & 2014) Frame-semantic parsing on ASR results (Das et al., 2013) frame  slot candidate lexical unit  slot filler.

BOOTSTRAPPING INFORMATION EXTRACTION FROM SEMI-STRUCTURED WEB PAGES Andrew Carson and Charles Schafer.

Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:

Hypertext Categorization using Hyperlink Patterns and Meta Data Rayid Ghani Séan Slattery Yiming Yang Carnegie Mellon University.

1 ICASSP Paper Survey Presenter: Chen Yi-Ting. 2 Improved Spoken Document Retrieval With Dynamic Key Term Lexicon and Probabilistic Latent Semantic Analysis.

Overview of Statistical NLP IR Group Meeting March 7, 2006.

Concept-Based Analysis of Scientific Literature Chen-Tse Tsai, Gourab Kundu, Dan Roth UIUC.

Introduction to Machine Learning August, 2014 Vũ Việt Vũ Computer Engineering Division, Electronics Faculty Thai Nguyen University of Technology.

Dan Roth University of Illinois, Urbana-Champaign 7 Sequential Models Tutorial on Machine Learning in Natural.

Short Text Similarity with Word Embedding Date: 2016/03/28 Author: Tom Kenter, Maarten de Rijke Source: CIKM’15 Advisor: Jia-Ling Koh Speaker: Chih-Hsuan.

Information Extractors Hassan A. Sleiman. Author Cuba Spain Lebanon.

Network Management Lecture 13. MACHINE LEARNING TECHNIQUES 2 Dr. Atiq Ahmed Université de Balouchistan.

Applying Deep Neural Network to Enhance EMPI Searching

Sentence Modeling Representation of sentences is the heart of Natural Language Processing A sentence model is a representation and analysis of semantic.

DATA MINING © Prentice Hall.

Basic Intro Tutorial on Machine Learning and Data Mining

Social Knowledge Mining

Intent-Aware Semantic Query Annotation

Introduction Task: extracting relational facts from text

Review-Level Aspect-Based Sentiment Analysis Using an Ontology

Automatic Detection of Causal Relations for Question Answering

Intent-Aware Semantic Query Annotation

Rachit Saluja 03/20/2019 Relation Extraction with Matrix Factorization and Universal Schemas Sebastian Riedel, Limin Yao, Andrew.

Introduction Dataset search

Presentation transcript:

Unsupervised Relation Detection using Automatic Alignment of Query Patterns extracted from Knowledge Graphs and Query Click Logs Panupong PasupatDilek Hakkani-Tür Stanford UniversityMicrosoft Research

Spoken Language Understanding (SLU)  Input: Transcribed query (e.g., “Who played Jake Sully in Avatar”)  Output: Semantic information (e.g., dialog acts, slot values, relations) Speech Recognition Spoken Language Understanding Dialog Management Natural Language Generation Speech Synthesis

Knowledge Graph Relations  A knowledge graph contains entities and relations. Avatar Action Sci-fi James Cameron Sam Worthington Jake Sully genre directed by Initial release date starring actor character

Knowledge Graph Relations  A knowledge graph contains entities and relations.  Determining the correct KG relations is an important step toward finding the correct response to a query. Avatar ??? Jake Sully starring actor character “Who played Jake Sully in Avatar”

Task: Relation Detection  Inputs: ◦ Natural language query “Who played Jake Sully in Avatar” ◦ KG relations of interest  Output: ◦ List of all KG relations expressed in the query acted by, movie character, character name, movie name

Types of Relations  Explicit Relations Who playedJake Sullyin Avatar character name ◦ The value of the relation is in the query ◦ Very similar to semantic slots  Implicit Relations Who playedJake Sully in Avatar movie actor ◦ The value of the relation is not explicitly stated

Approach 1.Mine queries related to the entities of interest 2.Infer explicit and implicit relations in the mined queries 3.Use the annotated queries to train a classifier

Approach 1.Mine queries related to the entities of interest 2.Infer explicit and implicit relations in the mined queries 3.Use the annotated queries to train a classifier

Mining Entities  Given a domain of interest (e.g., movie), we will mine relevant entities from KGs. Avatar  Start with entities from the central type (e.g., movie).

Mining Entities  Given a domain of interest (e.g., movie), we will mine relevant entities from KGs. Avatar Action Sci-fi James Cameron 2009 Sam Worthington Jake Sully genre directed by Initial release date starring actor character  Traverse edges in KG to get a related entities. (All entities shown here, including Avatar itself, are valid entities.) (identity)

Mining Queries  After we get an entity of interest, we mine queries that are related to that entity. Avatar James Cameron directed by

Query Click Log (QCL)  Our queries come from query click logs (QCLs).  A query click log is a weighted graph between search queries and URLs that the search engine users click on. james cameron movies cameron 2009 movie avatar nm en.wikipedia.org/wiki/ Avatar_(2009_film)

Mining Queries  Method 1: Construct seed queries (by applying templates on the entity), and then traverse the QCL twice. Avatar James Cameron directed by james cameron films action movies by james cameron …  Does not perform as well as expected due to lexical ambiguities (e.g., comic character Flash  “flash movie”).

Mining Queries  Method 2: Get URLs of the entity from KG, and then traverse the QCL once. Avatar James Cameron directed by action movies by james cameron  Gives better queries in general, but cannot be applied to some entity types (e.g., dates like 2009). en.wikipedia.org/ James_Cameron URL We will use this method in the experiments.

Approach 1.Mine queries related to the entities of interest 2.Infer explicit and implicit relations in the mined queries 3.Use the annotated queries to train a classifier

Inferring Explicit Relations Avatar (identity) Who played Jake Sully in Avatar mined from QCL

Inferring Explicit Relations  Idea: If a query is mined from an entity e, it should explicitly contain either some other entities related to e, or e itself. Avatar (identity) Who played Jake Sully in Avatar mined from QCL Jake Sully starring.character e = character name

Inferring Explicit Relations  Idea: If a query is mined from an entity e, it should explicitly contain either some other entities related to e, or e itself. Avatar (identity) Who played Jake Sully in Avatar mined from QCL Avatar (identity) movie name e =

Inferring Explicit Relations  Bonus: By inferring all explicit relations, we get an automatic slot annotation. Avatar (identity) Who played Jake Sully in Avatar mined from QCL Avatar (identity) movie name e = character name

Inferring Implicit Relations  Sometimes the entity e is absent from the query. Avatar James Cameron director Who directed the movie Avatar mined from QCL e = ∉

Inferring Implicit Relations  Idea: If the entity e is absent from the query, then we infer that e is the object of an implicit relation. Avatar James Cameron director Who directed the movie Avatar mined from QCL e = directed by ∉

Inferring Implicit Relations  Bonus: By collapsing entities related to e into placeholders, we get generic patterns for implicit relations. Avatar James Cameron director Who directed the movie [film] mined from QCL e = directed by ∉

Inferring Implicit Relations  Bonus: By collapsing entities related to e into placeholders, we get generic patterns for implicit relations. directed byacted by director of [film][profession] in [film] who directed [film][character] from [film] [film] the moviewho played [character] [film] directorcast of [film] Example Frequent Patterns

Approach 1.Mine queries related to the entities of interest 2.Infer explicit and implicit relations in the mined queries (produces 2 datasets: D E for inferred explicit relations and D I for inferred implicit relations) 3.Use the annotated queries to train a classifier

Approach 1.Mine queries related to the entities of interest 2.Infer explicit and implicit relations in the mined queries (produces 2 datasets: D E for inferred explicit relations and D I for inferred implicit relations) 3.Use the annotated queries to train a classifier i.Train an implicit relation classifier on D I ii.Apply the implicit relation classifier on queries in D E and augment the predicted implicit relations to D E iii.Train a final classifier on the augmented D E (Classifiers are multiclass multilabel linear classifiers trained using AdaBoost on decision tree stumps.)

Experiments  Dataset: ◦ Movie domain relation dataset (Hakkani-Tür et al., 2014) ◦ 3338 training / 1084 test ◦ Features: n-grams + weighted gazetteers

Main Results ClassifierMicro F1 Majority27.6 Chen et al., SLT 2014 (also unsupervised)43.3 Mine queries with URLs from KG trained on D E only42.7 trained on D I only29.3 final classifier55.5  Both datasets ( D E and D I ) help boost the performance of the final classifier.

Main Results ClassifierMicro F1 Majority27.6 Chen et al., SLT 2014 (also unsupervised)43.3 Mine queries with URLs from KG trained on D E only42.7 trained on D I only29.3 final classifier55.5 supervised86.0 semi-supervised (self-training)86.5  The bootstrapped classifier also improves the accuracy of the full supervised model.

Conclusion  We have presented techniques for: 1. Mining queries related to the domain of interest. 2. Infer explicit and implicit relations in the mined queries. 3. Train a classifier to detect both types of relations without any hand-labeled data.  As by-products, we also get automatic slot annotations and implicit relation patterns. Thank you!