Enriching Structured Knowledge with Open Information

Slides:



Advertisements
Similar presentations
Understanding Tables on the Web Jingjing Wang. Problem to Solve A wealth of information in the World Wide Web Not easy to access or process by machine.
Advertisements

Knowledge Base Completion via Search-Based Question Answering
Overview of the TAC2013 Knowledge Base Population Evaluation: English Slot Filling Mihai Surdeanu with a lot help from: Hoa Dang, Joe Ellis, Heng Ji, and.
Linked data: P redicting missing properties Klemen Simonic, Jan Rupnik, Primoz Skraba {klemen.simonic, jan.rupnik,
Linking Named Entity in Tweets with Knowledge Base via User Interest Modeling Date : 2014/01/22 Author : Wei Shen, Jianyong Wang, Ping Luo, Min Wang Source.
A Linguistic Approach for Semantic Web Service Discovery International Symposium on Management Intelligent Systems 2012 (IS-MiS 2012) July 13, 2012 Jordy.
Sequence Clustering and Labeling for Unsupervised Query Intent Discovery Speaker: Po-Hsien Shih Advisor: Jia-Ling Koh Source: WSDM’12 Date: 1 November,
Leveraging Community-built Knowledge For Type Coercion In Question Answering Aditya Kalyanpur, J William Murdock, James Fan and Chris Welty Mehdi AllahyariSpring.
Query Dependent Pseudo-Relevance Feedback based on Wikipedia SIGIR ‘09 Advisor: Dr. Koh Jia-Ling Speaker: Lin, Yi-Jhen Date: 2010/01/24 1.
WIMS 2014, Thessaloniki, June 2014 A soft frequent pattern mining approach for textual topic detection Georgios Petkos, Symeon Papadopoulos, Yiannis Kompatsiaris.
Integrating Bayesian Networks and Simpson’s Paradox in Data Mining Alex Freitas University of Kent Ken McGarry University of Sunderland.
A Statistical and Schema Independent Approach to Identify Equivalent Properties on Linked Data † Kno.e.sis Center Wright State University Dayton OH, USA.
OMAP: An Implemented Framework for Automatically Aligning OWL Ontologies SWAP, December, 2005 Raphaël Troncy, Umberto Straccia ISTI-CNR
An Integrated Approach to Extracting Ontological Structures from Folksonomies Huairen Lin, Joseph Davis, Ying Zhou ESWC 2009 Hyewon Lim October 9 th, 2009.
Tables to Linked Data Zareen Syed, Tim Finin, Varish Mulwad and Anupam Joshi University of Maryland, Baltimore County
Reyyan Yeniterzi Weakly-Supervised Discovery of Named Entities Using Web Search Queries Marius Pasca Google CIKM 2007.
The MMI Tools Carlos Rueda Monterey Bay Aquarium Research Institute OOS Semantic Interoperability Workshop Marine Metadata Interoperability Project Boulder,
A Survey for Interspeech Xavier Anguera Information Retrieval-based Dynamic TimeWarping.
Three methods developed for these objectives Based on machine learning and supervised learning Under the evolutionary paradigm –specifically Genetic Programming.
Ontology-Driven Automatic Entity Disambiguation in Unstructured Text Jed Hassell.
How do we Collect Data for the Ontology? AmphibiaTree 2006 Workshop Saturday 11:30–11:45 J. Leopold.
Bug Localization with Machine Learning Techniques Wujie Zheng
Semantic Enrichment of Ontology Mappings: A Linguistic-based Approach Patrick Arnold, Erhard Rahm University of Leipzig, Germany 17th East-European Conference.
An Ontological Framework for Web Service Processes By Claus Pahl and Ronan Barrett.
Q2Semantic: A Lightweight Keyword Interface to Semantic Search Haofen Wang 1, Kang Zhang 1, Qiaoling Liu 1, Thanh Tran 2, and Yong Yu 1 1 Apex Lab, Shanghai.
A Novel Local Patch Framework for Fixing Supervised Learning Models Yilei Wang 1, Bingzheng Wei 2, Jun Yan 2, Yang Hu 2, Zhi-Hong Deng 1, Zheng Chen 2.
BioSnowball: Automated Population of Wikis (KDD ‘10) Advisor: Dr. Koh, Jia-Ling Speaker: Lin, Yi-Jhen Date: 2010/11/30 1.
Module networks Sushmita Roy BMI/CS 576 Nov 18 th & 20th, 2014.
A Scalable Machine Learning Approach for Semi-Structured Named Entity Recognition Utku Irmak(Yahoo! Labs) Reiner Kraft(Yahoo! Inc.) WWW 2010(Information.
THE SUPPORTING ROLE OF ONTOLOGY IN A SIMULATION SYSTEM FOR COUNTERMEASURE EVALUATION Nelia Lombard DPSS, CSIR.
Shridhar Bhalerao CMSC 601 Finding Implicit Relations in the Semantic Web.
Named Entity Disambiguation on an Ontology Enriched by Wikipedia Hien Thanh Nguyen 1, Tru Hoang Cao 2 1 Ton Duc Thang University, Vietnam 2 Ho Chi Minh.
Finding frequent and interesting triples in text Janez Brank, Dunja Mladenić, Marko Grobelnik Jožef Stefan Institute, Ljubljana, Slovenia.
Linked Data Profiling Andrejs Abele National University of Ireland, Galway Supervisor: Paul Buitelaar.
DeepDive Introduction Dongfang Xu Ph.D student, School of Information, University of Arizona Sept 10, 2015.
DeepDive Model Dongfang Xu Ph.D student, School of Information, University of Arizona Dec 13, 2015.
Introduction to Data Mining by Yen-Hsien Lee Department of Information Management College of Management National Sun Yat-Sen University March 4, 2003.
TWC Illuminate Knowledge Elements in Geoscience Literature Xiaogang (Marshall) Ma, Jin Guang Zheng, Han Wang, Peter Fox Tetherless World Constellation.
Acquisition of Categorized Named Entities for Web Search Marius Pasca Google Inc. from Conference on Information and Knowledge Management (CIKM) ’04.
Finding document topics for improving topic segmentation Source: ACL2007 Authors: Olivier Ferret (18 route du Panorama, BP6) Reporter:Yong-Xiang Chen.
1 Random Walks on the Click Graph Nick Craswell and Martin Szummer Microsoft Research Cambridge SIGIR 2007.
Semantic search-based image annotation Petra Budíková, FI MU CEMI meeting, Plzeň,
GoRelations: an Intuitive Query System for DBPedia Lushan Han and Tim Finin 15 November 2011
An Introduction to Markov Logic Networks in Knowledge Bases
Designing Cross-Language Information Retrieval System using various Techniques of Query Expansion and Indexing for Improved Performance  Hello everyone,
A Brief Introduction to Distant Supervision
An Empirical Study of Learning to Rank for Entity Search
Kenneth Baclawski et. al. PSB /11/7 Sa-Im Shin
Presented by: Hassan Sayyadi
Reading Report: Open QA Systems
Generating Natural Answers by Incorporating Copying and Retrieving Mechanisms in Sequence-to-Sequence Learning Shizhu He, Cao liu, Kang Liu and Jun Zhao.
Department of Computer Science
HowToKB: Mining HowTo Knowledge from Online Communities
Wikitology Wikipedia as an Ontology
Critical Issues with Respect to Clustering
Enhanced Dependency Jiajie Yu Wentao Ding.
Extracting Semantic Concept Relations
An Empirical Study of Property Collocation on Large Scale of Knowledge Base 龚赛赛
Introduction Task: extracting relational facts from text
[jws13] Evaluation of instance matching tools: The experience of OAEI
A Graph-Based Approach to Learn Semantic Descriptions of Data Sources
Effective Entity Recognition and Typing by Relation Phrase-Based Clustering
Intent-Aware Semantic Query Annotation
RDA cataloguing and linked data
Information Networks: State of the Art
Leverage Consensus Partition for Domain-Specific Entity Coreference
Enriching Taxonomies With Functional Domain Knowledge
ProBase: common Sense Concept KB and Short Text Understanding
Topic: Semantic Text Mining
CoXML: A Cooperative XML Query Answering System
Presentation transcript:

Enriching Structured Knowledge with Open Information

Outline Introduction Related Work Overview Clustering & Mapping Experiments

Introduction State-0f-the-art IE systems Differences Focus NELL, REVERB, OLLIE, … (OIE) YAGO, DBPEDIA, FREEBASE, … (KB) Differences OIE: no fixed schemata, unstructured KB: assertions, URI, ontology Focus Mapping(OIE -> ontology) Results: precise & unambiguous assertions

Introduction Scenario Example Domain unlimited Applicable to NL: rel(s, o) Example REVERB fact input: is a town in (Croydon, London) Target KB: DBPEDIA Mappings is a town in - > dbo:country Croydon -> db:Croydon, London -> db:London Assertion output: dbo:county(db:Croydon, db:London)

Introduction Proplems Strategy Contributions Polysemous, Ambiguity Multi-references Strategy Relation phrases clustering Contributions Modularised mapping workflow: OIE -> KB Markov clustering Feedback: improve overall results’ quality

Related Work Matching Instances Knowledge Base Constructions & Debugging Distant Supervision based Approaches Semantifying Open Information

Overview Framework

Overview Modules instance matching (IM) look up (LU) clustering (CL) property mapping (PM)

Module Description IM Input: OIE facts Output: mapping[s & o terms -> Dbpedia entities] Working mechanism(Disambiguation) OIE Instances Possible referred entities(in KB Dbpedia) Candidate matching: probabilistic ranking based on KB relation pattern (domain/range) Filtering(MAP state)

Module Description LU Process Function: search for facts in target KB Input: set of instance mappings from IM Process OIE fact f, subject x, object y Search for KB assertions relating x & y Judge f+: facts with KB assertions(as PM mapping evidences) f-: facts without KB assertions(translated to Dbpedia vocabulary, KB extension)

Module Description CL Input: OIE facts 3 different clusters Output wf1: Trivial, a relational phrase=one element cluster wf2: Non-trivial, without Dbpedia seeds(properties) wf3: Non-trivial, with Dbpedia seeds Output wf1: clusters of similar relational phrases wf2,wf3: clusters(forward to IM)

Module Description PM Aim: map[relation phrase (OIE properties)-> object property(KB properties)] Mechanism Association rule(frequent rule pattern) mining rel -> (domain, range) Input: f+ Evidences for association rules formation Evidences for possible mapping Output: set of property mappings

Clustering & Mapping Similarity Metrics jac(), jaccard similarity wn(), WordNet similarity β, weighing factor, β∈[0, 1]

Clustering & Mapping Markov Clustering Node: rel-phrase Edge: affinity score Transition probability Mechanism Random walk(markov) Iterate to steady state probability distribution Strong link stronger Weak link weaker

Clustering & Mapping Markov Clustering Inflation Choice of I Parameter inflation factor, I Choice of I Set too small, cluster coarse, vice fersa Reasonable I, some final cluster had sub-clusters

Clustering & Mapping PM Strategies Pairwise similarity wf1-REVERB rel-phrases one element cluster, map to Dbpedia property wf2-extension of wf1, cluster REVERB rel-ph wf3-add Dbpedia properties as seeds clustered with REVERB rel-phrases, use markov clustering Pairwise similarity

Experiments Dataset REVERB ClubWeb Extractions confidence score >= 95%, remove facts with numeric expressions: 3.5 million triples with 474325 rel-phs 500 most frequent REVERB properties 100 most frequent Dbpedia properties

Experiments-Evaluation Metric S: cluster score comp(ci) : intra-cluster sparseness iso(C): inter-cluster sparseness The comp(ci) higher, iso(C) lower, the better.

Experiments-Evaluation Analysis Control parameter

Experiments-Evaluation

THANKS!