Type-directed Topic Segmentation of Entity Descriptions

Slides:



Advertisements
Similar presentations
Yansong Feng and Mirella Lapata
Advertisements

Query Chain Focused Summarization Tal Baumel, Rafi Cohen, Michael Elhadad Jan 2014.
Towards Twitter Context Summarization with User Influence Models Yi Chang et al. WSDM 2013 Hyewon Lim 21 June 2013.
Automatic Discovery of Useful Facet Terms Wisam Dakka – Columbia University Rishabh Dayal – Columbia University Panagiotis G. Ipeirotis – NYU.
Web search results clustering Web search results clustering is a version of document clustering, but… Billions of pages Constantly changing Data mainly.
BuzzTrack Topic Detection and Tracking in IUI – Intelligent User Interfaces January 2007 Keno Albrecht ETH Zurich Roger Wattenhofer.
ADVISE: Advanced Digital Video Information Segmentation Engine
Sensemaking and Ground Truth Ontology Development Chinua Umoja William M. Pottenger Jason Perry Christopher Janneck.
LinkSelector: A Web Mining Approach to Hyperlink Selection for Web Portals Xiao Fang University of Arizona 10/18/2002.
Chapter 2Modeling 資工 4B 陳建勳. Introduction.  Traditional information retrieval systems usually adopt index terms to index and retrieve documents.
Finding the Story — Generating Large- Scale Document Structure in Semantics-to-Hypermedia Transformation Lloyd Rutledge CWI, Amsterdam.
PROMPT: Algorithm and Tool for Automated Ontology Merging and Alignment Natalya Fridman Noy and Mark A. Musen.
Scalable Text Mining with Sparse Generative Models
A fuzzy video content representation for video summarization and content-based retrieval Anastasios D. Doulamis, Nikolaos D. Doulamis, Stefanos D. Kollias.
Page 1 Building Reliable Component-based Systems Chapter 17 - Architectural Support for Reuse Chapter 17 Architectural Support for Reuse.
Distributed Information Retrieval Jamie Callan Carnegie Mellon University
IBM User Technology March 2004 | Dynamic Navigation in DITA © 2004 IBM Corporation Dynamic Navigation in DITA Erik Hennum and Robert Anderson.
1 Prototype Hierarchy Based Clustering for the Categorization and Navigation of Web Collections Zhao-Yan Ming, Kai Wang and Tat-Seng Chua School of Computing,
Semantic Interoperability Jérôme Euzenat INRIA & LIG France Natasha Noy Stanford University USA.
Chapter 15: Informational Reading
Web Usage Mining with Semantic Analysis Date: 2013/12/18 Author: Laura Hollink, Peter Mika, Roi Blanco Source: WWW’13 Advisor: Jia-Ling Koh Speaker: Pei-Hao.
An Integrated Approach to Extracting Ontological Structures from Folksonomies Huairen Lin, Joseph Davis, Ying Zhou ESWC 2009 Hyewon Lim October 9 th, 2009.
Exploiting Ontologies for Automatic Image Annotation M. Srikanth, J. Varner, M. Bowden, D. Moldovan Language Computer Corporation
Web Document Clustering By Sang-Cheol Seok. 1.Introduction: Web document clustering? Why ? Two results for the same query ‘amazon’ Google : currently.
A Two Tier Framework for Context-Aware Service Organization & Discovery Wei Zhang 1, Jian Su 2, Bin Chen 2,WentingWang 2, Zhiqiang Toh 2, Yanchuan Sim.
Newsjunkie: Providing Personalized Newsfeeds via Analysis of Information Novelty Gabrilovich et.al WWW2004.
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
Data Mining Chapter 1 Introduction -- Basic Data Mining Tasks -- Related Concepts -- Data Mining Techniques.
A Hierarchical Monothetic Document Clustering Algorithm for Summarization and Browsing Search Results Kummamuru et al. Presented by Bei Yu Sept. 22 nd,
Course grading Project: 75% Broken into several incremental deliverables Paper appraisal/evaluation/project tool evaluation in earlier May: 25%
2007. Software Engineering Laboratory, School of Computer Science S E Web-Harvest Web-Harvest: Open Source Web Data Extraction tool 이재정 Software Engineering.
Ground Truth Free Evaluation of Segment Based Maps Rolf Lakaemper Temple University, Philadelphia,PA,USA.
Enhancing Cluster Labeling Using Wikipedia David Carmel, Haggai Roitman, Naama Zwerdling IBM Research Lab (SIGIR’09) Date: 11/09/2009 Speaker: Cho, Chin.
1 Web-Page Summarization Using Clickthrough Data* JianTao Sun, Yuchang Lu Dept. of Computer Science TsingHua University Beijing , China Dou Shen,
KU NLP Machine Learning1 Ch 9. Machine Learning: Symbol- based  9.0 Introduction  9.1 A Framework for Symbol-Based Learning  9.2 Version Space Search.
Understanding User Goals in Web Search University of Seoul Computer Science Database Lab. Min Mi-young.
Semantic web Bootstrapping & Annotation Hassan Sayyadi Semantic web research laboratory Computer department Sharif university of.
A Knowledge-Based Search Engine Powered by Wikipedia David Milne, Ian H. Witten, David M. Nichols (CIKM 2007)
1 Generating Comparative Summaries of Contradictory Opinions in Text (CIKM09’)Hyun Duk Kim, ChengXiang Zhai 2010/05/24 Yu-wen,Hsu.
Acquisition of Categorized Named Entities for Web Search Marius Pasca Google Inc. from Conference on Information and Knowledge Management (CIKM) ’04.
Finding document topics for improving topic segmentation Source: ACL2007 Authors: Olivier Ferret (18 route du Panorama, BP6) Reporter:Yong-Xiang Chen.
CP3024 Lecture 12 Search Engines. What is the main WWW problem?  With an estimated 800 million web pages finding the one you want is difficult!
Clustering (Search Engine Results) CSE 454. © Etzioni & Weld To Do Lecture is short Add k-means Details of ST construction.
Cruising the Semantic Web with Noadster Lloyd Rutledge and Jacco van Ossenbruggen With Noadster, we demonstrate a hypermedia-oriented Semantic Web browser.
Ontology Engineering and Feature Construction for Predicting Friendship Links in the Live Journal Social Network Author:Vikas Bahirwani 、 Doina Caragea.
Session 2 Welcome: The sixth learning sequence
System for Semi-automatic ontology construction
Guangbing Yang Presentation for Xerox Docushare Symposium in 2011
Data Mining Jim King.
Presented by: Hassan Sayyadi
Summarizing Entities: A Survey Report
Architecture for ICD 11 and SNOMED CT Harmonization
Ontology Partition for Browsing
Explore. Discover. Focus.
Linearization Generation
Presented by: Prof. Ali Jaoua
Adaptive entity resolution with human computation
ece 627 intelligent web: ontology and beyond
An Empirical Study of Property Collocation on Large Scale of Knowledge Base 龚赛赛
Teaching Java with the assistance of harvester and pedagogical agents
Nov. 29, 2001 Ontology Based Recognition of Complex Objects --- Problems to be Solved Develop Base Object Recognition algorithms that identify non-decomposable.
Qingxia Liu Interactive Hierarchical Tag Clouds for Summarizing Spatiotemporal Social Contents [ICDE 2014] Kang, Wei, Anthony KH Tung,
Music Computer & New Media.
Text Categorization Berlin Chen 2003 Reference:
An Approach to Abstractive Multi-Entity Summarization
Clustering Large Datasets in Arbitrary Metric Space
Semantic Navigation over Linked Data Using the Link Pattern Space
Embedding based entity summarization
Introduction Dataset search
CoXML: A Cooperative XML Query Answering System
Presentation transcript:

Type-directed Topic Segmentation of Entity Descriptions 龚赛赛 2015-03-16

Contents Background and Motivation Related Work Method Framework

Background The descriptions of an entity in Linked Data are usually about more than one topic e.g. family and academic info of a researcher Topic segmentation: split entity descriptions into a sequence of topically coherent segments Useful for many tasks, including entity browsing and entity summarization

Background Manually segmentation spends humans large amount of time and energy Several works leverage a variety of cues for automatic segmentation e.g., Property name relatedness, property value overlap, relatedness derived from property axioms, distributional relatedness

Motivation Existing works mainly follow the paradigm: firstly characterize relatedness between/among descriptions using various measures, and then derive segmentations by using clustering algorithms The performances rely heavily on the setting of the clustering algorithms and remain far from perfect

Motivation Entity types contain important cues for segmentation E.g. types of Arnold Schwarzenegger: person, politician, artist, body builder, etc. In our work, we propose a new approach to text segmentation for entity browsing, which use the entity types to guide segmentation Select a subset of entity types that are sensible, have a high coverage rate of descriptions, and allocate entity descriptions to these types to form the initial segmentations Split the large segmentations in the initial ones

Related work Various property relatedness measures Manually segmentation Haystack, Marble, Fresnel Spend users a lot of time and energy Various property relatedness measures E.g. Property name, wordnet based, wikipedia based, search engine based, value overlap, property axiom based, and so on Cues to determine segmentation Lloyd Rutledge et al1. mainly use property value overlap and formal concept analysis to generate FACES2 use topic segmentation to improve entity summarization Making RDF Presentable. www FACES: Diversity-Aware Entity Summarization using Incremental Hierarchical Conceptual Clustering. aaai

Method Framework Entity description: a property with its value set Segmentation: a set of descriptions Given a set of entity descriptions of an entity , get a sequence of segmentations

Method Framework Cover: a property’s domain is superclass of a type Sensible subset with high coverage The size of subset is limited, at most k A type covering suitable number of properties is preferred, i.e. 1/k A type in the deeper position of the type hierarchy is preferred Each property is allocated to a type covering it

Method Framework EBMC Property weight:1 The grade of membership: based on the distance of the property domain and the type

Method Framework Split large initial segmentations Linear combination of measures and clustering Measures: property name I-sub, wordnet relatedness, property value overlap, distributional relatedness Clustering algorithm: DBSCAN