Trank: Ranking Entity Types Using the Web of Data

Slides:



Advertisements
Similar presentations
CONTRIBUTIONS Ground-truth dataset Simulated search tasks environment Multiple everyday applications (MS Word, MS PowerPoint, Mozilla Browser) Implicit.
Advertisements

Which Feature Location Technique is Better? Emily Hill, Alberto Bacchelli, Dave Binkley, Bogdan Dit, Dawn Lawrie, Rocco Oliveto.
Evaluating Novelty and Diversity Charles Clarke School of Computer Science University of Waterloo two talks in one!
Introduction to Information Retrieval
Super Awesome Presentation Dandre Allison Devin Adair.
DQR : A Probabilistic Approach to Diversified Query recommendation Date: 2013/05/20 Author: Ruirui Li, Ben Kao, Bin Bi, Reynold Cheng, Eric Lo Source:
Collaborative QoS Prediction in Cloud Computing Department of Computer Science & Engineering The Chinese University of Hong Kong Hong Kong, China Rocky.
Exploring Reduction for Long Web Queries Niranjan Balasubramanian, Giridhar Kuamaran, Vitor R. Carvalho Speaker: Razvan Belet 1.
GENERATING AUTOMATIC SEMANTIC ANNOTATIONS FOR RESEARCH DATASETS AYUSH SINGHAL AND JAIDEEP SRIVASTAVA CS DEPT., UNIVERSITY OF MINNESOTA, MN, USA.
Sensemaking and Ground Truth Ontology Development Chinua Umoja William M. Pottenger Jason Perry Christopher Janneck.
Digital Library Service Integration (DLSI) --> Looking for Collections and Services to be DLSI Testbeds
J. Chen, O. R. Zaiane and R. Goebel An Unsupervised Approach to Cluster Web Search Results based on Word Sense Communities.
Affinity Rank Yi Liu, Benyu Zhang, Zheng Chen MSRA.
Undue Influence: Eliminating the Impact of Link Plagiarism on Web Search Rankings Baoning Wu and Brian D. Davison Lehigh University Symposium on Applied.
SEEKING STATEMENT-SUPPORTING TOP-K WITNESSES Date: 2012/03/12 Source: Steffen Metzger (CIKM’11) Speaker: Er-gang Liu Advisor: Dr. Jia-ling Koh 1.
Learning to Classify Short and Sparse Text & Web with Hidden Topics from Large- scale Data Collections Xuan-Hieu PhanLe-Minh NguyenSusumu Horiguchi GSIS,
Improving Web Search Ranking by Incorporating User Behavior Information Eugene Agichtein Eric Brill Susan Dumais Microsoft Research.
A Comparison of Statistical Significance Tests for Information Retrieval Evaluation CIKM´07, November 2007.
Querying Structured Text in an XML Database By Xuemei Luo.
Clustering Top-Ranking Sentences for Information Access Anastasios Tombros, Joemon Jose, Ian Ruthven University of Glasgow & University of Strathclyde.
Personalized Search Xiao Liu
Recommending Twitter Users to Follow Using Content and Collaborative Filtering Approaches John HannonJohn Hannon, Mike Bennett, Barry SmythBarry Smyth.
Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.
Contextual Ranking of Keywords Using Click Data Utku Irmak, Vadim von Brzeski, Reiner Kraft Yahoo! Inc ICDE 09’ Datamining session Summarized.
 Examine two basic sources for implicit relevance feedback on the segment level for search personalization. Eye tracking Display time.
1 01/10/09 1 INFILE CEA LIST ELDA Univ. Lille 3 - Geriico Overview of the INFILE track at CLEF 2009 multilingual INformation FILtering Evaluation.
Qi Guo Emory University Ryen White, Susan Dumais, Jue Wang, Blake Anderson Microsoft Presented by Tetsuya Sakai, Microsoft Research.
21/11/20151Gianluca Demartini Ranking Clusters for Web Search Gianluca Demartini Paul–Alexandru Chirita Ingo Brunkhorst Wolfgang Nejdl L3S Info Lunch Hannover,
Chapter 8 Evaluating Search Engine. Evaluation n Evaluation is key to building effective and efficient search engines  Measurement usually carried out.
Computing Scientometrics in Large-Scale Academic Search Engines with MapReduce Leonidas Akritidis Panayiotis Bozanis Department of Computer & Communication.
Evgeniy Gabrilovich and Shaul Markovitch
Multi-Abstraction Concern Localization Tien-Duy B. Le, Shaowei Wang, and David Lo School of Information Systems Singapore Management University 1.
 Used MapReduce algorithms to process a corpus of web pages and develop required index files  Inverted Index evaluated using TREC measures  Used Hadoop.
Ranking Definitions with Supervised Learning Methods J.Xu, Y.Cao, H.Li and M.Zhao WWW 2005 Presenter: Baoning Wu.
COLLABORATIVE SEARCH TECHNIQUES Submitted By: Shikha Singla MIT-872-2K11 M.Tech(2 nd Sem) Information Technology.
+ User-induced Links in Collaborative Tagging Systems Ching-man Au Yeung, Nicholas Gibbins, Nigel Shadbolt CIKM’09 Speaker: Nonhlanhla Shongwe 18 January.
Performance Measures. Why to Conduct Performance Evaluation? 2 n Evaluation is the key to building effective & efficient IR (information retrieval) systems.
Advantages of Query Biased Summaries in Information Retrieval by A. Tombros and M. Sanderson Presenters: Omer Erdil Albayrak Bilge Koroglu.
UWMS Data Mining Workshop Content Analysis: Automated Summarizing Prof. Marti Hearst SIMS 202, Lecture 16.
Comparing Document Segmentation for Passage Retrieval in Question Answering Jorg Tiedemann University of Groningen presented by: Moy’awiah Al-Shannaq
Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006.
Answer Mining by Combining Extraction Techniques with Abductive Reasoning Sanda Harabagiu, Dan Moldovan, Christine Clark, Mitchell Bowden, Jown Williams.
Is Top-k Sufficient for Ranking? Yanyan Lan, Shuzi Niu, Jiafeng Guo, Xueqi Cheng Institute of Computing Technology, Chinese Academy of Sciences.
26/01/20161Gianluca Demartini Ranking Categories for Faceted Search Gianluca Demartini L3S Research Seminars Hannover, 09 June 2006.
Predicting User Interests from Contextual Information R. W. White, P. Bailey, L. Chen Microsoft (SIGIR 2009) Presenter : Jae-won Lee.
Learning to Rank: From Pairwise Approach to Listwise Approach Authors: Zhe Cao, Tao Qin, Tie-Yan Liu, Ming-Feng Tsai, and Hang Li Presenter: Davidson Date:
Introduction to Information Retrieval Introduction to Information Retrieval Lecture 10 Evaluation.
 Copyright 2008 Digital Enterprise Research Institute. All rights reserved. Digital Enterprise Research Institute From Web 1.0 to Web 2.0.
An Adaptive User Profile for Filtering News Based on a User Interest Hierarchy Sarabdeep Singh, Michael Shepherd, Jack Duffy and Carolyn Watters Web Information.
WP4 Models and Contents Quality Assessment
A Collaborative Quality Ranking Framework for Cloud Components
Queensland University of Technology
Addresses ALL aspects of the task
Neighborhood - based Tag Prediction
An Empirical Study of Learning to Rank for Entity Search
Lecture 10 Evaluation.
إستراتيجيات ونماذج التقويم
EDIUM: Improving Entity Disambiguation via User modelling
Liang Zheng, Yuzhong Qu Nanjing University, China
CS 456 Interactive Software.
Lecture 6 Evaluation.
A Framework for Benchmarking Entity-Annotation Systems
Evaluating Information Retrieval Systems
Measuring Complexity of Web Pages Using Gate
Summarization for entity annotation Contextual summary
deepschema.org: An Ontology for Typing Entities in the Web of Data
Feature Selection for Ranking
Cumulated Gain-Based Evaluation of IR Techniques
1Micheal T. Adenibuyan, 2Oluwatoyin A. Enikuomehin and 2Benjamin S
Click Chain Model in Web Search
Presentation transcript:

Trank: Ranking Entity Types Using the Web of Data Alberto Tonon1 Michele Catasta, Gianluca Demartini Philippe Cudré-Mauroux Karl Aberer

Motivation Entity-centric Type is import More than one type The most relevant types

Task Definition Task: entity type ranking Context types: Entity e, document d, types Te={t1,…tn} relevance to textual context ce from d <rdfs:type>, <owl:sameAs> Context types: Three paragraphs One paragraph Sentence Entity itself

Approaches

Approaches Entity-Centric Context-Aware Hierarchy-Based FREQ,WIKILINK,LABEL Context-Aware SAMETYPE,PATH Hierarchy-Based DEPTH,ANCESTORS,ANC_DEPTH

Experiments NYT,Feb 21- Mar 7 2013 A ground truth: crowdsource 128 articles: each 12 entities, average 10.2 types A ground truth: crowdsource relevant Relevance score

Experiments Evaluation Measures: 4 datasets MAP NDCG Cumulated gain-based evaluation of IR techniques 4 datasets 770 distinct entities Sentence : 419, average 32 words, 2.45 entities Paragraph: 339, average 66 words, 2,72 entities 3-Paragraph: 339, average 165 words, 11.8 entities

Results

Results TRank MapReduce 71TB 8 servers, 12 cores 2.33GB, 32GB

Conclusions Type hierarchy Regression modle Interaction among entities User impact

Thanks!