Gong Cheng,Danyun Xu,Yuzhong Qu

Slides:



Advertisements
Similar presentations
The Math Studies Project for Internal Assessment
Advertisements

A Probabilistic Representation of Systemic Functional Grammar Robert Munro Department of Linguistics, SOAS, University of London.
Literacy Test Preparation
Introduction to FX Stat 3. Getting Started When you open FX Stat you will see three separate areas.
LEDIR : An Unsupervised Algorithm for Learning Directionality of Inference Rules Advisor: Hsin-His Chen Reporter: Chi-Hsin Yu Date: From EMNLP.
Recommender Systems Aalap Kohojkar Yang Liu Zhan Shi March 31, 2008.
Motivation in the Workplace. A major resource of companies is their “human capital” ~ the cost of low employee motivation can be great.
The Use of Corpora for Automatic Evaluation of Grammar Inference Systems Andrew Roberts & Eric Atwell Corpus Linguistics ’03 – 29 th March Computer Vision.
Ranking by Odds Ratio A Probability Model Approach let be a Boolean random variable: document d is relevant to query q otherwise Consider document d as.
1/16 Final project: Web Page Classification By: Xiaodong Wang Yanhua Wang Haitang Wang University of Cincinnati.
READING QUESTION TYPES
Longbiao Kang, Baotian Hu, Xiangping Wu, Qingcai Chen, and Yan He Intelligent Computing Research Center, School of Computer Science and Technology, Harbin.
Jinhui Tang †, Shuicheng Yan †, Richang Hong †, Guo-Jun Qi ‡, Tat-Seng Chua † † National University of Singapore ‡ University of Illinois at Urbana-Champaign.
Modeling (Chap. 2) Modern Information Retrieval Spring 2000.
Today we will learn: Daily TEKS Objectives April 1, 2014.
C OLLECTIVE ANNOTATION OF WIKIPEDIA ENTITIES IN WEB TEXT - Presented by Avinash S Bharadwaj ( )
Attribute Extraction and Scoring: A Probabilistic Approach Taesung Lee, Zhongyuan Wang, Haixun Wang, Seung-won Hwang Microsoft Research Asia Speaker: Bo.
Dual Coordinate Descent Algorithms for Efficient Large Margin Structured Prediction Ming-Wei Chang and Scott Wen-tau Yih Microsoft Research 1.
A Two Tier Framework for Context-Aware Service Organization & Discovery Wei Zhang 1, Jian Su 2, Bin Chen 2,WentingWang 2, Zhiqiang Toh 2, Yanchuan Sim.
Learning a Fast Emulator of a Binary Decision Process Center for Machine Perception Czech Technical University, Prague ACCV 2007, Tokyo, Japan Jan Šochman.
Incident Threading for News Passages (CIKM 09) Speaker: Yi-lin,Hsu Advisor: Dr. Koh, Jia-ling. Date:2010/06/14.
1 Towards Automated Related Work Summarization (ReWoS) HOANG Cong Duy Vu 03/12/2010.
Pete Bohman Adam Kunk.  ChronoSearch: A System for Extracting a Chronological Timeline ChronoChrono.
Databases. Not All Tables Are Created Equal Spreadsheets use tables to store data and formulas associated with that data The “meaning” of data is implicit.
Greedy is not Enough: An Efficient Batch Mode Active Learning Algorithm Chen, Yi-wen( 陳憶文 ) Graduate Institute of Computer Science & Information Engineering.
CS 6998 NLP for the Web Columbia University 04/22/2010 Analyzing Wikipedia and Gold-Standard Corpora for NER Training William Y. Wang Computer Science.
1 Helping Editors Choose Better Seed Sets for Entity Set Expansion Vishnu Vyas, Patrick Pantel, Eric Crestan CIKM ’ 09 Speaker: Hsin-Lan, Wang Date: 2010/05/10.
Learning from Negative Examples in Set-Expansion Authors: Prateek Jindal and Dan Roth Dept. of Computer Science, UIUC Presenting in: ICDM 2011.
Evidence of Quality of Textual Features on the Web 2.0 Flavio Figueiredo David FernandesEdleno MouraMarco Cristo Fabiano BelémHenrique.
AF1.1 L1-2 Using models for and in explanations Compare features or parts of objects, living things or events.
Today we will learn: Daily TEKS Objectives February 26, 2014.
AN EFFECTIVE STATISTICAL APPROACH TO BLOG POST OPINION RETRIEVAL Ben He Craig Macdonald Iadh Ounis University of Glasgow Jiyin He University of Amsterdam.
Outline Problem Background Theory Extending to NLP and Experiment
Multi-level Bootstrapping for Extracting Parallel Sentence from a Quasi-Comparable Corpus Pascale Fung and Percy Cheung Human Language Technology Center,
DeepDive Model Dongfang Xu Ph.D student, School of Information, University of Arizona Dec 13, 2015.
Doug Fisher Follow me: dfisherSDSU.
Improved Video Categorization from Text Metadata and User Comments ACM SIGIR 2011:Research and development in Information Retrieval - Katja Filippova -
Discovering Relations among Named Entities from Large Corpora Takaaki Hasegawa *, Satoshi Sekine 1, Ralph Grishman 1 ACL 2004 * Cyberspace Laboratories.
Divided Pretreatment to Targets and Intentions for Query Recommendation Reporter: Yangyang Kang /23.
A Novel Relational Learning-to- Rank Approach for Topic-focused Multi-Document Summarization Yadong Zhu, Yanyan Lan, Jiafeng Guo, Pan Du, Xueqi Cheng Institute.
Today we will learn: Daily TEKS Objectives January 24, 2014.
A System for Automatic Personalized Tracking of Scientific Literature on the Web Tzachi Perlstein Yael Nir.
Learning Kernel Classifiers 1. Introduction Summarized by In-Hee Lee.
DEFINITIONS FOR DR. HALVERSON’S CLASSES. DEFINING TERMS A correctly formed DEFINITION of a term consists of 2 parts  Noun – statement of the basic nature.
Information Retrieval Models School of Informatics Dept. of Library and Information Studies Dr. Miguel E. Ruiz.
OCR National Travel and Tourism Unit 5: Assessment Objective 2a Thorpe Park.
Recognising Textual Entailment Johan Bos School of Informatics University of Edinburgh Scotland,UK.
Linked Data Profiling Andrejs Abele UNLP PhD Day Supervisor: Paul Buitelaar.
Knowledge Management Challenges for Question Answering Vinay K. Chaudhri SRI International White Paper Co-authors: Ken Barker (UT), Tom Garvey (SRI), Ken.
Topic Modeling for Short Texts with Auxiliary Word Embeddings
Guangbing Yang Presentation for Xerox Docushare Symposium in 2011
Personalized Social Image Recommendation
Big Data Quality the next semantic challenge
Are you ready for the Literacy Test?
Ganapathy Mani, Bharat Bhargava, Jason Kobes*
Gong Cheng, Yanan Zhang, and Yuzhong Qu
Presentation 王睿.
Topic Oriented Semi-supervised Document Clustering
MEgo2Vec: Embedding Matched Ego Networks for User Alignment Across Social Networks Jing Zhang+, Bo Chen+, Xianming Wang+, Fengmei Jin+, Hong Chen+, Cuiping.
An Interactive Approach to Collectively Resolving URI Coreference
Danyun Xu, Gong Cheng*, Yuzhong Qu
Leveraging Textual Specifications for Grammar-based Fuzzing of Network Protocols Samuel Jero, Maria Leonor Pacheco, Dan Goldwasser, Cristina Nita-Rotaru.
Summarization for entity annotation Contextual summary
deepschema.org: An Ontology for Typing Entities in the Web of Data
Text Annotation: DBpedia Spotlight
Text Categorization Berlin Chen 2003 Reference:
An Approach to Abstractive Multi-Entity Summarization
Faceted Filter Jidong Jiang
Topic: Semantic Text Mining
Given that {image} {image} Evaluate the limit: {image} Choose the correct answer from the following:
Presentation transcript:

Gong Cheng,Danyun Xu,Yuzhong Qu Summarizing Entity Descriptions for Effective and Efficient Human-centered Entity Linking Gong Cheng,Danyun Xu,Yuzhong Qu

Motivation Entity Linking Manual Efforts Gold-standard links for evaluation crowdsourcing

Approaches Characterizing power Characterizing Diversity information Logical inference String/numerical similarity Unique feature high characterzing power Overlap information

Approaches Characterizing power Combinatorial optimization problem Binary quadratic knapsack problem details

Approaches Differentiating power Logical inference String/numerical similarity

Approaches Differentiating power Combinatorial optimization problem Binary quadratic multidimensional knapsack problem details

Approaches Relevance Class Vector Model: CF-IIF(class frequency-inverse instance frequency), cosine similarity MMR:

Approaches Combination Binary quadratic multidimensional knapsack problem

Experiments DataSet Approaches Task KB: DBpedia Text corpora: AQUAINT,IITB Approaches DESC, CHR, DFF, CNT, COMB, RELIN Task Choose correct entity from three candidate entities according the entity mention and its context Rate and comment

Evaluation Extrinsic Evaluation 30 students, each for a total of 72 tasks, or 36 tasks from each corpora

Evaluation Intrinsic Evaluation Dominating factors: characterizing power and information overlap

Evaluation Intrinsic Evaluation Comments: CHR: DFF COMB 53% IITB highly distinguishing features 50% IITB different types helped them filter out noise entities easily 60% AQUAINT apart from different types, almost no useful information 80% IITB some highly distinguishing features 90% AQUAINT 53% comprehensive information was provided

Thanks Q&A