Summarization for entity annotation Contextual summary 徐丹云 2014.6.30
Experiments Introduction Background Approachs Problem Part One 1 Part Two 2 Part Three 3 Part Four 4 Part Five 5 Introduction Background Problem Approachs Experiments
Introduction Three Steps Identity Ranking disambiguation Entity annotation(entity linking, entity disambiguation) Three Steps Identity Ranking disambiguation Introduction Figure[2]
Background System Classification Munual annotation New York Times Semi-automatic annotation ZenCrowd[5] Automatic annotation AIDA[7] DBpediaSpotlight[6] ……. Background
Background Approaches: Machine learning Crowdsourcing General Supervised Semi-supervised Crowdsourcing Background
Motivation Have human in the loop of entity annotation(crowdsourcing…) For example [5] Descriptions are lengthy & inefficient verification Annotate more efficiently Guarantee accuracy Motivation
Problem Automatically generate a summary (or summaries) Input: entity mention & its surrounding text; candidate entity (or entities) Output: summaries of candidate entity (or entities) Problem
Approaches Features: Model Popularity of entity Similarity between entity and text Entity type Name String Comparison …… Model Approaches
Experiments DataSet Ground truth Knowledge base AQUAINT[8](http://www.nzdl.org/wikification/docs.html) (50 documents : newswire stories from the Xinhua News Service, the New York , and the Associated Press; average of 9 links per document; link to wikipedia; wrong and right) IITB[9](http://www.cse.iitb.ac.in/soumen/doc/CSAW/) (100 manually annotated texts, Web pages about sport, entertainment, science and technology, and health; 19000 annotations, 3800 distinct Wikipedia entities) AIDA/CoNLL[2](http://www.mpi-inf.mpg.de/departments/databases-and-information-systems/research/yago-naga/aida/downloads/) (newswire articles; YAGO2 entities) …… Knowledge base DBpedia Experiments
Experiments Evaluation BaseLine Time Accuracy Running efficiency [5]Total Description(corresponding webpage) One VS multiple Experiments
[1]Cornolti M, Ferragina P, Ciaramita M [1]Cornolti M, Ferragina P, Ciaramita M. A framework for benchmarking entity-annotation systems[C]//Proceedings of the 22nd international conference on World Wide Web. International World Wide Web Conferences Steering Committee, 2013: 249-260. [2] Hoffart J, Yosef M A, Bordino I, et al. Robust disambiguation of named entities in text[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2011: 782-792. [3] Uren V, Cimiano P, Iria J, et al. Semantic annotation for knowledge management: Requirements and a survey of the state of the art[J]. Web Semantics: science, services and agents on the World Wide Web, 2006, 4(1): 14-28. [4] Shen W, Wang J, Han J. Entity Linking with a Knowledge Base: Issues, Techniques, and Solutions[J]. [5] Demartini G, Difallah D E, Cudré-Mauroux P. ZenCrowd: leveraging probabilistic reasoning and crowdsourcing techniques for large-scale entity linking[C]//Proceedings of the 21st international conference on World Wide Web. ACM, 2012: 469-478. [6] Mendes P N, Jakob M, García-Silva A, et al. DBpedia spotlight: shedding light on the web of documents[C]//Proceedings of the 7th International Conference on Semantic Systems. ACM, 2011: 1-8. [7] Yosef M A, Hoffart J, Bordino I, et al. Aida: An online tool for accurate disambiguation of named entities in text and tables[J]. Proceedings of the VLDB Endowment, 2011, 4(12): 1450-1453. [8] Milne D, Witten I H. Learning to link with wikipedia[C]//Proceedings of the 17th ACM conference on Information and knowledge management. ACM, 2008: 509-518. [9] Kulkarni S, Singh A, Ramakrishnan G, et al. Collective annotation of Wikipedia entities in web text[C]//Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2009: 457-466. references
Thank You @江湖人称大皇兄