Einat Minkov University of Haifa, Israel CL course, U

Slides:



Advertisements
Similar presentations
Research Internships Advanced Research and Modeling Research Group.
Advertisements

YAGO-NAGA Project Presented By: Mohammad Dwaikat To: Dr. Yuliya Lierler CSCI 8986 – Fall 2012.
Jing-Shin Chang National Chi Nan University, IJCNLP-2013, Nagoya 2013/10/15 ACLCLP – Activities ( ) & Text Corpora.
Social networks, in the form of bibliographies and citations, have long been an integral part of the scientific process. We examine how to leverage the.
Graph Data Management Lab, School of Computer Science Put conference information here.
Extracting Academic Affiliations Alicia Tribble Einat Minkov Andy Schlaikjer Laura Kieras.
Introduction to Computational Linguistics Lecture 2.
Learning Random Walk Models for Inducing Word Dependency Distributions Kristina Toutanova Christopher D. Manning Andrew Y. Ng.
Semantic Web and Web Mining: Networking with Industry and Academia İsmail Hakkı Toroslu IST EVENT 2006.
An Overview of Text Mining Rebecca Hwa 4/25/2002 References M. Hearst, “Untangling Text Data Mining,” in the Proceedings of the 37 th Annual Meeting of.
Article by: Feiyu Xu, Daniela Kurz, Jakub Piskorski, Sven Schmeier Article Summary by Mark Vickers.
Which Nobel laureate survived both world wars and all his four children? Tandem What‘s this? Question AnsweringPhoto AnnotationTimeline Analysis.
Enhance legal retrieval applications with an automatically induced knowledge base Ka Kan Lo.
Artificial Intelligence Research Centre Program Systems Institute Russian Academy of Science Pereslavl-Zalessky Russia.
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005.
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
CIG Conference Norwich September 2006 AUTINDEX 1 AUTINDEX: Automatic Indexing and Classification of Texts Catherine Pease & Paul Schmidt IAI, Saarbrücken.
Author: William Tunstall-Pedoe Presenter: Bahareh Sarrafzadeh CS 886 Spring 2015.
DBrev: Dreaming of a Database Revolution Gjergji Kasneci, Jurgen Van Gael, Thore Graepel Microsoft Research Cambridge, UK.
Natural Language Processing Guangyan Song. What is NLP  Natural Language processing (NLP) is a field of computer science and linguistics concerned with.
Ontology-Based Information Extraction: Current Approaches.
Flexible Text Mining using Interactive Information Extraction David Milward
EMNLP’01 19/11/2001 ML: Classical methods from AI –Decision-Tree induction –Exemplar-based Learning –Rule Induction –TBEDL ML: Classical methods from AI.
Mining Topic-Specific Concepts and Definitions on the Web Bing Liu, etc KDD03 CS591CXZ CS591CXZ Web mining: Lexical relationship mining.
1 CSI 5180: Topics in AI: Natural Language Processing, A Statistical Approach Instructor: Nathalie Japkowicz Objectives of.
GTRI.ppt-1 NLP Technology Applied to e-discovery Bill Underwood Principal Research Scientist “The Current Status and.
Copyright © 2012, SAS Institute Inc. All rights reserved. ANALYTICS IN BIG DATA ERA ANALYTICS TECHNOLOGY AND ARCHITECTURE TO MANAGE VELOCITY AND VARIETY,
Bootstrapping for Text Learning Tasks Ramya Nagarajan AIML Seminar March 6, 2001.
Computational Linguistics. The Subject Computational Linguistics is a branch of linguistics that concerns with the statistical and rule-based natural.
Ontology-based search and knowledge sharing using domain ontologies
Towards the Semantic Web 6 Generating Ontologies for the Semantic Web: OntoBuilder R.H.P. Engles and T.Ch.Lech 이 은 정
Measuring Semantic Similarity between Words Using Web Search Engines WWW 07.
1 CSC 594 Topics in AI – Text Mining and Analytics Fall 2015/16 3. Word Association.
Document Databases for Information Management Gregor Erbach FTW, Wien DFKI, Saarbrucken ETL, Tsukuba
1 NAGA: Searching and Ranking Knowledge Gjergji Kasneci Joint work with: Fabian M. Suchanek, Georgiana Ifrim, Maya Ramanath, and Gerhard Weikum.
CS460/IT632 Natural Language Processing/Language Technology for the Web Lecture 1 (03/01/06) Prof. Pushpak Bhattacharyya IIT Bombay Introduction to Natural.
TWC Illuminate Knowledge Elements in Geoscience Literature Xiaogang (Marshall) Ma, Jin Guang Zheng, Han Wang, Peter Fox Tetherless World Constellation.
Unsupervised Relation Detection using Automatic Alignment of Query Patterns extracted from Knowledge Graphs and Query Click Logs Panupong PasupatDilek.
CONTEXTUAL SEARCH AND NAME DISAMBIGUATION IN USING GRAPHS EINAT MINKOV, WILLIAM W. COHEN, ANDREW Y. NG SIGIR’06 Date: 2008/7/17 Advisor: Dr. Koh,
1 Question Answering and Logistics. 2 Class Logistics  Comments on proposals will be returned next week and may be available as early as Monday  Look.
Department of Computer Science The University of Texas at Austin USA Joint Entity and Relation Extraction using Card-Pyramid Parsing Rohit J. Kate Raymond.
Overview of Statistical NLP IR Group Meeting March 7, 2006.
Semantic Web Technologies Readings discussion Research presentations Projects & Papers discussions.
Trends in NL Analysis Jim Critz University of New York in Prague EurOpen.CZ 12 December 2008.
Ensembling Diverse Approaches to Question Answering
Measuring Monolinguality
A Brief Introduction to Distant Supervision
6 ~ GIR.
Kenneth Baclawski et. al. PSB /11/7 Sa-Im Shin
Reading Report: Open QA Systems
Terminology problems in literature mining and NLP
Statistical NLP: Lecture 13
CSC 594 Topics in AI – Natural Language Processing
Associative Query Answering via Query Feature Similarity
--Mengxue Zhang, Qingyang Li
Web IR: Recent Trends; Future of Web Search
Research at Open Systems Lab IIIT Bangalore
Social Knowledge Mining
CSC 594 Topics in AI – Applied Natural Language Processing
Machine Learning in Natural Language Processing
WordNet WordNet, WSD.
Automatic Detection of Causal Relations for Question Answering
CSE 635 Multimedia Information Retrieval
How to publish in a format that enhances literature-based discovery?
Effective Entity Recognition and Typing by Relation Phrase-Based Clustering
CS246: Information Retrieval
Knowledge Representation for Natural Language Understanding
ProBase: common Sense Concept KB and Short Text Understanding
Learning to Rank Typed Graph Walks: Local and Global Approaches
CS565: Intelligent Systems and Interfaces
Presentation transcript:

Einat Minkov University of Haifa, Israel CL course, U Einat Minkov University of Haifa, Israel CL course, U. Trento March 2, 2017

About myself 2008, PhD from the Language Technologies Institute at Carnegie Mellon University, USA 2008-10, Nokia Research, Cambridge, USA 2010-present, U.Haifa, Israel Teaching: Intro. to AI, text mining, Databases Main research focus: semantics, graph-based inference, information extraction

Information extraction Extraction of structured factual information from text This should be useful for: Question answering Information aggregation Main tasks: Named entity recognition Event extraction Auto. construction of knowledge bases (ontologies)

Information extraction Extraction of structured factual information from text Event extraction Question answering Text (processed) Named entity recognition Document indexing and search Ontology construction Improved syntactic processing

Named entity recognition Find and classify names in text.

Named entity recognition Find and classify names in text.

Named entity recognition Often addressed as a tagging task, using rule-based or using statistical learning, considering: the string value formatting (is it capitalized? Does the word end with `ski’?) lexicon lookups (does `shen’ appear in a dictionary of first person names? or, `ltd’ in the lexicon of company suffixes?) Syntactic and lexical neighborhood (DET on the left? `MR.’ on the left?)

Named entity recognition Applications: Question answering (e.g., “WHO invented.. ?” “WHERE is the CL class?”) Document indexing and linking Preliminary step for higher level IE tasks…

Extraction of events Minkov & Zettlemoyer, ACL’12

Extraction of events Minkov & Zettlemoyer, ACL’12

Extraction of events Seminar slot population Minkov & Zettlemoyer, ACL’12

Extraction of events Seminar slot population Minkov & Zettlemoyer, ACL’12

Extraction of events The system’s output should `make sense’: Start time of seminar before the end time.. The seminar doesn’t take place at night And its duration is longer than 10 minutes and shorter than 2 hours.. The location is one of the rooms at CMU We wish to have relevant world knowledge available Minkov & Zettlemoyer, ACL’12

The holy grail: ontology of world knowledge In addition to ‘is-a’ relations: Synonym / antonym Holonym / meronym related / similar-to …

Lexical knowledge for parsing Structural ambiguities: He broke [the window] [with a hammer] He broke [the window] [with the white curtains] Good probability estimates of P(hammer | broke, with) and P(curtains| window, with) will help with disambiguation Toutanova, Manning & Ng, ICML’04

Lexical knowledge for parsing Pair-wise statistics involving two words are very sparse, even on topics central to the domain of the corpus. Examples from WSJ (1million words): stocks plummeted 2 occurrences stocks stabilized 1 occurrence stocks rose 50 occurrences stocks skyrocketed 0 occurrences stocks laughed 0 occurrences Toutanova, Manning & Ng, ICML’04

Lexical knowledge for parsing stabilize stabilized stabilizing rise climb morphology synonyms skyrocket “is-a” relationships Examples with pictures Toutanova, Manning & Ng, ICML’04

Using multiple similarity measures and chaining inferences stocks rose rise More nodes and links; show a path skyrocket skyrocketed Toutanova, Manning & Ng, ICML’04

The holy grail: ontology of world knowledge Why `holy grail’? 1. It is a hard task 2. World knowledge is very dynamic Dalvi, Minkov, Talukdar & Cohen, WSDM’04

Relation extraction YAGO: Yet Another Great Ontology Entity subclass subclass subclass Organization Person Location subclass subclass subclass subclass subclass Country Scientist Politician subclass subclass State instanceOf instanceOf Biologist instanceOf Physicist City instanceOf Germany instanceOf instanceOf locatedIn Oct 23, 1944 Erwin_Planck diedOn locatedIn Kiel Schleswig-Holstein FatherOf bornIn Nobel Prize hasWon instanceOf citizenOf diedOn Oct 4, 1947 Max_Planck Society Max_Planck Angela Merkel Apr 23, 1858 bornOn means(0.9) means means means means(0.1) “Max Planck” “Max Karl Ernst Ludwig Planck” “Angela Merkel” “Angela Dorothea Merkel” YAGO: Yet Another Great Ontology [Suchanek et al.: WWW’07]

Automatic KB population Gazetteers, tables, and text-based: General `Hearst patterns’ (92): Learning type-specific contexts:

Knowledge Bases and NLP KBs used for text processing tasks: Named entity recognition Event extraction Entity linking and disambiguation Question answering Syntactic structures being increasingly used for KB population and fact extraction e.g., “Leveraging Linguistic Structure For Open Domain Information Extraction”, Angeli, Premkumar & Manning, ACL’15 Still an open question how to effectively interface language with world knowledge

Thank you! einatm@is.haifa.ac.il