Automatic Ontology Extraction Miloš Husák RASLAN 2010.

Slides:



Advertisements
Similar presentations
CILC2011 A framework for structured knowledge extraction and representation from natural language via deep sentence analysis Stefania Costantini Niva Florio.
Advertisements

Statistical NLP: Lecture 3
Albert Gatt LIN3022 Natural Language Processing Lecture 8.
Amirkabir University of Technology Computer Engineering Faculty AILAB Efficient Parsing Ahmad Abdollahzadeh Barfouroush Aban 1381 Natural Language Processing.
Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang National Central University
Machine Learning in Natural Language Processing Noriko Tomuro November 16, 2006.
1/17 Acquiring Selectional Preferences from Untagged Text for Prepositional Phrase Attachment Disambiguation Hiram Calvo and Alexander Gelbukh Presented.
Comments on Guillaume Pitel: “Using bilingual LSA for FrameNet annotation of French text from generic resources” Gerd Fliedner Computational Linguistics.
Syntactic Pattern Recognition Statistical PR:Find a feature vector x Train a system using a set of labeled patterns Classify unknown patterns Ignores relational.
(2.1) Grammars  Definitions  Grammars  Backus-Naur Form  Derivation – terminology – trees  Grammars and ambiguity  Simple example  Grammar hierarchies.
The College of Saint Rose CIS 433 – Programming Languages David Goldschmidt, Ph.D. from Concepts of Programming Languages, 9th edition by Robert W. Sebesta,
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
Probabilistic Parsing Reading: Chap 14, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Paul Tarau, based on Rada.
“How much context do you need?” An experiment about context size in Interactive Cross-language Question Answering B. Navarro, L. Moreno-Monteagudo, E.
Survey of Semantic Annotation Platforms
Authors: Ting Wang, Yaoyong Li, Kalina Bontcheva, Hamish Cunningham, Ji Wang Presented by: Khalifeh Al-Jadda Automatic Extraction of Hierarchical Relations.
Interpreting Dictionary Definitions Dan Tecuci May 2002.
A view-based approach for semantic service descriptions Carsten Jacob, Heiko Pfeffer, Stephan Steglich, Li Yan, and Ma Qifeng
Annotating Words using WordNet Semantic Glosses Julian Szymański Department of Computer Systems Architecture, Faculty of Electronics, Telecommunications.
Using Short-Answer Format Questions for an English Grammar Tutoring System Conceptualization & Research Planning Jonggun Gim.
LOGIC AND ONTOLOGY Both logic and ontology are important areas of philosophy covering large, diverse, and active research projects. These two areas overlap.
Content analysis and CERN Roman Chyla. Artificial intelligence Natural language processing Web of data Content analysis.
1 Syntax In Text: Chapter 3. 2 Chapter 3: Syntax and Semantics Outline Syntax: Recognizer vs. generator BNF EBNF.
Approaches to Machine Translation CSC 5930 Machine Translation Fall 2012 Dr. Tom Way.
Parsing Introduction Syntactic Analysis I. Parsing Introduction 2 The Role of the Parser The Syntactic Analyzer, or Parser, is the heart of the front.
Machine Learning Chapter 5. Artificial IntelligenceChapter 52 Learning 1. Rote learning rote( โรท ) n. วิถีทาง, ทางเดิน, วิธีการตามปกติ, (by rote จากความทรงจำ.
1/21 Automatic Discovery of Intentions in Text and its Application to Question Answering (ACL 2005 Student Research Workshop )
Artificial Intelligence: Natural Language
For Monday Read chapter 24, sections 1-3 Homework: –Chapter 23, exercise 8.
For Monday Read chapter 26 Last Homework –Chapter 23, exercise 7.
For Friday Finish chapter 23 Homework –Chapter 23, exercise 15.
CS 4705 Lecture 17 Semantic Analysis: Robust Semantics.
Syntax and Semantics Form and Meaning of Programming Languages Copyright © by Curt Hill.
FILTERED RANKING FOR BOOTSTRAPPING IN EVENT EXTRACTION Shasha Liao Ralph York University.
Parsing and Code Generation Set 24. Parser Construction Most of the work involved in constructing a parser is carried out automatically by a program,
CSC312 Automata Theory Lecture # 26 Chapter # 12 by Cohen Context Free Grammars.
Concepts and Realization of a Diagram Editor Generator Based on Hypergraph Transformation Author: Mark Minas Presenter: Song Gu.
Chapter 4: Syntax analysis Syntax analysis is done by the parser. –Detects whether the program is written following the grammar rules and reports syntax.
For Monday Read chapter 26 Homework: –Chapter 23, exercises 8 and 9.
Overview of Statistical NLP IR Group Meeting March 7, 2006.
1 OOAD Mehwish Shafiq. 2 The Requirements specification must provide sufficient information to enable the system to be constructed “ successfully. ” Functional.
5 Lecture in math Predicates Induction Combinatorics.
An Ontology-based Automatic Semantic Annotation Approach for Patent Document Retrieval in Product Innovation Design Feng Wang, Lanfen Lin, Zhou Yang College.
Natural Language Processing Vasile Rus
1 Commonsense Reasoning in and over Natural Language Hugo Liu Push Singh Media Laboratory Massachusetts Institute of Technology Cambridge, MA 02139, USA.
Trends in NL Analysis Jim Critz University of New York in Prague EurOpen.CZ 12 December 2008.
5. Context-Free Grammars and Languages
Chapter 3 – Describing Syntax
Approaches to Machine Translation
Statistical NLP: Lecture 3
Basic Parsing with Context Free Grammars Chapter 13
What does it mean? Notes from Robert Sebesta Programming Languages
Syntax Analysis Chapter 4.
Machine Learning in Natural Language Processing
CSCI 5832 Natural Language Processing
Chapter 2: A Simple One Pass Compiler
R.Rajkumar Asst.Professor CSE
Parts of Speech Mr. White English I.
KNOWLEDGE REPRESENTATION
Automatic Detection of Causal Relations for Question Answering
Approaches to Machine Translation
Towards Semantics Generation
Linguistic Essentials
Information Networks: State of the Art
Theory of Computation Lecture #
USING AMBIGUOUS GRAMMARS
Artificial Intelligence 2004 Speech & Natural Language Processing
Information Retrieval
COMPILER CONSTRUCTION
CH 4 - Language semantics
Presentation transcript:

Automatic Ontology Extraction Miloš Husák RASLAN 2010

My apologies for “I believe” and “I think”

“I believe that ontologies can improve performance of word-sense desambiguation, and therefore improve parsing and many other tasks.” “Existing ontologies do not seem to be suitable for word-sense desambiguation” “Manual building of language resources is slow, expensive, tiresome and inaccurate.” Motivation

What kind of information do YOU think we need for word-sense desambiguation?

Comparison of existing technologies ● Semantic hierarchy: WordNet ● Semantic roles: FrameNet ● Grammars: Synt ● Syntactic properties: Verbalex ● Problems: Incomplete/overly complete, inconsistent, contradicting, expensive, no signs of use for analysis

What is the ontology? ”Ontology deals with questions concerning whether entities exist or can be said to exist, and how such entities can be grouped, related within a hierarchy, and subdivided according to similarities and differences.” “Some claim that ontology should contain only nouns, some claim it should contain all words, some people prefer concepts or synstes.” “I see ontology as a sort of knowledge representation – dense network of multitude of relations between any relevant objects, that can help desambiguate between senses (or at least parts of speech).

Ontology tiger name swimmin g White fur mammal lion In the pool is Has is attribute is likes animal Is furwhite Attribute Odin called is

Three approaches From the least ambitious to the least probable: ● Use of semantically annotated patterns ● Use of syntactic parser with semantically annotated rules ● Generation of the on-the-fly grammar ● Which of them get the best results?

Semantically annotated patterns ● Patterns simillar to the word-sketch definitions with relations ● The alghoritm would probably extract spare ontology for small corpora... (the more the better)

Semantically annotated grammar ● Can use parser – e.g. Synt to analyze sentences. ● It would remember frequencies of all possible parses. ● It should be possible to decide, which of the possible pareses is correct based on the statistics for each word (in contrast with statistics for grammar rules).

On-the-fly grammar generation ● Extract new words from sentences that contain only one unknown word. ● Put the new words into ontology based on the location of known fitting words. ● Define the relations by used grammar rules as well as the expressions itself. ● Remember all derivation trees in the ontology along with the frequencies of every word in every leaf. ● Learn the relations on unambiguous simple sentences and rememeber statistics for each node. ● Use the learned values to decide ambiguous sentences. ●

Additional notes ● As a postprocessing, it should be possible to express the relations between different forms of otherwise identical relations (e.g. passive and active verb forms, cases etc...) and cross-add the relations that were learned for one form, but not for the other one.

Anti-nuclear protestors released live cockroaches inside the White House, and these were arrested when they left and blocked a security gate. Example Application

Questions? Suggestions?

Thank you for your attention! Miloš Husák

References ● CONCEPT ACQUISITION IN EXAMPLE-BASED GRAMMAR AUTHORING; Ye-Yi Wang and Alex Acero; Microsoft Research (2003)