1 An Integrated Annotation DB in OntoNotes Sameer Pradhan, Eduard Hovy, Mitchell Marcus, Martha Palmer, Lance Ramshaw, and Ralph Weischedel

Slides:



Advertisements
Similar presentations
C6 Databases.
Advertisements

Layering Semantics (Putting meaning into trees) Treebank Workshop Martha Palmer April 26, 2007.
Introduction to Databases
Prentice Hall, Database Systems Week 1 Introduction By Zekrullah Popal.
Management Information Systems, Sixth Edition
Steven Schoonover.  What is VerbNet?  Levin Classification  In-depth look at VerbNet  Evolution of VerbNet  What is FrameNet?  Applications.
1 OntoNotes: A Unified Relational Semantic Representation Sameer Pradhan, Eduard Hovy, Mitchell Marcus, Martha Palmer, Lance Ramshaw, and Ralph Weischedel.
Annotation Types for UIMA Edward Loper. UIMA Unified Information Management Architecture Analytics framework –Consists of components that perform specific.
File Systems and Databases
1 Introduction The Database Environment. 2 Web Links Google General Database Search Database News Access Forums Google Database Books O’Reilly Books Oracle.
1 COS 425: Database and Information Management Systems XML and information exchange.
Integrating data sources on the World-Wide Web Ramon Lawrence and Ken Barker U. of Manitoba, U. of Calgary
Ch1: File Systems and Databases Hachim Haddouti
Lance Ramshaw (with Ralph Weischedel) BBN. 2 Ontobank Coreference Part of the multi-site Ontobank effort –Intended to combine with word-sense and propositional.
OntoNotes project Treebank Syntax Training Data Decoders Propositions Verb Senses and verbal ontology links Noun Senses and targeted nominalizations Coreference.
Type Systems, Interoperability and Database Population Eric Nyberg, CMU Shilpa Arora, CMU Lance Ramshaw, BBN.
PropBank Martha Palmer University of Colorado. Unified Linguistic Annotation: Merging PropBank, NomBank, TimeBank, Penn Discourse Treebank, Coreference,
Prevalent Database Models (Advantages of a database over flat files)
Database Management COP4540, SCS, FIU An Introduction to database system.
Chapter One Overview of Database Objectives: -Introduction -DBMS architecture -Definitions -Data models -DB lifecycle.
Database Design and Introduction to SQL
Introduction to Database Concepts
Databases From A to Boyce Codd. What is a database? It depends on your point of view. For Manovich, a database is a means of structuring information in.
Database Design - Lecture 1
 Introduction Introduction  Purpose of Database SystemsPurpose of Database Systems  Levels of Abstraction Levels of Abstraction  Instances and Schemas.
PropBank, VerbNet & SemLink Edward Loper. PropBank 1M words of WSJ annotated with predicate- argument structures for verbs. –The location & type of each.
STORING ORGANIZATIONAL INFORMATION— DATABASES CIS 429—Chapter 7.
1 INTRODUCTION TO DATABASE MANAGEMENT SYSTEM L E C T U R E
Database System Concepts and Architecture
Management Information Systems By Effy Oz & Andy Jones
Interpreting Dictionary Definitions Dan Tecuci May 2002.
Microsoft Access 2003 Define some key Access terminology: Field – A single characteristic or attribute of a person, place, object, event, or idea. Record.
CHAPTER 8: MANAGING DATA RESOURCES. File Organization Terms Field: group of characters that represent something Record: group of related fields File:
The main mathematical concepts that are used in this research are presented in this section. Definition 1: XML tree is composed of many subtrees of different.
311: Management Information Systems Database Systems Chapter 3.
Component 4: Introduction to Information and Computer Science Unit 6: Databases and SQL Lecture 2 This material was developed by Oregon Health & Science.
Databases From A to Boyce Codd. What is a database? It depends on your point of view. For Manovich, a database is a means of structuring information in.
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
1.file. 2.database. 3.entity. 4.record. 5.attribute. When working with a database, a group of related fields comprises a(n)…
Storing Organizational Information - Databases
Announcements. Data Management Chapter 12 Traditional File Approach  Structure Field  Record  File  Fixed All records have common fields, and a field.
MIS 327 Database Management system 1 MIS 327: DBMS Dr. Monther Tarawneh Dr. Monther Tarawneh Week 2: Basic Concepts.
AQUAINT Workshop – June 2003 Improved Semantic Role Parsing Kadri Hacioglu, Sameer Pradhan, Valerie Krugler, Steven Bethard, Ashley Thornton, Wayne Ward,
McGraw-Hill/Irwin Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 5 Data Resource Management.
Prepared By Prepared By : VINAY ALEXANDER ( विनय अलेक्सजेंड़र ) PGT(CS),KV JHAGRAKHAND.
1 CSE 2337 Introduction to Data Management Textbook: Chapter 1.
Component 4/Unit 6b Topic II Relational Databases Keys and relationships Data modeling Database acquisition Database Management System (DBMS) Database.
Agenda for Presenation  What is NLIDB  What has been done  What is to be done.
Management Information Systems, 4 th Edition 1 Chapter 8 Data and Knowledge Management.
Foundations of Business Intelligence: Databases and Information Management.
Assoc. Prof. Dr. Ahmet Turan ÖZCERİT.  The concept of Data, Information and Knowledge  The fundamental terms:  Database and database system  Database.
Object storage and object interoperability
Towards Unifying Vector and Raster Data Models for Hybrid Spatial Regions Philip Dougherty.
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
Hoi Le. Why database? Spreadsheet is not good to: Store very large information Efficiently update data Use in multi-user mode Hoi Le2.
What is a database? (a supplement, not a substitute for Chapter 1…) some slides copied/modified from text Collection of Data? Data vs. information Example:
Management Information Systems by Prof. Park Kyung-Hye Chapter 7 (8th Week) Databases and Data Warehouses 07.
Geographic Information Systems GIS Data Databases.
An Introduction to database system
Overview of MDM Site Hub
Relational Databases.
Microsoft Access 2003 Illustrated Complete
MANAGING DATA RESOURCES
Database.
File Systems and Databases
Chapter 5 Data Resource Management.
Relational Database Model
MANAGING DATA RESOURCES
Database Systems Instructor Name: Lecture-3.
Presentation transcript:

1 An Integrated Annotation DB in OntoNotes Sameer Pradhan, Eduard Hovy, Mitchell Marcus, Martha Palmer, Lance Ramshaw, and Ralph Weischedel

2 Goals  Capture multiple layers of annotation and modeling –Syntax –Propositions –Word sense –Ontology –Coreference –Names  Using an integrated relational database representation –Enforces consistency across the different annotations –Supports integrated models that can combine evidence from different layers

3 Unified Representation  Provide a bare-bones representation independent of the individual semantics that can –Efficiently capture intra- and inter- layer semantics –Maintain component independence –Provide mechanism for flexible integration –Integrate information at the lowest level of granularity  A Relational Database

4 Unified Relational Representation Corpus Trees Coreference Names Propositions Senses

5 Example: DB Representation of Syntax Treebank tokens (stored in the Token table) provide the common base The Tree table stores the recursive tree nodes, each with its span Subsidiary tables define the sets of function tags, phase types, etc.

6 Advantages of an Integrated Representation  Each layer translates into a common representation  Clean, consistent layers –Resolve the inconsistencies and problems that this reveals  Well defined relationships –Database schema defines the merged structure efficiently  Original representations available as predefined views –Treebank, PropBank, etc.  SQL queries can extract examples based on multiple layers or define new views  Python Object-oriented API allows for programmatic access to tables and queries

7 Syntax Layer  Identifies meaningful phrases in the text  Lays out the structure of how they are related Concerns about the pace of the Vienna talks -- which are aimed at the destruction of some 100,000 weapons, as well as major reductions and realignments of troops in central Europe – also are being registered at the Pentagon. S major reductions and realignments of troops in central Europe... major reductions and realignments of troops in central Europe –... NP JJNNSCCNNSINNP NNS PP INNP JJNNP PP SYNTAX

8 ARG2 ARG1 ARGM-LOC Propositional Structure  Tells who did what to whom  For both verbs and nouns Concerns about the pace of the Vienna talks -- which are aimed at the destruction of some 100,000 weapons, as well as major reductions and realignments of troops in central Europe – also are being registered at the Pentagon.... major reductions and realignments of troops in central Europe –... NP JJNNSCCNNSINNP NNS PP INNP JJNNP PP S Concerns about the pace of the Vienna talks -- which are aimed at the destruction of some 100,000 weapons, as well as major reductions and realignments of troops in central Europe – also are being registered at the Pentagon.

9 reduce.01 – Make less aim.02 – Directed motion Predicate Frames Concerns about the pace of the Vienna talks -- which are aimed at the destruction of some 100,000 weapons, as well as major reductions and realignments of troops in central Europe – also are being registered at the Pentagon. Predicate Frames aim aim.01 – Plan aim.02 – Directed motion ARG0 – Aimer ARG1 – Action ARG0 – Aimer ARG1 – Thing in motion ARG2 – Target Predicate Frames reduction reduce.01 – Make less ARG0 – Agent ARG1 – Thing falling ARG2 – Amount fallen ARG3 – Starting point ARG4 – Ending point  Predicate frames define the meanings of the numbered arguments

10 Word Sense and Ontology  Meaning of nouns and verbs are specified  All the senses are annotatable at 90% inter-annotator agreement  Catalog of possible meanings supplied in the sense inventory files  Ontology links (currently being added) will capture similarities between related senses of different words Concerns about the pace of the Vienna talks -- which are aimed at the destruction of some 100,000 weapons, as well as major reductions and realignments of troops in central Europe – also are being registered at the Pentagon. Word Sense aim 1.Point or direct object, weapon, at something... 2.Wish, purpose or intend to achieve something Word Sense register 1.Enter into an official record 2.Be aware of, enter into someone’s conciousness 3.Indicate a measurement 4.Show in one’s face 2.Wish, purpose or intend to achieve something 1.Enter into an official record Concerns about the pace of the Vienna talks -- which are aimed at the destruction of some 100,000 weapons, as well as major reductions and realignments of troops in central Europe – also are being registered at the Pentagon.

11 Coreference  Identifies different mentions of the same entity in text – especially links definite, referring noun phrases, and pronouns in text  Two types – Identity as well as Attributive coreference tagged. Concerns about the pace of the Vienna talks -- which are aimed at the destruction of some 100,000 weapons, as well as major reductions and realignments of troops in central Europe – also are being registered at the Pentagon. President Bush conventional arms talk the Pentagon Vienna talks – which are aimed at the destruction of some 100,000 weapons, as well as major reductions and realignments of troops in central Europe the Pentagon Pentagon He e0

12 Example of DB Query Function for a_proposition in a_proposition_bank: if(a_proposition.lemma != "say"): arg_in_p_q = "select * from argument where proposition_id = '%s';" % (a_proposition.id) a_cursor.execute(arg_in_p_query) argument_rows = a_cursor.fetchall() for a_argument_row in argument_rows: a_argument_id = a_argument_row["id"] a_argument_type = a_argument_row["type"] if(a_argument_type != "ARG0"): n_in_arg_q = "select * from argument_node where argument_id = '%s';" % (a_argument_id) a_cursor.execute(n_in_arg_q) argument_node_rows = a_cursor.fetchall() for a_argument_node_row in argument_node_rows: a_node_id = a_argument_node_row["node_id"] a_ne_node_query = "select * from name_entity where subtree_id = '%s';" % (a_node_id) a_cursor.execute(a_ne_node_query) ne_rows = a_cursor.fetchall() for a_ne_row in ne_rows: a_ne_type = a_ne_row["type"] ne_hash[a_ne_type] = ne_hash[a_ne_type] + 1 a_tree = a_tree_document.get_tree(a_tree_id) a_node = a_tree.get_subtree(a_node_id) for a_child in a_node.subtrees(): a_ne_subtree_query = "select * from name_entity where subtree_id = '%s';" % (a_child.id) subtree_ne_rows = a_cursor.execute(a_ne_subtree_query) ne_subtree_rows = a_cursor.fetchall() for a_ne_subtree_row in ne_subtree_rows: a_subtree_ne_type = a_ne_subtree_row["type"] ne_hash[a_subtree_ne_type] = ne_hash[a_subtree_ne_type] + 1 if (proposition.lemma == “say”): query = “select * from argument where proposition_id = '%s';”.. What is the distribution of named entities that are ARG0s of the predicate “say”? if (argument_type == "ARG0"): for child in node.subtrees(): Name EntityFrequency Person84 GPE34 Organization29 NORP15...

13 Conclusion  Integrating the annotation layers using a relational schema –Improves consistency –Allows predictive features that combine evidence from multiple layers  Easily Accessible –Through Python API –SQL queries