SemAF – Basics: Semantic annotation framework Harry Bunt Tilburg University isa -6 Joint ISO - ACL/SIGSEM workshop Oxford, 11 - 12 January 2011 TC 37/SC.

Slides:



Advertisements
Similar presentations
Database Systems: Design, Implementation, and Management Tenth Edition
Advertisements

ICT Monica Monachini – 1° KYOTO Workshop – Amsterdam 2/ KYOTO (ICT ) Yielding Ontologies for Transition-Based Organization Intelligent.
MLIF: A Metamodel to Represent and Exchange Multilingual Textual Information ISO TC37 SC4 WG Samuel Cruz-Lara, Gil Francopoulo, Laurent Romary,
OASIS Reference Model for Service Oriented Architecture 1.0
Supervised by Prof. LYU, Rung Tsong Michael Department of Computer Science & Engineering The Chinese University of Hong Kong Prepared by: Chan Pik Wah,
The RDF meta model: a closer look Basic ideas of the RDF Resource instance descriptions in the RDF format Application-specific RDF schemas Limitations.
Outline Chapter 1 Hardware, Software, Programming, Web surfing, … Chapter Goals –Describe the layers of a computer system –Describe the concept.
Developed by Reneta Barneva, SUNY Fredonia Component Level Design.
Foundations This chapter lays down the fundamental ideas and choices on which our approach is based. First, it identifies the needs of architects in the.
1 TECO-WIS, 6-8 November 2006 TECHNICAL CONFERENCE ON THE WMO INFORMATION SYSTEM Seoul, Republic of Korea, 6-8 November 2006 ISO 191xx series of geographic.
10 December, 2013 Katrin Heinze, Bundesbank CEN/WS XBRL CWA1: DPM Meta model CWA1Page 1.
18 June, 2013 Katrin Heinze, Bundesbank CEN/WS XBRL CWA1: European Filing Rules Data Point Meta Model Data Point Methodology Guidance European Taxonomy.
TMF - a tutorial TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria.
Future of MDR - ISO/IEC Metadata Registries (MDR) Larry Fitzwater, SC 32 WG 2 Convener Computer Scientist U.S. Environmental Protection Agency May.
SC32 WG2 Metadata Standards Tutorial Metadata Registries and Big Data WG2 N1945 June 9, 2014 Beijing, China.
1 An Analytical Evaluation of BPMN Using a Semiotic Quality Framework Terje Wahl & Guttorm Sindre NTNU, Norway Terje Wahl, 14. June 2005.
Workshop on Integrated Application of Formal Languages, Geneva J.Fischer Mappings, Use of MOF for Language Families Joachim Fischer Workshop on.
A Motivating Scenario for Designing an Extensible Audio- Visual Description Language Monday 25 th of October, 2004 Raphaël Troncy, Jean Carrive, Steffen.
LIRICS mid-term review 1 LIRICS WP3: Morpho-syntactic and syntactic annotations Thierry Declerck DFKI-LT - Saarbrücken 23rd May 2006.
ISO Project Semantic Annotation Framework, Part 2: Dialogue Acts Editorial Group first meeting Pisa, September 2008 TC 37/SC 4/WG 2 Kiyong.
Working group on multimodal meaning representation Dagstuhl workshop, Oct
MPEG-21 : Overview MUMT 611 Doug Van Nort. Introduction Rather than audiovisual content, purpose is set of standards to deliver multimedia in secure environment.
LIRICS Mid-term Review 1 LIRICS WP2 – NLP Lexica Monica Monachini CNR-ILC - Pisa 23rd May 2006.
Copyright 2002 Prentice-Hall, Inc. Modern Systems Analysis and Design Third Edition Jeffrey A. Hoffer Joey F. George Joseph S. Valacich Chapter 20 Object-Oriented.
Nancy Lawler U.S. Department of Defense ISO/IEC Part 2: Classification Schemes Metadata Registries — Part 2: Classification Schemes The revision.
SDMX Standards Relationships to ISO/IEC 11179/CMR Arofan Gregory Chris Nelson Joint UNECE/Eurostat/OECD workshop on statistical metadata (METIS): Geneva.
ISO TC 37 / SC4 Language Resources An overview (Ammended 2-5 février 2002) Laurent Romary.
2004 Open Forum for eBusiness and Metadata Technology Standardization Metamodel Framework for Ontology Keqing He, Yixin Jing, Yangfan He State Key Laboratory.
Towards multimodal meaning representation Harry Bunt & Laurent Romary LREC Workshop on standards for language resources Las Palmas, May 2002.
Dimitrios Skoutas Alkis Simitsis
9 th Open Forum on Metadata Registries Harmonization of Terminology, Ontology and Metadata 20th – 22nd March, 2006, Kobe Japan. Presentation Title: Day:
Taken from Schulze-Kremer Steffen Ontologies - What, why and how? Cartic Ramakrishnan LSDIS lab University of Georgia.
Jan 9, 2004 Symposium on Best Practice LSA, Boston, MA 1 Comparability of language data and analysis Using an ontology for linguistics Scott Farrar, U.
Proposed NWI KIF/CG --> Common Logic Standard A working group was recently formed from the KIF working group. John Sowa is the only CG representative so.
Technology – Broad View Aspects that play a role when integrating archives leave the details of some core topics to the 2. day Bernhard Neumair:Base Technologies.
A comprehensive framework for multimodal meaning representation Ashwani Kumar Laurent Romary Laboratoire Loria, Vandoeuvre Lès Nancy.
MPEG 21 – An Overview MUMT 611 Elliot Sinyor January 2005.
ISO/TC37/SC4/TDG6 Language Resource Ontologies , Pisa HASIDA Koiti CfSR, AIST, Japan.
TMF - Terminological Markup Framework Laurent Romary Laboratoire LORIA (CNRS, INRIA, Universités de Nancy) ISO meeting London, 14 August 2000.
ESDI Workshop on Conceptual Schema Languages and Tools
Overview of SC 32/WG 2 Standards Projects Supporting Semantics Management Open Forum 2005 on Metadata Registries 14:45 to 15:30 13 April 2005 Larry Fitzwater.
ISO/TC37/SC4/N377 secretary report
The RDF meta model Basic ideas of the RDF Resource instance descriptions in the RDF format Application-specific RDF schemas Limitations of XML compared.
® A Proposed UML Profile For EXPRESS David Price Seattle ISO STEP Meeting October 2004.
Towards a roadmap for standardization in language technology Laurent Romary & Nancy Ide Loria-INRIA — Vassar College.
Extending the MDR for Semantic Web November 20, 2008 SC32/WG32 Interim Meeting Vilamoura, Portugal - Procedure for the Specification of Web Ontology -
ISO/IEC JTC 1/SC 32 Plenary and WGs Meetings Jeju, Korea, June 25, 2009 Jeong-Dong Kim, Doo-Kwon Baik, Dongwon Jeong {kjd4u,
Pete Johnston, Eduserv Foundation 16 April 2007 An Introduction to the DCMI Abstract Model JISC.
Levels of Linguistic Analysis
Inferring Declarative Requirements Specification from Operational Scenarios IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 24, NO. 12, DECEMBER, 1998.
Chapter 7 K NOWLEDGE R EPRESENTATION, O NTOLOGICAL E NGINEERING, AND T OPIC M APS L EO O BRST AND H OWARD L IU.
1 Chapter 2 Database Environment Pearson Education © 2009.
July 11, 2008OASIS SET TC OASIS Semantic Support for Electronic Business Document Interoperability (SET) TC Overview.
Ontologies Reasoning Components Agents Simulations An Overview of Model-Driven Engineering and Architecture Jacques Robin.
A Reduced Yet Extensible Audio- Visual Description Language: How to Escape From The MPEG-7 Bottleneck Thursday 28 th of October, 2004 Raphaël Troncy, Jean.
Describing resources II: Dublin Core CERN-UNESCO School on Digital Libraries Rabat, Nov 22-26, 2010 Annette Holtkamp CERN.
Introduction: Databases and Database Systems Lecture # 1 June 19,2012 National University of Computer and Emerging Sciences.
Geospatial metadata Prof. Wenwen Li School of Geographical Sciences and Urban Planning 5644 Coor Hall
Language = Syntax + Semantics + Vocabulary
Metadata Issues in Long-term Management of Data and Metadata
SysML 2.0 Formalism: Requirement Benefits, Use Cases, and Potential Language Architectures Formalism WG December 6, 2016.
SysML v2 Formalism: Requirements & Benefits
Advanced Database Models
Chapter 2 Database Environment Pearson Education © 2009.
Chapter 2 Database Environment.
Ontology-Based Approaches to Data Integration
Levels of Linguistic Analysis
MUMT611: Music Information Acquisition, Preservation, and Retrieval
Chapter 2 Database Environment Pearson Education © 2009.
Chapter 2 Database Environment Pearson Education © 2009.
Presentation transcript:

SemAF – Basics: Semantic annotation framework Harry Bunt Tilburg University isa -6 Joint ISO - ACL/SIGSEM workshop Oxford, January 2011 TC 37/SC 4/WG 2 Kiyong Lee, convenor

Outline Background: ad-hoc Task Domain Group TDG 3; LIRICS; SemAF part 1 (time and events); part 2 (dialogue acts);... General (ISO, LAF) considerations on annotation standards Specific LAF requirements Additional or elaborated methodological requirements -Principle of Additivity (Complementarity) -Abstract versus concrete syntax -Semantics for abstract syntax -Requirements on representation formats -Metamodel and abstract syntax -Core entities, extensions and subschemas -Layers and integrated annotation/representation Conclusion: How to move further?

Aims Make explicit what is, or should be, common to the various parts of SemAF (24617) Ensure consistency of the various parts of SemAF (24617): Their aims Their methodology Their annotation schemes Their representation schemes Provide guidelines for future parts of SemAF

General requirements on linguistic annotation standards Media independence (common mechanisms should be provided to handle all media types, including text, audio, video, etc.) Data integrity (use standoff rather than inline representation format) Machine processibility (representations must be machine readable and interpretable; the burden of interpretation should not be left to the processing software) Human readability (representations must be human readable, at least for creation and editing)

LAF requirements: Distinguish annotation from representation. An annotation is certain linguistic information that is added to language data, independent of its representation. A representation is the format into which annotation is rendered, independent of its content. Distinguish systematically between content and reference in annotation representations Uniform and TEI-compliant way of referring to relevant segments of source data Uniform way of cross-referencing between different layers of annotation

SemAF-specific requirements (1) Semantic additivity (semantic annotations should add semantic information to source data (rather than, e.g., ‘flag’ semantic phenomena)) Semantic explicitness (information in an annotation scheme must be explicit: the burden of interpretation should not be left to the processing software) Conceptual consistency (concepts used in annotations in different SemAF-parts should have the same meaning; related concepts in different SemAF-parts should be semantically consistent; underlying meta models should be mutually consistent) Representational consistency (a single mechanism should be used to represent the same type of information; there must be a consistent underlying data model)

SemAF-specific requirements (2) Methodological consistency ( Bunt, ICGL-2 Hong Kong, January 2010; Ide & Bunt, LAW-IV, Uppsala, July 2010): Conceptual analysis: metamodel Abstract syntax: extended formal specification of metamodel Definition of formal semantics of abstract syntax Concrete syntax: definition of ‘ideal’ representation format Core entities; extensions; subschemas Relation to Data Category Registry

Additivity and Explicitness Annotations (ad notare ≈ adding notes to) add information to portions of source text (cf. LAF); semantic annotations add semantic information to source text. Semantic annotations can only count as such if they have a formal semantics (Bunt & Romary, 2002), which makes them machine-interpretable.

Conceptual consistency ISO-TimeML: events subdivided into transitions, processes, and states; ISO-Semantic Roles? ISO-TimeML: event-time relations like AT, DURING; DURATION; ISO-Semantic Roles: temporal semantic roles ISO-Space: event-location relations; ISO-Semantic Roles: semantic roles relating motion events to locations etc. (Location, Source, Goal, Distance,..) ISO-Dialogue Acts: rhetorical relations between dialogue acts like Explanation, Justification Exemplification; ISO-DS: similar discourse relations

Abstract and concrete syntax of an annotation language Abstract syntax is a formal specification of the categories of objects and relations in a metamodel, describing how these elements may be combined to form annotations, defined as set-theoretical constructs; Concrete syntax specifies a particular format for the representation of annotations.  The abstract/concrete syntax distinction implements the fundamental distinction between annotations and representations made by LAF.

Semantics, abstract and concrete syntax Semantics of semantic annotations should be defined for abstract syntax, rather than for some concrete representation format.  Advantage: every representation format for the same abstract syntax has the same semantics

Requirements on representation formats Expressive adequacy: each annotation structure can be represented in this format; ‘Unambiguity’: each representation encodes a unique annotation structure. A representation format that satisfies these requirements is called ideal (Bunt, ICGL-2, Hong Kong, January 2010)  Representations in one ideal format can be converted in a meaning-preserving way to any other ideal format.

Ideal concrete syntax abstract syntax ideal concrete syntax-1 semantics F 1 F 1 -1 IaIa ideal concrete syntax-2 F 2 -1 F2F2 C 12 C 21

Core concepts, extensions, and subschemas; and the DCR A standard specifies: core concepts; principles for adding elements to the set of core concepts; principles for subschemas of a standard annotation schema. Core concepts should be entered into the ISO DCR

Things that cut across SemAF parts Overlaps, e.g. Events and their classification (ISO-TimeML, ISO-Space, Semantic roles) Time and place (ISO-TimeML, ISO-Space, Semantic roles, ISO-NE) Rhetorical and other coherence relations in dialogue and discourse (ISO-Dialogue acts, ISO-DS) Cutting across: Negation; modality Quantification; modification

References Bunt, Harry (2010) A methodology for designing semantic annotation languages. In Proceedings of the 2nd International Conference on Global Interoperability for Language Resources (ICGL-2), Hong Kong, January 2010, pp Bunt, Harry (2011) Multifunctionality in dialogue. Computer, Speech and Language 25, Ide, Nancy and Harry Bunt (20100 Anatomy of semantic annotation schemes: Mappings to GrAF. In Proceedings of the 4th Linguistic Annotation Workshop (LAW-IV), Uppsala, July 2010.