Download presentation
Presentation is loading. Please wait.
Published byNigel Tyler Modified over 9 years ago
1
www.sti-innsbruck.at © Copyright 2013 STI INNSBRUCK www.sti-innsbruck.at “How to put an annotation in HTML?” Ioannis Stavrakantonakis
2
www.sti-innsbruck.at Outline 2 Research question ITS 2.0 NIF What about Microdata? Demo References
3
www.sti-innsbruck.at Research question 3 We want to annotate Springfield with an URI to make sure that the computer understands we mean the Springfield in Massachusetts. HTML: It is well known, that Springfield has mild summers and short, but hard winters. HTML with annotation (something like that): It is well known, that Springfield http://sws.geonames.org/4951788/ has mild summers and short, but hard winters. We don't want to add whole triples, but just annotate the HTML and say "this element refers to the following URI". From: Denny Vrandečić Sent: Wednesday, April 24, 2013 1:59 PM To: semantic-web at W3C Subject: How to put an annotation in HTML?
4
www.sti-innsbruck.at ITS 2.0 4 International Tag Set (ITS) [2] –enhances the foundation to integrate automated processing of human language into core Web technologies; –focuses on HTML, XML-based formats in general, and can leverage processing based on the XML Localization Interchange File Format (XLIFF), as well as the Natural Language Processing Interchange Format (NIF); –is a technology to add metadata to Web content, for the benefit of localization, language technologies, and internationalization (see more in [5] regarding localization (l10n) and internationalization (i18n))
5
www.sti-innsbruck.at ITS 2.0 5 Potential Users of ITS [2]: –Schema developers starting a schema from the ground up (proposals for attribute and element names to be included in their new schema) –Schema developers working with an existing schema (should check whether their schemas support the markup proposed in this specification, and, where appropriate, add the markup proposed here to their schema) –Vendors of content-related tools (e.g. tools for authoring, translation, etc.) –Content producers (may be used by them to mark up specific bits of content) –Machine Translation Systems –Text Analytics (automatically generated metadata for improving localization, data integration or knowledge management workflows) –Localization Workflow Managers
6
www.sti-innsbruck.at ITS 2.0 6 The Text Analysis use case: This data category is used to annotate content with lexical or conceptual information for the purpose of contextual disambiguation. 3 pieces of annotation: –Confidence: The confidence of the agent (that produced the annotation) in its own computation – XSD double data type (e.g. 0.63) –Entity type: The type of entity, or concept class of the text analysis target – IRI (e.g. http://nerd.eurecom.fr/ontology#Location [8]) http://nerd.eurecom.fr/ontology#Location –Entity identifier: A unique identifier for the text analysis target – IRI or String (e.g. http://dbpedia.org/page/Innsbruck or the identifier for “Capital” from Wordnet [9]) http://dbpedia.org/page/Innsbruck
7
www.sti-innsbruck.at ITS 2.0 7 Rendered HTML: HTML with ITS metadata: Welcome to Innsbruck in Austria !
8
www.sti-innsbruck.at ITS 2.0 8 Conversion to NIF [2]: –Convert XML or HTML documents that contain ITS metadata to the RDF-based format based on NIF. The conversion results in RDF. –The conversion algorithm to generate NIF consists of seven steps. The output of the algorithm uses the ITS RDF ontology [7]. –The conversion to NIF is a possible basis for a natural language processing (NLP) application that creates, for example, named entity annotations. –To integrate the RDF annotations into the original input document is given in [6] (NIF2ITS).
9
www.sti-innsbruck.at NLP Interchange Format (NIF) 9 NIF is an RDF/OWL-based format that aims to achieve interoperability between Natural Language Processing (NLP) tools, language resources and annotations. NIF will soon be a normative part of the ITS 2.0 NIF and its community project NLP2RDF serve as an umbrella project liaising with other community of practices, especially: –LOD2 FP7 EU projectLOD2 FP7 EU project –MultilingualWeb-LT Working GroupMultilingualWeb-LT Working Group –Best Practices for Multilingual Linked Open Data Community GroupBest Practices for Multilingual Linked Open Data Community Group –Ontology-Lexica Community GroupOntology-Lexica Community Group –Named Entity Recognition and Disambiguation (NERD)Named Entity Recognition and Disambiguation (NERD) –Ontologies of Linguistic Annotation (OLiA)Ontologies of Linguistic Annotation (OLiA) University of Leipzig
10
www.sti-innsbruck.at How is it different to Microdata annotations? 10 What is the latitude and longitude of the Empire State Building ? Empire State Building What is the latitude and longitude of the Empire State Building ? Microdata + schema.org ITS2.0 + dbpedia resource
11
www.sti-innsbruck.at How is it different to Microdata annotations? 11 What is the latitude and longitude of the Empire State Building ? Semantics of ITS2.0 annotations: Specify entity identifiers (IRIs) for the presented information item. Semantics of Microdata annotations: Specify the type of information that is presented. Microdata ITS2.0
12
www.sti-innsbruck.at Hands-on / Demo 12 HTML with ITS metadata Transformation of HTML with ITS metadata to NIF Notes: Based on the XSLT files shared by the W3C Working Group member Felix Sasaki (@fsasaki) [4] The Java internal XSLTC processor fails to compile the XSLTs. Use Saxon 9 HE.
13
www.sti-innsbruck.at References [1] W3C semantic web list thread: http://lists.w3.org/Archives/Public/semantic- web/2013Apr/0218.html http://lists.w3.org/Archives/Public/semantic- web/2013Apr/0218.html [2] ITS 2.0 W3C working draft: http://www.w3.org/TR/its20/ http://www.w3.org/TR/its20/ [3] NIF Core Ontology: http://persistence.uni-leipzig.org/nlp2rdf/http://persistence.uni-leipzig.org/nlp2rdf/ [4] Felix Sasaki ITS 2.0 extractor (github): https://github.com/fsasaki/its20-extractor https://github.com/fsasaki/its20-extractor [5] W3C, Localization vs. Internationalization: http://www.w3.org/International/questions/qa-i18nhttp://www.w3.org/International/questions/qa-i18n [6] W3C, Conversion NIF2ITS: http://www.w3.org/TR/its20/#nif-backconversionhttp://www.w3.org/TR/its20/#nif-backconversion [7] W3C, ITS 2.0 / RDF Ontology: http://www.w3.org/2005/11/its/rdf-content/its-rdf.htmlhttp://www.w3.org/2005/11/its/rdf-content/its-rdf.html [8] Named Entity Recognition and Disambiguation (NERD): http://nerd.eurecom.fr/ontologyhttp://nerd.eurecom.fr/ontology [9] WordNet Search 3.1: http://wordnetweb.princeton.edu/perl/webwnhttp://wordnetweb.princeton.edu/perl/webwn 13
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.