Www.sti-innsbruck.at © Copyright 2008 STI INNSBRUCK www.sti-innsbruck.at Semantic Annotation Semantic Web Lecture Dieter Fensel.

Slides:



Advertisements
Similar presentations
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Advertisements

A Stepwise Modeling Approach for Individual Media Semantics Annett Mitschick, Klaus Meißner TU Dresden, Department of Computer Science, Multimedia Technology.
Ontology-based Annotation Sergey Sosnovsky
A. Grigorov, A. Georgiev, M. Petrov, S. Varbanov, K. Stefanov Building a Knowledge Repository for Life-long Competence Development.
© Copyright 2012 STI INNSBRUCK Apache Stanbol.
Using the Semantic Web to Construct an Ontology- Based Repository for Software Patterns Scott Henninger Computer Science and Engineering University of.
Information and Business Work
1 © Copyright 2010 Dieter Fensel and Olga Morozova Semantic Web Generating Semantic Annotations.
OntoBlog: Linking Ontology and Blogs Aman Shakya 1, Vilas Wuwongse 2, Hideaki Takeda 1, Ikki Ohmukai 1 1 National Institute of Informatics, Japan 2 Asian.
Information Retrieval in Practice
Semantic Web and Web Mining: Networking with Industry and Academia İsmail Hakkı Toroslu IST EVENT 2006.
Annotation for the Semantic Web Yihong Ding A PhD Research Area Background Study.
ADVISE: Advanced Digital Video Information Segmentation Engine
Visual Web Information Extraction With Lixto Robert Baumgartner Sergio Flesca Georg Gottlob.
Supervised by Prof. LYU, Rung Tsong Michael Department of Computer Science & Engineering The Chinese University of Hong Kong Prepared by: Chan Pik Wah,
Multimedia Search and Retrieval Presented by: Reza Aghaee For Multimedia Course(CMPT820) Simon Fraser University March.2005 Shih-Fu Chang, Qian Huang,
Shared Ontology for Knowledge Management Atanas Kiryakov, Borislav Popov, Ilian Kitchukov, and Krasimir Angelov Meher Shaikh.
AceMedia Personal content management in a mobile environment Jonathan Teh Motorola Labs.
Annotating Documents for the Semantic Web Using Data-Extraction Ontologies Dissertation Proposal Yihong Ding.
ReQuest (Validating Semantic Searches) Norman Piedade de Noronha 16 th July, 2004.
Visual Information Retrieval Chapter 1 Introduction Alberto Del Bimbo Dipartimento di Sistemi e Informatica Universita di Firenze Firenze, Italy.
Toward Semantic Web Information Extraction B. Popov, A. Kiryakov, D. Manov, A. Kirilov, D. Ognyanoff, M. Goranov Presenter: Yihong Ding.
Overview of Search Engines
MUSCLE WP9 E-Team Integration of structural and semantic models for multimedia metadata management Aims: (Semi-)automatic MM metadata specification process.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
Špindlerův Mlýn, Czech Republic, SOFSEM Semantically-aided Data-aware Service Workflow Composition Ondrej Habala, Marek Paralič,
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 18 Slide 1 Software Reuse.
Aurora: A Conceptual Model for Web-content Adaptation to Support the Universal Accessibility of Web-based Services Anita W. Huang, Neel Sundaresan Presented.
© Copyright 2008 STI INNSBRUCK Semantic Web Semantic Annotation Dieter Fensel Katharina Siorpaes.
Mining the Semantic Web: Requirements for Machine Learning Fabio Ciravegna, Sam Chapman Presented by Steve Hookway 10/20/05.
Survey of Semantic Annotation Platforms
Authors: Ting Wang, Yaoyong Li, Kalina Bontcheva, Hamish Cunningham, Ji Wang Presented by: Khalifeh Al-Jadda Automatic Extraction of Hierarchical Relations.
Information Systems & Semantic Web University of Koblenz ▪ Landau, Germany Semantic Web - Multimedia Annotation – Steffen Staab
NATIONAL TECHNICAL UNIVERSITY OF ATHENS Image, Video And Multimedia Systems Laboratory Background
1 Technologies for (semi-) automatic metadata creation Diana Maynard.
Meta Tagging / Metadata Lindsay Berard Assisted by: Li Li.
SemSearch: A Search Engine for the Semantic Web Yuangui Lei, Victoria Uren, Enrico Motta Knowledge Media Institute The Open University EKAW 2006 Presented.
Automatic Image Annotation by Using Concept-Sensitive Salient Objects for Image Content Representation Jianping Fan, Yuli Gao, Hangzai Luo, Guangyou Xu.
Description of some multimedia ontologies Rapha ë l Troncy Thursday 1 st of December, 2005.
STASIS Technical Innovations - Simplifying e-Business Collaboration by providing a Semantic Mapping Platform - Dr. Sven Abels - TIE -
Using Several Ontologies for Describing Audio-Visual Documents: A Case Study in the Medical Domain Sunday 29 th of May, 2005 Antoine Isaac 1 & Raphaël.
SKOS. Ontologies Metadata –Resources marked-up with descriptions of their content. No good unless everyone speaks the same language; Terminologies –Provide.
© Copyright 2013 STI INNSBRUCK “How to put an annotation in HTML?” Ioannis Stavrakantonakis.
Towards the Semantic Web 6 Generating Ontologies for the Semantic Web: OntoBuilder R.H.P. Engles and T.Ch.Lech 이 은 정
The Semantic Logger: Supporting Service Building from Personal Context Mischa M Tuffield et al. Intelligence, Agents, Multimedia Group University of Southampton.
CREAM: Semantic annotation system May 24, 2013 Hee-gook Jun.
Working with Ontologies Introduction to DOGMA and related research.
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
MMDB-9 J. Teuhola Standardization: MPEG-7 “Multimedia Content Description Interface” Standard for describing multimedia content (metadata).
Semantic web Bootstrapping & Annotation Hassan Sayyadi Semantic web research laboratory Computer department Sharif university of.
Digital Video Library Network Supervisor: Prof. Michael Lyu Student: Ma Chak Kei, Jacky.
Personalized Recommendation of Related Content Based on Automatic Metadata Extraction Andreas Nauerz 1, Fedor Bakalov 2, Birgitta.
1 Open Ontology Repository initiative - Planning Meeting - Thu Co-conveners: PeterYim, LeoObrst & MikeDean ref.:
DANIELA KOLAROVA INSTITUTE OF INFORMATION TECHNOLOGIES, BAS Multimedia Semantics and the Semantic Web.
Semantic Web Course - Semantic Annotation
Jens Hartmann York Sure Raphael Volz Rudi Studer The OntoWeb Portal.
COMM: Designing a Well-Founded Multimedia Ontology for the Web Wednesday 14 th of November, 2007 Richard Arndt Steffen Staab Rapha.
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
GoRelations: an Intuitive Query System for DBPedia Lushan Han and Tim Finin 15 November 2011
Linked Open Data Dataset from Related Documents Petya Osenova and Kiril Simov IICT-BAS LDL-2016, LREC, Portoroz.
Semantic Web Technologies Readings discussion Research presentations Projects & Papers discussions.
INHA UNIVERSITY, KOREA Rainer Simon Austrian Institute of Technology.
Working meeting of WP4 Task WP4.1
Visual Information Retrieval
Generating Semantic Annotations
Kenneth Baclawski et. al. PSB /11/7 Sa-Im Shin
Lecture #11: Ontology Engineering Dr. Bhavani Thuraisingham
Multimedia Information Retrieval
Session 2: Metadata and Catalogues
BUILDING A DIGITAL REPOSITORY FOR LEARNING RESOURCES
Presentation transcript:

© Copyright 2008 STI INNSBRUCK Semantic Annotation Semantic Web Lecture Dieter Fensel

Today‘s lecture 2 #DateTitle 1Introduction 2Semantic Web Architecture 3RDF and RDFs 4Web of hypertext (RDFa, Microformats) and Web of data 5Semantic Annotation 6Repositories and SPARQL 7OWL 8RIF 9Web-scale reasoning 10Social Semantic Web 11Ontologies and the Semantic Web 12SWS 13Tools 14Applications 15Exam

Agenda Overview Semantic annotation of text Semantic annotation of multimedia 3

Semantic Annotation Creating semantic labels within documents for the Semantic Web Used to support: –Advanced searching (e.g. concept) –Information Visualization (using ontology) –Reasoning about Web resources Converting syntactic structures into knowledge structures (human  machine)

Semantic Annotation Process

Semanitc Annotation Manual Semi-automatic Automatic 6

Semantic Annotation Concerns –Scale, Volume Existing & new documents on the Web Manual annotation –Expensive – economic, time –Subject to personal motivation –Schema Complexity –Storage support for multiple ontologies within or external to source document? Knowledge base refinement –Access - How are annotations accessed? API, custom UI, plug-ins

Language Toolkits –GATE – language processing system Component architecture, SDK, IDE ANNIE (‘A Nearly-New IE system’) – tokenizer, gazetteer, POS tagger, sentence splitter, etc JAPE – Java Annotations Pattern Engine –provides regular-expression based pattern/action rules Amilcare –adaptive IE system designed for document annotation –based on LP 2 –uses ANNIE

Annotation of text

Annotation of text Many systems apply rules or wrappers that were manually created that try to recognize patterns for the annotations. Some systems learn how to annotate with the help of the user. Supervised systems learn how to annotate from a training set that was manually created beforehand. Semi-automatic approaches often apply information extraction technology, which analyzes natural language for pulling out information the user is interested in. In information extraction, one distinguishes between five different types of pieces of information: 10

Information extraction Recognition of –Entities (people, places, organization, etc.). With up to 95% accuracy, entity recognition is the most reliable IE technology even though it is strongly domain- dependent. –Mentions (referring to entities). Mentions identify references to entities in text. In the sentence “That's Mickey, I know him.”, “him” would be identified as reference to “Mickey”. –Descriptions of entities combines several IE technologies and achieves a rather good score with around 80%. –Relations between entities can be discovered with the help of defining possible relations between entities. This of course strongly depends on the domain. –Event extraction involving entities is a difficult task as it is heavily dependent on the domain and tied to the scenarios of interest to the user: scenario templates only achieve 60% accuracy. 11

Tools for semantic annotation of text Armadillo –[Dingli2003] –pattern-based approach –Amilcare information extraction system [Ciravegna2003] –especially suitable for highly structured Web pages –starts from a seed pattern and does not require human input initially –does not require a manually annotated training set –patterns for entity recognition have to be added manually KIM –Knowledge and information management platform (KIM) [Popov2003] –ontology and knowledge base –indexing and retrieval server –RDF data is stored in an RDF repository [Kiryakov2005] –LUCENE system is performing search –based on an underlying ontology (KIMO or PROTON) –relies on GATE [Bontcheva2003] as an information extraction tool 12

KIM (2003) –ontology, kb, semantic annotation, indexing and retrieval server, front-ends (Web UI, IE plug-in) –KIMO ontology 250 classes, 100 properties 80,000 entities from general news corpus in KB –(plus >100,000 aliases) IE –Uses GATE, JAPE –Gazetteers (from KB) Source:

Tools for semantic annotation of text Magpie –[Domingue2004 –suite of tools that supports the fully automatic annotation of Web pages –maps entities that are found in the knowledge base against those identified on the Web pages –quality of the results depends on the ontology in the background Melita –[Ciravegna2002] –Amilcare –can learn from user input by taking annotated content and generalizing this content in order to make new annotations –system requires the user to fully annotate documents and starts to learn –makes suggestions to the user –learns from the user's feedback –possibilities for rule writing, i.e. advanced users can define rules for automatic annotation 14

Tools for semantic annotation of text MnM –[Vargas-Vera2002 –semi-automatic annotation based on the Amilcare system –set of training data in order to carry out the annotation semi-automatically –system gradually takes over the annotation process –while browsing the Web, the user manually annotates chosen Web pages in the MnM Web browser S-Cream –[Handschuh2002] –semi-automatic annotation tool that combines two tools: Ont-O-Mat, a manual annotation editor implementing the CREAM framework, and Amilcare –can be trained for different domains with the appropriate training data –both manual and semi-automatic annotation of Web pages 15

Ont-O-Mat (2002) –Uses Amilcare Wrapper induction (LP 2 ) –Extensible Adapted in 2004 for PANKOW algorithm –Disambiguation by maximal evidence –Proper nouns + ontology  linguistic phrases Source: kcap2001-annotate-sub.pdf

SemTag (2003) –Large-scale annotation Annotations separate from source “Semantic Label Bureau” –Uses the TAP taxonomy –Approach is: Find match to label in taxonomy Save window before & after match Perform disambiguation Main contribution is using taxonomy for disambiguation Source: resources/semtag.pdf

Platform Effectiveness *as reported by platform authors

Multimedia annotation

Multimedia Annotation Different levels of annotations –Metadata Often technical metadata EXIF, Dublin Core, access rights –Content level Semantic annotations Keywords, domain ontologies, free-text –Multimedia level low-level annotations Visual descriptors, such as dominant color

Metadata refers to information about technical details creation details –creator, creationDate, … –Dublin Core camera details –settings –resolution –format –EXIF access rights –administrated by the OS –owner, access rights, …

Content Level Describes what is depicted and directly perceivable by a human usually provided manually –keywords/tags –classification of content seldom generated automatically –scene classification –object detection different types of annotations –global vs. local –different semantic levels

Global vs. Local Annotations Global annotations most widely used –flickr: tagging is only global –organization within categories –free-text annotations –provide information about the content as a whole –no detailed information Local annotations are less supported –e.g. flickr, PhotoStuff allow to provide annotations of regions –especially important for semantic image understanding allow to extract relations provide a more complete view of the scene –provide information about different regions –and about the depicted relations and arrangements of objects

Semantic Levels Free-Text annotations cover large aspects, but less appropriate for sharing, organization and retrieval –Free-Text Annotations probably most natural for the human, but provide least formal semantics Tagging provides light-weight semantics –Only useful if a fixed vocabulary is used –Allows some simple inference of related concepts by tag analysis (clustering) –No formal semantics, but provides benefits due to fixed vocabulary –Requires more effort from the user Ontologies –Provide syntax and semantic to define complex domain vocabularies –Allow for the inference of additional knowledge –Leverage interoperability –Powerful way of semantic annotation, but hardly comprehensible by “normal users”

Tools Web-based Tools –flickr –riya Stand-Alone Tools –PhotoStuff –AktiveMedia Annotation for Feature Extraction –M-OntoMat-Annotizer

flickr Web2.0 application tagging photos globally add comments to image regions marked by bounding box large user community and tagging allows for easy sharing of images partly fixed vocabularies evolved –e.g. Geo-Tagging

riya Similar to flickr in functionality Adds automatic annotation features –Face Recognition Mark faces in photos associate name train system automatic recognition of the person in the future

PhotoStuff Java application for the annotation of images and image regions with domain ontologies Used during ESWC2006 for annotating images and sharing metadata Developed within Mindswap

AktiveMedia Text and image annotation tool Region-based annotation Uses ontologies –suggests concepts during annotation –providing a simpler interface for the user Provides semi-automatic annotation of content, using –Context –Simple image understanding techniques –flickr tagging data

M-OntoMat-Annotizer Extracts knowledge from image regions for automatic annotation of images Extracting features: –User can mark image regions manually or using an automatic segmentation tool –MPEG-7 descriptors are extracted –Stored within domain ontologies as prototypical, visual knowledge Developed within aceMedia Currently Version 2 is under development, incorporating –true image annotation –central storage –extended knowledge extraction –extensible architecture using a high-level multimedia ontology

Multimedia Ontologies Semantic annotation of images requires multimedia ontologies –several vocabularies exist (Dublin Core, FOAF) –they don’t provide appropriate models to describe multimedia content sufficiently for sophisticated applications MPEG-7 provides an extensive standard, but especially semantic annotations are insufficiently supported Several mappings of MPEG-7 into RDF or OWL exist –now: VDO and MSO developed within aceMedia –later: Engineering a multimedia upper ontology

aceMedia Ontology Infrastructure aceMedia Multimedia Ontology Infrastructure –DOLCE as core ontology –Multimedia Ontologies Visual Descriptors Ontology (VDO) Multimedia Structures Ontology (MSO) Annotation and Spatio-Temporal Ontology augmenting VDO and MSO –Domain Ontologies capture domain specific knowledge

Visual Descriptors Ontology Representation of MPEG-7 Visual Descriptors in RDF –Visual Descriptors represent low-level features of multimedia content –e.g. dominant color, shape or texture Mapping to RDF allows for –linking of domain ontology concepts with visual features –better integration with semantic annotations –a common underlying model for visual and semantic features

Visual Knowledge Used for automatic annotation of images Idea: –Describe the visual appearance of domain concepts by providing examples –User annotates instances of concepts and extracts features –features are represented with the VDO –the examples are then stored in the domain ontology as prototype instances of the domain concepts Thus the names: prototype and prototypical knowledge

Extraction of Prototype

Transformation to VDO extract "vde-inst1" 0 […] <vdoext:hasDescriptor“Sky_Prototype_1""#Sky" rdf:resource="#vde-inst1"/>"#vde-inst1" "vde-inst1" 0 […] <vdoext:hasDescriptor“Sky_Prototype_1""#Sky" rdf:resource="#vde-inst1"/>"#vde-inst1" transform

Using Prototypes for Automatic Labelling extract segment labeling Knowledge Assisted Analysis rock sky sea beach beach/rock rock/beach sea, sky person/bear

Multimedia Structure Ontology RDF representation of the MPEG-7 Multimedia Description Schemes Contains only classes and relations relevant for representing a decomposition of images or videos Contains Classes for different types of segments –temporal and spatial segments Contains relations to describe different decompositions Augmented by annotation ontology and spatio-temporal ontology, allowing to describe –regions of an image or video –the spatial and temporal arrangement of the regions –what is depicted in a region

MSO Example Sky/Sea Sea Sand Sea Sea/Sky Person/Sand Person image01 segment01sky01 sea01 sand01 Image Sky Sea Sand Segment spatial-decomposition rdf:type depicts segment02 rdf:type segment03

© Copyright 2008 STI INNSBRUCK Questions?