Nov. 17, 2004© Artem Chebotko, 20041 OntoELAN: An Ontology-Based Linguistic Multimedia Annotator Speaker: Artem Chebotko Department.

Slides:



Advertisements
Similar presentations
The Application of Machine Translation in CADAL Huang Chen, Chen Haiying Zhejiang University Libraries, Hangzhou, China
Advertisements

A Stepwise Modeling Approach for Individual Media Semantics Annett Mitschick, Klaus Meißner TU Dresden, Department of Computer Science, Multimedia Technology.
Jing-Shin Chang National Chi Nan University, IJCNLP-2013, Nagoya 2013/10/15 ACLCLP – Activities ( ) & Text Corpora.
Introduction to metadata for IDAH fellows Jenn Riley Metadata Librarian Digital Library Program.
Using the Semantic Web to Construct an Ontology- Based Repository for Software Patterns Scott Henninger Computer Science and Engineering University of.
OntoBlog: Informal Knowledge Management by Semantic Blogging Aman Shakya 1, Vilas Wuwongse 2, Hideaki Takeda 1, Ikki Ohmukai 1 1 National Institute of.
Information and Business Work
A New Learning Tools. Topic Maps is a standard for the representation and interchange of knowledge, with an emphasis on the findability of information.
Supervised by Prof. LYU, Rung Tsong Michael Department of Computer Science & Engineering The Chinese University of Hong Kong Prepared by: Chan Pik Wah,
COMP 6703 eScience Project Semantic Web for Museums Student : Lei Junran Client/Technical Supervisor : Tom Worthington Academic Supervisor : Peter Strazdins.
A System for A Semi-Automatic Ontology Annotation Kiril Simov, Petya Osenova, Alexander Simov, Anelia Tincheva, Borislav Kirilov BulTreeBank Group LML,
Department of Computer Science and Engineering, CUHK 1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System.
ReQuest (Validating Semantic Searches) Norman Piedade de Noronha 16 th July, 2004.
Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Dijrre, Peter Gerstl, Roland Seiffert Presented by Drew DeHaas.
What Linguists Want (we think) Helen Aristar Dry & Anthony Aristar LINGUIST List & E-MELD.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
Improving Data Discovery in Metadata Repositories through Semantic Search Chad Berkley 1, Shawn Bowers 2, Matt Jones 1, Mark Schildhauer 1, Josh Madin.
A Scalable Framework for the Collaborative Annotation of Live Data Streams Thesis Proposal Tao Huang
July 11, 2003E-MELD 2003 E-MELD “School” of Best Practice Helen Aristar-Dry & Gayathri Sriram The LINGUIST List Eastern Michigan University.
Principles of the GOLD Ontology & Conversion of GOLD to DCIF Presenters: Anthony Aristar, Evelyn Richter.
Sharing linguistic multi-media resources Jacquelijn Ringersma Paul Trilsbeek Max Planck Institute for Psycholinguistics Nijmegen, The Netherlands.
June 20, 2006E-MELD 2006, MSU1 Toward Implementation of Best Practice: Anthony Aristar, Wayne State University Other E-MELD Outcomes.
Contactforum: Digitale bibliotheken voor muziek. 3/6/2005 Real music libraries in the virtual future: for an integrated view of music and music information.
Project Builder and MediaMatrix: Redefining Access in the Digital Age Dean Rehberger and Michael Fegan MERLOT August 7-10, 2006 New Orleans, LA.
Max Planck Institute for Psycholinguistics Tool development report H. Brugman MPI Nijmegen.
Andrew Brasher Andrew Brasher, Patrick McAndrew Userlab, IET, Open University Human-Generated Learning.
Department of Computer Science and Engineering, CUHK 1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System.
LEXUS: a web based lexicon tool Jacquelijn Ringersma Max Planck Institute for Psycholinguistics Nijmegen, The Netherlands.
ICS-FORTH January 11, Thesaurus Mapping Martin Doerr Foundation for Research and Technology - Hellas Institute of Computer Science Bath, UK, January.
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
Query Processing In Multimedia Databases Dheeraj Kumar Mekala Devarasetty Bhanu Kiran.
Jennie Ning Zheng Linda Melchor Ferhat Omur. Contents Introduction WordNet Application – WordNet Data Structure - WordNet FrameNet Application – FrameNet.
1 Ontology-based Semantic Annotatoin of Process Template for Reuse Yun Lin, Darijus Strasunskas Depart. Of Computer and Information Science Norwegian Univ.
Dimitrios Skoutas Alkis Simitsis
 Copyright 2008 Digital Enterprise Research Institute. All rights reserved. Semantic on the Social Semantic Desktop.
Information Systems & Databases 2.2) Organisation methods.
Jan 9, 2004 Symposium on Best Practice LSA, Boston, MA 1 Comparability of language data and analysis Using an ontology for linguistics Scott Farrar, U.
Introduction to metadata
Exploring and Enriching a LR Archive via the Web Marc Kemps-Snijders, Alex Klassmann, Claus Zinn, Peter Berck, Albert Russel, Peter Wittenburg MPI for.
Technology – Broad View Aspects that play a role when integrating archives leave the details of some core topics to the 2. day Bernhard Neumair:Base Technologies.
EEL 5937 Ontologies EEL 5937 Multi Agent Systems Lecture 5, Jan 23 th, 2003 Lotzi Bölöni.
SKOS. Ontologies Metadata –Resources marked-up with descriptions of their content. No good unless everyone speaks the same language; Terminologies –Provide.
Slavic Digital Text Workshop 2006 The Open Archives Initiative Protocol for Metadata Harvesting: an Opportunity for Sharing Content in a Distributed Environment.
The future of the Web: Semantic Web 9/30/2004 Xiangming Mu.
Trustworthy Semantic Webs Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #4 Vision for Semantic Web.
1 Applications of video-content analysis and retrieval IEEE Multimedia Magazine 2002 JUL-SEP Reporter: 林浩棟.
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
Digitization – Basics and Beyond workshop Interoperability of cultural and academic resources New services for digitized collections Muriel Foulonneau.
M4 / September Integrating multimodal descriptions to index large video collections M4 meeting – Munich Nicolas Moënne-Loccoz, Bruno Janvier,
Digital Video Library Network Supervisor: Prof. Michael Lyu Student: Ma Chak Kei, Jacky.
EEL 5937 Ontologies EEL 5937 Multi Agent Systems Lotzi Bölöni.
MPEG-7 Audio Overview Ichiro Fujinaga MUMT 611 McGill University.
DANIELA KOLAROVA INSTITUTE OF INFORMATION TECHNOLOGIES, BAS Multimedia Semantics and the Semantic Web.
The Semantic Web. What is the Semantic Web? The Semantic Web is an extension of the current Web in which information is given well-defined meaning, enabling.
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
Annotation by category – ELAN and ISO DCR Han Slöetjes, Peter Wittenburg Max-Planck-Institute for Psycholinguistics LREC,
DocLing2016 Software Tools Peter K. Austin Department of Linguistics SOAS, University of London
1 February 2012 ILCAA, TUFS, Tokyo program David Nathan and Peter Austin Hans Rausing Endangered Languages Project SOAS, University of London Language.
Introduction to metadata for IDAH fellows Jenn Riley Metadata Librarian Digital Library Program.
ELAN as a tool for oral history CLARIN Oral History Workshop Oxford Sebastian Drude CLARIN ERIC 18 April 2016.
Introduction: Databases and Database Systems Lecture # 1 June 19,2012 National University of Computer and Emerging Sciences.
Semantic Web. P2 Introduction Information management facilities not keeping pace with the capacity of our information storage. –Information Overload –haphazardly.
Using Human Language Technology for Automatic Annotation and Indexing of Digital Library Content Kalina Bontcheva, Diana Maynard, Hamish Cunningham, Horacio.
SEMANTIC WEB Presented by- Farhana Yasmin – MD.Raihanul Islam – Nohore Jannat –
? Searching the WWW today document retrieval keyword based search user
Working meeting of WP4 Task WP4.1
Visual Information Retrieval
Kenneth Baclawski et. al. PSB /11/7 Sa-Im Shin
Ontology-Based Approaches to Data Integration
ViCoS Visualising Conceptual Spaces
Presentation transcript:

Nov. 17, 2004© Artem Chebotko, OntoELAN: An Ontology-Based Linguistic Multimedia Annotator Speaker: Artem Chebotko Department of Computer Science Wayne State University

Nov. 17, 2004© Artem Chebotko, Coauthors From left: Ms. Yu Deng, graduated with M.S. in Computer Science in 2004; Prof. Shiyong Lu, Computer Science, my advisor; Prof. Farshad Fotouhi, Computer Science, Chair of the department; Prof. Anthony Aristar, Dept. of English, Linguistics Program. All at the Wayne State University. Hennie Brugman, Alexander Klassmann, Han Sloetjes, Albert Russel, Peter Wittenburg, Max Planck Institute for Psycholinguistics, Nijmegen, Netherlands. Acknowledgements: Laura Buszard-Welcher and Andrea Berez, Dept. of English, Linguistics Program, WSU.

Nov. 17, 2004© Artem Chebotko, The Outline of The Talk Background and Motivation The Limitations of Existing Tools Our Approach and Advantages An Overview of OntoELAN Demo

Nov. 17, 2004© Artem Chebotko, Background and Motivation Linguistics Many languages are in serious danger of being lost In fact, half of the world's approximately 6,500 languages may disappear in the next 100 years Language data is critical to the research of linguistics, anthropology, history, sociology, and political science, etc. Language data is also important for the community of that language.

Nov. 17, 2004© Artem Chebotko, Background and Motivation Multimedia Many language data are collected as audio and video recordings Difficult for indexing and retrieval because multimedia data are not structured and their semantics are implicit in their contents. Annotation of multimedia data provides an opportunity for making the semantics explicit

Nov. 17, 2004© Artem Chebotko, Background and Motivation Ontology-based annotation An ontology is an explicit specification of a shared conceptualization. It formalizes the knowledge of various concepts and their relationships in a particular domain Annotation with ontological terms, whose meaning is known and understood by the domain community

Nov. 17, 2004© Artem Chebotko, Requirements for a Linguistic Multimedia Annotator Support for the annotation of descriptive metadata such as title, authors, date, time, etc. Support for a time axis and temporal segmentation of clips into slots Support for multiple-tier annotation, with each tier providing one avenue for annotation Support for ontology-based annotation to avoid incompatible formats and vocabularies

Nov. 17, 2004© Artem Chebotko, The Limitations of Existing Tools Either don’t support ontology IBM MPEG-7 Annotation Tool, ELAN or provide limited support of multimedia Protégé, ImageSpace, IBM MPEG-7 Annotation Tool ToolsDescriptive annotation Temporal segmentation Multi-tier annotation Ontology support ProtégéYesNo Yes IBM MPEG-7YesNo ImageSpaceYesNo Yes ELANYes No

Nov. 17, 2004© Artem Chebotko, Our Approach and Advantages We developed an ontology-based annotation tool, OntoELAN, for linguistic multimedia data that satisfies all the above requirements The ontological approach eliminates multiple incompatible annotation formats if the whole community can agree upon one domain ontology Annotations are formally defined and machine interpretable Deduction of additional, implicit information Search is precise and easier

Nov. 17, 2004© Artem Chebotko, An Overview of OntoELAN Developed on the top of ELAN annotator Max Planck Institute for Psycholinguistics team Features inherited from ELAN display a speech and/or video signals, together with their annotations; time linking of annotations to media streams; linking of annotations to other annotations; unlimited number of annotation tiers as defined by a user; different character sets; basic search facilities.

Nov. 17, 2004© Artem Chebotko, An Overview of OntoELAN Ontology support Wayne State University team New features language profile creation; ontology-based annotation; storing annotations in the XML format based on the General Multimedia Ontology and domain ontologies.

Nov. 17, 2004© Artem Chebotko, An Overview of OntoELAN

Nov. 17, 2004© Artem Chebotko, An Overview of OntoELAN

Nov. 17, 2004© Artem Chebotko, Linguistic Domain Ontology One example is the General Ontology for Linguistic Description (GOLD) Developed at University of Arizona Expressions OrthographicExpression, Utterance, SignedExpression, Word, WordPart Grammar Tense, Number, Agreement, PartOfSpeech PartOfSpeech: Noun, Verb, Participle, Preverb Data structures A lexical entry, a phoneme table and a syntactic tree Metaconcepts Language itself

Nov. 17, 2004© Artem Chebotko, General Multimedia Ontology Simple semantic framework for multimedia annotation Developed at Wayne State University especially for OntoELAN AnnotationDocument Tier TimeSlot Annotation AlignableAnnotation ReferringAnnotation AnnotationValue StringAnnotation OntologyAnnotation etc.

Nov. 17, 2004© Artem Chebotko, General Multimedia Ontology

Nov. 17, 2004© Artem Chebotko, Language Profile … is a subset of ontological terms, possibly renamed, that are used in the annotation of a particular multimedia resource ontological terms user-defined terms a mapping between ontological terms and user- defined terms a reference to an ontology

Nov. 17, 2004© Artem Chebotko, Language Profile Advantages Only a subset of ontological terms is useful for a particular resource annotation Renaming ontological terms, e.g. use another language, give an abbreviation or a synonym Combining the meaning of two or many ontological terms in one user-defined term. Disadvantage More work

Nov. 17, 2004© Artem Chebotko, Language Profile

Nov. 17, 2004© Artem Chebotko, Annotation Tiers and Linguistic Types Annotation tiers contain annotation values can be either alignable or referring are associated with their linguistic types Linguistic types None Time Subdivision Symbolic Subdivision Symbolic Association Ontological tier

Nov. 17, 2004© Artem Chebotko, Linguistic Multimedia Annotation with OntoELAN Language profile creation Creation of tiers Creation of annotations

Nov. 17, 2004© Artem Chebotko, Linguistic Multimedia Annotation with OntoELAN

Nov. 17, 2004© Artem Chebotko, Demos Language profile creation profile01.swf profile01.AVI profile01.swfprofile01.AVI profile02.swf profile02.AVI profile02.swfprofile02.AVI Creation of tiers & Creation of annotations annotate01.swf annotate01.AVI annotate01.swfannotate01.AVI annotate02.swf annotate02.AVI annotate02.swfannotate02.AVI

Nov. 17, 2004© Artem Chebotko, Conclusions and Future Work OntoELAN is the first attempt at annotating linguistic multimedia data with a linguistic ontology Future Work provide more channels for sharing data on the Web, such as the multimedia descriptions, the language words, etc. improve the current searching system integrate a text document annotation

Nov. 17, 2004© Artem Chebotko, References Artem Chebotko, Yu Deng, Shiyong Lu and Farshad Fotouhi. An Ontology-based Multimedia Annotator for the Semantic Web of Language Engineering. International Journal on Semantic Web and Information Systems, January, Artem Chebotko et al. OntoELAN: An Ontology-based Linguistic Multimedia Annotator. Proc. of the IEEE Sixth International Symposium on Multimedia Software Engineering (IEEE-MSE'2004), Miami, FL, USA, December, 2004.

Nov. 17, 2004© Artem Chebotko, References OntoELAN LangDL: A Digital Library For Language Engineering And Research ELAN E-MELD GOLD General Multimedia Ontology

Nov. 17, 2004© Artem Chebotko, Questions? Contact information Artem Chebotko