Presentation is loading. Please wait.

Presentation is loading. Please wait.

Nov. 17, 2004© Artem Chebotko, 20041 OntoELAN: An Ontology-Based Linguistic Multimedia Annotator Speaker: Artem Chebotko Department.

Similar presentations


Presentation on theme: "Nov. 17, 2004© Artem Chebotko, 20041 OntoELAN: An Ontology-Based Linguistic Multimedia Annotator Speaker: Artem Chebotko Department."— Presentation transcript:

1 Nov. 17, 2004© Artem Chebotko, 20041 OntoELAN: An Ontology-Based Linguistic Multimedia Annotator Speaker: Artem Chebotko (artem@cs.wayne.edu) Department of Computer Science Wayne State University

2 Nov. 17, 2004© Artem Chebotko, 20042 Coauthors From left: Ms. Yu Deng, graduated with M.S. in Computer Science in 2004; Prof. Shiyong Lu, Computer Science, my advisor; Prof. Farshad Fotouhi, Computer Science, Chair of the department; Prof. Anthony Aristar, Dept. of English, Linguistics Program. All at the Wayne State University. Hennie Brugman, Alexander Klassmann, Han Sloetjes, Albert Russel, Peter Wittenburg, Max Planck Institute for Psycholinguistics, Nijmegen, Netherlands. Acknowledgements: Laura Buszard-Welcher and Andrea Berez, Dept. of English, Linguistics Program, WSU.

3 Nov. 17, 2004© Artem Chebotko, 20043 The Outline of The Talk Background and Motivation The Limitations of Existing Tools Our Approach and Advantages An Overview of OntoELAN Demo

4 Nov. 17, 2004© Artem Chebotko, 20044 Background and Motivation Linguistics Many languages are in serious danger of being lost In fact, half of the world's approximately 6,500 languages may disappear in the next 100 years Language data is critical to the research of linguistics, anthropology, history, sociology, and political science, etc. Language data is also important for the community of that language.

5 Nov. 17, 2004© Artem Chebotko, 20045 Background and Motivation Multimedia Many language data are collected as audio and video recordings Difficult for indexing and retrieval because multimedia data are not structured and their semantics are implicit in their contents. Annotation of multimedia data provides an opportunity for making the semantics explicit

6 Nov. 17, 2004© Artem Chebotko, 20046 Background and Motivation Ontology-based annotation An ontology is an explicit specification of a shared conceptualization. It formalizes the knowledge of various concepts and their relationships in a particular domain Annotation with ontological terms, whose meaning is known and understood by the domain community

7 Nov. 17, 2004© Artem Chebotko, 20047 Requirements for a Linguistic Multimedia Annotator Support for the annotation of descriptive metadata such as title, authors, date, time, etc. Support for a time axis and temporal segmentation of clips into slots Support for multiple-tier annotation, with each tier providing one avenue for annotation Support for ontology-based annotation to avoid incompatible formats and vocabularies

8 Nov. 17, 2004© Artem Chebotko, 20048 The Limitations of Existing Tools Either don’t support ontology IBM MPEG-7 Annotation Tool, ELAN or provide limited support of multimedia Protégé, ImageSpace, IBM MPEG-7 Annotation Tool ToolsDescriptive annotation Temporal segmentation Multi-tier annotation Ontology support ProtégéYesNo Yes IBM MPEG-7YesNo ImageSpaceYesNo Yes ELANYes No

9 Nov. 17, 2004© Artem Chebotko, 20049 Our Approach and Advantages We developed an ontology-based annotation tool, OntoELAN, for linguistic multimedia data that satisfies all the above requirements The ontological approach eliminates multiple incompatible annotation formats if the whole community can agree upon one domain ontology Annotations are formally defined and machine interpretable Deduction of additional, implicit information Search is precise and easier

10 Nov. 17, 2004© Artem Chebotko, 200410 An Overview of OntoELAN Developed on the top of ELAN annotator Max Planck Institute for Psycholinguistics team Features inherited from ELAN display a speech and/or video signals, together with their annotations; time linking of annotations to media streams; linking of annotations to other annotations; unlimited number of annotation tiers as defined by a user; different character sets; basic search facilities.

11 Nov. 17, 2004© Artem Chebotko, 200411 An Overview of OntoELAN Ontology support Wayne State University team New features language profile creation; ontology-based annotation; storing annotations in the XML format based on the General Multimedia Ontology and domain ontologies.

12 Nov. 17, 2004© Artem Chebotko, 200412 An Overview of OntoELAN

13 Nov. 17, 2004© Artem Chebotko, 200413 An Overview of OntoELAN

14 Nov. 17, 2004© Artem Chebotko, 200414 Linguistic Domain Ontology One example is the General Ontology for Linguistic Description (GOLD) Developed at University of Arizona Expressions OrthographicExpression, Utterance, SignedExpression, Word, WordPart Grammar Tense, Number, Agreement, PartOfSpeech PartOfSpeech: Noun, Verb, Participle, Preverb Data structures A lexical entry, a phoneme table and a syntactic tree Metaconcepts Language itself

15 Nov. 17, 2004© Artem Chebotko, 200415 General Multimedia Ontology Simple semantic framework for multimedia annotation Developed at Wayne State University especially for OntoELAN AnnotationDocument Tier TimeSlot Annotation AlignableAnnotation ReferringAnnotation AnnotationValue StringAnnotation OntologyAnnotation etc.

16 Nov. 17, 2004© Artem Chebotko, 200416 General Multimedia Ontology

17 Nov. 17, 2004© Artem Chebotko, 200417 Language Profile … is a subset of ontological terms, possibly renamed, that are used in the annotation of a particular multimedia resource ontological terms user-defined terms a mapping between ontological terms and user- defined terms a reference to an ontology

18 Nov. 17, 2004© Artem Chebotko, 200418 Language Profile Advantages Only a subset of ontological terms is useful for a particular resource annotation Renaming ontological terms, e.g. use another language, give an abbreviation or a synonym Combining the meaning of two or many ontological terms in one user-defined term. Disadvantage More work

19 Nov. 17, 2004© Artem Chebotko, 200419 Language Profile

20 Nov. 17, 2004© Artem Chebotko, 200420 Annotation Tiers and Linguistic Types Annotation tiers contain annotation values can be either alignable or referring are associated with their linguistic types Linguistic types None Time Subdivision Symbolic Subdivision Symbolic Association Ontological tier

21 Nov. 17, 2004© Artem Chebotko, 200421 Linguistic Multimedia Annotation with OntoELAN Language profile creation Creation of tiers Creation of annotations

22 Nov. 17, 2004© Artem Chebotko, 200422 Linguistic Multimedia Annotation with OntoELAN

23 Nov. 17, 2004© Artem Chebotko, 200423 Demos Language profile creation profile01.swf profile01.AVI profile01.swfprofile01.AVI profile02.swf profile02.AVI profile02.swfprofile02.AVI Creation of tiers & Creation of annotations annotate01.swf annotate01.AVI annotate01.swfannotate01.AVI annotate02.swf annotate02.AVI annotate02.swfannotate02.AVI

24 Nov. 17, 2004© Artem Chebotko, 200424 Conclusions and Future Work OntoELAN is the first attempt at annotating linguistic multimedia data with a linguistic ontology Future Work provide more channels for sharing data on the Web, such as the multimedia descriptions, the language words, etc. improve the current searching system integrate a text document annotation

25 Nov. 17, 2004© Artem Chebotko, 200425 References Artem Chebotko, Yu Deng, Shiyong Lu and Farshad Fotouhi. An Ontology-based Multimedia Annotator for the Semantic Web of Language Engineering. International Journal on Semantic Web and Information Systems, January, 2005. Artem Chebotko et al. OntoELAN: An Ontology-based Linguistic Multimedia Annotator. Proc. of the IEEE Sixth International Symposium on Multimedia Software Engineering (IEEE-MSE'2004), Miami, FL, USA, December, 2004.

26 Nov. 17, 2004© Artem Chebotko, 200426 References OntoELAN http://www.cs.wayne.edu/~yudeng/projects.htm LangDL: A Digital Library For Language Engineering And Research http://database.cs.wayne.edu/proj/langdl/index.html ELAN http://www.mpi.nl/tools/elan.html E-MELD http://www.emeld.org GOLD http://www.emeld.org/gold General Multimedia Ontology http://database.cs.wayne.edu/proj/OntoELAN/multimedia.owl

27 Nov. 17, 2004© Artem Chebotko, 200427 Questions? Contact information Artem Chebotko artem@cs.wayne.edu 313-577-6711


Download ppt "Nov. 17, 2004© Artem Chebotko, 20041 OntoELAN: An Ontology-Based Linguistic Multimedia Annotator Speaker: Artem Chebotko Department."

Similar presentations


Ads by Google