A Reduced Yet Extensible Audio- Visual Description Language: How to Escape From The MPEG-7 Bottleneck Thursday 28 th of October, 2004 Raphaël Troncy, Jean Carrive
10/28/2004 ACM DocEng'04 - Raphaël Troncy1 Description of the AV content Various uses / Different granularity : –identification of the content creator and the content provider: Dublin Core metadata, VRA core categories, TV Anytime metadata … –feature extraction from the audio/video signal: storing and exchanging automatic tools results (MPEG-7) –structural decomposition in video segments corresponding to a logical structure of the program: time-code, spatial coordinates –semantic description of these segments: controlled vocabulary, thesaurus, free text annotation
10/28/2004 ACM DocEng'04 - Raphaël Troncy2 Description of the AV content (cultural heritage point of view) Segmentation –locate and date some events Description –type each segment with an AV genre –type each segment with a general thematic –give hints on the production –describe the scene (who, when, where, what, …) ⇒ needs a powerful description language
10/28/2004 ACM DocEng'04 - Raphaël Troncy3 MPEG-7, the natural candidate description language? ISO standard since December of 2001 Main components: –Descriptors (Ds) and Description Schemes (DSs) –DDL (XML Schema + extensions) Concern all types of media XML Syntax Part 5 - MDS
10/28/2004 ACM DocEng'04 - Raphaël Troncy4 MPEG-7: a non-effective description language for intelligent access to AV 1.A non-extensible language closed set of descriptors 2.Exchange syntax rather than a real machine processable multimedia description language non object-based data model non modular language (universal approach) 3.No formal semantics provided applications cannot have access to the meaning of the documents ⇒ the DDL (XML Schema) fault ?
10/28/2004 ACM DocEng'04 - Raphaël Troncy5 Motivating scenario Generic application for describing manually TV programs w.r.t: –structural constraints: patterns represent the logical structure of a document –semantic constraints: the description of the content is machine understandable Let us define the temporal structure of a Sports Magazine
10/28/2004 ACM DocEng'04 - Raphaël Troncy6 MPEG-7 cannot carry out this scenario ⇒ how to reconciliate the critical issue object-oriented semantic expression versus structural validation How to define new descriptors ? How to define new description schemes ? How to make the description machine understandable ?
10/28/2004 ACM DocEng'04 - Raphaël Troncy7 Our proposition: AVDL AVDL: a reduced yet extensible audio-visual description language –an object meta-model (an instance model specifies the vocabulary for and the rules followed by the descriptions) –an XML syntax –a semantics (closed to DL for the descriptors) Description Schemes –Descriptors –Properties –Structures Descriptions –valid instances w.r.t description schemes
10/28/2004 ACM DocEng'04 - Raphaël Troncy8 The meta class level
10/28/2004 ACM DocEng'04 - Raphaël Troncy9 The class level
10/28/2004 ACM DocEng'04 - Raphaël Troncy10 Location
10/28/2004 ACM DocEng'04 - Raphaël Troncy11 Document, Content and Media Distinction : –Document vs Content vs Media –Virtual content vs physical content Media: a content abstraction for decomposition –audio tracks, subtitles
10/28/2004 ACM DocEng'04 - Raphaël Troncy12 Defining Structures A structure defines how the descriptors may and have to be combined –allows a description control –allows an automatic completion of the descriptions AVDL provides some predefined structure models –containment : gives the list of the possible sub-segments of an AV segment (in space and in time) –regular expression : by analogy of grammar for temporal succession Other models are currently studied: temporal constraints, etc.
10/28/2004 ACM DocEng'04 - Raphaël Troncy13 AVDL Implementation XML Serialization –Independent from a schema language –Use XML Schema validation (mainly for datatypes) C# –Object inheritance –Use of the.NET reflexivity
10/28/2004 ACM DocEng'04 - Raphaël Troncy14 d-162.xml ds-17.xml avdl.xsd XML Serialization Audio-Visual Description Language Description Schemes Descriptions ds-17.xsd partial control transformation partial control
10/28/2004 ACM DocEng'04 - Raphaël Troncy15 XML Syntax (DS) <Constraint type="temporal" validation="full" method="system parser="XMLSchema">
10/28/2004 ACM DocEng'04 - Raphaël Troncy16 XML Syntax (Descriptions) <Media id="CPB mpg" name="CPB mpg" contentID="CPB mpg" frameHeight="288" frameWidth="352"/>...
10/28/2004 ACM DocEng'04 - Raphaël Troncy17 Carrying out the scenario Definition of new descriptors and properties –associating behavior with the corresponding classes –performing reasoning on the descriptions with the formal definitions in OWL Definition of logical and temporal structures –the description is controlled and validated by a grammar
10/28/2004 ACM DocEng'04 - Raphaël Troncy18 Conclusion and Future Work AVDL: a reduced yet extensible Audio-Visual Description Language –descriptors, properties, structures –XML syntax and DL semantics –.NET implementation and APIs About structure validation: –which constructors used ? which semantics ? Trade-of expressivity vs calculability –OWL Full is undecidable –constraints satisfaction problems can be complex
10/28/2004 ACM DocEng'04 - Raphaël Troncy20 Memory.NET implementation d-162.xml ds-17.xml Description Schemes Descriptions ds-17.dll parsing read/write.NET instanciation
10/28/2004 ACM DocEng'04 - Raphaël Troncy21 Two kinds of applications Static Description Schemes –DS are well-known –The developer uses generated libraries Dynamic Description Schemes –DS are created by the application –Use of the dynamic instantiation mechanism (reflexivity) of.NET