Download presentation
Presentation is loading. Please wait.
Published byWendy Todd Modified over 9 years ago
1
MPEG-7 Audio Overview Ichiro Fujinaga MUMT 611 McGill University
2
2 / 18 MUMT611 Fujinaga Content MPEG-7 overview MPEG-7 overview Objectives and scope Objectives and scope Main elements and organization Main elements and organization MPEG-7 audio MPEG-7 audio Low-level features Low-level features High-level features and tools High-level features and tools
3
3 / 18 MUMT611 Fujinaga Introduction (formally) Multimedia Content Description Interface (formally) Multimedia Content Description Interface MPEG-1, 2, 4: Content coding and representation MPEG-1, 2, 4: Content coding and representation MPEG-7: Metadata (1998-2001) MPEG-7: Metadata (1998-2001) standardized descriptions and description schemes of structures and content of multimedia a language to specify such descriptions and description schemes Interoperable interface that defines syntax and semantics Interoperable interface that defines syntax and semantics Modalities: audio, visual, or multimedia Modalities: audio, visual, or multimedia Aspects: media, meta, structural, or semantic Aspects: media, meta, structural, or semantic Applications: searching, filtering, navigation Applications: searching, filtering, navigation
4
4 / 18 MUMT611 Fujinaga Scope The goal is to provide interoperability among multimedia applications in The goal is to provide interoperability among multimedia applications in Generation Generation Management Management Distribution Distribution Consumption Consumption
5
5 / 18 MUMT611 Fujinaga Application domains Broadcast media selection (radio channel, TV channel) Broadcast media selection (radio channel, TV channel) Digital libraries (film, video, audio and radio archives) Digital libraries (film, video, audio and radio archives) E-Commerce (personalized advertising) E-Commerce (personalized advertising) Education (repositories of multimedia courses, multimedia search for support material) Education (repositories of multimedia courses, multimedia search for support material) Home Entertainment (management of personal multimedia collections, including manipulation of content, e.g. karaoke). Journalism (searching speeches of a certain politician using his name, his voice or his face) Home Entertainment (management of personal multimedia collections, including manipulation of content, e.g. karaoke). Journalism (searching speeches of a certain politician using his name, his voice or his face) Multimedia directory services (yellow pages) Multimedia directory services (yellow pages) Surveillance and remote sensing Surveillance and remote sensing
6
6 / 18 MUMT611 Fujinaga Components (XML) MPEG-7 Systems MPEG-7 Systems MPEG-7 Description Definition Language MPEG-7 Description Definition Language MPEG-7 Visual MPEG-7 Visual MPEG-7 Audio MPEG-7 Audio MPEG-7 Multimedia Description Schemes MPEG-7 Multimedia Description Schemes Reference Software: the eXperimentation Model (test) Reference Software: the eXperimentation Model (test) MPEG-7 Conformance (syntax checking) MPEG-7 Conformance (syntax checking) MPEG-7 Extraction and use of descriptions (technical report) MPEG-7 Extraction and use of descriptions (technical report)
7
7 / 18 MUMT611 Fujinaga Other Standards SMPTE SMPTE EBU EBU TV-Anytie TV-Anytie DIG-35 DIG-35 Dublin Core Dublin Core OCLC/RLG OCLC/RLG
8
8 / 18 MUMT611 Fujinaga MPEG-7 Objectives Information about the content Information about the content Form: e.g. the coding format used Form: e.g. the coding format used Conditions for accessing the material: Conditions for accessing the material: Intellectual property rights / price Intellectual property rights / price Classification: e.g. parental rating Classification: e.g. parental rating Links to other relevant materials Links to other relevant materials Context: e.g. “Olympic Games 1996, final of 200 meter hurdles, men” Context: e.g. “Olympic Games 1996, final of 200 meter hurdles, men” Information present in the content: Information present in the content: Combination of low-level and high-level descriptors Combination of low-level and high-level descriptors
9
9 / 18 MUMT611 Fujinaga Where do the descriptions come from? Preservation of existing descriptive data through the production/delivery Preservation of existing descriptive data through the production/delivery Generated automatically by capture devices (e.g. time or GPS location in a camera) Generated automatically by capture devices (e.g. time or GPS location in a camera) Extracted automatically & semi-automatically Extracted automatically & semi-automatically Manually produced (e.g. for legacy material such as existing film archives) Manually produced (e.g. for legacy material such as existing film archives)
10
10 / 18 MUMT611 Fujinaga Main Elements of MPEG-7 Description Tools: ( textual / binary ) Description Tools: ( textual / binary ) Descriptors (D): define the syntax and the semantics of each feature (metadata element) Descriptors (D): define the syntax and the semantics of each feature (metadata element) Description Schemes (DS): relationships between components Description Schemes (DS): relationships between components Description Definition Language (DDL): Description Definition Language (DDL): Define the syntax of the MPEG-7 Description Tools Define the syntax of the MPEG-7 Description Tools Creation, extension,and modification of DSs Creation, extension,and modification of DSs System tools: System tools: Storage and transmission, synchronization of descriptions with content, multiplexing of descriptions, etc. Storage and transmission, synchronization of descriptions with content, multiplexing of descriptions, etc.
11
11 / 18 MUMT611 Fujinaga Main Elements of MPEG-7 Salembier and Avaro (2001)
12
12 / 18 MUMT611 Fujinaga Description Tools Creation and production processes: (director, title) Creation and production processes: (director, title) Usage: (broadcast schedule) Usage: (broadcast schedule) Storage features Storage features Structural information: (spatial-temporal components) Structural information: (spatial-temporal components) Segmentations Segmentations Low-level features: (sound timbres, melody description) Low-level features: (sound timbres, melody description) Conceptual information: (objects and events, interactions) Conceptual information: (objects and events, interactions) Navigation and access: (summaries, variations) Navigation and access: (summaries, variations) Collections of objects Collections of objects User-content interactions: (user preferences, usage history) User-content interactions: (user preferences, usage history)
13
13 / 18 MUMT611 Fujinaga MPEG-7 Audio Audio provides structures—building upon some basic structures from the MDS—for describing audio content. Audio provides structures—building upon some basic structures from the MDS—for describing audio content. Low-level features Low-level features audio features that cut across many applications audio features that cut across many applications High-level features and tools High-level features and tools more specific to a set of applications more specific to a set of applications
14
14 / 18 MUMT611 Fujinaga Low-level Features Two low-level descriptor types (for sample and segment) Two low-level descriptor types (for sample and segment) Scalar : (e.g. power or fundamental frequency) Scalar : (e.g. power or fundamental frequency) Vector : (e.g. spectra) Vector : (e.g. spectra) Hierarchical, consistent interface Hierarchical, consistent interface Any descriptor inheriting from these types can be instantiated, describing a segment with a single summary value or a series of sampled values, as the application requires. Any descriptor inheriting from these types can be instantiated, describing a segment with a single summary value or a series of sampled values, as the application requires. Scalable series (hierarchical re-sampling) Scalable series (hierarchical re-sampling) Progressively down-sample the data contained in a series (application-oriented) Progressively down-sample the data contained in a series (application-oriented)
15
15 / 18 MUMT611 Fujinaga Low-level Features Salembier and Avaro (2001)
16
16 / 18 MUMT611 Fujinaga High-level Features Exchange some generality for descriptive richness: Exchange some generality for descriptive richness: a smaller set of audio features (as compared to visual features) that may canonically represent a sound without domain-specific knowledge. a smaller set of audio features (as compared to visual features) that may canonically represent a sound without domain-specific knowledge. Audio Signature (DS) Audio Signature (DS) Musical Instrument Timbre Musical Instrument Timbre Melody Melody General Sound Recognition and Indexing General Sound Recognition and Indexing Spoken Content Spoken Content
17
17 / 18 MUMT611 Fujinaga Recent Development New audio description tools specified (MPEG-7 version 2): New audio description tools specified (MPEG-7 version 2): Audio signal quality Audio signal quality Audio tempo Audio tempo Chord pattern Chord pattern Rhythm pattern Rhythm pattern Multi-channel Multi-channel
18
18 / 18 MUMT611 Fujinaga References Chang, S., T. Sikora, and A. Puri, 2001. Overview of MPEG-7 Standard. IEEE Transactions on Circuits and Systems for Video Technology 11 (6): 688-95. Chang, S., T. Sikora, and A. Puri, 2001. Overview of MPEG-7 Standard. IEEE Transactions on Circuits and Systems for Video Technology 11 (6): 688-95. Matinez, J. 2004. MPEG-7 Overview. http://www.chiariglione.org/mpeg/standards/mpeg- 7/mpeg-7.htm Matinez, J. 2004. MPEG-7 Overview. http://www.chiariglione.org/mpeg/standards/mpeg- 7/mpeg-7.htm Quackenbush, S. and A. Lindsay. 2001. Overview of MPEG-7 audio. IEEE Transactions on Circuits and Systems for Video Technology 11 (6): 725-9. Salembier, P., and O. Avaro. 2000. MPEG-7: Multimedia Content Description interface. Salembier, P., and O. Avaro. 2000. MPEG-7: Multimedia Content Description interface. http://gps- tsc.upc.es/imatge/_Philippe/demo/MPEG21_MPEG7.pdf
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.