MPEG-7 Audio Overview Beinan Li MUMT 611 Week 2 2005. 1. 20.

Slides:



Advertisements
Similar presentations
GMD German National Research Center for Information Technology Darmstadt University of Technology Perspectives and Priorities for Digital Libraries Research.
Advertisements

Acoustics Research Institute Austrian Academy of Science MPEG-7 Todays Multimedia Standard Peter Balazs Institut für Schallforschung.
Chinese Academy of Sciences, Beijing, China Speech and Language Processing Techniques Report Document Overview of MPEG-7 Dr Zhang Sen Speech Group, INRIA-LORIA.
Outline Introduction Music Information Retrieval Classification Process Steps Pitch Histograms Multiple Pitch Detection Algorithm Musical Genre Classification.
DL:Lesson 11 Multimedia Search Luca Dini
A presentation by Modupe Omueti For CMPT 820:Multimedia Systems
Discussion on Video Analysis and Extraction, MPEG-4 and MPEG-7 Encoding and Decoding in Java, Java 3D, or OpenGL Presented by: Emmanuel Velasco City College.
Web- and Multimedia-based Information Systems. Assessment Presentation Programming Assignment.
Information Retrieval in Practice
3. Technical and administrative metadata standards Metadata Standards and Applications.
1 CS 430: Information Discovery Lecture 22 Non-Textual Materials 2.
Multimedia Search and Retrieval: New Concepts, System Implementation, and Application Qian Huang, Atul Puri, Zhu Liu IEEE TRANSACTION ON CIRCUITS AND SYSTEMS.
Presentation Outline  Project Aims  Introduction of Digital Video Library  Introduction of Our Work  Considerations and Approach  Design and Implementation.
1 MPEG-21 : Goals and Achievements Ian Burnett, Rik Van de Walle, Keith Hill, Jan Bormans and Fernando Pereira IEEE Multimedia, October-November 2003.
An Exploration of timbre: its perception, analysis and representation Dr. Deirdre Bolger CNRS-LMS,Paris Invited lecture, Institut für Musikwissenschaft,
MPEG-7 Audio Overview Beinan Li MUMT 611 Week
Architecture & Data Management of XML-Based Digital Video Library System Jacky C.K. Ma Michael R. Lyu.
Internet Resources Discovery (IRD) IBM DB2 Digital Library Thanks to Zvika Michnik and Avital Greenberg.
Philips Research France Delivery Context in MPEG-21 Sylvain Devillers Philips Research France Anthony Vetro Mitsubishi Electric Research Laboratories.
MPEG-7 Multimedia Content Description Standard January 8, 2003 John R. Smith Pervasive Media Management Group IBM T. J. Watson Research Center 19 Skyline.
Metadata Presentation by Rick Pitchford Chief Engineer, School of Communication COM 633, Content Analysis Methods Fall 2009.
Overview of Search Engines
Sound Applications Advanced Multimedia Tamara Berg.
Media Retrieval Information Retrieval Image Retrieval Video Retrieval Audio Retrieval Information Retrieval Image Retrieval Video Retrieval Audio Retrieval.
A Motivating Scenario for Designing an Extensible Audio- Visual Description Language Monday 25 th of October, 2004 Raphaël Troncy, Jean Carrive, Steffen.
1 Seminar Presentation Multimedia Audio / Video Communication Standards Instructor: Dr. Imran Ahmad By: Ju Wang November 7, 2003.
Multimedia Databases (MMDB)
The MPEG-7 Standard - A Brief Tutorial - Ali Tabatabai Sony US Research Laboratories February 27, 2001.
An Overview of MPEG-21 Cory McKay. Introduction Built on top of MPEG-4 and MPEG-7 standards Much more than just an audiovisual standard Meant to be a.
The MPEG Standard MPEG-1 (1992) actually a video player
MPEG-21 : Overview MUMT 611 Doug Van Nort. Introduction Rather than audiovisual content, purpose is set of standards to deliver multimedia in secure environment.
Music Information Retrieval -or- how to search for (and maybe find) music and do away with incipits Michael Fingerhut Multimedia Library and Engineering.
By NIST/ITL/IAD, Mike Rubinfeld, January 16, 2002 Page 1 L3 Overview L3 Standards Overview By Mike Rubinfeld Chairman, INCITS/L3 (MPEG & JPEG) NIST, Gaithersburg,
ECE8873 MPEG-7 Deryck Yeung. Overview Summary of MPEG-1,MPEG-2 and MPEG-4 Why another standard? MPEG-7 What’s next? Conclusion.
1 Mpeg-4 Overview Gerhard Roth. 2 Overview Much more general than all previous mpegs –standard finished in the last two years standardized ways to support:
Understanding The Semantics of Media Chapter 8 Camilo A. Celis.
1 CS 430: Information Discovery Lecture 22 Non-Textual Materials: Informedia.
Professional Content Management & Production Introduction & Content Related Workflows.
[The Band SIG] MPEG7 - Audio 손우람 2007 년 12 월 1 일.
Personalized Interaction With Semantic Information Portals Eric Schwarzkopf DFKI
MPEG 21 – An Overview MUMT 611 Elliot Sinyor January 2005.
MMDB-9 J. Teuhola Standardization: MPEG-7 “Multimedia Content Description Interface” Standard for describing multimedia content (metadata).
1 MPEG-7 Overview - part 2. 2 Review Descriptor (D) - 對內容的特徵作定義。 - 通常用以描述 low-level features 。 Description Scheme (DS) - 通常用以描述 high-level features 。
MPEG-4: Multimedia Coding Standard Supporting Mobile Multimedia System Lian Mo, Alan Jiang, Junhua Ding April, 2001.
Digital Video Library Network Supervisor: Prof. Michael Lyu Student: Ma Chak Kei, Jacky.
MPEG-7 Audio Overview Ichiro Fujinaga MUMT 611 McGill University.
DANIELA KOLAROVA INSTITUTE OF INFORMATION TECHNOLOGIES, BAS Multimedia Semantics and the Semantic Web.
Copyright (c) 2014 Pearson Education, Inc. Introduction to DBMS.
1 CS 430 / INFO 430 Information Retrieval Lecture 17 Metadata 4.
A Reduced Yet Extensible Audio- Visual Description Language: How to Escape From The MPEG-7 Bottleneck Thursday 28 th of October, 2004 Raphaël Troncy, Jean.
Introduction to MPEG  Moving Pictures Experts Group,  Geneva based working group under the ISO/IEC standards.  In charge of developing standards for.
MPEG 7 &MPEG 21.
LREC – Workshop on Crossing media for Improved Information Access, Genova, Italy, 23 May Cross-Media Indexing in the Reveal-This System Murat Yakici,
Information Retrieval in Practice
MPEG-7 What is MPEG-7 ? MPEG-7 is a multimedia content description standard. These descriptions are based on catalogue (e.g., title, creator, rights),
The Greek Audio Dataset
Technologies: for Enhancing Broadcast Programmes with Bridgets
Visual Information Retrieval
Search Engine Architecture
CS644 Advanced Topics in Networking
Introduction Multimedia initial focus
Multimedia Content-Based Retrieval
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
An Overview of MPEG-21 Cory McKay.
Overview What is Multimedia? Characteristics of multimedia
2. An overview of SDMX (What is SDMX? Part I)
2. An overview of SDMX (What is SDMX? Part I)
Multimedia Content Description Interface
Físchlár Digital Video Library
MUMT611: Music Information Acquisition, Preservation, and Retrieval
Presentation transcript:

MPEG-7 Audio Overview Beinan Li MUMT 611 Week 2 2005. 1. 20

Content MPEG-7 overview MPEG-7 Audio What is… Why? Objectives and scope Main elements and organization. MPEG-7 Audio Low-level features High-level tools

What is MPEG-7 "Multimedia Content Description Interface“ ISO/IEC standard by MPEG (Moving Picture Experts Group) Providing meta-data for multimedia MPEG-1, -2, -4: make content available; MPEG-7: makes content accessible, retrievable, filterable, manageable (via device / computer). Multi-degrees of interpretation of information’s meaning Support as broad a range of applications as possible. A compatible (with existing tech) and extensible standard.

Why MPEG-7 “The value of information often depends on how easy it can be found, retrieved, accessed, filtered and managed. ” Past: poverty of the digital multimedia sources -> Simplicity of the access mechanisms Now: growing amount of audiovisual information -> Identifying and managing them efficiently is becoming more difficult. e.g. “record only news about sport.”

Why MPEG-7 For future multimedia services, content representation and description may have to be addressed jointly. Many services dealing with content representation will have to deal first with content description “a non-described content may be useless” Need for access only to the content description: New original services (e.g. optimizing personal time) Adaptation to networks and terminal capabilities

Application’s domains (incomplete) Broadcast media selection (e.g., radio channel, TV channel). Digital libraries (e.g., film, video, audio and radio archives). E-Commerce (e.g., personalized advertising). Education (e.g., repositories of multimedia courses, multimedia search for support material). Home Entertainment (e.g., management of personal multimedia collections, including manipulation of content, e.g. karaoke). Journalism (e.g. searching speeches of a certain politician using his name, his voice or his face). Multimedia directory services (e.g. yellow pages, G.I.S). Surveillance and remote sensing.

MPEG-7 Objectives Standardize content-based description for various types of audiovisual information Independent from media support (encoding and storage) Different granularity Low-level features: shape, size, key, tempo changes, High-level semantic info: “scene with a barking brown dog on the left and with the sound of passing cars in the background.” Meaningful in the context of the application Same material -> different types of features and combinations e.g. timbre v.s. loudness

MPEG-7 Objectives Information about the content The form: e.g. the coding format used Conditions for accessing the material: e.g. Intellectual property rights / price Classification: e.g. parental rating Links to other relevant materials The context: “e.g. Olympic Games 1996, final of 200 meter hurdles, men)” Information present in the content: Combination of low-level and high-level descriptors

Scope of the Standard processing chain:

An example of architecture Pull: (Client Queries -> Descriptions repository -> Matched Ds) Push: (Filter descriptions -> Programmed actions)

Workplan

Where are the descriptions from? Preservation of existing descriptive data (e.g. scripts) through the production/delivery Generated automatically by capture devices (e.g. time or GPS location in a camera) Extracted automatically & semi-automatically (i.e. with some human assistance) Manually produced (e.g. for legacy material such as existing film archives)

Main Elements of MPEG-7 Description Tools: ( textual / binary ) Descriptors (D): define the syntax and the semantics of each feature (metadata element) Description Schemes (DS): relationships between components Description Definition Language (DDL): Define the syntax of the MPEG-7 Description Tools Creation , extension and modification of DSs System tools: Storage and transmission, synchronization of descriptions with content, multiplexing of descriptions, etc.

Main Elements of MPEG-7 Relationship among elements introduced above.

Description Tools Creation and production processes: (director, title) Usage: (broadcast schedule) Storage features. Structural information: (spatial-temporal components) Segmentations Low level features: (sound timbres, melody description) Conceptual information: (objects and events, interactions) Navigation and access: (summaries, variations) Collections of objects. User-content interactions: (user preferences, usage history)

Organization of Description Tools

Descriptions (further) MPEG-7 approaches the description of content from several viewpoints. A set of methods and tools for the different viewpoints of the description (not a monolithic system) Interrelated and can be combined in many ways. Associated with the content itself: (searching, filtering) Location: (document V.S. stream) physically located with the material somewhere else on the globe (maybe not) Interoperability with other metadata standards: (XML)

Use of Description Tools The description tools are presented on the basis of the functionality they provide. In practice, they are combined into meaningful sets of description units. Furthermore, each application will have to select a sub-set of descriptors and DSs. Library of tools! DDL can be used to handle specific needs of the application. (like scripting in many current applications)

Major Functionalities MPEG-7 Systems MPEG-7 Description Definition Language MPEG-7 Visual MPEG-7 Audio MPEG-7 Multimedia Description Schemes (D.T.) Reference Software: the eXperimentation Model (test) MPEG-7 Conformance (syntax checking) MPEG-7 Extraction and use of descriptions (technical report)

MPEG-7 Audio Audio provides structures—building upon some basic structures from the MDS—for describing audio content. Low-level Descriptors: audio features that cut across many applications High-level Description Tools: more specific to a set of applications.

Low-level Features “MPEG-7 Audio Framework”: Two low-level descriptor types: (for sample and segment) Scalar : (e.g. power or fundamental frequency) Vector : (e.g. spectra) Hierarchical, consistent interface Any descriptor inheriting from these types can be instantiated, describing a segment with a single summary value or a series of sampled values, as the application requires. Scalable Series: (hierarchical re-sampling) Progressively down-sample the data contained in a series (Application-oriented)

Low-level Features (types) Basic Basic Spectral Signal Parameters Timbral Temporal Timbral Spectral Spectral Basis MPEG-7 Silence Descriptor

Low-level Features (graph)

Low-level Features (details) Basic: (temporally sampled scalar values for general use) AudioWaveform Descriptor waveform envelope: (for display purposes). AudioPower Descriptor temporally-smoothed instantaneous power: (quick summary of a signal) Applicable to all kinds of signals

Low-level Features (details) Basic Spectral: (single time-frequency analysis of signal) AudioSpectrumEnvelope: (Base class) the short-term power spectrum: (display, synthesize, general-purpose search) AudioSpectrumCentroid: dominated by high or low frequencies ? AudioSpectrumSpread: the power spectrum centered near the spectral centroid, or spread out over the spectrum? pure-tone and noise-like sounds AudioSpectrumFlatness: (the presence of tonal components)

Low-level Features (details) Signal Parameters: (periodic or quasi-periodic signals) AudioFundamentalFrequency: “confidence measure”, replacing “pitch-tracking” AudioHarmonicity: distinction between sounds with a harmonic / inharmonic / non-harmonic spectrum

Low-level Features (details) Timbral Temporal: (temporal characteristics of segments of sounds, musical timbre) LogAttackTime TemporalCentroid where in time the energy of a signal is focused. Useful when attack times are identical

Low-level Features (details) Timbral Spectral: (spectral features in a linear-frequency space) SpectralCentroid: power-weighted average of the frequency of the bins in the linear power spectrum. distinguishing musical instrument timbres 4 Ds for harmonic regularly-spaced components of signals: HarmonicSpectralCentroid HarmonicSpectralDeviation HarmonicSpectralSpread HarmonicSpectralVariation

Low-level Features (details) Spectral Basis: (low-dimensional projections of a spectral space to aid compactness and recognition) AudioSpectrumBasis: a series of (time-varying / statistically independent) basis functions derived from the singular value decomposition of a normalized power spectrum. AudioSpectrumProjection: low-d features of a spectrum after projection upon a reduced rank basis. independent subspaces of a spectra correlate strongly with different sound sources. Provide more salience using less space. With Sound Classification and Indexing Description Tools.

Low-level Features (details) Silence segment: (no significant sound) aid further segmentation of the audio stream, or as a hint not to process a segment

High-level audio Description Tools (Ds and DSs) Exchange some generality for descriptive richness: a smaller set of audio features (as compared to visual features) that may canonically represent a sound without domain-specific knowledge. Audio Signature (DS) Musical Instrument Timbre Melody General Sound Recognition and Indexing Spoken Content

High-level audio Description Tools (details) Audio Signature Description Scheme SpectralFlatness Ds a unique content identifier for the purpose of robust automatic identification e.g. audio fingerprinting

High-level audio Description Tools (details) Musical Instrument Timbre Description Tools HarmonicInstrumentTimbre Ds: LogAttackTime Descriptor PercussiveIinstrumentTimbre Ds: SpectralCentroid Descriptor

High-level audio Description Tools (details) Melody Description Tools: efficient, robust, and expressive melodic similarity matching. MelodyContour Description Scheme: terse, efficient melody contour / rhythm MelodySequence Description Scheme: verbose, complete, expressive melody / rhythm. Interval encoding

High-level audio Description Tools (details) General Sound Recognition and Indexing Description Tools: SoundModel Description Scheme SoundClassificationModel Description Scheme a set of SoundModel DS -> multi-way classifier SoundModelStatePath Descriptor indices to states generated by a SoundModel of a segment immediately applied to sound effects automatically index and segment sound tracks. Low -> mid -> high level analyses

High-level audio Description Tools (details) Spoken Content Description Tools: detailed description of words spoken within an audio stream. indexing into and retrieval of an audio stream indexing of multimedia objects annotated with speech. Recall of audio/video data by memorable spoken events. a character or person spoke a particular word Spoken Document Retrieval separate spoken documents Annotated Media Retrieval photograph retrieved using a spoken annotation

Development Currently under development: MPEG-7 Audio COR.1 (currently at DCOR1) MPEG-7 Amendment 1 (currently at FPDAM1) New Audio Description Tools specified (MPEG-7 version 2): Spoken Content: Audio Signal Quality: Audio Tempo: Currently Proposed tools: Low Level Descriptor for Audio Intensity Low Level Descriptor for Audio Spectrum Envelope Evolution Generic mechanism for data representation based on ‘modulation decomposition’ MPEG-7 Audio-specific binary representation of descriptors

MPEG-7 version 1 Schedule Call for Proposals October 1998 Evaluation February 1999 First version of Working Draft (WD) December 1999 Committee Draft (CD) October 2000 Final Committee Draft (FCD) February 2001 Final Draft International Standard (FDIS) July 2001 International Standard (IS) September 2001

MPEG-7 work plan: See : Annex A of MPEG-7 Overview (version 9) http://www.chiariglione.org/mpeg/standards/mpeg-7/mpeg-7.htm

Annotated Link Page / References http://www.music.mcgill.ca/~damonli/611/611_w2.htm All pictures taken from: P. Salembier and O. Avaro, “MPEG-7: Multimedia Content Description interface”, http://gps-tsc.upc.es/imatge/_Philippe/demo/MPEG21_MPEG7.pdf