MPEG-4 CS Division University of California at Berkeley www.cs.berkeley.edu/~johnw John Lazzaro John Wawrzynek June 18, 2001 Modified by Francois Thibault.

Slides:



Advertisements
Similar presentations
Speech Coding Techniques
Advertisements

MPEG-4 Structured Audio CS Division University of California at Berkeley John Lazzaro John Wawrzynek June 18, 2001 Modified.
Part II (MPEG-4) Audio TSBK01 Image Coding and Data Compression Lecture 11, 2003 Jörgen Ahlberg.
Tamara Berg Advanced Multimedia
CS335 Principles of Multimedia Systems Audio Hao Jiang Computer Science Department Boston College Oct. 11, 2007.
03/18/2005ENEE408G Spring 2005 Multimedia Signal Processing 1 ENEE408G: Capstone Design Project: Multimedia Signal Processing Design Project 4: Digital.
Using Multimedia on the Web Enhancing a Web Site with Sound, Video, and Applets.
4.1Different Audio Attributes 4.2Common Audio File Formats 4.3Balancing between File Size and Audio Quality 4.4Making Audio Elements Fit Our Needs.
4.2 Multimedia Elements Audio 1. Learning Outcomes: At the end of the lesson, students should be: a) describe the purpose of using audio in multimedia.
I Power Higher Computing Multimedia technology Audio.
SWE 423: Multimedia Systems Chapter 3: Audio Technology (2)
Motivation Application driven -- VoD, Information on Demand (WWW), education, telemedicine, videoconference, videophone Storage capacity Large capacity.
Audiovisual digital documents Adolf Knoll National Library of the Czech Republic
Discussion on Video Analysis and Extraction, MPEG-4 and MPEG-7 Encoding and Decoding in Java, Java 3D, or OpenGL Presented by: Emmanuel Velasco City College.
1 Audio Compression Techniques MUMT 611, January 2005 Assignment 2 Paul Kolesnik.
EE442—Multimedia Networking Jane Dong California State University, Los Angeles.
SWE 423: Multimedia Systems Chapter 7: Data Compression (1)
EET 450 Chapter 18 – Audio. Analog Audio Sound is analog Consists of air pressure that has a variety of characteristics  Frequencies  Amplitude (loudness)
Tutorial 7 Working with Multimedia. XP Objectives Explore various multimedia applications on the Web Learn about sound file formats and properties Embed.
5. Multimedia Data. 2 Multimedia Data Representation  Digital Audio  Sampling/Digitisation  Compression (Details of Compression algorithms – following.
MPEG-4 Cedar Wingate MUMT 621 Slide Presentation I Professor Ichiro Fujinaga September 24, 2009.
HYPERTEXT MARKUP LANGUAGE (HTML)
Digital Sound and Video Chapter 10, Exploring the Digital Domain.
Computer Networking From LANs to WANs: Hardware, Software, and Security Chapter 14 Multimedia Networking.
Introduction to Interactive Media 10: Audio in Interactive Digital Media.
COMP Representing Sound in a ComputerSound Course book - pages
CS Spring 2012 CS 414 – Multimedia Systems Design Lecture 12 – MPEG-2/ MPEG-4 (Part 6) Klara Nahrstedt Spring 2012.
Chapter 8: Digital Media1 Digital Media Chapter 8.
Audio Compression Usha Sree CMSC 691M 10/12/04. Motivation Efficient Storage Streaming Interactive Multimedia Applications.
An Overview of MPEG-21 Cory McKay. Introduction Built on top of MPEG-4 and MPEG-7 standards Much more than just an audiovisual standard Meant to be a.
CHAPTER SEVEN SOUND. CHAPTER HIGHLIGHTS Nature of sound – Sine waves, amplitude, frequency Traditional sound reproduction Digital sound – Sampled – Synthesized.
CHAPTER FOUR COMPUTER SOFTWARE.
Tutorial 7 Working with Multimedia. XP Objectives Explore various multimedia applications on the Web Learn about sound file formats and properties Embed.
Introduction to Interactive Media Interactive Media Tools: Software.
Sound on the Web. Using Sound on a Web Site Conveying information  pronounce a word or describe a product Set a mood  music to match the web page scene.
Overview of Multimedia A multimedia presentation might contain: –Text –Animation –Digital Sound Effects –Voices –Video Clips –Photographic Stills –Music.
XP Tutorial 8New Perspectives on Creating Web Pages with HTML, XHTML, and XML 1 Using Multimedia on the Web Enhancing a Web Site with Sound, Video, and.
By NIST/ITL/IAD, Mike Rubinfeld, January 16, 2002 Page 1 L3 Overview L3 Standards Overview By Mike Rubinfeld Chairman, INCITS/L3 (MPEG & JPEG) NIST, Gaithersburg,
Creating Web Documents alt attribute Good and bad uses of ‘multimedia’ Sound files Homework: Discuss with me AND post announcement of Project II. Forms.
Tutorial 7 Working with Multimedia. New Perspectives on HTML, XHTML, and XML, Comprehensive, 3rd Edition 2 Objectives Explore various multimedia applications.
Tutorial 7 Working with Multimedia. New Perspectives on HTML, XHTML, and XML, Comprehensive, 3rd Edition 2 Objectives Explore various multimedia applications.
1 Mpeg-4 Overview Gerhard Roth. 2 Overview Much more general than all previous mpegs –standard finished in the last two years standardized ways to support:
Digital Multimedia, 2nd edition Nigel Chapman & Jenny Chapman Chapter 9 This presentation © 2004, MacAvon Media Productions Sound.
MPEG 4 Structured Audio: Algorithmic Sound for the Internet and Beyond CS Division University of California at Berkeley John.
8. 1 MPEG MPEG is Moving Picture Experts Group On 1992 MPEG-1 was the standard, but was replaced only a year after by MPEG-2. Nowadays, MPEG-2 is gradually.
09/30/2005ENEE408G Fall 2005 Multimedia Signal Processing 1 ENEE408G: Capstone Design Project: Multimedia Signal Processing Design Project 2: Digital Audio.
MULTIMEDIA INPUT / OUTPUT TECHNOLOGIES INTRODUCTION 6/1/ A.Aruna, Assistant Professor, Faculty of Information Technology.
Image Processing Architecture, © 2001, 2002, 2003 Oleh TretiakPage 1 ECE-C490 Image Processing Architecture MP-3 Compression Course Review Oleh Tretiak.
MPEG-4 standard MPEG-4 Multimedia Standard Olivier Dechazal.
Chapter 12 The Principles of Computer Music Contents Digital Audio Processing Noise Reduction Audio Compression Digital Rights Management (DRM)
Marwan Al-Namari 1 Digital Representations. Bits and Bytes Devices can only be in one of two states 0 or 1, yes or no, on or off, … Bit: a unit of data.
MMDB-8 J. Teuhola Audio databases About digital audio: Advent of digital audio CD in Order of magnitude improvement in overall sound quality.
Digital Audio III. Sound compression (I) Compression of sound data requires different techniques from those for graphical data Requirements are less stringent.
CSCI-100 Introduction to Computing Hardware Part II.
MPEG-4: Multimedia Coding Standard Supporting Mobile Multimedia System Lian Mo, Alan Jiang, Junhua Ding April, 2001.
COMP135/COMP535 Digital Multimedia, 2nd edition Nigel Chapman & Jenny Chapman Chapter 2 Lecture 2 – Digital Representations.
Interactive Multimedia Sound Mikael Fernström. Data sources Microphones and transducers –Sample acoustic reality Synthesis –Simulate reality (and beyond.
Voice Sampling. Sampling Rate Nyquist’s theorem states that a signal can be reconstructed if it is sampled at twice the maximum frequency of the signal.
Sound (analogue signal). time Sound (analogue signal) time.
Glencoe Introduction to Multimedia Chapter 8 Audio 1 Section 8.1 Audio in Multimedia Audio plays many roles in multimedia. Effective use in multimedia.
1 Part A Multimedia Production Chapter 2 Multimedia Basics Digitization, Coding-decoding and Compression Information and Communication Technology.
Introduction to MPEG  Moving Pictures Experts Group,  Geneva based working group under the ISO/IEC standards.  In charge of developing standards for.
Multimedia Systems Dr. Wissam Alkhadour.
MPEG-4 Binary Information for Scenes (BIFS)
Tutorial 7 Working with Multimedia
Overview What is Multimedia? Characteristics of multimedia
Working with Multimedia
Govt. Polytechnic Dhangar(Fatehabad)
Digital Audio Application of Digital Audio - Selected Examples
Presentation transcript:

MPEG-4 CS Division University of California at Berkeley John Lazzaro John Wawrzynek June 18, 2001 Modified by Francois Thibault January 20, 2003 Further modified by Ichiro Fujinaga January 20, 2005

MPEG 4 Standard  Finalized its standardization process in 1999 (Vancouver)  Design to integrate visual and audio  Includes "natural" (recorded) and "synthetic" (synthesized) coding of audio and video

MPEG 4 Scope  Provides a set of technologies to satisfy the needs of  authors  network service providers  end users  Enables the production of content that has far greater reusability in  digital television  animated graphics  web pages

MPEG 4 Features MPEG-4 provide standardized ways to:  represent units of aural, visual or audiovisual content, called “media objects”  Natural origin  Synthetic origin  recorded with a camera or microphone, or generated with a computer  describe the composition of these objects to create compound media objects that form audiovisual scenes  multiplex and synchronize the data associated with media objects, so that they can be transported over networks providing a QoS (Quality of Service)  interact with the audiovisual scene generated at the receiver’s end

MPEG 4 Standard (audio) MPEG 4 audiosystemvideo SA Natural codingSynthetic coding AACT/FCELPParametric TTS ISO/IEC sec5

MPEG 4 Audio: Natural (recorded)  AAC: The Advanced Audio Coding  Originally created as an extension to MPEG-2  Provides better quality at 64 kbit/sec/channel than MP3 does at 128 kbit/sec/channel  CELP: A codebook-excited linear prediction  scheme optimized for telephone- quality transmission of speech in the range 8-32 kbps  Parametric:  A novel "harmonic vector + noise" method that allows lossy but extremely low-bitrate coding of wideband sounds down to 2 kbps/sec/ channel

MPEG 4 Audio: Synthetic (synthesized)  Structured Audio:  A downloadable synthesis method that allows producers to describe new synthesis methods as part of the bitstream  the receiver implements a reconfigurable synthesis engine and synthesizes the sound on-the-fly as the instructions are received  Text-to-Speech:  An interface to standalone TTS systems is provided, so that synthetic speech can be synchronized in multimedia presentations  No "method" of creating synthetic speech is standardized by MPEG

MPEG 4 Standard - Structured Audio Structured Audio: One “component” in the MPEG audio standard. MPEG 4 audiosystemvideo SA Natural codingSynthetic coding AACT/FCELPParametric TTS ISO/IEC sec5

Audio Compression Basics decoderencoder time amp Filter into Critical Bands Allocate Bits Format Bit- stream Compute Masking  Traditional Technique for Music

The Kolmogorov alternative:  Write a computer program that generates the desired audio stream.  Transmit the computer program.  To decode, execute the program.  MPEG-4 Structured Audio (MP4-SA) uses this approach.  Eric Scheirer, Editor (MIT Media Lab).  Similar to Postscript!

MP4-SA Encoding  may be a creative act: writing a program.  directly (emacs), or  indirectly (GUI, webpage)  In this case, MP4-SA is a lossless compressor.  may be automatic: given a sound, an encoder writes a program that generates the sound.  Automatic encoding is a hard in the general case. MP4-SA Decoders  are interpreters or compilers.

Key Application: Music Production  Modern music production is computer-based.  Musicians enter performances into computers as control information, not audio waveforms.  Digital synthesizers, effects, and mixes create the final audio, under engineer/producer control. “The Program” synthesis algorithms effects “boxes” mixers Musical performance Mix-down control information “The Decoder” sound rendering MP4-SA Maps to Modern Music Production Network Premium on low-bandwidth

Key Application: Music Production  Modern music production is computer-based.  Musicians enter performances into computers as control information, not audio waveforms.  Digital synthesizers, effects, and mixes create the final audio, under engineer/producer control. “The Program” synthesis algorithms effects “boxes” mixers Musical performance Mix-down control information “The Decoder” sound rendering MP4-SA Maps to Modern Music Production Ideal for collaborative productions, remixes, and... File System Standard Framework

Key Application: Music Performance  Music Performance requires dynamic control.  True interactively requires parameterized sounds.  Musicians control instruments and effects with interactive controllers.  Control could be indirect and remote (ex: games). MP4-SA Enables Networked Music Performance Network Premium on low-bandwidth “The Decoder” sound rendering + “The Decoder” sound rendering +

MPEG 4 Structured Audio:  A binary file format that encodes:  The programming language SAOL (pronounced: sail).  The musical score language SASL.  Legacy support for MIDI.  Audio sample data.  Result is normative: an MP4-SA file will sound identical on all compliant decoders. èDifferent from MIDI files.

Why SAOL and MP4-SA? Why not Java?  Musical performance have temporal structure that changes over several timescales: Sample-by-sample 10’s of usec Amplitude & timbre envelopes: 10’s of msec Note-by-note: 100’s of msec  Writing sound generation code in a conventional language results in code dominated by time-scale management.  Hard to maintain, hard to optimize.

Time management is built into SAOL.  A SAOL program executes by moving a simulated clock forward in time, performing calculations along the way in a synchronous fashion.  Work is scheduled to happen:  at the a-rate (the audio sample rate)  at the k-rate (envelope control rate)  at the i-rate (rate for new notes)  Language variables are typed as a/k/i-rate.  A language statement is scheduled based on the rate of the variables it contains.

SAOL, SASL, and Scheduling:  Sound creation in MP4-SA can be compared to a musician playing notes on an instrument.  A SAOL subprogram (called an instr or instrument) serves as the instrument.  SASL commands (called score lines) act to play notes on SAOL instruments.  Many instances of a SAOL instr can be active at one time, making sounds corresponding to notes launched by different score lines in a SASL file.

An example:  SAOL instrument tone, that plays a gated sine wave. (SAOL code in next slide.)  This SASL file plays melody on tone : 0.5 tone tone tone tone tone tone tone end How long instrument runs When instance is launched Instance parameters (note number, loudness)

SAOL code for tone instr tone (note, loudness) { ivar a; // sets osc f ksig env; // env output asig x, y; // osc state asig init; a = 2*sin( *cpsmidi(note)/s_rate); env = kline(0, 0.1, 0.5, dur-0.2, 0.5, 0.1, 0); if (init == 0) // first a-pass only { x = loudness; init = 1; } x = x - a*y; // the FLOPS happen in y = y + a*x; // these 3 statements output(y*env); // creates audio output } // end of instr tone

SAOL Features  Rate semantics:  i/k/a-rate execution  Vector arithmetic:  ex: A=B+C  for i=1,n A[i]=B[i]+C[i]  All floating-point arithmetic.  Extensive build-in audio function library:  signal generators, table operators, pitch converters, filters, fft, sample rate conversion, effects,...

Sfront - a SAOL-to-C translator sfront foo.mp4sa.c  Converts MP4-SA files to a ANSI C program, that when executed, produces audio.  Runs on UNIX, Windows, MacOS.  Under Linux, supports real-time MIDI input, real-time audio input and output, and MIDI over RTP (Real Time Protocol).  sfront foo.mp4 SAOL MIDI Uncompressed samples SASL sa.c  Handles SAOL, SASL, MIDI, uncompressed samples.

Generator Techniques  Much of the SA standard describes a library  104 core opcodes (ex: pow(), allpass(), reverb() )  16 wave table generators (ex: harm, spline, random)  Sfront optimizes the code produced for each library element instance based on the invocation attributes  rate, width, size, constancy, integral nature of the parameters, number of paramaters

Conclusions  MP4-SA puts emphasis on sound synthesis methods that can be described in a small amount of space. Physical Modeling good  Sampling Natural Instruments bad  If models are chosen carefully, compression ratios of 100 to 10,000 are possible.  MP4-SA specifies that a decoder produces audio that “sounds identical” to computing the program accurately.