CS 414 - Spring 2014 CS 414 – Multimedia Systems Design Lecture 15 – MP3 and MP4 Audio Klara Nahrstedt Spring 2014.

Slides:



Advertisements
Similar presentations
Audio Compression ADPCM ATRAC (Minidisk) MPEG Audio –3 layers referred to as layers I, II, and III –The third layer is mp3.
Advertisements

MP3 Overview John Ehrhardt Elena Silenok CSE228 – Spring 03.
Department of Computer Engineering University of California at Santa Cruz MPEG Audio Compression Layer 3 (MP3) Hai Tao.
Introduction to MP3 and psychoacoustics Material from website by Mark S. Drew
Psycho-acoustics and MP3 audio encoding
CS Spring 2012 CS 414 – Multimedia Systems Design Lecture 11 – MP3 and MP4 Audio (Part 7) Klara Nahrstedt Spring 2012.
Guerino Mazzola (Fall 2014 © ): Introduction to Music Technology IIIDigital Audio III.6 (Fr Oct 24) The MP3 algorithm with PAC.
MPEG/Audio Compression Tutorial Mike Blackstock CPSC 538a January 11, 2004.
CS335 Principles of Multimedia Systems Audio Hao Jiang Computer Science Department Boston College Oct. 11, 2007.
MPEG-1 MUMT-614 Jan.23, 2002 Wes Hatch. Purpose of MPEG encoding To decrease data rate How? –two choices: could decrease sample rate, but this would cause.
Data Compression CS 147 Minh Nguyen.
4.1Different Audio Attributes 4.2Common Audio File Formats 4.3Balancing between File Size and Audio Quality 4.4Making Audio Elements Fit Our Needs.
MPEG Audio Formats Jason Leung Wednesday, February 5, 2014.
Time-Frequency Analysis Analyzing sounds as a sequence of frames
Digital Audio Compression
Digital Audio Coding – Dr. T. Collins Standard MIDI Files Perceptual Audio Coding MPEG-1 layers 1, 2 & 3 MPEG-4.
AUDIO COMPRESSION TOOLS & TECHNIQUES Gautam Bhattacharya.
Digital Representation of Audio Information Kevin D. Donohue Electrical Engineering University of Kentucky.
1 Digital Audio Compression. 2 Formats  There are many different formats for storing and communicating digital audio:  CD audio  Wav  Aiff  Au 
Chapter 14 MPEG Audio Compression 14.1 Psychoacoustics 14.2 MPEG Audio 14.3 Other Commercial Audio Codecs 14.4 The Future: MPEG-7 and MPEG Further.
Audiovisual digital documents Adolf Knoll National Library of the Czech Republic
Speech Coding Nicola Orio Dipartimento di Ingegneria dell’Informazione IV Scuola estiva AISV, 8-12 settembre 2008.
Speech & Audio Processing
1 Audio Compression Techniques MUMT 611, January 2005 Assignment 2 Paul Kolesnik.
MPEG-3 For Audio Presented by: Chun Lui Sunjeev Sikand.
Lecture 14: Spring 2007 MPEG Audio Compression
MPEG Audio Compression by V. Loumos. Introduction Motion Picture Experts Group (MPEG) International Standards Organization (ISO) First High Fidelity Audio.
Audio Coding MPEG1 Layers I, II, III MPEG2MPEG4 Sherida Subrati Anthony Caliendo.
AUDIO VIDEO FLASH DIGITAL MEDIA: COMMUNICATION AND DESIGN
Fundamentals of Perceptual Audio Encoding Craig Lewiston HST.723 Lab II 3/23/06.
Audio CompressiontMyn1 Audio Compression Audio compression has become well entrenched in consumer and professional digital audio products such as the compact.
1 Audio Compression Multimedia Systems (Module 4 Lesson 4) Summary: r Simple Audio Compression: m Lossy: Prediction based r Psychoacoustic Model r MPEG.
CS Spring 2011 CS 414 – Multimedia Systems Design Lecture 2 –Auditory Perception and Digital Audio Klara Nahrstedt Spring 2011.
Image Compression - JPEG. Video Compression MPEG –Audio compression Lossy / perceptually lossless / lossless 3 layers Models based on speech generation.
Introduction to Interactive Media 10: Audio in Interactive Digital Media.
CS Spring 2012 CS 414 – Multimedia Systems Design Lecture 8 – JPEG Compression (Part 3) Klara Nahrstedt Spring 2012.
Psycho- acoustics and MP3 audio encoding Physics of Music PHY103.
MPEG: (Moving Pictures Expert Group) A Video Compression Standard for Multimedia Applications Seo Yeong Geon Dept. of Computer Science in GNU.
Audio Compression Usha Sree CMSC 691M 10/12/04. Motivation Efficient Storage Streaming Interactive Multimedia Applications.
CMPT 365 Multimedia Systems
Klara Nahrstedt Spring 2011
A Tutorial on MPEG/Audio Compression Davis Pan, IEEE Multimedia Journal, Summer 1995 Presented by: Randeep Singh Gakhal CMPT 820, Spring 2004.
CS Spring 2011 CS 414 – Multimedia Systems Design Lecture 7 – Basics of Compression (Part 2) Klara Nahrstedt Spring 2011.
Digital Multimedia, 2nd edition Nigel Chapman & Jenny Chapman Chapter 9 This presentation © 2004, MacAvon Media Productions Sound.
1 Audio Compression. 2 Digital Audio  Human auditory system is much more sensitive to quality degradation then is the human visual system  redundancy.
8. 1 MPEG MPEG is Moving Picture Experts Group On 1992 MPEG-1 was the standard, but was replaced only a year after by MPEG-2. Nowadays, MPEG-2 is gradually.
CS Spring 2011 CS 414 – Multimedia Systems Design Lecture 11 – MP3 Audio & Introduction to MPEG-4 (Part 6) Klara Nahrstedt Spring 2011.
Compression video overview 演講者:林崇元. Outline Introduction Fundamentals of video compression Picture type Signal quality measure Video encoder and decoder.
Image Processing Architecture, © 2001, 2002, 2003 Oleh TretiakPage 1 ECE-C490 Image Processing Architecture MP-3 Compression Course Review Oleh Tretiak.
CS Spring 2009 CS 414 – Multimedia Systems Design Lecture 3 – Digital Audio Representation Klara Nahrstedt Spring 2009.
MMDB-8 J. Teuhola Audio databases About digital audio: Advent of digital audio CD in Order of magnitude improvement in overall sound quality.
IntroductiontMyn1 Introduction MPEG, Moving Picture Experts Group was started in 1988 as a working group within ISO/IEC with the aim of defining standards.
CS Spring 2014 CS 414 – Multimedia Systems Design Lecture 3 – Digital Audio Representation Klara Nahrstedt Spring 2014.
Project Proposal Audio Compression Variants
CS Spring 2012 CS 414 – Multimedia Systems Design Lecture 7 – Basics of Compression (Part 2) Klara Nahrstedt Spring 2012.
EE5359 Multimedia Processing Project Study and Comparison of AC3, AAC and HE-AAC Audio Codecs Dhatchaini Rajendran Student ID: Date :
UNIT V. Linear Predictive coding With the advent of inexpensive digital signal processing circuits, the source simply analyzing the audio waveform to.
1 Part A Multimedia Production Chapter 2 Multimedia Basics Digitization, Coding-decoding and Compression Information and Communication Technology.
Submitted To: Submitted By: Seminar On Digital Audio Broadcasting.
Fundamentals of Multimedia 2 nd ed., Chapter 14 Chapter 14 MPEG Audio Compression 14.1 Psychoacoustics 14.2 MPEG Audio 14.3 Other Audio Codecs 14.4 MPEG-7.
MP3 and AAC Trac D. Tran ECE Department The Johns Hopkins University Baltimore MD
RICO HARTONO JAHJA SOURCE CODING: PART IV.
MP3 and MP4 Audio By: Krunal Tailor
[1] National Institute of Science & Technology Technical Seminar Presentation 2004 Suresh Chandra Martha National Institute of Science & Technology Audio.
III Digital Audio III.6 (Fr Oct 20) The MP3 algorithm with PAC.
Data Compression.
Sound Digital Multimedia, 2nd edition Nigel Chapman & Jenny Chapman
MPEG-1 Overview of MPEG-1 Standard
III Digital Audio III.6 (Mo Oct 22) The MP3 algorithm with PAC.
Govt. Polytechnic Dhangar(Fatehabad)
Presentation transcript:

CS Spring 2014 CS 414 – Multimedia Systems Design Lecture 15 – MP3 and MP4 Audio Klara Nahrstedt Spring 2014

CS Spring 2014 Administrative HW1 – posted on February 24 (Monday) HW1 – deadline on March 3 (Monday) Midterm – March 7 (Friday) in class

Outline H.264/H.265 – Arithmetic Coding MP3 Audio Encoding MP4 Audio Reading:  Media Coding book, Section –  Recommended Paper on MP3: Davis Pan, “A Tutorial on MPEG/Audio Compression”, IEEE Multimedia, pp. 6-74, 1995  Recommended books on JPEG/ MPEG Audio/Video Fundamentals: Haskell, Puri, Netravali, “Digital Video: An Introduction to MPEG-2”, Chapman and Hall, 1996 CS Spring 2014

H.264/H.265 Entropy Encoding (Limitations of Huffman Coding) Diverges from lower limit when probability of a particular symbol becomes high  always uses an integral number of bits Must send code book with the data  lowers overall efficiency Must determine frequency distribution  must remain stable over the data set CS Spring 2014

H.264/H.265 Entropy Coding (Arithmetic Coding) Each symbol is coded by considering the prior data Encoded data must be read from the beginning, there is no random access possible Each real number (< 1) is represented as binary fraction  0.5 = 2 -1 (binary fraction = 0.1); 0.25 = 2 -2 (binary fraction = 0.01), = (binary fraction = 0.101) …. CS Spring 2014

AUDIO COMPRESSION CS Spring 2014

Why Audio Compression is Needed Data rate = sampling rate * quantization bits * channels (+ control information) For example (digital audio):  Hz; 16 bits; 2 channels  generates about 1.4M of data per second; 84M per minute; 5G per hour CS Spring 2014

MPEG-1 Audio Lossy compression of audio In late 1980’s ISO’s MPEG group started to standardize  TV broadcasting  Use of Audio on CD-ROM (later DVD) MPEG-1 Audio – 1992 MPEG-2 Audio MPEG-1 Audio Layer I, II, III CS Spring 2014

MPEG-1 Audio Encoding Characteristics  Precision 16 bits  Sampling frequency: 32KHz, 44.1 KHz, 48 KHz  3 compression layers: Layer 1, Layer 2, Layer 3 (MP3) Layer 3: kbps, target 64 kbps Layer 2: kbps, target 128 kbps Layer 1: kbps, target 192 kbps CS Spring 2014

MPEG-1 Audio Layer II Called MP2 Dominant standard for audio broadcasting  DAB digital radio and DVB digital television Came out of MUSICAM codecs with bit rates kbps  MUSICAM audio coding - basis for MPEG-1 and MPEG-2 audio Sampling rates: 32, 44.1, 48 kHz Bit rates: 32, 48, 56, 64, 80, 96, … 384 kbps Format: mono, stereo, dual channel, …  MP2 – sub-band audio encoder in time domain CS Spring 2014

MPEG-1 Audio Layer III MPEG-1 Layer III is called MP3 format  Popular for Internet applications  Goal to compress to 128 kbps, but can be compressed to higher or lower resulting quality  Utilization of psychoacoustics Scientific study of sound perception. CS Spring 2014

MPEG Audio Encoding Steps CS Spring 2014

MPEG Audio Filter Bank Filter bank divides input into multiple sub-bands (32 equal frequency sub-bands) Sub-band i defined - filter output sample for sub-band i at time t, C[n] – one of 512 coefficients, x[n] – audio input sample from 512 sample buffer CS Spring 2014

MPEG Audio Psycho-acoustic Model Compresses by removing acoustically irrelevant parts of audio signals Takes advantage of human auditory systems inability to hear quantization noise under auditory masking  Auditory masking: occurs when ever the presence of a strong audio signal makes a temporal or spectral neighborhood of weaker audio signals imperceptible. CS Spring 2014

Loudness and Pitch (Review on Psychoacoustic Effects) More sensitive to loudness at mid frequencies than at other frequencies  intermediate frequencies at [500hz, 5000hz]  Human hearing frequencies at [20hz,20000hz] Perceived loudness of a sound changes based on frequency of that sound  basilar membrane reacts more to intermediate frequencies than other frequencies CS Spring 2014

Masking Effects (Review of Psychoacoustic Effects) CS Spring 2014 Frequency masking Temporal masking

CS Spring 2014 MPEG/audio divides audio signal into frequency sub-bands that approximate critical bands. Then we quantize each sub-band according to the audibility of quantization noise within the band

MPEG Audio Bit Allocation This process determines number of code bits allocated to each sub-band based on information from the psycho- acoustic model Algorithm: 1. Compute mask-to-noise ratio: MNR=SNR-SMR Standard provides tables that give estimates for SNR resulting from quantizing to a given number of quantizer levels 2. Get MNR for each sub-band 3. Search for sub-band with the lowest MNR 4. Allocate code bits to this sub-band. If sub-band gets allocated more code bits than appropriate, look up new estimate of SNR and repeat step 1 CS Spring 2014

Audio Quality Bitrate  With too low bit rate, we get compression artifacts Ringing Pre-echo – sound is heard before it occurs. It is most noticeable in impulsive sounds from percussion instruments such as cymbals  Occurs in transform-based audio compression algorithms Quality of encoder and encoding parameters  Constant Bit rate encoding  Variable Bit rate encoding CS Spring 2014

MP3 Audio Format CS Spring 2014 Source:

MPEG Audio Comments Precision of 16 bits per sample is needed to get good SNR ratio Noise we are getting is quantization noise from the digitization process For each added bit, we get 6dB better SNR ratio Masking effect means that we can raise the noise floor around a strong sound because the noise will be masked away Raising noise floor is the same as using less bits and using less bits is the same as compression CS Spring 2014

MPEG-4 Audio (AAC’s Improvements over MP3 Advanced Audio Coding in MPEG-4 More sample frequencies (8-96 kHz) Arbitrary bit rates and variable frame length Higher efficiency and simpler filterbank  Uses pure MDCT (modified discrete cosine transform)  Used in Windows Media Audio CS Spring 2014

MPEG-4 Audio Variety of applications  General audio signals  Speech signals  Synthetic audio  Synthesized speech (structured audio) CS Spring 2014

MPEG-4 Audio Part 3 Includes variety of audio coding technologies  Lossy speech coding (e.g., CELP) CELP – code-excited linear prediction – speech coding  General audio coding (AAC)  Lossless audio coding  Text-to-Speech interface  Structured Audio (e.g., MIDI) CS Spring 2014

MPEG-4 Part 14 Called MP4 with Extension.mp4 Multimedia container format Stores digital video and audio streams and allows streaming over Internet Container or wrapper format  meta-file format whose spec describes how different data elements and metadata coesit in computer file CS Spring 2014

Conclusion MPEG Audio is an integral part of the MPEG standard to be considered together with video MPEG-4 Audio represents an major extension in terms of capabilities to MPEG-1 Audio CS Spring 2014 [edit] Notesedit

ADDITIONAL SLIDES CS Spring 2014

Criteria for Good Standard Achieve desired outcome Be comprehensible Allow efficient implementation Support competition Give benchmark tests Be supported by industry Be good for end users …. Two models: implement first, then standardize standardize first, then implement CS Spring 2014

History of MPEG Audio – MP3 First psychoacoustic masking code was proposed in 1979 in AT&T – Bell Labs, Murray Hill. MP3 based on OCF (optimum coding in frequency domain) and PXFM (Perceptual transform coding) MPEG-1 Audio Layer III – public release 1993 MPEG-2 Audio III – public release 1995 CS Spring 2014

MPEG Audio – MP – mp3.com – offering thousands of MP3s created by independent artists for free 1999 – Napster MP3 peer-to-peer file sharing Problem: copyright infringement Authorized services: Amazon.com, Rhapsody, Juno Records,.. CS Spring 2014

Fletcher-Munson Contours Each contour represents an equal perceived sound CS Spring 2014 Perception sensitivity (loudness) is not linear across all frequencies and intensities

MPEG-4 Audio (Successor of MP3) Advanced Audio Coding (AAC) – now part of MPEG-4 Audio Inclusion of 48 full-bandwidth audio channels Default audio format for iPhone, iPad, Nintendo, PlayStation, Nokia, Android, BlackBerry Introduced 1997 as MPEG-2 Part 7 In 1999 – updated and included in MPEG-4 CS Spring 2014

MPEG-4 Audio Bit-rate 2-64kbps Scalable for variable rates MPEG-4 defines set of coders  Parametric Coding Techniques: low bit-rate 2-6kbps, 8kHz sampling frequency  Code Excited Linear Prediction: medium bit-rates kbps, 8 and 16 kHz sampling rate  Time Frequency Techniques: high quality audio 16 kbps and higher bit-rates, sampling rate > 7 kHz CS Spring 2014