Understanding the Internet Low Bit Rate Coder

Slides:



Advertisements
Similar presentations
Wideband Speech Coding for CDMA2000® Systems
Advertisements

Developement and Implementation of an MPEG1 Layer III Decoder on x86 and TMS320C6711 platforms Braidotti Enrico (Farina Simone)
N Team 15: Final Presentation Peter Nyberg Azadeh Bararsani Adie Tong N N multicodec minisip.
Speech Coders – a VoIP perspective Roar Hagen CTO SIP/
High Performance 32 Channel ADPCM Codec File Number Here ® LogiCORE Products.
Ranko Pinter Simoco Digital Systems
Speech in Multimedia Hao Jiang Computer Science Department Boston College Oct. 9, 2007.
CELLULAR COMMUNICATIONS 5. Speech Coding. Low Bit-rate Voice Coding  Voice is an analogue signal  Needed to be transformed in a digital form (bits)
Yi Liang Department of Electrical Engineering Stanford University April 19, 2000 Loss Recovery and Adaptive Playout Control for Packet Voice Communications.
© 2006 AudioCodes Ltd. All rights reserved. AudioCodes Confidential Proprietary Signal Processing Technologies in Voice over IP Eli Shoval Audiocodes.
1 © NOKIA GPP2 Wideband Codec Presentation Interoperable Wideband Speech Coder for CDMA2000 and WCDMA Systems W-VRM: Wideband Variable-Rate Multi-Mode.
Speech Coding Nicola Orio Dipartimento di Ingegneria dell’Informazione IV Scuola estiva AISV, 8-12 settembre 2008.
A STUDY OF DESIGN COMPROMISES FOR SPEECH CODERS IN PACKET NETWORKS 1.INTRODUCTION In voice over packet networks, the coding gain achieved by prediction-based.
Understanding the Internet Low Bit Rate Coder Jan Linden Vice President of Engineering Global IP Sound Presented by Jan Skoglund Sr. Research Scientist.
Voice over the Internet (the basics) CS 7270 Networked Applications & Services Lecture-2.
Overview of Adaptive Multi-Rate Narrow Band (AMR-NB) Speech Codec
VoIP on the iPhone: Imagine the Possibilities Jan Linden, VP of Engineering.
Digital Voice Communication Link EE 413 – TEAM 2 April 21 st, 2005.
Encoder and Decoder Optimization for Source-Channel Prediction in Error Resilient Video Transmission Hua Yang and Kenneth Rose Signal Compression Lab ECE.
Rate-Distortion Optimized Motion Estimation for Error Resilient Video Coding Hua Yang and Kenneth Rose Signal Compression Lab ECE Department University.
© 2010 Universität Tübingen, WSI-ICS Patrick Schreiner, Christian Hoene Universität Tübingen WSI-ICS 26. July 2010 Rate Adaptation for the IETF IIAC.
Cisco Unified Communications Manager (CUCM)
Secure Steganography in Audio using Inactive Frames of VoIP Streams
Computer Networks: Multimedia Applications Ivan Marsic Rutgers University Chapter 3 – Multimedia & Real-time Applications.
Develop and Implementation of the Speex Vocoder on the TI C64+ DSP
UNIVERSITÉ DE SHERBROOKE - Philippe G OURNAY Senior Research Engineer VoiceAge Corporation University of Sherbrooke François R OUSSEAU, Roch L EFEBVRE.
Page 0 of 23 MELP Vocoders Nima Moghadam SN#: Saeed Nari SN#: Supervisor Dr. Saameti April 2005 Sharif University of Technology.
Audio Henning Schulzrinne Dept. of Computer Science Columbia University Fall 2003.
Speech Coding Submitted To: Dr. Mohab Mangoud Submitted By: Nidal Ismail.
The Way Forward Factors Driving Video Conferencing Dr. Jan Linden, VP of Engineering Global IP Solutions.
1 RTP Multiplexing using Tunnels (TCRTP) Bruce Thompson Tmima Koren Cisco Systems Inc.
Rate-distortion Optimized Mode Selection Based on Multi-channel Realizations Markus Gärtner Davide Bertozzi Classroom Presentation 13 th March 2001.
1 RaptorG Forward Error Correction Scheme for Object Delivery draft-luby-rmt-bb-fec-raptorg-object-00 (update to this to be officially submitted soon)
ECE 5525 Osama Saraireh Fall 2005 Dr. Veton Kepuska
VOCODERS. Vocoders Speech Coding Systems Implemented in the transmitter for analysis of the voice signal Complex than waveform coders High economy in.
1.INTRODUCTION The use of the adaptive codebook (ACB) in CELP-like speech coders allows the achievement of high quality speech, especially for voiced segments.
Fast motion estimation and mode decision for H.264 video coding in packet loss environment Li Liu, Xinhua Zhuang Computer Science Department, University.
Page 1 The department of Information & Communications Engineering Dong-uk, kim A Survey of Packet Loss Recovery Techniques for Streaming.
Comparisons of FEC and Codec Robustness on VoIP Quality and Bandwidth Efficiency Wenyu Jiang Henning Schulzrinne Columbia University ICN 2002, Atlanta,
Present document contains informations proprietary to France Telecom. Accepting this document means for its recipient he or she recognizes the confidential.
Minjie Xie, Dave Lindbergh, and Peter Chu
Chapter 20 Speech Encoding by Parameters 20.1 Linear Predictive Coding (LPC) 20.2 Linear Predictive Vocoder 20.3 Code Excited Linear Prediction (CELP)
Voice Sampling. Sampling Rate Nyquist’s theorem states that a signal can be reconstructed if it is sampled at twice the maximum frequency of the signal.
COMPARATIVE STUDY OF HEVC and H.264 INTRA FRAME CODING AND JPEG2000 BY Under the Guidance of Harshdeep Brahmasury Jain Dr. K. R. RAO ID MS Electrical.
CELP / FS-1016 – 4.8kbps Federal Standard in Voice Coding
Institut für Nachrichtengeräte und Datenverarbeitung Prof. Dr.-Ing. P. Vary On the Use of Artificial Bandwidth Extension Techniques in Wideband Speech.
UNIT V. Linear Predictive coding With the advent of inexpensive digital signal processing circuits, the source simply analyzing the audio waveform to.
2nd Workshop on Wideband Speech Quality - June nd Workshop on Wideband Speech Quality in Terminals and Networks: Assessment and Prediction 22nd.
Presented by: Class Presentation of Custom DSP Implementation Course on: This is a class presentation. All data are copy rights of their respective authors.
Codec 2 ● open source speech codec ● low bit rate (2400 bit/s and below) ● applications include digital speech for HF and VHF radio ● fills gap in open.
Motivation ● The (Ham) world needs an open source, patent free speech codec at bit rates of less than 5000 bit/s ● I know how to build one!
Codec 2 open source speech codec
CSI-447: Multimedia Systems
Scalable Speech Coding for IP Networks
Scalable Speech Coding for IP Networks: Beyond iLBC
Digital Communications Chapter 13. Source Coding
Vocoders.
Wenyu Jiang Henning Schulzrinne Columbia University
Optimal Mode Selection For Robust Video Transmission
Signaling Compression for Push-to-talk over Cellular (PoC)
Context-based Data Compression
Audio Henning Schulzrinne Dept. of Computer Science
Mohamed Chibani, Roch Lefebvre and Philippe Gournay
Speech and Audio Processing
ON THE ARCHITECTURE OF THE CDMA2000® VARIABLE-RATE MULTIMODE WIDEBAND (VMR-WB) SPEECH CODING STANDARD Milan Jelinek†, Redwan Salami‡, Sassan Ahmadi*, Bruno.
Scalable Speech Coding for IP Networks: Beyond iLBC
Packet loss concealment using audio morphing
Standards Presentation ECE 8873 – Data Compression and Modeling
MPEG-1 Overview of MPEG-1 Standard
Presentation transcript:

Understanding the Internet Low Bit Rate Coder Jan Linden Vice President of Engineering Global IP Sound Presented by Jan Skoglund Sr. Research Scientist

iLBC – Background info Development started in Summer 2000 Contributed to IETF as an internet draft in Feb 2002 Accepted as work item in IETF AVT group Mar 2002 Contributed to CableLabs RFP in June 2002 Improved version to IETF, Fall 2002 ECR submitted in May 2003 Support for 20 ms frames spring 2003 Successful interoperability events Past Working Group Last call in IETF Jan 2004 April 2004 added as a mandatory codec in PacketCable 1.1 December 2004 IETF process finalized (became Experimental RFC 3951 and 3952)

Design Principles Free of 3rd party IPR Packet independency extensive experience in speech coding patents by design team patent and research situation monitored since 2000 has been public in IETF since March 2002 and reviewed by independent speech coding researchers Packet independency no coding interdependency between frames increased packet loss robustness suitable for IP networks Linear Predictive Coding well know highly successful coding model novel coding techniques of residual signal

iLBC Features Sampling Rate: 8 kHz Supports 30 ms and 20 ms speech frame modes Bitrate 13.3 kbps (399 bits, packetized in 50 bytes) for 30 ms frames 15.2 kbps (303 bits, packetized in 38 bytes) for 20 ms frames Computational complexity (TI C54x) 30 ms frames: appr. 18 MIPS/channel 20 ms frames: appr. 15 MIPS/channel Memory 400 Words/channel state memory (RAM) less than 4 kWords table memory (ROM) Stack and program memory requirements similar to other low bit rate codecs (e.g. G.729A)

The Core iLBC method Start state encoding Gain-shape waveform matching forward in time Gain-shape waveform matching backward in time Pitch enhancement Packet loss concealment

iLBC Encoding Incoming speech Packets to network

iLBC Decoding Decoded speech Packets from network

20 ms vs 30 ms sub-blocks 20 ms frame size mode - 4 sub-blocks with the total length of 160 samples 30 ms frame size mode - 6 sub-blocks with the total length of 240 samples

20 ms vs 30 ms mode – bit allocation 240 samples encoded to 399 bits = 13.3 kbit/s (50 oct) 160 samples encoded to 303 bits = 15.2 kbit/s (38 oct) Parameter Bits LPC Start state position Start state scale Start state samples Shapes Gains 40 4 6 174 115 60 Total 399 Parameter Bits LPC Start state position Start state scale Start state samples Shapes Gains 20 3 6 171 67 36 Total 303

Advantage over CELP original iLBC g729 g723 State recovery PLC

iLBC Performance vs G.729A & G.723.1 old version from Winter 2002 Source: Dynastat

iLBC Performance Equivalent or slightly lower performance than G.729E in clean. Improved robustness to packet loss compared to G.729E. iLBC showed better than G.728 in other testing.

Implementation Fixed Point Source Floating Point Source DSP Source Significant signal processing skills necessary Quality / efficiency trade-off ~ 6 Months Optimization skills ~ 4 Months

iLBC Specifications Available in floating point , fixed point ANSI C, TIc54x, TIc55x, TIc64x,… Supports 20 and 30 ms speech frames Algorithmic delay: Same as frame size Sampling Rate: 8 kHz Bit rate: 13.333 kpbs for 30ms and 15.2 kpbs for 20ms Product Frame size Complexity (max) Program Memory Data Memory Static Data Memory Dynamic Encoder Decoder Fix Per channel GIPS iLBC TIc54x 20 ms 11.5 MIPS 4.1 MIPS 17.6 2.4 1.4 2.3 30 ms 13.5 MIPS 4.4 MIPS GIPS iLBC TIc55x 7.5 MIPS 3.0 MIPS 15.6 8.8 MIPS 3.1 MIPS Memory in kWord16