Understanding the Internet Low Bit Rate Coder Jan Linden Vice President of Engineering Global IP Sound Presented by Jan Skoglund Sr. Research Scientist.

Slides:

Advertisements

Similar presentations

Wideband Speech Coding for CDMA2000® Systems

Advertisements

Speech Coding Techniques

Developement and Implementation of an MPEG1 Layer III Decoder on x86 and TMS320C6711 platforms Braidotti Enrico (Farina Simone)

N Team 15: Final Presentation Peter Nyberg Azadeh Bararsani Adie Tong N N multicodec minisip.

Copyright © by Elliot Eichen. All rights reserved. RTP – Real Time Protocol (and RTCP)

Speech Coders – a VoIP perspective Roar Hagen CTO SIP/

High Performance 32 Channel ADPCM Codec File Number Here ® LogiCORE Products.

Ranko Pinter Simoco Digital Systems

Speech in Multimedia Hao Jiang Computer Science Department Boston College Oct. 9, 2007.

CELLULAR COMMUNICATIONS 5. Speech Coding. Low Bit-rate Voice Coding  Voice is an analogue signal  Needed to be transformed in a digital form (bits)

Yi Liang Department of Electrical Engineering Stanford University April 19, 2000 Loss Recovery and Adaptive Playout Control for Packet Voice Communications.

Speech codecs and DCCP with TFRC VoIP mode Magnus Westerlund

© 2006 AudioCodes Ltd. All rights reserved. AudioCodes Confidential Proprietary Signal Processing Technologies in Voice over IP Eli Shoval Audiocodes.

1 © NOKIA GPP2 Wideband Codec Presentation Interoperable Wideband Speech Coder for CDMA2000 and WCDMA Systems W-VRM: Wideband Variable-Rate Multi-Mode.

Speech Coding Nicola Orio Dipartimento di Ingegneria dell’Informazione IV Scuola estiva AISV, 8-12 settembre 2008.

A STUDY OF DESIGN COMPROMISES FOR SPEECH CODERS IN PACKET NETWORKS 1.INTRODUCTION In voice over packet networks, the coding gain achieved by prediction-based.

Voice over the Internet (the basics) CS 7270 Networked Applications & Services Lecture-2.

PROJECT PRESENTATION “ Analyzing Factors that affect VoIP Call Quality ” Presented By: Vamsi Krishna Karnati 11/24/2014.

SWE 423: Multimedia Systems

Overview of Adaptive Multi-Rate Narrow Band (AMR-NB) Speech Codec

An Error-Resilient GOP Structure for Robust Video Transmission Tao Fang, Lap-Pui Chau Electrical and Electronic Engineering, Nanyan Techonological University.

VoIP on the iPhone: Imagine the Possibilities Jan Linden, VP of Engineering.

Digital Voice Communication Link EE 413 – TEAM 2 April 21 st, 2005.

Encoder and Decoder Optimization for Source-Channel Prediction in Error Resilient Video Transmission Hua Yang and Kenneth Rose Signal Compression Lab ECE.

Rate-Distortion Optimized Motion Estimation for Error Resilient Video Coding Hua Yang and Kenneth Rose Signal Compression Lab ECE Department University.

K. Salah 1 Chapter 28 VoIP or IP Telephony. K. Salah 2 VoIP Architecture and Protocols Uses one of the two multimedia protocols SIP (Session Initiation.

1 Audio Compression Multimedia Systems (Module 4 Lesson 4) Summary: r Simple Audio Compression: m Lossy: Prediction based r Psychoacoustic Model r MPEG.

© 2010 Universität Tübingen, WSI-ICS Patrick Schreiner, Christian Hoene Universität Tübingen WSI-ICS 26. July 2010 Rate Adaptation for the IETF IIAC.

Cisco Unified Communications Manager (CUCM)

Secure Steganography in Audio using Inactive Frames of VoIP Streams

Computer Networks: Multimedia Applications Ivan Marsic Rutgers University Chapter 3 – Multimedia & Real-time Applications.

ETSI STQ-Aurora Distributed Speech Recognition (DSR) Bernhard Noé Distributed Speech Recognition.

Develop and Implementation of the Speex Vocoder on the TI C64+ DSP

UNIVERSITÉ DE SHERBROOKE - Philippe G OURNAY Senior Research Engineer VoiceAge Corporation University of Sherbrooke François R OUSSEAU, Roch L EFEBVRE.

Page 0 of 23 MELP Vocoders Nima Moghadam SN#: Saeed Nari SN#: Supervisor Dr. Saameti April 2005 Sharif University of Technology.

Audio Henning Schulzrinne Dept. of Computer Science Columbia University Fall 2003.

Speech Coding Submitted To: Dr. Mohab Mangoud Submitted By: Nidal Ismail.

The Way Forward Factors Driving Video Conferencing Dr. Jan Linden, VP of Engineering Global IP Solutions.

1 Audio Compression. 2 Digital Audio  Human auditory system is much more sensitive to quality degradation then is the human visual system  redundancy.

1 RTP Multiplexing using Tunnels (TCRTP) Bruce Thompson Tmima Koren Cisco Systems Inc.

1 Classification of Compression Methods. 2 Data Compression  A means of reducing the size of blocks of data by removing  Unused material: e.g.) silence.

Rate-distortion Optimized Mode Selection Based on Multi-channel Realizations Markus Gärtner Davide Bertozzi Classroom Presentation 13 th March 2001.

1 RaptorG Forward Error Correction Scheme for Object Delivery draft-luby-rmt-bb-fec-raptorg-object-00 (update to this to be officially submitted soon)

VOCODERS. Vocoders Speech Coding Systems Implemented in the transmitter for analysis of the voice signal Complex than waveform coders High economy in.

1.INTRODUCTION The use of the adaptive codebook (ACB) in CELP-like speech coders allows the achievement of high quality speech, especially for voiced segments.

Developing Applications with Host Media Processing David Asher.

Comparisons of FEC and Codec Robustness on VoIP Quality and Bandwidth Efficiency Wenyu Jiang Henning Schulzrinne Columbia University ICN 2002, Atlanta,

Present document contains informations proprietary to France Telecom. Accepting this document means for its recipient he or she recognizes the confidential.

Minjie Xie, Dave Lindbergh, and Peter Chu

A Very Low Bit Rate Protection Layer to Increase the Robustness of the AMR- WB+ Codec against Bit Errors Philippe Gournay Université de Sherbrooke Département.

Chapter 20 Speech Encoding by Parameters 20.1 Linear Predictive Coding (LPC) 20.2 Linear Predictive Vocoder 20.3 Code Excited Linear Prediction (CELP)

Voice Sampling. Sampling Rate Nyquist’s theorem states that a signal can be reconstructed if it is sampled at twice the maximum frequency of the signal.

CELP / FS-1016 – 4.8kbps Federal Standard in Voice Coding

Institut für Nachrichtengeräte und Datenverarbeitung Prof. Dr.-Ing. P. Vary On the Use of Artificial Bandwidth Extension Techniques in Wideband Speech.

UNIT V. Linear Predictive coding With the advent of inexpensive digital signal processing circuits, the source simply analyzing the audio waveform to.

MEMORY-LESS GAIN QUANTIZATION IN THE EVS CODEC Vladimir Malenovsky Milan Jelinek University of Sherbrooke/VoiceAge Corp. CANADA.

2nd Workshop on Wideband Speech Quality - June nd Workshop on Wideband Speech Quality in Terminals and Networks: Assessment and Prediction 22nd.

Presented by: Class Presentation of Custom DSP Implementation Course on: This is a class presentation. All data are copy rights of their respective authors.

Codec 2 ● open source speech codec ● low bit rate (2400 bit/s and below) ● applications include digital speech for HF and VHF radio ● fills gap in open.

Motivation ● The (Ham) world needs an open source, patent free speech codec at bit rates of less than 5000 bit/s ● I know how to build one!

Scalable Speech Coding for IP Networks

Scalable Speech Coding for IP Networks: Beyond iLBC

Wenyu Jiang Henning Schulzrinne Columbia University

Signaling Compression for Push-to-talk over Cellular (PoC)

Audio Henning Schulzrinne Dept. of Computer Science

Mohamed Chibani, Roch Lefebvre and Philippe Gournay

ON THE ARCHITECTURE OF THE CDMA2000® VARIABLE-RATE MULTIMODE WIDEBAND (VMR-WB) SPEECH CODING STANDARD Milan Jelinek†, Redwan Salami‡, Sassan Ahmadi*, Bruno.

Understanding the Internet Low Bit Rate Coder

Scalable Speech Coding for IP Networks: Beyond iLBC

MPEG-1 Overview of MPEG-1 Standard

Presentation transcript:

Understanding the Internet Low Bit Rate Coder Jan Linden Vice President of Engineering Global IP Sound Presented by Jan Skoglund Sr. Research Scientist Global IP Sound

iLBC – Background info Development started in Summer 2000 Contributed to IETF as an internet draft in Feb 2002 Accepted as work item in IETF AVT group Mar 2002 Contributed to CableLabs RFP in June 2002 Improved version to IETF, Fall 2002 ECR submitted in May 2003 Support for 20 ms frames spring 2003 Successful interoperability events Past Working Group Last call in IETF Jan 2004 April 2004 added as a mandatory codec in PacketCable 1.1 December 2004 IETF process finalized (became Experimental RFC 3951 and 3952)

Design Principles Free of 3rd party IPR o extensive experience in speech coding patents by design team o patent and research situation monitored since 2000 o has been public in IETF since March 2002 and reviewed by independent speech coding researchers Packet independency o no coding interdependency between frames o increased packet loss robustness o suitable for IP networks Linear Predictive Coding o well know highly successful coding model o novel coding techniques of residual signal

iLBC Features Sampling Rate: 8 kHz Supports 30 ms and 20 ms speech frame modes Bitrate o 13.3 kbps (399 bits, packetized in 50 bytes) for 30 ms frames o 15.2 kbps (303 bits, packetized in 38 bytes) for 20 ms frames Computational complexity (TI C54x) o 30 ms frames: appr. 18 MIPS/channel o 20 ms frames: appr. 15 MIPS/channel Memory o 400 Words/channel state memory (RAM) o less than 4 kWords table memory (ROM) o Stack and program memory requirements similar to other low bit rate codecs (e.g. G.729A)

The Core iLBC method Start state encoding Gain-shape waveform matching forward in time Gain-shape waveform matching backward in time Pitch enhancement Packet loss concealment

iLBC Encoding Incoming speech Packets to network

iLBC Decoding Packets from network Decoded speech

20 ms vs 30 ms sub-blocks 20 ms frame size mode - 4 sub-blocks with the total length of 160 samples 30 ms frame size mode - 6 sub-blocks with the total length of 240 samples

20 ms vs 30 ms mode – bit allocation 240 samples encoded to 399 bits = 13.3 kbit/s (50 oct) ParameterBits LPC Start state position Start state scale Start state samples Shapes Gains Total samples encoded to 303 bits = 15.2 kbit/s (38 oct) ParameterBits LPC Start state position Start state scale Start state samples Shapes Gains Total303

Advantage over CELP original iLBC g729 g723 PLC State recovery

iLBC Performance vs G.729A & G old version from Winter 2002 Source: Dynastat

iLBC Performance  Equivalent or slightly lower performance than G.729E in clean.  Improved robustness to packet loss compared to G.729E.  iLBC showed better than G.728 in other testing.

Implementation Floating Point Source Fixed Point Source DSP Source Significant signal processing skills necessary Quality / efficiency trade-off ~ 6 Months Optimization skills ~ 4 Months

iLBC Specifications Available in floating point, fixed point ANSI C, TIc54x, TIc55x, TIc64x,… Supports 20 and 30 ms speech frames Algorithmic delay: Same as frame size Sampling Rate: 8 kHz Bit rate: kpbs for 30ms and 15.2 kpbs for 20ms Product Frame size Complexity (max) Program Memory Data Memory Static Data Memory Dynamic EncoderDecoderFix Per channel GIPS iLBC TIc54x20 ms11.5 MIPS4.1 MIPS GIPS iLBC TIc54x30 ms13.5 MIPS4.4 MIPS GIPS iLBC TIc55x20 ms7.5 MIPS3.0 MIPS GIPS iLBC TIc55x30 ms8.8 MIPS3.1 MIPS Memory in kWord16