The Opus Codec Jean-Marc Valin, Koen Vos,

Slides:

Advertisements

Similar presentations

Wideband Speech Coding for CDMA2000® Systems

Advertisements

VMR-WB – Operation of the 3GPP2 Wideband Speech Coding Standard M. Jelinek†, R. Salami‡ and S. Ahmadi * †University of Sherbrooke, Canada ‡VoiceAge Corporation,

Guerino Mazzola (Fall 2014 © ): Introduction to Music Technology IIIDigital Audio III.6 (Fr Oct 24) The MP3 algorithm with PAC.

Developement and Implementation of an MPEG1 Layer III Decoder on x86 and TMS320C6711 platforms Braidotti Enrico (Farina Simone)

CS335 Principles of Multimedia Systems Audio Hao Jiang Computer Science Department Boston College Oct. 11, 2007.

MPEG-1 MUMT-614 Jan.23, 2002 Wes Hatch. Purpose of MPEG encoding To decrease data rate How? –two choices: could decrease sample rate, but this would cause.

Audio Coding Team Member: ChungMing Yan, Chun Tong.

CGMB324: Multimedia System Design

Time-Frequency Analysis Analyzing sounds as a sequence of frames

Digital Audio Compression

N Team 15: Final Presentation Peter Nyberg Azadeh Bararsani Adie Tong N N multicodec minisip.

Copyright © by Elliot Eichen. All rights reserved. RTP – Real Time Protocol (and RTCP)

Philippe Gournay, Bruno Bessette, Roch Lefebvre

Technion - IIT Dept. of Electrical Engineering Signal and Image Processing lab Transrating and Transcoding of Coded Video Signals David Malah Ran Bar-Sella.

Digital Representation of Audio Information Kevin D. Donohue Electrical Engineering University of Kentucky.

Codec requirements update Michael Knappe Co-chair, codec WG 1Michael Knappe IETF 77.

RTP Payload for Comfort Noise Robert Zopf Lucent Technologies.

Speech codecs and DCCP with TFRC VoIP mode Magnus Westerlund

© 2006 AudioCodes Ltd. All rights reserved. AudioCodes Confidential Proprietary Signal Processing Technologies in Voice over IP Eli Shoval Audiocodes.

Understanding the Internet Low Bit Rate Coder Jan Linden Vice President of Engineering Global IP Sound Presented by Jan Skoglund Sr. Research Scientist.

Voice over the Internet (the basics) CS 7270 Networked Applications & Services Lecture-2.

1 Audio Compression Techniques MUMT 611, January 2005 Assignment 2 Paul Kolesnik.

MPEG-3 For Audio Presented by: Chun Lui Sunjeev Sikand.

Overview of Adaptive Multi-Rate Narrow Band (AMR-NB) Speech Codec

MPEG Audio Compression by V. Loumos. Introduction Motion Picture Experts Group (MPEG) International Standards Organization (ISO) First High Fidelity Audio.

Dolby AC-3 Audio Encoding & THX Wai Kam (Winnie) Henele Adams Peter Boettcher.

Audio Coding MPEG1 Layers I, II, III MPEG2MPEG4 Sherida Subrati Anthony Caliendo.

Audio CompressiontMyn1 Audio Compression Audio compression has become well entrenched in consumer and professional digital audio products such as the compact.

1 Audio Compression Multimedia Systems (Module 4 Lesson 4) Summary: r Simple Audio Compression: m Lossy: Prediction based r Psychoacoustic Model r MPEG.

© 2010 Universität Tübingen, WSI-ICS Patrick Schreiner, Christian Hoene Universität Tübingen WSI-ICS 26. July 2010 Rate Adaptation for the IETF IIAC.

Cisco Unified Communications Manager (CUCM)

Secure Steganography in Audio using Inactive Frames of VoIP Streams

Audio Compression Usha Sree CMSC 691M 10/12/04. Motivation Efficient Storage Streaming Interactive Multimedia Applications.

1/75 Embedded Audio Coder Jin Li 2/75 Outline Introduction Embedded audio coder - Algorithm MLT with window switching Quantizer Entropy coder Bitstream.

Windows Media Video 9 Tarun Bhatia Multimedia Processing Lab University Of Texas at Arlington 11/05/04.

Audio Henning Schulzrinne Dept. of Computer Science Columbia University Fall 2003.

What’s new in Wideband Audio?

1 Audio Compression. 2 Digital Audio  Human auditory system is much more sensitive to quality degradation then is the human visual system  redundancy.

8. 1 MPEG MPEG is Moving Picture Experts Group On 1992 MPEG-1 was the standard, but was replaced only a year after by MPEG-2. Nowadays, MPEG-2 is gradually.

Aug 25, 2005 page1 Aug 25, 2005 Integration of Advanced Video/Speech Codecs into AccessGrid National Center for High Performance Computing Speaker: Barz.

Media Handling in FreeSWITCH Moisés Silva Software Engineer / Manager

ECE 5525 Osama Saraireh Fall 2005 Dr. Veton Kepuska

1 “ ” Multiservice networking is emerging as a strategically important issue for enterprise and public service provider infrastructures alike.The proposition.

Present document contains informations proprietary to France Telecom. Accepting this document means for its recipient he or she recognizes the confidential.

IntroductiontMyn1 Introduction MPEG, Moving Picture Experts Group was started in 1988 as a working group within ISO/IEC with the aim of defining standards.

Sung-Won Yoon, David ChoiEE368C Project Proposal Bandwidth Extrapolation of Audio Signals Sung-Won Yoon David Choi February 8 th, 2001.

A Very Low Bit Rate Protection Layer to Increase the Robustness of the AMR- WB+ Codec against Bit Errors Philippe Gournay Université de Sherbrooke Département.

Perceptual Audio Coding The AT&T/Bell Labs view James D. Johnston Chief Scientist Neural Audio, Kirkland, Wa.

A UDIO B ANDWIDTH D ETECTION IN THE EVS C ODEC University of Sherbrooke, Canada VoiceAge Corporation, Montréal, Canada Fraunhofer IIS, Erlagen, Germany.

Institut für Nachrichtengeräte und Datenverarbeitung Prof. Dr.-Ing. P. Vary On the Use of Artificial Bandwidth Extension Techniques in Wideband Speech.

2nd Workshop on Wideband Speech Quality - June nd Workshop on Wideband Speech Quality in Terminals and Networks: Assessment and Prediction 22nd.

Opus SW codec RTLAB Ki Eun Seong. What is the Opus Codec? Real-time interactive audio codec Targets interactive audio over the internet Aims to be royalty-free,

Codec 2 ● open source speech codec ● low bit rate (2400 bit/s and below) ● applications include digital speech for HF and VHF radio ● fills gap in open.

Motivation ● The (Ham) world needs an open source, patent free speech codec at bit rates of less than 5000 bit/s ● I know how to build one!

Introduction to H.264 / AVC Video Coding Standard Multimedia Systems Sharif University of Technology November 2008.

Opus, a free, high-quality speech and audio codec

High-Quality, Low-Delay Music Coding in the Opus Codec

Cisco Unity Connection

Scalable Speech Coding for IP Networks

III Digital Audio III.6 (Fr Oct 20) The MP3 algorithm with PAC.

Digital Communications Chapter 13. Source Coding

Thomas Daede October 5, 2017 AV1 Update Thomas Daede October 5, 2017.

Audio Henning Schulzrinne Dept. of Computer Science

1 Vocoders. 2 The Channel Vocoder (analyzer) : The channel vocoder employs a bank of bandpass filters,  Each having a bandwidth between 100 HZ and 300.

ON THE ARCHITECTURE OF THE CDMA2000® VARIABLE-RATE MULTIMODE WIDEBAND (VMR-WB) SPEECH CODING STANDARD Milan Jelinek†, Redwan Salami‡, Sassan Ahmadi*, Bruno.

Understanding the Internet Low Bit Rate Coder

MPEG-1 Overview of MPEG-1 Standard

III Digital Audio III.6 (Mo Oct 22) The MP3 algorithm with PAC.

Govt. Polytechnic Dhangar(Fatehabad)

Presentation transcript:

The Opus Codec Jean-Marc Valin, Koen Vos, Timothy B. Terriberry, Gregory Maxwell CCBE 27 September 2013

What is Opus? New highly-flexible speech and audio codec Works for most audio applications Completely free Royalty-free licensing Open-source implementation IETF RFC 6716 (Sep. 2012)

Why a New Audio Codec? http://xkcd.com/927/ http://imgs.xkcd.com/comics/standards.png

Why Should Broadcasters Care? Ultra-low delay Adaptability to varying network conditions Best-in-class performance within a wide range of bitrates No licensing costs No incompatible flavours

Applications and Standards (2010) Codec VoIP with PSTN AMR-NB Wideband VoIP/videoconference AMR-WB High-quality videoconference G.719 Low-bitrate music streaming HE-AAC High-quality music streaming AAC-LC Low-delay broadcast AAC-ELD Network music performance

Applications and Standards (2013) Codec VoIP with PSTN Opus Wideband VoIP/videoconference High-quality videoconference Low-bitrate music streaming High-quality music streaming Low-delay broadcast Network music performance

Features Highly flexible Bit-rates from 6 kb/s to 510 kb/s Narrowband (8 kHz) to fullband (48 kHz) Frame sizes from 2.5 ms to 60 ms Speech and music support Mono and stereo Flexible rate control Flexible complexity All changeable dynamically, signaled within the bitstream

Rate Control Opus supports true CBR Every packet has the same number of bytes No bit reservoir => no extra delay Quality not as good as VBR Constrained VBR Total variation within 1 frame of CBR (same as bit reservoir) Bounded delay, better transients, etc. True VBR Open loop: calibrated to a large corpus Gets the most benefit from new encoder improvements Bitrate cap possible for both VBR modes

Opus Design SILK: Based on voice codec from Skype CELT: MDCT codec from Xiph.Org Better than sum of its parts (Hybrid mode, seamless mode switching) CELT SILK In ↓ ↑ + Out MUX DEMUX Encoder Decoder 8-16 kHz 48 kHz bit-stream D

SILK Technology Originally used in Skype Based on linear prediction (LPC) Very good at narrowband and wideband speech up to ~32 kb/s Not very good on music Heavily modified to integrate with Opus

SILK Technology Based on Noise Feedback Coding rather than Analysis-by-Synthesis Analysis/synthesis mismatch to de-emphasize spectral valleys Replaces post-filters Variable-rate coding

CELT Technology “Constrained-Energy Lapped Transform” Psychoacoustics built into the format Harder to write a bad encoder Works on speech and music Most efficient on fullband audio (48 kHz) Less efficient on low bitrate speech

CELT Technology MDCT with low-overlap window Code band energy separately from spectrum “details” Preserves the energy in each critical band Implicit masking curve defined by the format No need to code scalefactors

CELT Stereo Coupling Code separate energy for each channel Prevents cross-talk Converts to mid-side after normalization Mid and side coded separately with their relative energy conserved Prevents stereo unmasking Intensity stereo Discards side past a certain frequency

Google Listening Tests Wideband/ Fullband

HydrogenAudio Results 64 kbit/s

Cascading Tests (AES 135) 5 cascadings Bitrate = 128 kbit/s

Adoption Broadcast Tieline, Mayah, Harris Broadcast CBS, ABC, NBC, NPR, Fox, Cumulus, ... Distribution Magnatune music store StreamGuys CDN VoIP and videoconference Jitsi, Meetecho, CounterPath, Mumble, Teamspeak, ... Mandatory-to-implement for WebRTC

Adoption HTTP streaming Firefox 18+ (incl. FFOS), Chrome, Opera Lots of other players: FFMpeg, GStreamer, VLC, Foobar2k, Winamp (with a plugin), Amarok, xmms2, etc. Icecast 2.4-beta1 added Opus support Examples: http://dir.xiph.org/by_format/Opus http://www.absoluteradio.co.uk/listen/labs.html

Roadmap

libopus 1.1 Beta released in July, full release “soon” https://people.xiph.org/~xiphmont/demo/opus/demo3.shtml First release with True VBR Tonality estimation Better dynamic allocation Improves on the built-in psychoacoustics Temporal VBR (discovered by accident!) Automatic speech/music detection Optional delayed decision (better high-latency performance)

libopus 1.1 (cotd.) Better surround encoding Better API (knows which channel is which) Better LFE encoding Inter-channel masking Major ARM performance gains: 40% decoder CPU reduction 27% encoder CPU reduction (33% with Neon)

Standards RTP (draft-ietf-payload-opus) Hopefully WGLC soon Ogg (draft-ietf-codec-oggopus) Maybe WGLC soon? WebM (Matroska) Opus paired with VP9 for next RF video format Used by YouTube Spec’d at https://wiki.xiph.org/MatroskaOpus Implementations underway Minor RFC 6716 revisions (draft-valin-codec-opus- update) 3 minor bug-fixes to the reference implementation Feedback at codec@ietf.org welcomed!

Opus in RTP Very simple: 1 RTP payload == 1 Opus packet From 2.5 ms to 120 ms audio Packets decodable with no OOB signaling No negotiation failure, always opus/48000/2 All SDP parameters are informative Mono/stereo, bitrate, audio bandwidth, frame size, mode, etc., signaled in band Receiver decodes all of these transparently Encoder and decoder can run at different rates

Opus in Ogg Includes surround support, up to 255 channels Similar to RTP mapping Header is informative (except surround)

Questions? Resources Website: http://opus-codec.org Mailing list: opus@xiph.org IRC: #opus on irc.freenode.net Git repository: git://git.opus-codec.org/opus.git Questions?