Motivation ● The (Ham) world needs an open source, patent free speech codec at bit rates of less than 5000 bit/s ● I know how to build one!

Slides:



Advertisements
Similar presentations
ON THE REPRESENTATION OF VOICE SOURCE APERIODICITIES IN THE MBE SPEECH CODING MODEL Preeti Rao and Pushkar Patwardhan Department of Electrical Engineering,
Advertisements

Speech Coding Workshop 2000 Jean-Marc Valin, Roch Lefebvre 1 IEEE Speech Coding Workshop Sept 17–20, 2000 Lake Lawn Resort Delavan, WI Jean-Marc Valin,
Tamara Berg Advanced Multimedia
Audio Compression ADPCM ATRAC (Minidisk) MPEG Audio –3 layers referred to as layers I, II, and III –The third layer is mp3.
LECTURE Copyright  1998, Texas Instruments Incorporated All Rights Reserved Use of Frequency Domain Telecommunication Channel |A| f fcfc Frequency.
Copyright 2001, Agrawal & BushnellVLSI Test: Lecture 181 Lecture 18 DSP-Based Analog Circuit Testing  Definitions  Unit Test Period (UTP)  Correlation.
Page 0 of 34 MBE Vocoder. Page 1 of 34 Outline Introduction to vocoders MBE vocoder –MBE Parameters –Parameter estimation –Analysis and synthesis algorithm.
Ranko Pinter Simoco Digital Systems
Storey: Electrical & Electronic Systems © Pearson Education Limited 2004 OHT 5.1 Signals and Data Transmission  Introduction  Analogue Signals  Digital.
Speech Coding Nicola Orio Dipartimento di Ingegneria dell’Informazione IV Scuola estiva AISV, 8-12 settembre 2008.
Understanding the Internet Low Bit Rate Coder Jan Linden Vice President of Engineering Global IP Sound Presented by Jan Skoglund Sr. Research Scientist.
Speech & Audio Processing
Overview of Adaptive Multi-Rate Narrow Band (AMR-NB) Speech Codec
20101 The Physical Layer Chapter Bandwidth-Limited Signals.
System Microphone Keyboard Output. Cross Synthesis: Two Implementations.
Communications & Multimedia Signal Processing Formant Tracking LP with Harmonic Plus Noise Model of Excitation for Speech Enhancement Qin Yan Communication.
Digital Voice Communication on HF
Basics of Signal Processing. frequency = 1/T  speed of sound × T, where T is a period sine wave period (frequency) amplitude phase.
A Full Frequency Masking Vocoder for Legal Eavesdropping Conversation Recording R. F. B. Sotero Filho, H. M. de Oliveira (qPGOM), R. Campello de Souza.
Digital to Analogue Conversion Natural signals tend to be analogue Need to convert to digital.
DIGITAL VOICE NETWORKS ECE 421E Tuesday, October 02, 2012.
GCT731 Fall 2014 Topics in Music Technology - Music Information Retrieval Overview of MIR Systems Audio and Music Representations (Part 1) 1.
Basics of Signal Processing. SIGNALSOURCE RECEIVER describe waves in terms of their significant features understand the way the waves originate effect.
Digital Speech Transmission and Recovery. Overall System Output (speaker) Channel (coax cable) Receiver Circuit Input (microphone) Transmitter Circuit.
Lecture 18 DSP-Based Analog Circuit Testing
Wireless PHY: Modulation and Demodulation Y. Richard Yang 09/6/2012.
Lecture 71 Today, we are going to talk about: Some bandpass modulation schemes used in DCS for transmitting information over channel M-PAM, M-PSK, M-FSK,
Speech Coding Using LPC. What is Speech Coding  Speech coding is the procedure of transforming speech signal into more compact form for Transmission.
Acoustic Analysis of Speech Robert A. Prosek, Ph.D. CSD 301 Robert A. Prosek, Ph.D. CSD 301.
Concepts of Multimedia Processing and Transmission IT 481, Lecture #4 Dennis McCaughey, Ph.D. 25 September, 2006.
SPEECH CODING Maryam Zebarjad Alessandro Chiumento.
Speech Signal Representations I Seminar Speech Recognition 2002 F.R. Verhage.
CS Spring 2009 CS 414 – Multimedia Systems Design Lecture 3 – Digital Audio Representation Klara Nahrstedt Spring 2009.
˜ SuperHeterodyne Rx ECE 4710: Lecture #18 fc + fLO fc – fLO -fc + fLO
ECE 5525 Osama Saraireh Fall 2005 Dr. Veton Kepuska
VOCODERS. Vocoders Speech Coding Systems Implemented in the transmitter for analysis of the voice signal Complex than waveform coders High economy in.
Outline Transmitters (Chapters 3 and 4, Source Coding and Modulation) (week 1 and 2) Receivers (Chapter 5) (week 3 and 4) Received Signal Synchronization.
Write-Log Click her to open this box. Set the frequency and mode
Present document contains informations proprietary to France Telecom. Accepting this document means for its recipient he or she recognizes the confidential.
DVSI HX-SD™ Selectable Mode Vocoder
Chapter 20 Speech Encoding by Parameters 20.1 Linear Predictive Coding (LPC) 20.2 Linear Predictive Vocoder 20.3 Code Excited Linear Prediction (CELP)
From Error Control to Error Concealment Dr Farokh Marvasti Multimedia Lab King’s College London.
CELP / FS-1016 – 4.8kbps Federal Standard in Voice Coding
Institut für Nachrichtengeräte und Datenverarbeitung Prof. Dr.-Ing. P. Vary On the Use of Artificial Bandwidth Extension Techniques in Wideband Speech.
Speaker Verification System Middle Term Presentation Performed by: Barak Benita & Daniel Adler Instructor: Erez Sabag.
1 Speech Compression (after first coding) By Allam Mousa Department of Telecommunication Engineering An Najah University SP_3_Compression.
2nd Workshop on Wideband Speech Quality - June nd Workshop on Wideband Speech Quality in Terminals and Networks: Assessment and Prediction 22nd.
Codec 2 ● open source speech codec ● low bit rate (2400 bit/s down to 1400 bit/s) ● applications include digital speech for HF and VHF radio ● fills gap.
Codec 2 ● open source speech codec ● low bit rate (2400 bit/s and below) ● applications include digital speech for HF and VHF radio ● fills gap in open.
Codec2: An Open Future for Digital Voice ● David Rowe VK5DGR ● ● Codec developer. ● Today's Presenter ● Bruce Perens.
Open Source Digital Radio David Rowe. Topics ● Digital Radio and VOIP ● Applications ● Open Source Digital Radio – who cares? ● Codec 2 explained ● Digital.
Codec 2 open source speech codec
Codec 2 at 450 bit/s Codec 2 Digital Voice over High Frequency (HF) Radio The Mission Why HF is Hard How Analog Does it Demo.
GSM Speech Coding To send a voice across a radio network, we have to turn our voice into a digital signal. GSM uses a method called RPE-LPC (Regular Pulse.
Voice Manipulator Department of Electrical & Computer Engineering
Scalable Speech Coding for IP Networks
Digital Communications Chapter 13. Source Coding
Vocoders.
Pulse Code Modulation (PCM)
1 Vocoders. 2 The Channel Vocoder (analyzer) : The channel vocoder employs a bank of bandpass filters,  Each having a bandwidth between 100 HZ and 300.
Introduction King Saud University
Speech and Audio Processing
Understanding the Internet Low Bit Rate Coder
Vocoders.
Pulse Code Modulation (PCM)
DIGITAL WATERMARKING OF AUDIO SIGNALS USING A PSYCHOACOUSTIC AUDITORY MODEL AND SPREAD SPECTRUM THEORY By: Ricardo A. Garcia University of Miami School.
Electrical Communications Systems ECE Spring 2019
INTRODUCTION TO ADVANCED DIGITAL SIGNAL PROCESSING
Electrical Communications Systems ECE
Introduction 1st semester King Saud University
Presentation transcript:

Motivation ● The (Ham) world needs an open source, patent free speech codec at bit rates of less than 5000 bit/s ● I know how to build one!

Codec 2 Author - David Rowe ● Adelaide, South Australia ● VK5DGR, first licensed in 1981 at age 13 ● PhD in speech coding (1999) ● Built some of the first real time speech codecs in the late 1980's on early DSP chips ● Now works full time on open software/open hardware for developing world communications ●

Digital Voice Radio System A/D codec2 enc FEC enc mod HF/VHF radio D/A codec2 dec FEC dec demod mic spk r

Speech Coding ● Take speech samples (e.g. 16 bit samples at 8 kHz sampling rate) ● Convert to 2400 bit/s ● What can we throw away? ● Retain intelligible speech ● Retain natural speech ● Use a model of speech, send model parameters

Patents and Codecs ● The authors of proprietary/patented codecs borrowed heavily from the public domain ● Perhaps 5% of the algorithms they use are original and patented ● 95% of the algorithms in these codecs are public domain algorithms ● To build an equivalent codec, we simply need alternatives for the 5% that is patented

Sinusoidal Speech Coding Pitch Period 35 samples or 4.4ms at 8kHz sample rate Time (samples) Amplitude (16 bit samples)

Sinusoidal Speech Coding Amplitud e (dB) Frequency (Hz) Pitch 230Hz or 4.3ms Harmonics of 230Hz

Sinusoidal Speech Model Amplitude 1 Phase 1 Frequency 1 Amplitude 2 Phase 2 Frequency 2 Amplitude L Phase L Frequency L

Amplitude Modelling

Encoder Block Diagram LPC Analysis MBE Voicing est FFT 16 bit, 8kHz samples LPC to LSP Quant Energy Quant Pitch est Pitch Quant LPC Correcti on 2550 bit/s quantised model parameters

Bit Allocation ● Alpha V0.1 codec, subject to rapid change ● 51 bits per 20ms frame, or 2550 bit/s

Decoder Block Diagram Inverse FFT 16 bit, 8kHz samples Recover Harm Amps LSP to LPC Overlap Add Post Filter FFT LSPs Energ y LPC Correctio n Phase Synthesi s Voicin g

Prior Art Summary ● Sinusoidal Coding, Mcaulay & Quatieri, 1984 ● Linear Predictive Coding, Makhoul, 1975 ● Line Spectrum Pairs, Itakura, 1975 ● MBE Voicing, Griffin & Lim, 1988 ● Overlap Add, Tribolet & Crochiere, 1979 ● NLP Pitch Estimation, Rowe, 1999 ● LPC Amplitude Recovery (algorithm used here), Rowe, 1991, 1999, 2009 ● Post Filter, Rowe, 2009

Further Work ● Better phase model and voicing estimator ● Toll quality at 2000 bit/s ● Lower bit rate, 2400, 1200 bit/s ● Better background noise performance ● FEC and non-redundant error correction ● Integration with modem and test over radio channels ● Fixed point and DSP chip implementation