Presentation is loading. Please wait.

Presentation is loading. Please wait.

DVSI HX-SD™ Selectable Mode Vocoder

Similar presentations


Presentation on theme: "DVSI HX-SD™ Selectable Mode Vocoder"— Presentation transcript:

1 DVSI HX-SD™ Selectable Mode Vocoder
3GPP2- DVSI SMV Presentation DIGITAL VOICE SYSTEMS, INC. 26 April 2000 Seattle, WA The Speech Compression Specialists DVSI HX-SD™ Selectable Mode Vocoder Digital Voice Systems, Inc. One Van de Graaff Drive Burlington, MA USA Phone: (781) Fax: (781) Web: Presented by: John C. Hardwick President The proposals in this submission have been formulated by Digital Voice Systems, Inc. (DVSI) to assist the 3GPP2 Standards Committtee. This document is offered to the committee as a basis for discussion and is not binding on DVSI. This submission is subject to change in form and in numerical values after further study, and DVSI specifically reserves the right to add to, or amend, the quantitative statements made herein. Nothing contained herein shall be construed as conferring by implication, estoppel, or otherwise any license or right under any patent, whether or not the use of information herein necessarily employs an invention of any existing or later issued patent. © Copyright, Digital Voice Systems, Inc. 2000, All Rights Reserved. DVSI hereby gives permission for copying this submission for the legitimate purposes of the 3GPP2 Standards Committee, provided DVSI is credited on all copies. Distribution or reproduction of this document, by any means, electronic, mechanical, or otherwise, in its entirety or any portion thereof, for monetary gain or any non-3GPP2 purpose is expressly prohibited. GRANT OF LICENSE: DVSI grants a free, irrevocable license to 3GGP2 and its Organizational Partners to incorporate text or other copyrightable material contained in the contribution and any modification thereof in the creation of 3GGP2 publications; to copyright and sell in Organizational Partner’s name any Organizational Partner’s standards publication even though it may include portions of the contribution; and at the Organizational Partner’s sole discretion to permit others to reproduce in whole or in part such contributions or the resulting Organizational Partner’s standards publications. The contributor must also be willing to grant licenses under such contributor copyrights to third parties on reasonable, non-discriminatory terms and conditions as appropriate.

2 Presentation Overview
DVSI Introduction Vocoder Overview Vocoder Complexity Vocoder Rate Statistics Good Afternoon: I'm sure that everyone in this room is concerned about the quality and the clarity of the radio signals used in public safety applications for one reason or another. The lives of our police and fire department personnel are constantly put at risk due to the nature of their respective jobs and are dependent on effective communications equipment. We need to mitigate this risk through the use of state of the art communications equipment. Disaster Relief and Rescue personnel need to be able to communicate with one another and clearly understand the voice at the other end of the radio. Tonal inflection can convey subtleties within a message that must be perceived and clearly understood. When life-threatening situations arise, the quality of voice communications must not be compromised.

3 DVSI Corporate Information
Specializing in Vocoder development and implementation since 1988 Experience Technical Staff from M.I.T. Chairman - Professor Jae S. Lim President - Dr. John C. Hardwick Director of R&D - Dr. Daniel Griffin Developer of proprietary model-based (MBE) and hybrid (HX-SD) Vocoders Focus on vocoder design for wireless applications Developer of the IMBE™ Vocoder which is part of the ANSI/TIA digital mobile radio standard for APCO Project 25. Some of the parameters of the MOS test are shown on this overhead. Of late, It has become common practice within the industry to refer to a MOS value to illustrate voice quality . Using the absolute value of the MOS score to determine voice quality can be very misleading. In addition, the scores from different MOS tests can not be compared to one another. In fact, the MOS values for any one vocoder will vary from test to test depending on a number of key variables. These variable include the vocoders within the test. If you were to test one very good vocoders against a number of inferior vocoders, the MOS score would be artificially inflated. Conversely, test a very good vocoder against only unquantized speech and the value will be artificially low. In determining voice quality, it is much more accurate to compare the difference in the overall scores within a test.

4 DVSI HX-SD™ Selectable Mode Vocoder
Hybrid Excitation - Spectral Decomposition (HX-SD™) Vocoder Combined harmonic/waveform vocoder Model Parameters: pitch, gain, spectral envelope, and mixing state for each 20 ms frame of speech Open Loop Analysis, Quantization and Synthesis Variable Frame Size (16, 40, 80 or 170 bits /frame) with perceptual based rate determination Noise Suppression for improved background noise performance. Moderate Complexity w/ 40 ms algorithmic delay In order to demonstrate some of these design considerations, relevant details of the IMBE (Improved Multi-Band Excitation) Vocoder implementation for APCO Project 25 North American land mobile radio system will be described. The total bit rate is 7200 bps with a frame size of 20 ms. The IMBE Vocoder model parameters (SEE CHART) consist of a pitch or fundamental frequency, a set of Voiced/Unvoiced (V/UV) parameters, and a set of spectral magnitudes. In the IMBE model, the spectrum is divided into a number of frequency bands, the V/UV parameters indicate whether each band is voiced (contains periodic energy) or unvoiced (contains noise like energy). This model provides improved performance for speech in background noise or speech with mixed voicing. The coding gain vs. error persistence trade-off resulted in selection of a prediction coefficient varying between .7 for low pitched speakers and .4 for high pitched speakers. These values provided good coding gain with low error persistence. The total of 144 bits per frame is allocated to speech model parameters (87 bits), synchronization (1 bit), and error control codes (56 bits). The speech model parameter bits are divided into 4 groups from most sensitive to bit errors to least sensitive. The most sensitive are encoded with a [23,12] Golay code with an additional error detecting code. The next group is encoded with three [23,12] Golay codes. The next group is encoded with three [15,11] Hamming codes and no error control codes are applied to the least sensitive group. Intra-frame interleaving is used to spread burst errors over multiple codewords. Adaptive smoothing of speech model parameters, frame repeating, and muting are used to mitigate uncorrected errors

5 HX-SD™ SMV Block Diagram
Input Speech Output Parameter Analysis Quantization Waveform Coding Harmonic Bit Stream & Rate Decisions Decoding & Error Mitigation Synthesis Encoder Decoder + Mixing control The excitation in the MBE model is comprised of periodic energy only in the frequency bands declared voiced, while the remaining bands are comprised of noise-like energy. This example shows an important feature of the MBE speech model. Namely, the V/UV determination is performed such that frequency bands where the ratio of periodic energy to noise-like energy is high are declared voiced while frequency bands where this ratio is low are declared unvoiced.

6 HX-SD™ SMV Bit Allocation
- bit allocation for each rate differs for frames which are entirely waveform encoded from frames which are all or part harmonic - remaining bits are allocated to waveform and harmonic coders depending on mixing state

7 HX-SD™ SMV Complexity - encoder estimated at 21.2 MIPs, decoder estimated at 7.3 MIPs - Total ROM is approximately 16.5 kb Data ROM and 11.5 kb Program ROM - further reductions in MIPS and memory are envisioned

8 HX-SD™ SMV Rate Statistics
- Average Data Rate for test vector “vambm22.l22” vs 7.4 kbps for EVRC - Open Loop Rate Determination based on mode, perceptual difficulty and Voice Activity Detection (VAD) Some of the parameters of the MOS test are shown on this overhead. Of late, It has become common practice within the industry to refer to a MOS value to illustrate voice quality . Using the absolute value of the MOS score to determine voice quality can be very misleading. In addition, the scores from different MOS tests can not be compared to one another. In fact, the MOS values for any one vocoder will vary from test to test depending on a number of key variables. These variable include the vocoders within the test. If you were to test one very good vocoders against a number of inferior vocoders, the MOS score would be artificially inflated. Conversely, test a very good vocoder against only unquantized speech and the value will be artificially low. In determining voice quality, it is much more accurate to compare the difference in the overall scores within a test.

9 Summary DVSI is a leader in the field of low/medium bit rate vocoders for wireless communications DVSI has developed a new hybrid vocoder which is being proposed to 3GPP2 for the Selectable Mode Vocoder DVSI’s hybrid vocoder was designed for high quality speech at low data rates with moderate complexity Some of the parameters of the MOS test are shown on this overhead. Of late, It has become common practice within the industry to refer to a MOS value to illustrate voice quality . Using the absolute value of the MOS score to determine voice quality can be very misleading. In addition, the scores from different MOS tests can not be compared to one another. In fact, the MOS values for any one vocoder will vary from test to test depending on a number of key variables. These variable include the vocoders within the test. If you were to test one very good vocoders against a number of inferior vocoders, the MOS score would be artificially inflated. Conversely, test a very good vocoder against only unquantized speech and the value will be artificially low. In determining voice quality, it is much more accurate to compare the difference in the overall scores within a test.


Download ppt "DVSI HX-SD™ Selectable Mode Vocoder"

Similar presentations


Ads by Google