Evolutionary Fuzzy Volume Tuner for Cellular Phones.

Slides:

Advertisements

Similar presentations

Interactive Evolutionary Computation Review of Applications Praminda Caleb-Solly Intelligent Computer Systems Centre University of the West of England.

Advertisements

Introduction to MP3 and psychoacoustics Material from website by Mark S. Drew

Time-Frequency Analysis Analyzing sounds as a sequence of frames

2nd Workshop on Wideband Speech Quality - June Perceptual Wideband Audio Quality Assessments Using PEAQ Christian Schmidmer Opticom GmbH, Erlangen.

Speech Processing for NSR Vs DSR Veeru Ramaswamy PhD CTO, Vianix LLC

Speech Enhancement through Noise Reduction By Yating & Kundan.

Measuring Perceived Quality of Speech and Video in Multimedia Conferencing Applications Anna Watson and M. Angela Sasse Dept. of CS University College.

Speech Compression. Introduction Use of multimedia in personal computers Requirement of more disk space Also telephone system requires compression Topics.

CS 551/651: Structure of Spoken Language Lecture 11: Overview of Sound Perception, Part II John-Paul Hosom Fall 2010.

Echo Generation and Simulated Reverberation R.C. Maher ECEN4002/5002 DSP Laboratory Spring 2003.

1 TAC2000/ IP Telephony Lab Perceptual Evaluation of Speech Quality (PESQ) Speaker: Wen-Jen Lin Date: Dec

Multiple Criteria for Evaluating Land Cover Classification Algorithms Summary of a paper by R.S. DeFries and Jonathan Cheung-Wai Chan April, 2000 Remote.

Christian Schmidmer, OPTICOM1 Subjective Quality Testing - Voice & Audio.

Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.

Speaker Adaptation for Vowel Classification

EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 1 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision.

1 MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING By Kaan Tariman M.S. in Computer Science CSCI 8810 Course Project.

Classification of Music According to Genres Using Neural Networks, Genetic Algorithms and Fuzzy Systems.

Nov. 3, 2000 Adaptive Playout Scheduling in Packet Voice Communications.

A PRESENTATION BY SHAMALEE DESHPANDE

Kathy Grimes. Signals Electrical Mechanical Acoustic Most real-world signals are Analog – they vary continuously over time Many Limitations with Analog.

8th and 9th June 2004 Mainz, Germany Workshop on Wideband Speech Quality in Terminals and Networks: Assessment and Prediction 1 Vincent Barriac, Jean-Yves.

Objective and Subjective Degradations of Transcoded Voice for Heterogeneous Radio Networks Interoperability Ľubica Blašková 1, Jan Holub 1, Michael Street.

1 Audio Compression Multimedia Systems (Module 4 Lesson 4) Summary: r Simple Audio Compression: m Lossy: Prediction based r Psychoacoustic Model r MPEG.

A Full Frequency Masking Vocoder for Legal Eavesdropping Conversation Recording R. F. B. Sotero Filho, H. M. de Oliveira (qPGOM), R. Campello de Souza.

LE 460 L Acoustics and Experimental Phonetics L-13

Figures for Chapter 6 Compression

A VOICE ACTIVITY DETECTOR USING THE CHI-SQUARE TEST

1 TTC‘s Standardization Work Network Management Committee “A Method for Speech Quality Assessment of IP Telephony”

SoundSense by Andrius Andrijauskas. Introduction  Today’s mobile phones come with various embedded sensors such as GPS, WiFi, compass, etc.  Arguably,

By Sarita Jondhale1 Pattern Comparison Techniques.

Multiresolution STFT for Analysis and Processing of Audio

Subjective Sound Quality Assessment of Mobile Phones for Production Support Thorsten Drascher, Martin Schultes Workshop on Wideband Speech Quality in Terminals.

Tratamiento Digital de Voz Prof. Luis A. Hernández Gómez ftp.gaps.ssr.upm.es/pub/TDV/DOC/ Tema2c.ppt Dpto. Señales, Sistemas y Radiocomunicaciones.

Page 0 of 23 MELP Vocoders Nima Moghadam SN#: Saeed Nari SN#: Supervisor Dr. Saameti April 2005 Sharif University of Technology.

1 Requirements for the Transmission of Streaming Video in Mobile Wireless Networks Vasos Vassiliou, Pavlos Antoniou, Iraklis Giannakou, and Andreas Pitsillides.

Colombia, September 2013 The importance of models and procedures for planning, monitoring and control in the provision of communications services.

Speech Coding Submitted To: Dr. Mohab Mangoud Submitted By: Nidal Ismail.

REVISED CONTEXTUAL LRT FOR VOICE ACTIVITY DETECTION Javier Ram’ırez, Jos’e C. Segura and J.M. G’orriz Dept. of Signal Theory Networking and Communications.

Content Clustering Based Video Quality Prediction Model for MPEG4 Video Streaming over Wireless Networks Asiya Khan, Lingfen Sun & Emmanuel Ifeachor 16.

Introduction to SOUND.

1 Audio Compression. 2 Digital Audio  Human auditory system is much more sensitive to quality degradation then is the human visual system  redundancy.

Jun-Won Suh Intelligent Electronic Systems Human and Systems Engineering Department of Electrical and Computer Engineering Speaker Verification System.

Hearing Aid (HA) and Cochlear Implant (CI) users provided subjective ratings of usability for speech-to-interference ratios (SIRs) presented in a single-interval,

Submitted By: Santosh Kumar Yadav (111432) M.E. Modular(2011) Under the Supervision of: Mrs. Shano Solanki Assistant Professor, C.S.E NITTTR, Chandigarh.

Wireless communications and mobile computing conference, p.p , July 2011.

CCN COMPLEX COMPUTING NETWORKS1 This research has been supported in part by European Commission FP6 IYTE-Wireless Project (Contract No: )

MMDB-8 J. Teuhola Audio databases About digital audio: Advent of digital audio CD in Order of magnitude improvement in overall sound quality.

1 Audio Coding. 2 Digitization Processing Signal encoder Signal decoder samplingquantization storage Analog signal Digital data.

Performance Comparison of Speaker and Emotion Recognition

Automatic Speech Recognition A summary of contributions from multiple disciplines Mark D. Skowronski Computational Neuro-Engineering Lab Electrical and.

Predicting Voice Elicited Emotions

Automatic Equalization for Live Venue Sound Systems Damien Dooley, Final Year ECE Progress To Date, Monday 21 st January 2008.

Automatic Transcription System of Kashino et al. MUMT 611 Doug Van Nort.

Alan Clark Telchemy Modeling the effects of Burst Packet Loss and Recency on Subjective Voice Quality Alan Clark Telchemy

A. R. Jayan, P. C. Pandey, EE Dept., IIT Bombay 1 Abstract Perception of speech under adverse listening conditions may be improved by processing it to.

1 Speech Compression (after first coding) By Allam Mousa Department of Telecommunication Engineering An Najah University SP_3_Compression.

ADAPTIVE BABY MONITORING SYSTEM Team 56 Michael Qiu, Luis Ramirez, Yueyang Lin ECE 445 Senior Design May 3, 2016.

[1] National Institute of Science & Technology Technical Seminar Presentation 2004 Suresh Chandra Martha National Institute of Science & Technology Audio.

Evaluating Register File Size

ARTIFICIAL NEURAL NETWORKS

Artificial Intelligence

– Workshop on Wideband Speech Quality in Terminals and Networks

Neuro-Fuzzy and Soft Computing for Speaker Recognition (語者辨識)

An Improved Neural Network Algorithm for Classifying the Transmission Line Faults Slavko Vasilic Dr Mladen Kezunovic Texas A&M University.

Department of Electrical Engineering

AUDIO SURVEILLANCE SYSTEMS: SUSPICIOUS SOUND RECOGNITION

Assist. Lecturer Safeen H. Rasool Collage of SCIENCE IT Dept.

Fuzzy Logic KH Wong Fuzzy Logic v.9a.

Embedded Sound Processing : Implementing the Echo Effect

Presentation transcript:

Evolutionary Fuzzy Volume Tuner for Cellular Phones

2 Contents l Introduction l Evolutionary Fuzzy Volume Tuner(EFVT) –Obtaining Noise Levels –Fuzzy Noise Classifier –Personalization of FVT –Evolutionary Algorithm l Simulation using FuzzyControl++ l Quality of Speech l Summary and Conclusions

3 Introduction: Quality of Speech Speech Quality (QoS) Sound quality and Naturalness Intelligibility Conversational Effort Background Noise Environmental Conditions Listening and Talking Effort Speech Characteristics Network Conditions

4 Introduction: PsychoAcoustics l Ear has very complex hearing mechanism l Loudness Perception äFrequency dependent( Ear is most sensitive to sounds of 4Khz) äDuration Dependent ( If the duration of a sound is smaller than approx 200ms, it will be less loud than a sound of same intensity with a duration greater than 200ms) l Masking äWhen two tones are presented simultaneously, the weaker tone may not be heard äNoise signals mask the Audio signals l Hearing Impairments

5 Introduction: Volume Tuner l Current day Mobile Phones have manual volume Tuner settings l Background noise Affects the Quality of Speech( QoS) äUsers have to manually change the acoustic Volume levels of the mobile handset äUsers tend to bring the handset very close to their ears l Background noise classes { cars, busses, trains, factory …} l Evolutionary Fuzzy Volume Tuner adjusts the volume setting of the Mobile handset based on äNoise Level derived from the Voice Activity Detector äNoise Class derived from the Fuzzy Background classification system äPersonalization based on the individual’s hearing requirements äEvolutionary algorithm tunes IO Scaling factors, membership functions and optimizes the fuzzy rule-base äWe call the Mobile Phone which uses the Evolutionary Fuzzy Volume Tuner as a Smart Cellular Phone (SCP)

6 Example Speech File

7 Speech File Analysis

8 Fuzzy Sets l Fuzzy sets were introduced by Lotfi Zadeh in 1965 l A Fuzzy set is any point in the cube l The elements of the fuzzy set have degree of membership m(x) from 0 to 1. l Fit vector represent a fuzzy set l eg A= ( 0.25, 0.75)= ( x1, x2) In this example, element x1 belongs to, or fits in, subset A to degree 0.25= l We define fuzzy-set intersection fitwise by pairwise minimum, union by pairwise maximum and complementation by order reversal:

9 Fuzzy Systems Block Diagram

10 Fuzzy System( Mamdani)

11 Fuzzy Volume Tuner

12 Volume Level Very Low High Very High Degree of Membership Volume 0 1 Zero

13 Background Noise Level l GSM VAD computes the noise levels during the noise- only periods. An adaptive noise-suppressor filter is used to filter the input signal frame. The co-efficients of this filter is computed during noise-only periods determined by special measures taken to identify noise- only frames. These include signal stationarity and periodicity measures. l Fuzzy set values for Noise level : { Very Low, Medium Low, Low, Zero, High, Medium High, Very High}.

14 Background Noise Class l Fuzzy Noise Classifier (FNC) classifies the background noise into 7 types i.e., Stationary ( Car, Train, Bus-Dump), Non-stationary (Street, Factory, Construction, Babble). l The volume change to be applied is based on the noise class. E.g., the volume increase may not be the same for Car noise and the factory noise. l The Fuzzy rule base contains IF/THEN rules such as: äIf noise level is High and Volume Level is Low and noise class is Car Then volume level Change is LP äIf noise level is Low and Volume Level is High and noise class is Train Then Volume level Change is MN

15 Fuzzy Noise Classifier (FNC) l Based on work by F. Beritelli et al., l Feature Extraction 15 parameters of speech input l FNC operates in 4 levels with 6 Fuzzy Systems trained to match the features l 7 classes of Background noise Street Feature Extraction Matching (Fuzzy System n.1) Fuzzy System n.2 Fuzzy System n.3 Fuzzy System n.4 Fuzzy System n.5 Fuzzy System n.6 Stationary Non-stationary Bus CarTrain Constr Babble Factory

16 Personalization of FVT The hearing loss is measured with an audiogram. An Audiogram of a person shows the amount of hearing loss in each of the frequency bands as shown in the fig. A person with hearing loss will perceive different frequencies at different levels.

17 Personalization of FVT The speech intelligibility and quality of hearing can vary and also are dependent on the background noise. l Hearing loss is measured using Audiogram l Translate Audiogram data to Fuzzy rule base for personalization of FVT l Fuzzy Volume Tuner uses this rule-base

18 Evolutionary Algorithm l The evolutionary algorithm for fuzzy volume tuner performs 3 functions.: ä  Tunes the input-output(IO) scaling factors ä  It tunes the membership functions –Optimizes the fuzzy rule-base. Evolutionary Algorithm Fuzzy Volume Tuner User/Trainer Input Mobile Environment Inputs Fuzzy Sets Fuzzy Rules Fitness Output

19 FuzzyControl++ l This is a tool which can be easily used for configuring and simulating a fuzzy system. l The inputs, outputs and IF/THEN block are easily configured. However, the membership functions are fixed and no choice is available. The rules can be easily edited. l There is also a provision for editing rules by matrix. The ranges for the linguistic terms can be easily defined. l It provides an impressive 3D graphics display to view the decision surface. l A rule activity window and a simulation window enable simulation studies of the fuzzy system. The choices for the waveform are restricted compared to ECANSE. However, this tool can generate code for target systems.

20 FuzzyControl++ Simulation

21 FuzzyControl++ Simulation

22 MIPS and Memory Estimation  Based on the benchmark results: For Fuzzy Mobile Phone application ( fuzzy rules, 2 inputs, 1 output, 4-6 labels per variable), we may need less than 2Kbytes for storing the code. The execution cycles may not exceed 15,000 cycles. This is for a conventional 8 bit microcontroller such as 68HC11.  Processing Delay = 80ms Speech Frame ( 160 Samples) Speech Codec Fuzzy Noise Classifier VAD Fuzzy Volume Tuner 20ms dscdvad dfnc dFV T 80ms for 100MIPS processor

23 Quality of Speech (QoS) l Attributes of Speech Quality for Fuzzy Mobile Phone: äStress, Intelligibility, Pleasantness, Loudness l Overall Quality Evaluation by Mean Opinion Score(MOS): äITU-T Rec. P.800 Most recognized methodology for evaluating subjective quality of speech. –MOS is a five level scale {(1, bad), (2,Poor), (3, Fair), (4, Good), (5, Excellent)} –The listener’s task is to simply evaluate the tested speech with MOS scale. –The use of simple five level scale is easy and provides some instant explicit information. l ITU-T Rec. P.861 – Perceptual Speech Quality Measurement (PSQM) objective measurement of speech quality ( Tools such as Opera for PSQM)

24 Summary and Conclusion l Cellular phones with EFVT will have several benefits: l  Improved QoS for stationary and non-stationary noise in mobile environments. As the fuzzy volume Tuner uses the information on background noise level and class to adjust the volume level. l  Some classes of noise such as car noise fall into low-frequency noise. They do not affect the intelligibility of speech compared to noise classes such as factory noise. Hence the fuzzy volume Tuner has to be dependent on noise classes for effective volume adjustments. l  The fuzzy volume Tuner is easily embedded in the mobile handset as it has very less memory and computational requirements. The computations are carried out by the microcontroller within the baseband chip. l The fuzzy volume Tuner can be personalized based on the audiogram for a hearing impaired person. l The Evolutionary algorithm tunes the scaling factors, membership functions and optimizes the fuzzy rule-base for improving and optimizing Cellphone performance.