AN ANALOG INTEGRATED- CIRCUIT VOCAL TRACT PRESENTED BY: NIEL V JOSEPH S7 AEI ROLL NO-46 GUIDED BY: MR.SANTHOSHKUMAR.S ASST.PROFESSOR E&C DEPARTMENT.

Slides:



Advertisements
Similar presentations
LC Tunable Oscillator Team Members: Hubert Mamba Fu Jingyi Wang Jian ELG4135 Electronics III Project.
Advertisements

Vowel Formants in a Spectogram Nural Akbayir, Kim Brodziak, Sabuha Erdogan.
Physical modeling of speech XV Pacific Voice Conference PVSF-PIXAR Brad Story Dept. of Speech, Language and Hearing Sciences University of Arizona.
“Connecting the dots” How do articulatory processes “map” onto acoustic processes?
Anna Barney, Antonio De Stefano ISVR, University of Southampton, UK & Nathalie Henrich LAM, Université Paris VI, France The Effect of Glottal Opening on.
Speech & Audio Coding TSBK01 Image Coding and Data Compression Lecture 11, 2003 Jörgen Ahlberg.
SPPA 403 Speech Science1 Unit 3 outline The Vocal Tract (VT) Source-Filter Theory of Speech Production Capturing Speech Dynamics The Vowels The Diphthongs.
TWO-PORT NETWORKS In many situations one is not interested in the internal organization of a network. A description relating input and output variables.
Lecture 6. Chapter 3 Microwave Network Analysis 3.1 Impedance and Equivalent Voltages and Currents 3.2 Impedance and Admittance Matrices 3.3 The Scattering.
CHAPTER 3 Measurement Systems with Electrical Signals
ACOUSTICS OF SPEECH AND SINGING MUSICAL ACOUSTICS Science of Sound, Chapters 15, 17 P. Denes & E. Pinson, The Speech Chain (1963, 1993) J. Sundberg, The.
Chapter 7 Principles of Analog Synthesis and Voltage Control Contents Understanding Musical Sound Electronic Sound Generation Voltage Control Fundamentals.
Comments, Quiz # 1. So far: Historical overview of speech technology - basic components/goals for systems Quick overview of pattern recognition basics.
Toward a high-quality singing synthesizer with vocal texture control Hui-Ling Lu Center for Computer Research in Music and Acoustics (CCRMA) Stanford University,
The Human Voice. I. Speech production 1. The vocal organs
ACOUSTICAL THEORY OF SPEECH PRODUCTION
The Human Voice Chapters 15 and 17. Main Vocal Organs Lungs Reservoir and energy source Larynx Vocal folds Cavities: pharynx, nasal, oral Air exits through.
Speech Sound Production: Recognition Using Recurrent Neural Networks Abstract: In this paper I present a study of speech sound production and methods for.
Speech Group INRIA Lorraine
Eva Björkner Helsinki University of Technology Laboratory of Acoustics and Audio Signal Processing HUT, Helsinki, Finland KTH – Royal Institute of Technology.
Complete Discrete Time Model Complete model covers periodic, noise and impulsive inputs. For periodic input 1) R(z): Radiation impedance. It has been shown.
Hossein Sameti Department of Computer Engineering Sharif University of Technology.
It was assumed that the pressureat the lips is zero and the volume velocity source is ideal  no energy loss at the input and output. For radiation impedance:
Pitch changes result from changing the length and tension of the vocal folds The pitch you produce is based on the number of cycles (vocal fold vibrations)
Pinched Hysteresis Loops of Two Memristor SPICE Models Akzharkyn Izbassarova and Daulet Kengesbek Department of Electrical and Electronics Engineering.
Introduction to Speech Synthesis ● Key terms and definitions ● Key processes in sythetic speech production ● Text-To-Phones ● Phones to Synthesizer parameters.
Vowels Vowels: Articulatory Description (Ferrand, 2001) Tongue Position.
Anatomic Aspects Larynx: Sytem of muscles, cartileges and ligaments.
Chapter 2 – Operational Amplifiers Introduction Textbook CD
Signal and Systems Introduction to Signals and Systems.
GUIDED BY: Prof. DEBASIS BEHERA
General Licensing Class G7A – G7C Practical Circuits Your organization and dates here.
Landmark-Based Speech Recognition: Spectrogram Reading, Support Vector Machines, Dynamic Bayesian Networks, and Phonology Mark Hasegawa-Johnson
Source/Filter Theory and Vowels February 4, 2010.
Computer Sound Synthesis 2
ENE 428 Microwave Engineering
LE 222 Sound and English Sound system
ECE 590 Microwave Transmission for Telecommunications Noise and Distortion in Microwave Systems March 18, 25, 2004.
Acoustic Phonetics 3/9/00. Acoustic Theory of Speech Production Modeling the vocal tract –Modeling= the construction of some replica of the actual physical.
MUSIC 318 MINI-COURSE ON SPEECH AND SINGING
Speech Coding Using LPC. What is Speech Coding  Speech coding is the procedure of transforming speech signal into more compact form for Transmission.
Speech Science Fall 2009 Oct 26, Consonants Resonant Consonants They are produced in a similar way as vowels i.e., filtering the complex wave produced.
Module 4 Operational Amplifier
Introduction to SOUND.
♥♥♥♥ 1. Intro. 2. VTS Var.. 3. Method 4. Results 5. Concl. ♠♠ ◄◄ ►► 1/181. Intro.2. VTS Var..3. Method4. Results5. Concl ♠♠◄◄►► IIT Bombay NCC 2011 : 17.
DR.D.Y.PATIL POLYTECHNIC, AMBI COMPUTER DEPARTMENT TOPIC : VOICE MORPHING.
Structure of Spoken Language
Speech Science VI Resonances WS Resonances Reading: Borden, Harris & Raphael, p Kentp Pompino-Marschallp Reetzp
1 RS ENE 428 Microwave Engineering Lecture 5 Discontinuities and the manipulation of transmission lines problems.
ECE 5525 Osama Saraireh Fall 2005 Dr. Veton Kepuska
LIN 3201 Sounds of Human Language Sayers -- Week 1 – August 29 & 31.
Speech Generation and Perception
P105 Lecture #27 visuals 20 March 2013.
ENE 490 Applied Communication Systems
EKT 441 MICROWAVE COMMUNICATIONS CHAPTER 3: MICROWAVE NETWORK ANALYSIS (PART 1)
Acoustic Tube Modeling (I) 虞台文. Content Introduction Wave Equations for Lossless Tube Uniform Lossless Tube Lips-Radiation Model Glottis Model One-Tube.
EMLAB Two-port networks. EMLAB 2 In cases of very complex circuits, observe the terminal voltage and current variations, from which simple equivalent.
1.  Transmission lines or T-lines are used to guide propagation of EM waves at high frequencies.  Distances between devices are separated by much larger.
Chapter 3: The Speech Process
The Human Voice. 1. The vocal organs
ELEC 401 MICROWAVE ELECTRONICS
The Human Voice. 1. The vocal organs
Speech Generation and Perception
ELEC 401 MICROWAVE ELECTRONICS
General Licensing Class
topics Basic Transmission Line Equations
Network Analysis and Synthesis
Speech Generation and Perception
The Human Voice.
Presentation transcript:

AN ANALOG INTEGRATED- CIRCUIT VOCAL TRACT PRESENTED BY: NIEL V JOSEPH S7 AEI ROLL NO-46 GUIDED BY: MR.SANTHOSHKUMAR.S ASST.PROFESSOR E&C DEPARTMENT

CONTENTS  Introduction  Human vocal tract  Concept of speech locked loop  Circuit model of vocal tract  Two -port ∏ -section  Linear and non linear resistor modeling  Driving the vocal tract  conclusion 2

INTRODUCTION  First experimental integrated circuit vocal tract  16 stage cascade of two port ∏-elements  Analysis by synthesis  Speech locked loop 3

Human vocal tract 4

 Function is filtering of sound  Consist of laryngeal cavity, pharynx, nasal cavity and oral cavity  Length in adult males is 16.9 cm and in females 14.1cm  Larynx produces sound in mammals  Lungs act as power supply 5

 controlled variations in the area of vocal tract produces speech  Two sources of excitation are  Periodic source at the glottis  Turbulent noise source at some point along the tube  Vocal fold vibrations produces interruption of airflow 6

CONCEPT OF SPEECH LOCKED LOOP FIG: CONCEPT OF SPEECH LOCKED LOOP 7

 Analysis- by -synthesis  Analogy with phase locked loop  Measure of error is computed by comparing to the input  SLL locks to the input sound with optimal vocal tract profile  Vocal tract is analogous to VCO in PLL 8

CIRCUIT MODEL OF VOCAL TRACT  Vocal tract is assumed as non-uniform acoustic tube  Terminated by the vocal chord at one end and lip/nose at other end  Cross sectional area is varied by changing impedances at different points  Propagation of wave is approximately one dimensional 9

 The wave equation for one dimensional sound propagation in a uniform tube of circular cross section is P-sound pressure U-volume velocity A-area of cross section C-velocity of sound 10

 Acoustic wave propagation in a tube is analogous to plane wave propagation along an electrical transmission line  Equation can be modified as V-voltage I-current 11

FIG: SCHEMATIC DIAGRAM OF TRANSMISSION LINE VOCAL TRACT 12

 Transmission Line (TL) model  TL comprises of cascade of two-port elements  Current source model  Variable impedance model  Fluid volume velocity is mapped to current  Fluid pressure is mapped to voltage 13

TWO-PORT ∏-SECTION FIG: PASSIVE ∏ CIRCUIT MODEL ASUMING RIGID WALLS 14

FIG: CIRCUIT DIAGRAM OF TUNABLE TWO PORT ∏ -SECTION 15

FIG:MEASURED SIGNAL AND NOISE CHARARACTERISTICS AS A FUNCTION OF INPUT FREQUENCY  SNR IS 64,66,63 dB for F1,F2,F3 of /e/ 16

LINEAR AND NON LINEAR RESISTOR MODELING  Implemented with MOS transistor  Glottal constriction resistance is a series combination of linear and non linear resistors  For linear characteristics I ∞V  For non linear characteristics I ∞√V 17

FIG: I -V CURVES FOR TYPICAL nMOS TRANSISTOR 18

FIG: MEASURED I-V CHARACTERISTICS OF MOS RESISTOR CONFIGURED AS A 100 GIGA OHM RESISTANCE 19

DRIVING THE VOCAL TRACT  It can produce all speech sounds  We should be given area function, the glottal excitation source, the turbulent noise source  Area function has large number of degrees of freedom  To reduce the dimensionality we use Maeda articulatory model 20

 The Maeda articulatory model describes the vocal tract profile using seven components 1.Jaw height 2.Tongue body position 3.Tongue body shape 4.Tongue tip 5.Lip height 6.Lip protrusion 7.Larynx height 21

 Articulatory code book contain mappings from the articulatory to acoustic domain  Set of vocal tract profiles are generated  ‘babble’ is synthesized using each vocal tract profile 22

23

 DCT is applied to generate a set of 12 cepstral coefficient  Compared against the codebook  Best match is found and corresponding articulatory parameters are used to produce vocal tract area profile 24

CONCLUSION  Consumes less than 275 micro watt power  Used in speech locked loop to generate speech  Cross sectional area of tube can be varied by varying L/C  Also used in speech compression an speech recognition 25

 For many speech synthesis applications 5-7kHz is sufficient Spectrogram of recording of the word ‘Massachusetts’ Synthesized by AVT, female vocal tract is used 26

REFERENCES 27  B. Raj, L. Turicchia, B. Schmidt-Nielsen, and R. Sarpeshkar, “An FFTbased companding front end for noise-robust automatic speech recognition,” EURASIP J. Audio, Speech, Music Process., vol. 2007, 2007, /2007/65420, Article ID  R. Sarpeshkar, M. W. Baker, C. D. Salthouse, J. Sit, L. Turicchia, and S. M. Zhak, “An ultra-low-power programmable analog bionic ear processor,” IEEE Trans. Biomed. Eng., vol. 52, no. 4, pp. 711–727, Apr  L. Turicchia and R. Sarpeshkar, “A bio-inspired companding strategy for spectral enhancement,” IEEE Trans. Speech Audio Process., vol. 13, no. 2, pp. 243–253, Mar

28

29