ISIP: Research Presentation Seungchan Lee Feb.16.2006 Page 0 of 36 Seungchan Lee Intelligent Electronic Systems Human and Systems Engineering Department.

Slides:



Advertisements
Similar presentations
Chapter 11 Introduction to Programming in C
Advertisements

The Assembly Language Level
ICS103 Programming in C Lecture 1: Overview of Computers & Programming
Lecture 1: Overview of Computers & Programming
1/1/ / faculty of Electrical Engineering eindhoven university of technology Architectures of Digital Information Systems Part 1: Interrupts and DMA dr.ir.
SYSTEM PROGRAMMING & SYSTEM ADMINISTRATION
Programming Types of Testing.
1/1/ / faculty of Electrical Engineering eindhoven university of technology Introduction Part 3: Input/output and co-processors dr.ir. A.C. Verschueren.
In collaboration with Hualin Gao, Richard Duncan, Julie A. Baca, Joseph Picone Human and Systems Engineering Center of Advanced Vehicular System Mississippi.
 2005 Pearson Education, Inc. All rights reserved Introduction.
Lab6 – Debug Assembly Language Lab
© Janice Regan, CMPT 102, Sept CMPT 102 Introduction to Scientific Computer Programming The software development method algorithms.
Engineering Problem Solving With C++ An Object Based Approach Fundamental Concepts Chapter 1 Engineering Problem Solving.
Two main requirements: 1. Implementation Inspection policies (scheduling algorithms) that will extand the current AutoSched software : Taking to account.
Computer Science 1620 Programming & Problem Solving.
EE694v-Verification-Lect5-1- Lecture 5 - Verification Tools Automation improves the efficiency and reliability of the verification process Some tools,
Guide To UNIX Using Linux Third Edition
Programming Fundamentals (750113) Ch1. Problem Solving
Chapter 3 Planning Your Solution
1.3 Executing Programs. How is Computer Code Transformed into an Executable? Interpreters Compilers Hybrid systems.
Your Interactive Guide to the Digital World Discovering Computers 2012.
CS102 Introduction to Computer Programming
M. Taimoor Khan * Java Server Pages (JSP) is a server-side programming technology that enables the creation of dynamic,
GODIAN MABINDAH RUTHERFORD UNUSI RICHARD MWANGI.  Differential coding operates by making numbers small. This is a major goal in compression technology:
Homework Reading Programming Assignments
© Janice Regan, CMPT 128, Jan CMPT 128 Introduction to Computing Science for Engineering Students Creating a program.
WorkPlace Pro Utilities.
Learning Objectives Data and Information Six Basic Operations Computer Operations Programs and Programming What is Programming? Types of Languages Levels.
Topics Introduction Hardware and Software How Computers Store Data
สาขาวิชาเทคโนโลยี สารสนเทศ คณะเทคโนโลยีสารสนเทศ และการสื่อสาร.
Programming. What is a Program ? Sets of instructions that get the computer to do something Instructions are translated, eventually, to machine language.
IPC144 Introduction to Programming Using C Week 1 – Lesson 2
A brief overview of Speech Recognition and Spoken Language Processing Advanced NLP Guest Lecture August 31 Andrew Rosenberg.
CS 114 – Class 02 Topics  Computer programs  Using the compiler Assignments  Read pages for Thursday.  We will go to the lab on Thursday.
Computer Programming TCP1224 Chapter 3 Completing the Problem-Solving Process and Getting Started with C++
Seungchan Lee Intelligent Electronic Systems Human and Systems Engineering Department of Electrical and Computer Engineering Software Release and Support.
Testing and Debugging Version 1.0. All kinds of things can go wrong when you are developing a program. The compiler discovers syntax errors in your code.
SE: CHAPTER 7 Writing The Program
Programming Fundamentals. Today’s Lecture Why do we need Object Oriented Language C++ and C Basics of a typical C++ Environment Basic Program Construction.
C++ Programming Language Lecture 2 Problem Analysis and Solution Representation By Ghada Al-Mashaqbeh The Hashemite University Computer Engineering Department.
Seungchan Lee Intelligent Electronic Systems Human and Systems Engineering Department of Electrical and Computer Engineering Speaker Verification Experiment.
Algorithms  Problem: Write pseudocode for a program that keeps asking the user to input integers until the user enters zero, and then determines and outputs.
Release Progress Report Daniel May Intelligent Electronic Systems Human and Systems Engineering Department of Electrical and Computer Engineering min XMLABNF.
CMP 131 Introduction to Computer Programming Violetta Cavalli-Sforza Week 3, Lecture 1.
Jun-Won Suh Intelligent Electronic Systems Human and Systems Engineering Department of Electrical and Computer Engineering ISIP_VERIFY, ISIP_DECODER_DEMO,
Chapter One An Introduction to Programming and Visual Basic.
Programming Fundamentals. Overview of Previous Lecture Phases of C++ Environment Program statement Vs Preprocessor directive Whitespaces Comments.
Intermediate 2 Computing Unit 2 - Software Development.
Digital Computer Concept and Practice Copyright ©2012 by Jaejin Lee Control Unit.
The Hashemite University Computer Engineering Department
Chapter 7 Speech Recognition Framework  7.1 The main form and application of speech recognition  7.2 The main factors of speech recognition  7.3 The.
Seungchan Lee Department of Electrical and Computer Engineering Mississippi State University RVM Implementation Progress.
1 The Software Development Process ► Systems analysis ► Systems design ► Implementation ► Testing ► Documentation ► Evaluation ► Maintenance.
Chapter – 8 Software Tools.
Digital Computer Concept and Practice Copyright ©2012 by Jaejin Lee Control Unit.
Lecture #1: Introduction to Algorithms and Problem Solving Dr. Hmood Al-Dossari King Saud University Department of Computer Science 6 February 2012.
20 October 2005 LCG Generator Services monthly meeting, CERN Validation of GENSER & News on GENSER Alexander Toropin LCG Generator Services monthly meeting.
Introduction to Algorithmic Processes CMPSC 201C Fall 2000.
Design of a Guitar Tab Player in MATLAB Summary Lecture Module 1: Modeling a Guitar Signal.
Some of the utilities associated with the development of programs. These program development tools allow users to write and construct programs that the.
Definition CASE tools are software systems that are intended to provide automated support for routine activities in the software process such as editing.
Topics Introduction Hardware and Software How Computers Store Data
Completing the Problem-Solving Process
COMPUTER ORGANIZATION & ASSEMBLY LANGUAGE
System Programming and administration
Algorithms Problem: Write pseudocode for a program that keeps asking the user to input integers until the user enters zero, and then determines and outputs.
ICS103 Programming in C Lecture 1: Overview of Computers & Programming
Unit# 8: Introduction to Computer Programming
Chapter 11 Introduction to Programming in C
Speaker Recognition Experiment
Presentation transcript:

ISIP: Research Presentation Seungchan Lee Feb Page 0 of 36 Seungchan Lee Intelligent Electronic Systems Human and Systems Engineering Department of Electrical and Computer Engineering On a Utility for Speaker Verification Research Presentation:

ISIP: Research Presentation Seungchan Lee Feb Page 1 of 36 Set up a standard IES environment The first appearance at CAVS is good. The first thing to do is set up IES environment.  Create Enlistment  Our production system is consist of many classes  I’m surprised at the structure of our software environment. Even though many works has been already done, I need to consolidate our system with other IFCers.  GroupWise : Good communication and schedule management tools within our group  After that, I could make a program and compile it in my local machine. client CVS repository SERVER

ISIP: Research Presentation Seungchan Lee Feb Page 2 of 36 First IFC program, instruction First simple IFC program do the following instructions  Reads a 3×3 float matrix from an Sof file.  Reads a 3×1 float vector from an Sof file.  Multiples the vector and matrix using the equation Z=alpha*A*B  Writes the result to an Sof file.  Allows the value of alpha to be set from the command line: foo.exe –alpha 2.0 input.sof output.sof

ISIP: Research Presentation Seungchan Lee Feb Page 3 of 36 First IFC program, flow Foo.exe Foo.exe –alpha 2.0 input_file output_file Read Input Sof Write to Output Sof file Read 3×3 float matrix Read 3×1 float vector Multiples the vector and matrix

ISIP: Research Presentation Seungchan Lee Feb Page 4 of 36 First IFC program After completing first IFC program, I ’m more familiar with our production system. When I have questions about our production system, our prominent group members always helps me about my questions. It’s good to study alone, but sometime it is better to ask an expert in the programming. The more I know about our production system, The more I have many questions. The more I have many questions.

ISIP: Research Presentation Seungchan Lee Feb Page 5 of 36 First IFC program First Question - How can we view the contents of the class? Answer : It is possible through debug method. In order to view the contents of the Sof object, it is so hard to figure out during the debugging time. Instead of, I used debug method that is included in the source code. Sometimes this may retard our debugging time, but I know this is best way until now. Thus, I can figure out which data is contained in the Sof object. All other class variables are same.

ISIP: Research Presentation Seungchan Lee Feb Page 6 of 36 First IFC program Second Question - Using debug method, why changes string capacity? SysString token = “oscar had a heap of apples” Using debug method, we can see the each value. value_d = (5 >= 5) oscar value_d = (5 >= 3) had value_d = (5 >= 1) a value_d = (5 >= 4) heap value_d = (5 >= 2) of value_d = (6 >= 6) apples Answer In the expression (n >= m), 'n' is to total capacity of the data structure, and 'm' is the current length. So for the line: value_d = (5 >= 3) had The capacity of the SysString is 5 and the current length is 3, which is obvious from the string 'had'.

ISIP: Research Presentation Seungchan Lee Feb Page 7 of 36 First IFC program Third Question - What “L” means? - In our production system, all of the classes uses “L” character. For example, SysString file1; file1.assign(L"/tmp/foo_bin.sof"); I didn’t exactly figure out why this “L” is used. Answer  The "L" is just a macro that tells the compiler that the following string is a Unicode string.

ISIP: Research Presentation Seungchan Lee Feb Page 8 of 36 ISIP_VERIFY Basic Work Flow - Decide what added and removed in the new version - Analyze old version - Draw class diagram - Design new version - Coding and Compilation - Testing and fixing bugs

ISIP: Research Presentation Seungchan Lee Feb Page 9 of 36 ISIP_VERIFY Decide what added and removed in the new version - Currently, isip_verify does Speaker Verification, but only uses HMM algorithm. We want new isip_verify performs that function using HMM, SVM, RVM algorithm. This means new version of “isip_verify” will be more general utility than the old version. Analyze old version - isip_verify utility uses SpeakerVerifier,VerifyHMM,HMM classes, and does both training and testing. Different to the “GMM” case, “SVM” statistical model have “isip_svm_learn” and “isip_svm_classify”. While “isip_svm_learn” utility can process training, “isip_svm_classify” can process testing.

ISIP: Research Presentation Seungchan Lee Feb Page 10 of 36 ISIP_VERIFY The problem : 1. isip_verify can process only using “GMM” statistical model. 2. We does not have “RVM” routine which can do same function of “SVM” utility. Solution : 1. Add SVM, RVM routine in the isip_verify 2. Add same functionality in the RVM class. 3. Modify the SpeakerVerifier class. We can make a utility which can do all functions which I mentioned. To begin with, I drew class block diagrams of each utility and make sure the relationship of classes and functions. After that, I could figure out more easily about these utilities. Next, I drew the flow chart of new utilities.

ISIP: Research Presentation Seungchan Lee Feb Page 11 of 36 ISIP_VERIFY ISIP_VERIFY (util/speech) SpeakerVerifier (asr) VerifyHMM (pr) If algorithm = HMM If algorithm = VERIFY Verify() HiddenMarkovModel(pr) If algorithm = TRAIN Train and model creation If implementation = LIKELIHOOD Verifyl() LIKELIHOOD RATIO Verifylr() run() algorithm = TRAIN Implementation = BAUM WELCH linearDecoder()Run() else Set algorithm Set implementation Class Block Diagram Parameter check

ISIP: Research Presentation Seungchan Lee Feb Page 12 of 36 ISIP_VERIFY ISIP_SVM_LEARN isip_svm_learn (util/speech) SupportVectorMachine(pr) if algorithm = SEQUENTIAL_MINIMAL_OPTIMIZATION sequentialMinimalOptimization() train() determine the support vector writeModel() loadFeature() positive example, negative example StatisticalModel(stat) – SupportVectorModel type StatisticalModelBase SupportVectorModel(stat) getSupportVectorModel() getBias() getKernels() getAlphas() getSupportVectors() write() Parameter check

ISIP: Research Presentation Seungchan Lee Feb Page 13 of 36 ISIP_VERIFY ISIP_SVM_CLASSIFY isip_svm_classify (util/speech) StatisticalModel (stat) AudioDatabase (mmedia) FeatureFile( mmedia) read() getRecord() getBufferData() getSupportVectorModel() open() getDistance() open write the distance to output file

ISIP: Research Presentation Seungchan Lee Feb Page 14 of 36 ISIP_VERIFY FLOW CHART Isip_verify (new version) algorithm HMMSVMRVM mode train testtrain test svmTrain() svmTest()rvmTrain() rvmTest() isip_verify (old version) isip_verify -param.sof.... -algo_type [hmm,svm,rvm] –mode [train, test] Check statistical_model = “GMM” error No error No Check statistical_model = “SVM” Check statistical_model = “RVM” Check “algo_type” option Check “mode” option (Model incorrect) verifyHMM class processes parameter file for isip_verify which can do both training and testing = gmmVerify() Yes No Since no algo_type was specified, HMM algo_type was chosen statistical model Yes statistical model No You must specify mode error Yes

ISIP: Research Presentation Seungchan Lee Feb Page 15 of 36 ISIP_VERIFY Coding and Compilation 1. Add and remove parameters and check the parameters (Won) 2. Combine three functionality - new “isip_verify” performs run() method in that utility and run() method call support vector machine object or relevance vector machine object, then performing training. This enables us to implement three models on one utility. 3. SpeakerVerifier class - include SVM, RVM class - modify parameter check - modify run(sdb) method - add run(pos_sdb,neg_sdb) method 4. RVM class - Add training and testing module (Sridhar)

ISIP: Research Presentation Seungchan Lee Feb Page 16 of 36 ISIP_VERIFY Problems during coding and compilation - How to verify SpeakerVerifier class? After modifying existing class, we need to verify the correctness. Diagnose method performs this functionality in our production system. This method is implemented *_02.cc in every class. After compiling the class, we execute “make test”. This automatically check every function in that class. - How can we resolve segmentation fault? One of the most difficult things to figure out the reason. Comment out all new modules, and then add one module, compile the class. And then test it. This is continued when every new module is tested.

ISIP: Research Presentation Seungchan Lee Feb Page 17 of 36 ISIP_VERIFY Problems during coding and compilation - Compilation, debugging time - When developing a new program, one of the most time consuming works is compiling and debugging. - In our production system, it takes much time to compile and debug a program. We have so many linking processes when compiling a program. - How can we resolve it?  It is faster to do in our local repository.

ISIP: Research Presentation Seungchan Lee Feb Page 18 of 36 ISIP_VERIFY Testing and fixing bugs  This part is as important as previous steps.  We can find faults and missing points during this step.  Problems :  What happens sdb object?  Normally, sdb object contains every commandline options.(except parameters)  However, the sdb object loses its contents when passing to the SpeakerVerifier class.  How can fix that? Comment out all code except control code. This is because I did not give list file option.

ISIP: Research Presentation Seungchan Lee Feb Page 19 of 36 Software Release What need to know for Software Release?  Varmint utility : to track down all problems  Production system : In order to better understand our system, I did and will do the followings. Data Preparation Feature Extraction Recognition Acoustic modeling Language modeling  These will be more specifically explain after this topic

ISIP: Research Presentation Seungchan Lee Feb Page 20 of 36 Software Release ProductionRuleTokenType class  It uses lots of if-else statement when doing read/write function.  Instead of doing this, we can use NameMap class.  In order to do that, Declare the NameMap class and modified related module. Problems : I met run-time errors. Solution : –I made a simple program that includes diagnose method in prtt_02.cc. –After track down the function, I could find the reason. –I firstly checked in this class on our production system.

ISIP: Research Presentation Seungchan Lee Feb Page 21 of 36 Software Release isip_lm_tester  This utility randomly generates sentences based on the language model file and tests the language model.  Problem : Currently, generating state transcriptions won’t generate past first symbols at the highest level.  What to do? I need to track down this problem, but it requires to the understanding of language model.  Read and study our tutorial on the production system thoroughly, and then can involve in fixing bugs in isip_lm_tester.

ISIP: Research Presentation Seungchan Lee Feb Page 22 of 36 Production System In this part, I will go from Data preparation to Feature extraction. How can we better understand our production system? - Data Preparation - Feature extraction - Recognition - Acoustic modeling - Language modeling

ISIP: Research Presentation Seungchan Lee Feb Page 23 of 36 Production System, Data Preparation Data Preparation  Why difficult as a beginner? -In normal programming, preparing input data is not hard. -But, in our production system, it is not easy to prepare that for a beginner.  It requires the knowledge of speech. This includes speech file format, file conversion, sampling  Speech file - Header + Sampled data - Sampled data  raw files header Data header Data header Data header Data Data WAV, Sof SPHERE, AU Raw

ISIP: Research Presentation Seungchan Lee Feb Page 24 of 36 Production System, Data Preparation Sof Format  Information of thelocation of each object stored in the file, and the corresponding object data.  Support two basic storage formats - text : human readable files - binary : sampled data  Used by all data objects in the ISIP environment to unify and simplify I/O.  Binary format : - Handle machine architecture differences with automatic byte transformations. - Used for large quantities of data for the obvious efficiency gains. - The objects are stored in a binary tree and a symbol table is used to hold the object class names.

ISIP: Research Presentation Seungchan Lee Feb Page 25 of 36 Production System, Data Preparation  Text format : - Used User input parameter files in the ISIP environment. - Simple format that consists of object names and tags, followed by the object data - Example Float value = VectorFloat value = VectorLong value = 2,3;

ISIP: Research Presentation Seungchan Lee Feb Page 26 of 36 Production System, Data Preparation Converting from external (i.e., SPHERE, WAV) format to raw format. speech.sph  speech.raw 1. Convert the SPHERE file's binary data to 16-bit linear samples using w_decode  w_decode -o pcm speech.sph speech-nb.sph 2. Strip the file's header using h_strip  h_strip speech-nb.sph speech.raw 3. The result is speech.raw which is identical everything except missing first 1024-bytes header information 4. One line command : w_decode -o pcm speech.sph - | h_strip - - > speech.raw Header Data

ISIP: Research Presentation Seungchan Lee Feb Page 27 of 36 Production System, Data Preparation Verification of Conversion to Raw  SoX: Audio Playback  sox -t.sw -r speech.raw -t.au speech.au  audioplay speech.au  File Size Comparison: Using "ls -l"  ls -l speech.* -rw-rw-r-- 1 may isip Sep 10 15:19 speech.raw -rw-rw-r-- 1 may isip Sep 10 15:12 speech.sph  We can see the fifth field is file size. Speech.raw is 1024 bytes smaller than speech.sph.  Octel Dump (od): Listing Values  od -t d2 speech.raw

ISIP: Research Presentation Seungchan Lee Feb Page 28 of 36 Production System, Data Preparation Creating Sof file : raw file  Sof file  Using isip_make_sof  type the following :  isip_make_sof speech.raw  This creates binary file. If you want to create text version, type the following  isip_make_sof -type text -suffix _text speech.raw Data Header Isip_make_sof

ISIP: Research Presentation Seungchan Lee Feb Page 29 of 36 Production System, Feature Extraction What is feature extraction?  Speech Recognizer dose not understand human voice  Only certain features of human voice are useful for recognizer decoding  Must be numerically measured and stored  feature vector  The process of taking these measurements is known as feature extraction.  Include the followings. -converting the signal to a digital form -measuring some important characters of the signal -augmenting these measurements Human voice MicroPhone Digital Signal

ISIP: Research Presentation Seungchan Lee Feb Page 30 of 36 Production System, Feature Extraction Frame  Typical frame duration in speech recognition is 10 ms  Determines the number of times we produce a feature vector Window  Typical window duration is 25 ms  Surrounding the frame for smoother representation of the speech data  Determine the number of samples Sampling rate :  number of samples per second taken from a continuous signal to make a discrete signal  Example) 8 Khz sampling rate with a frame duration of 10 ms, measurements would be taken over 80 samples to produce one feature vector.

ISIP: Research Presentation Seungchan Lee Feb Page 31 of 36 Production System, Feature Extractionm, Signal Flow Graph Basic process of extracting a single feature  Input : Speech data stored in digital form on a computer.  Energy : A computer program or algorithm specifically designed to measure energy values in the speech data.  Ouput : A computer file which stores the measurements of features Including window  Determine the number of samples used to calculate the energy measurements inputEnergyoutput inputEnergyoutputWind

ISIP: Research Presentation Seungchan Lee Feb Page 32 of 36 Production System, Feature Extraction, Signal Flow Graph Process of computing the frequency spectrum for a speech signal  Energy – time domain  Converting signals from the time domain to the frequency domain  Spec : represents the Fourier Transform Additional methods are needed to fully measure the features needed by a speech recognizer. Further analyze FFT of speech signal MFCC : Use a mathematical transformation called the cepstrum which computes the inverse Fourier transform of the log-spectrum of the speech signal. inputSpecoutputWind inputSpecoutputWindCeps

ISIP: Research Presentation Seungchan Lee Feb Page 33 of 36 Production System, Feature Extraction, Signal Flow Graph Recipe :  The information for each component is stored in a single entity. - format of the speech input - algorithms for extracting the features - format of the output - make recipe using isip_transform - Example) simple signal flow graph for extracting energy inp out Engy Recipe1 Recipe File

ISIP: Research Presentation Seungchan Lee Feb Page 34 of 36 Production System, Feature Extraction, Signal Flow Graph More complex Recipes  A single recipe file is produced for the entire graph. inp out Wind Recipe2 Recipe File EngyCeps

ISIP: Research Presentation Seungchan Lee Feb Page 35 of 36 Q & A 1. ordinary data type and function - In our production system, all data type is used in our classes. Instead of using float, why we use Float? This made me so confused. When I tried to use commandline interface, I used cout, cin function in C++ class. However, the situation is different in our system.

ISIP: Research Presentation Seungchan Lee Feb Page 36 of 36 Reference Production System Tutorial fundamentals/current/