Chapter 12 search and speaker adaptation 12.1 General Search Algorithm 12.2 Search Algorithms for Speech Recognition 12.3 Language Model States 12.4 Speaker.

Slides:



Advertisements
Similar presentations
Lecture Notes on AI-NN Chapter 5 Information Processing & Utilization.
Advertisements

CMSC 471 Fall 2002 Class #5-6 – Monday, September 16 / Wednesday, September 18.
CS344 : Introduction to Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 2 - Search.
Heuristic Search techniques
Artificial Intelligence Chapter 9 Heuristic Search Biointelligence Lab School of Computer Sci. & Eng. Seoul National University.
October 1, 2012Introduction to Artificial Intelligence Lecture 8: Search in State Spaces II 1 A General Backtracking Algorithm Let us say that we can formulate.
Search Techniques MSc AI module. Search In order to build a system to solve a problem we need to: Define and analyse the problem Acquire the knowledge.
Best-First Search: Agendas
Hidden Markov Models Theory By Johan Walters (SR 2003)
Lecture 15 Hidden Markov Models Dr. Jianjun Hu mleg.cse.sc.edu/edu/csce833 CSCE833 Machine Learning University of South Carolina Department of Computer.
1 Using Search in Problem Solving Part II. 2 Basic Concepts Basic concepts: Initial state Goal/Target state Intermediate states Path from the initial.
Speaker Adaptation in Sphinx 3.x and CALO David Huggins-Daines
Part 6 HMM in Practice CSE717, SPRING 2008 CUBS, Univ at Buffalo.
Review Best-first search uses an evaluation function f(n) to select the next node for expansion. Greedy best-first search uses f(n) = h(n). Greedy best.
State-Space Searches. 2 State spaces A state space consists of –A (possibly infinite) set of states The start state represents the initial problem Each.
State-Space Searches.
Brute Force Search Depth-first or Breadth-first search
Heuristic Search Heuristic - a “rule of thumb” used to help guide search often, something learned experientially and recalled when needed Heuristic Function.
Informed Search Idea: be smart about what paths to try.
3.0 State Space Representation of Problems 3.1 Graphs 3.2 Formulating Search Problems 3.3 The 8-Puzzle as an example 3.4 State Space Representation using.
Informed Search Strategies
Vilalta&Eick: Informed Search Informed Search and Exploration Search Strategies Heuristic Functions Local Search Algorithms Vilalta&Eick: Informed Search.
7-Speech Recognition Speech Recognition Concepts
Chapter 14 Speaker Recognition 14.1 Introduction to speaker recognition 14.2 The basic problems for speaker recognition 14.3 Approaches and systems 14.4.
CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 5: Monotonicity 13 th Jan, 2011.
Introduction to search Chapter 3. Why study search? §Search is a basis for all AI l search proposed as the basis of intelligence l inference l all learning.
Informed State Space Search Department of Computer Science & Engineering Indian Institute of Technology Kharagpur.
1 CS 552/652 Speech Recognition with Hidden Markov Models Winter 2011 Oregon Health & Science University Center for Spoken Language Understanding John-Paul.
State-Space Searches. 2 State spaces A state space consists of A (possibly infinite) set of states The start state represents the initial problem Each.
8.0 Search Algorithms for Speech Recognition References: of Huang, or of Becchetti, or , of Jelinek 4. “ Progress.
Informed search algorithms Chapter 4. Best-first search Idea: use an evaluation function f(n) for each node –estimate of "desirability"  Expand most.
CS344: Introduction to Artificial Intelligence (associated lab: CS386)
Lecture 3: Uninformed Search
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Reestimation Equations Continuous Distributions.
Advanced Artificial Intelligence Lecture 2: Search.
For Wednesday Read chapter 5, sections 1-4 Homework: –Chapter 3, exercise 23. Then do the exercise again, but use greedy heuristic search instead of A*
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Supervised Learning Resources: AG: Conditional Maximum Likelihood DP:
CS621: Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 3 - Search.
The Use of Virtual Hypothesis Copies in Decoding of Large-Vocabulary Continuous Speech Frank Seide IEEE Transactions on Speech and Audio Processing 2005.
Chapter 20 Classification and Estimation Classification – Feature selection Good feature have four characteristics: –Discrimination. Features.
Computer Sciences Department1.  Property 1: each node can have up to two successor nodes (children)  The predecessor node of a node is called its.
Searching for Solutions
CS621: Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 13– Search 17 th August, 2010.
Statistical Models for Automatic Speech Recognition Lukáš Burget.
3.5 Informed (Heuristic) Searches This section show how an informed search strategy can find solution more efficiently than uninformed strategy. Best-first.
February 18, 2016Introduction to Artificial Intelligence Lecture 8: Search in State Spaces III 1 A General Backtracking Algorithm Sanity check function.
CS621: Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 3: Search, A*
Multiple-goal Search Algorithms and their Application to Web Crawling Dmitry Davidov and Shaul Markovitch Computer Science Department Technion, Haifa 32000,
Flexible Speaker Adaptation using Maximum Likelihood Linear Regression Authors: C. J. Leggetter P. C. Woodland Presenter: 陳亮宇 Proc. ARPA Spoken Language.
A Study on Speaker Adaptation of Continuous Density HMM Parameters By Chin-Hui Lee, Chih-Heng Lin, and Biing-Hwang Juang Presented by: 陳亮宇 1990 ICASSP/IEEE.
1 Minimum Bayes-risk Methods in Automatic Speech Recognition Vaibhava Geol And William Byrne IBM ; Johns Hopkins University 2003 by CRC Press LLC 2005/4/26.
Eick: Informed Search Informed Search and Exploration Search Strategies Heuristic Functions Local Search Algorithms Vilalta&Eick: Informed Search.
Chapter 11 Dynamic Programming.
8.0 Search Algorithms for Speech Recognition
Informed Search and Exploration
Hidden Markov Models Part 2: Algorithms
Lecture 1B: Search.
Informed search algorithms
Introduction to Artificial Intelligence Lecture 9: Two-Player Games I
BEST FIRST SEARCH -OR Graph -A* Search -Agenda Search CSE 402
Artificial Intelligence Chapter 9 Heuristic Search
A General Backtracking Algorithm
LECTURE 15: REESTIMATION, EM AND MIXTURES
Informed Search Idea: be smart about what paths to try.
State-Space Searches.
State-Space Searches.
CS621: Artificial Intelligence
State-Space Searches.
Informed Search Idea: be smart about what paths to try.
Presentation transcript:

Chapter 12 search and speaker adaptation 12.1 General Search Algorithm 12.2 Search Algorithms for Speech Recognition 12.3 Language Model States 12.4 Speaker Adaptation

12.1 General Search Algorithms(1) General Graph Searching Procedures (1) The Graph-Search Algorithm 1. Initialization : Put S in OPEN list and create an initially empty CLOSE list. 2. If OPEN list is empty, exit and declare failure. 3. Pop up the first node N in OPEN list, remove it from OPEN list and put it into CLOSE list. 4. If node N is a goal node, exit successfully with the solution obtained by tracing back the path along the pointers from N to S.

General Search Algorithms(2) 5. Expand node N by applying the successor operator to generate the successors set SS(N) of node N. Be sure to eliminate the ancestors of N, from SS(N). 6. For any v ∈ SS(N) do 6a. (optional) If v ∈ OPEN and the accumulated distance(cost) of the new path is smaller than that for the one in the OPEN list, do (1) Change the trace back (parent) pointer of v to N and adjust the accumulated distance(cost) for v (2) Go to step 7. 6b. (optional) If v ∈ CLOSE and the accumulated distance

General Search Algorithms(3) (cost) of the new path is smaller than the partial path ending at v in the CLOSE list, do (1) Change the trace back (parent) pointer of v to N and adjust the accumulated distance(cost) for all the path containing v. (2) Go to step 7. 6c. Create a pointer pointing to N and push it into OPEN list. 7. Reorder the OPEN list according to search strategy or some heuristic measurement. 8. Go to step 2.

General Search Algorithms(4) Depth First Search Breadth First Search Correspondingly modify the algorithm Heuristic Graph Search Algorithm Try to use a guidance to guide the search in correct direction. In general, this hill climbing style of guidance can help us to find destination much more efficiently. It needs domain-specific knowledge, and it is called heuristic. In most practical problems, the choice of different heuristics is usually a tradeoff between the quality of the solution and the cost of finding the solution.

General Search Algorithms(5) f(N)= g(N)+h(N) is the estimation of the total distance for the path going through node N. A heuristic search method uses f to re-order the OPEN list in step 7. The node with shortest distance will be explored first. h(N) is the heuristic estimate of the remaining distance from node N to goal node G. g(N) is the distance of the partial path already traveled from S to node N. The heuristic function that underestimate the distance(cost) are often used in search methods aiming to find the optimal solution.

General Search Algorithms(6) Best First (A * ) Search Beam Search has become one of the most popular methods for complicated speech recognition problem, because of its simplicity in both its search strategy and its requirement of domain-specific heuristic information. It is particularly attractive when integration of different knowledge sources is required in a time synchronous fashion. It has the advantage to have a consistent way of exploring nodes level by level and to offer minimally needed communication between different paths. It is also very suitable for parallel implementation because of its breadth- first search nature.

12.2 Search Algorithms for Speech Recognition (1) The basic problem for large-scale speaker independent continuous speech recognition could be expressed as : W = argmax w P(W|O) = argmax P(W)P(O|W)/P(O)

Search Algorithms for Speech Recognition (2) Almost all the search techniques can be two categories : sharing and pruning. Sharing means intermediate results can be kept, so they can be used by other paths without redundant re-computation, while pruning means unpromising subpaths can be discarded reliably without going too far. Search strategies based on dynamic programming or Viterbi algorithm with the help of clever pruning, have been applied successfully to a wide range of speech recognition tasks, ranging from small-vocabulary tasks, like digit recognition, to unconstraint large-vocabulary (more than words) speech recognition.

Search Algorithms for Speech Recognition (3) With Bayes’ formulation, searching the minimum- cost path(word sequence) is equivalent to finding the path with maximum probability. For the sake of consistency we will use the inverse of Bayes’ posteriori probability as our objective formula. By using logarithm, multiplications became into additions that will make a close resemble of speech decoder to the general graph search algorithms. The new criterion will be used to find the optimal word sequence W : (W is the word sequence) C(W|O) = -log[P(W)P(O|W) W= argmin C(W|A)

12.3 Language Model States (1) It deals with the search space (language model states) for various grammars for continuous speech recognition. Search space with Unigram In the grammar network the unigram probability is attached as the transition probability from starting state S to the first state of each word HMM. Search space with BigramWhen bigram is used, the probability of a word depends only on the immediately preceding word. In the grammar network the bigram expansion will be |V| 2.

Language Model States (2) Another is the rule based language model. The grammar should be defined first, then the sentence network could be created by compiler and the search space will be the entire network. Every node of the network will be a HHM model in acoustic level.

12.4 Speaker Adaptation (1) Adaptation means to adjust the model parameters according to new training data. It should cover a wide range of changes, for example speaking environment, channel characteristics, characteristics of speaker, task characteristics and the application domain. Here we only discuss about speaker adaptation, the methods are also suitable for others. In general, SD has better performance than SI under almost same conditions. The error rate of SD is only 1/3 to 1/4 of SI. Adaptation means using limited new speaker’s training data to modify the model or parameters of a existing model to make the new model adapted to the new speaker.

Speaker Adaptation (2) There are four ways to do speaker adaptation : SI data & SI model*  adaptation clustering  SI model SD data& SD model*  speaker transformation  SD model SD data & SI model  speaker adaptation  SA model SD data & SD model *  serial adaptation  SA model * represents optional, SA means speaker adaptation Adaptation clustering divides the data into a couple of types. The acoustic characteristics of the speaker in one type are closed to each other. A set of SI template could be created for every type. During recognition few data is used to decide which type the data belongs to, then the

Speaker Adaptation (3) corresponding template is used to do the recognition. The basic idea of speaker transformation is that the difference between two speakers mostly because they have difference of the short-time spectrum when they uttered same utterance. These differences stem from the difference of their oral organs. So it is possible to find a linear transformation of the short-time spectrum.

Speaker Adaptation (4) Serial adaptation gradually adjusts the parameters to get the optimal state. 1. Speaker clustering The number of clusters K is the key factor. It should not be too large or too small (2-10 are suitable). There are two clustering approaches : (1) Supervised and based on HMMS similarity At first, K types of speakers needs to be created. It could be done by some merging procedure.

Speaker Adaptation (5) Then combined training data are used to create the VQ codebook and HMM parameters for every type. When testing, the speaker utters some sentences prefixed, the probability the type creates the sentences could be calculated and compared, the maximum of the probability will determine the type. The accuracy of the system is about same as the original SI system.

Speaker Adaptation (6) (2) Unsupervised and based on GMM P(O|λ i )=Σ m=1 c p m P(O|λ i m ), Σ m=1 c p m =1 λ i ={λ i 1,λ i 2,…,λ i c }, p m needs preset How to determine λ i by N Feature vectors O i N ? The idea is λ i should make L = Σ i logP(O|λ i ) maximum.

Speaker Adaptation (7)

Speaker Adaptation (8) 2. Spectrum transformation Suitable for VQ system. The idea is by using small amount data of new speaker to get a relation between the old and new speakers.

Speaker Adaptation (9) 3. Bayes adaptation of CDHMM It adjusts the parameter according to Bayes estimation (or Bayes learning) Take the SI parameter as the priori probability of the mean (now mean is a random variable), and the SD data is provided, then the mean will have new parameters, this new model could be SD model

Speaker Adaptation (3)