Let us consider a sample word graph as described below:

Slides:

Advertisements

Similar presentations

Page 1 of 19 Confidence measure using word posteriors Sridhar Raghavan Confidence Measure using Word Graphs 3/6 2/6 4/6 2/6 1/6 4/6 1/6 4/6 1/6 4/6 5/6.

Advertisements

Estimation of Means and Proportions

Statistical Machine Translation Part II: Word Alignments and EM Alexander Fraser Institute for Natural Language Processing University of Stuttgart

Are our results reliable enough to support a conclusion?

NL equals coNL Section 8.6 Giorgi Japaridze Theory of Computability.

Hidden Markov Models M. Vijay Venkatesh. Outline Introduction Graphical Model Parameterization Inference Summary.

. Hidden Markov Model Lecture #6 Background Readings: Chapters 3.1, 3.2 in the text book, Biological Sequence Analysis, Durbin et al., 2001.

Sample size computations Petter Mostad

. Hidden Markov Model Lecture #6 Background Readings: Chapters 3.1, 3.2 in the text book, Biological Sequence Analysis, Durbin et al., 2001.

Independent Sample T-test Often used with experimental designs N subjects are randomly assigned to two groups (Control * Treatment). After treatment, the.

Learning HMM parameters Sushmita Roy BMI/CS 576 Oct 21 st, 2014.

Lecture 7 1 Statistics Statistics: 1. Model 2. Estimation 3. Hypothesis test.

Copyright © 2010 Pearson Education, Inc. Chapter 24 Comparing Means.

Chapter 9 Hypothesis Testing.

Sequencing Problem.

Review of normal distribution. Exercise Solution.

Computer vision: models, learning and inference

Sections 8-1 and 8-2 Review and Preview and Basics of Hypothesis Testing.

ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Statistical Significance Hypothesis Testing.

Ch 8.1 Numerical Methods: The Euler or Tangent Line Method

L7.1b Continuous Random Variables CONTINUOUS RANDOM VARIABLES NORMAL DISTRIBUTIONS AD PROBABILITY DISTRIBUTIONS.

Sridhar Raghavan Dept. of Electrical and Computer Engineering Mississippi State University URL:

CORRELATION & REGRESSION

Overview Basics of Hypothesis Testing

Individual values of X Frequency How many individuals   Distribution of a population.

1 Software Testing. 2 Path Testing 3 Structural Testing Also known as glass box, structural, clear box and white box testing. A software testing technique.

Calculating Risk of Cost Using Monte Carlo Simulation with Fuzzy Parameters in Civil Engineering Michał Bętkowski Andrzej Pownuk Silesian University of.

CS Statistical Machine learning Lecture 24

Hidden Markovian Model. Some Definitions Finite automation is defined by a set of states, and a set of transitions between states that are taken based.

ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Supervised Learning Resources: AG: Conditional Maximum Likelihood DP:

D Nagesh Kumar, IIScOptimization Methods: M6L5 1 Dynamic Programming Applications Capacity Expansion.

CSCI 256 Data Structures and Algorithm Analysis Lecture 20 Some slides by Kevin Wayne copyright 2005, Pearson Addison Wesley all rights reserved, and some.

ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Statistical Significance Hypothesis Testing.

9-1 ESTIMATION Session Factors Affecting Confidence Interval Estimates The factors that determine the width of a confidence interval are: 1.The.

Chapter 7: Sampling Distributions Section 7.2 Sample Proportions.

© 2013 Pearson Education, Inc., publishing as Prentice Hall. All rights reserved.10-1 American Options The value of the option if it is left “alive” (i.e.,

Sridhar Raghavan and Joseph Picone URL:

Advanced Algorithms Analysis and Design By Dr. Nazir Ahmad Zafar Dr Nazir A. Zafar Advanced Algorithms Analysis and Design.

Theory of Computational Complexity Probability and Computing Chapter Hikaru Inada Iwama and Ito lab M1.

Lecture Slides Elementary Statistics Twelfth Edition

9.3 Hypothesis Tests for Population Proportions

Statistical Significance (Review)

Chapter 9 Hypothesis Testing.

Chapter 23 Comparing Means.

Significance Test for the Difference of Two Proportions

LECTURE 32: STATISTICAL SIGNIFICANCE AND CONFIDENCE

LECTURE 33: STATISTICAL SIGNIFICANCE AND CONFIDENCE (CONT.)

Hypothesis Testing I The One-sample Case

Review and Preview and Basics of Hypothesis Testing

CHAPTER 9 Testing a Claim

Elementary Statistics

Hypothesis Testing: Two Sample Test for Means and Proportions

Party-by-Night Problem

Hidden Markov Models Part 2: Algorithms

Chapter 9 Hypothesis Testing.

Registered Electrical & Mechanical Engineer

Chapter 23 Comparing Means.

Elementary Statistics

Unit 4: Dynamic Programming

Addition of Independent Normal Random Variables

Sampling Distributions

LECTURE 25: STATISTICAL SIGNIFICANCE AND CONFIDENCE

Chapter 24 Comparing Means Copyright © 2009 Pearson Education, Inc.

Hidden Markov Model Lecture #6

Type I and Type II Errors

Lecture 17 – Practice Exercises 3

Presentation transcript:

Let us consider a sample word graph as described below: This series of slides provide detailed description of the algorithm used to determine the confidence measure of the words in the hypothesis through word graphs. The core computation is the forward-backward algorithm used to determine the link posterior probabilities. Let us consider a sample word graph as described below: 3/6 2/6 4/6 1/6 5/6 Sil This is a test sentence this the guest quest sense The values on the links are the likelihoods.

Using forward-backward algorithm for determining the link probability. The equations used to compute the alphas and betas are as follows: Computing alphas: Step 1: Initialization: In a conventional HMM forward-backward algorithm we would perform the following – We need to use a slightly modified version of the above equation for processing a word graph. The emission probability will be the language model probability and the initial probability in this case has been taken as 0.01 (assuming we have 100 words in a loop grammar and hence all the words are equally probable with probability 1/100).

The α for the first node in the word graph is computed as follows: Step 2: Induction This step is the main reason we use forward-backward algorithm for computing such probabilities. The alpha values computed in the previous step is used to compute the alphas for the succeeding nodes. Note: Unlike in HMMs where we move from left to right at fixed intervals of time, over here we move from one start time of a word to the next closest word’s start time.

Let us see the computation of the alphas from node 2, the alpha for node 1 was computed in the previous step during initialization. Node 2: Node 3: Node 4: The alpha calculation continues in this manner for all the remaining nodes

Once we compute the alphas using the forward algorithm we begin the beta computation using the backward algorithm. The backward algorithm is similar to the forward algorithm, but we start from the last node and proceed from right to left. Step 1 : Initialization Step 2: Induction

Let us see the computation of the beta values from node 14 and backwards.

Node 11: In a similar manner we obtain the beta values for all the nodes till node 1. We can compute the probabilities on the links (between two nodes) as follows: Let us call this link probability as Γ. Therefore Γ(t-1,t) is computed as the product of α(t-1)*ß(t). These values give the un-normalized posterior probabilities of the word on the link considering all possible paths through the link.

This is the word graph with every node with its corresponding alpha and beta value. quest 1/6 a sense 14 1/6 the 1/6 α =1E-04 β=2.8843E-16 1/6 8 Sil is 2/6 4 1/6 guest 11 3/6 sentence Sil 2/6 1/6 5/6 Sil 6 This 2/6 1 3 4/6 13 15 is 9 4/6 3/6 α=1.861E-14 β=2.776E-8 α=2.886E-20 β=1 3/6 the Sil 5 sentence α=3.438E-18 β=8.33E-3 2/6 this 1/6 α=3.35E-9 β=8.534E-12 is test 2 a 4/6 4/6 12 α =5E-07 β=2.87E-16 7 10 α=4.964E-16 β=5.555E-5 α=7.446E-14 β=3.703E-7 α=1.117E-11 β=2.514E-9 Assumption here is that the probability of occurrence of any word is 0.01. i.e. we have 100 words in a loop grammar

The following word graph shows the links with their corresponding link posterior probabilities (not yet normalized). Γ=1.292E-19 Γ=4.649E-19 Γ=1.292E-19 quest 1/6 Γ=7.749E-20 a sense 14 1/6 the 1/6 Γ=7.749E-20 1/6 8 Sil Γ=4.649E-19 is 2/6 Γ=5.74E-18 4 1/6 guest 11 3/6 sentence Sil 2/6 Sil 6 1/6 5/6 Γ=6.46E-19 This 2/6 1 4/6 13 3 15 is Γ=1.549E-19 9 Γ=3.1E-19 4/6 3/6 3/6 the Γ=3.438E-18 Γ=2.87E-20 Sil Γ=4.288E-18 5 sentence 2/6 this 1/6 Γ=3.1E-19 is test Γ=2.87E-20 2 a 4/6 4/6 12 7 10 Γ=4.136E-18 Γ=8.421E-18 Γ=4.136E-18 Γ=4.136E1-18