Comparing Distributions II: Bayes Rule and Acceptance Sampling By Peter Woolf University of Michigan Michigan Chemical Process Dynamics.

Slides:



Advertisements
Similar presentations
Probability Probability Principles of EngineeringTM
Advertisements

Lahore University of Management Sciences, Lahore, Pakistan Dr. M.M. Awais- Computer Science Department 1 Lecture 12 Dealing With Uncertainty Probabilistic.
Review of Probability. Definitions (1) Quiz 1.Let’s say I have a random variable X for a coin, with event space {H, T}. If the probability P(X=H) is.
Anthony Greene1 Simple Hypothesis Testing Detecting Statistical Differences In The Simplest Case:  and  are both known I The Logic of Hypothesis Testing:
Introductory Statistics By Peter Woolf University of Michigan Michigan Chemical Process Dynamics and Controls Open Textbook version.
i) Two way ANOVA without replication
Statistics: Purpose, Approach, Method. The Basic Approach The basic principle behind the use of statistical tests of significance can be stated as: Compare.
Chapter 4 Probability and Probability Distributions
COUNTING AND PROBABILITY
GrowingKnowing.com © Binomial probabilities Your choice is between success and failure You toss a coin and want it to come up tails Tails is success,
Probability and Statistics1  Basic Probability  Binomial Distribution  Statistical Measures  Normal Distribution.
1. (f) Use continuity corrections for discrete random variable LEARNING OUTCOMES At the end of the lesson, students will be able to (g) Use the normal.
Conditional Probability and Independence. Learning Targets 1. I can calculate conditional probability using a 2-way table. 2. I can determine whether.
ENGR 4296 – Senior Design II Question: How do you establish your “product” is better or your experiment was a success? Objectives: Sources of Variability.
Ch.18 Normal approximation using probability histograms Review measures center and spread –List of numbers (histogram of data) –Box model For a “large”
Inferential Statistics & Hypothesis Testing
Bayesian Networks I: Static Models & Multinomial Distributions By Peter Woolf University of Michigan Michigan Chemical Process Dynamics.
1 MF-852 Financial Econometrics Lecture 4 Probability Distributions and Intro. to Hypothesis Tests Roy J. Epstein Fall 2003.
Probability theory Much inspired by the presentation of Kren and Samuelsson.
Comparing Distributions I: DIMAC and Fishers Exact By Peter Woolf University of Michigan Michigan Chemical Process Dynamics and Controls.
Comparing Distributions III: Chi squared test, ANOVA By Peter Woolf University of Michigan Michigan Chemical Process Dynamics and Controls.
LARGE SAMPLE TESTS ON PROPORTIONS
Dynamical Systems Analysis I: Fixed Points & Linearization By Peter Woolf University of Michigan Michigan Chemical Process Dynamics.
Probability Distributions
Statistics for the Social Sciences Psychology 340 Fall 2006 Hypothesis testing.
Dynamical Systems Analysis II: Evaluating Stability, Eigenvalues By Peter Woolf University of Michigan Michigan Chemical Process Dynamics.
Dynamical Systems Analysis IV: Root Locus Plots & Routh Stability
Multiple Input, Multiple Output I: Numerical Decoupling By Peter Woolf University of Michigan Michigan Chemical Process Dynamics and.
Statistics for the Social Sciences Psychology 340 Spring 2005 Hypothesis testing.
Statistics for the Social Sciences Psychology 340 Fall 2006 Hypothesis testing.
The Binomial Distribution. Introduction # correct TallyFrequencyP(experiment)P(theory) Mix the cards, select one & guess the type. Repeat 3 times.
Discrete and Continuous Probability Distributions.
INFERENTIAL STATISTICS – Samples are only estimates of the population – Sample statistics will be slightly off from the true values of its population’s.
HAWKES LEARNING SYSTEMS Students Matter. Success Counts. Copyright © 2013 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Section 5.2.
1 Bayesian methods for parameter estimation and data assimilation with crop models Part 2: Likelihood function and prior distribution David Makowski and.
Sets, Combinatorics, Probability, and Number Theory Mathematical Structures for Computer Science Chapter 3 Copyright © 2006 W.H. Freeman & Co.MSCS SlidesProbability.
 States that any distribution of sample means from a large population approaches the normal distribution as n increases to infinity ◦ The.
Inference for a Single Population Proportion (p).
Populations, Samples, and Probability. Populations and Samples Population – Any complete set of observations (or potential observations) may be characterized.
Binomial Distributions Calculating the Probability of Success.
Binomial Distributions
Introduction to Management Science
11-1 Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Probability and Statistics Chapter 11.
LECTURE IV Random Variables and Probability Distributions I.
P. STATISTICS LESSON 8.2 ( DAY 1 )
10.1: Confidence Intervals – The Basics. Review Question!!! If the mean and the standard deviation of a continuous random variable that is normally distributed.
Probability. Statistical inference is based on a Mathematics branch called probability theory. If a procedure can result in n equally likely outcomes,
BINOMIALDISTRIBUTION AND ITS APPLICATION. Binomial Distribution  The binomial probability density function –f(x) = n C x p x q n-x for x=0,1,2,3…,n for.
Introduction to Behavioral Statistics Probability, The Binomial Distribution and the Normal Curve.
GrowingKnowing.com © Binomial probabilities Your choice is between success and failure You toss a coin and want it to come up tails Tails is success,
Probability The calculated likelihood that a given event will occur
Chapter 20 Testing Hypothesis about proportions
PROBABILITY, PROBABILITY RULES, AND CONDITIONAL PROBABILITY
POSC 202A: Lecture 4 Probability. We begin with the basics of probability and then move on to expected value. Understanding probability is important because.
Statistics 300: Elementary Statistics Sections 7-2, 7-3, 7-4, 7-5.
Conditional Probability Mass Function. Introduction P[A|B] is the probability of an event A, giving that we know that some other event B has occurred.
Binomial Distribution
Basics on Probability Jingrui He 09/11/2007. Coin Flips  You flip a coin Head with probability 0.5  You flip 100 coins How many heads would you expect.
Analysis of Experimental Data; Introduction
1.Addition Rule 2.Multiplication Rule 3.Compliments 4.Conditional Probability 5.Permutation 6.Combinations 7.Expected value 8.Geometric Probabilities 9.Binomial.
+ Chapter 5 Overview 5.1 Introducing Probability 5.2 Combining Events 5.3 Conditional Probability 5.4 Counting Methods 1.
Multiplication Rule Statistics B Mr. Evans. Addition vs. Multiplication Rule The addition rule helped us solve problems when we performed one task and.
Probability, Sampling, and Inference Q560: Experimental Methods in Cognitive Science Lecture 5.
ELEC 303 – Random Signals Lecture 17 – Hypothesis testing 2 Dr. Farinaz Koushanfar ECE Dept., Rice University Nov 2, 2009.
Hypothesis Testing Steps for the Rejection Region Method State H 1 and State H 0 State the Test Statistic and its sampling distribution (normal or t) Determine.
Extension: How could researchers use a more powerful measure of analysis? Why do you think that researchers do not just rely on descriptive statistics.
+ Binomial and Geometric Random Variables Textbook Section 6.3.
Conditional Probability 423/what-is-your-favorite-data-analysis-cartoon 1.
CS 188: Artificial Intelligence Fall 2008
Lecture Slides Elementary Statistics Twelfth Edition
Presentation transcript:

Comparing Distributions II: Bayes Rule and Acceptance Sampling By Peter Woolf University of Michigan Michigan Chemical Process Dynamics and Controls Open Textbook version 1.0 Creative commons

From last lecture found that variations in the product yield were significantly related to runny feed One solution is to find a way to identify runny feed before it was fed into the process and avoid it.

Runnyfeedometer TM Image from You develop an offline tool to detect runny feed using a cone and plate viscometer. The test is inexpensive, but not always accurate due to inhomogeneous feed. You have a more accurate way of measuring runny feed but it is slow and expensive, so maybe you can get away with multiple reads on the Runnyfeedometer TM ? Experimental Data: 100 known runny and 100 known normal samples tested in the Runnyfeedometer TM P(+ test | runny) = 98:100 P(- test | runny) = 2:100 P(+ test | normal) = 3:100 P(- test | normal) = 97:100 True positive False negative False positive True negative What are the odds that 9 in 10 tests on a runny sample would all come back positive?

P(+ test | runny) = 98:100 P(- test | runny) = 2:100 Question: What are the odds that 9 in 10 tests on a runny sample would all come back positive? 10 combinations Probability of a particular outcome (0.98)*(0.98)*(0.98)*(0.98)*(0.98)* (0.98)*(0.98)*(0.98)*(0.98)*(0.02) Overall probability= probability of a particular outcome* # combinations= 10*(0.98) 9 (0.02) 1 = Possible results: {+,+,+,+,+,+,+,+,+,-} {+,+,+,+,+,+,+,+,-,+} {+,+,+,+,+,+,+,-,+,+} {+,+,+,+,+,+,-,+,+,+} {+,+,+,+,+,-,+,+,+,+} {+,+,+,+,-,+,+,+,+,+} {+,+,+,-,+,+,+,+,+,+} {+,+,-,+,+,+,+,+,+,+} {+,-,+,+,+,+,+,+,+,+} {-,+,+,+,+,+,+,+,+,+} Note: hard to list if 2 or more fail..

In our case: P(+ test | runny) = 98:100 = p P(- test | runny) = 2:100 = (1-p) Binomial Distribution Describes the probability of obtaining k events from N independent samples of a binary outcome with known probability. Examples: Odds of getting 20 heads from 30 coin tosses Odds of finding 3 broken bolts in a box of 100

In Mathematica Probability of exactly 5 heads out of 10 tosses Probability of 0-5 heads out of 10 tosses Probability test: What are the odds of getting 5 heads out of 10 coin tosses? (a) 25% (b) 50% (c) 62%

Probability of exactly 5 heads out of 10 tosses Probability of 0-5 heads out of 10 tosses Probability test: What are the odds of getting 5 heads out of 10 tosses? Note axes are off by 1 25% 62% (a) 25% (b) 50% (c) 62% =5 Okay No ≤5 Okay

Runnyfeedometer TM Image from P(+ test | runny) = 98:100 P(- test | runny) = 2:100 P(+ test | normal) = 3:100 P(- test | normal) = 97:100 Given these data what acceptance sampling criteria would be required to correctly identify a normal sample with 99.99% confidence? Example acceptance sampling criteria: Accept sample if from 10 samples, 3 or fewer test positive Translation: We want the following P(normal | 3 or fewer positive results from 10 tests) Using our binomial distribution we can calculate a related quantity (0 in 10 positive: very likely normal, 10 in 10: very likely runny)

x P(x) Using our binomial distribution we can calculate a related quantity P(3 or fewer positive results from 10 tests | normal) Where i=# of positive results p= probability of a positive result given a normal feed=0.03 If normal will get ≤3 positive tests with 99% probability! Not the same! Translation: We want the following P(normal | 3 or fewer positive results from 10 tests)

1. Joint Probability 2. Conditional Probability 3. Marginalization Three Probability Definitions

1. Joint Probability Three Probability Definitions What is the probability of drawing an ace first and then a jack from a deck of 52 cards? What is the probability of a protein being highly expressed and phosphorylated? What is the probability that valves A and B both fail? (# highly expressed and phosphorylated proteins)/(total proteins) (# times A & B fail) (total observations)

2. Conditional Probability Three Probability Definitions What is the probability of drawing an ace given that you just drew a jack from a deck of 52 cards? What is the probability of a protein being highly expressed given that it is phosphorylated? What is the probability that valve A fails given that B has failed? (# highly expressed phosphorylated proteins)/(total phosphorylated proteins) (# times A & B fail) (total observations where B fails)

3. Marginalization Three Probability Definitions What is the probability of drawing an ace given that you just drew one other card from a deck of 52 cards?

in general if independent Probability Algebra Bayes’ Rule

We want the following P(normal | 3 or fewer positive results from 10 tests) Bayes’ Rule P(normal | 3 or fewer positive results from 10 tests)= P(3 or fewer positive results from 10 tests | normal) P(normal) P(3 or fewer positive results from 10 tests) Marginalize Binomial distribution Prior

P(3 or fewer positive results from 10 tests | normal): P(normal): from prior observations, what are the odds of getting a batch of normal feed? From previous data found normal feed in 19 of 25 samples, so a first approximation could be 0.76 P(normal | 3 or fewer positive results from 10 tests)= P(3 or fewer positive results from 10 tests | normal) P(normal) P(3 or fewer positive results from 10 tests) =0.9998

P(3 or fewer positive results from 10 tests): Found by marginalizing over runny and normal =P(≤3 of 10 positive | runny)P(runny)+ P(≤3 of 10 positive | normal)P(normal) P(≤3 of 10 positive | runny) P(+ test | runny) = 98:100 ~0% of the time will a runny sample yield ≤3 pos. P(runny)=1-P(normal) = 0.24 P(normal | 3 or fewer positive results from 10 tests)= P(3 or fewer positive results from 10 tests | normal) P(normal) P(3 or fewer positive results from 10 tests)

P(3 or fewer positive results from 10 tests): Found by marginalizing over runny and normal =(0)(0.24)+(0.9998)(0.76)= P(runny | 3 or fewer positive results from 10 tests)= (0.9998) (0.76)= Acceptance sampling criteria will identify runny feeds essentially 100% of the time.. May be too strict! =P(≤3 of 10 positive | runny)P(runny)+ P(≤3 of 10 positive | normal)P(normal) P(normal | 3 or fewer positive results from 10 tests)= P(3 or fewer positive results from 10 tests | normal) P(normal) P(3 or fewer positive results from 10 tests)

Test different acceptance sampling criteria: Acceptance sampling criteria will identify normal feeds >99.99% of the time Remember: 0 in 10 positive: very likely normal 10 in 10 positive: very likely runny 0 to 10 positive: no information --> 0 to 6 positive: likely normal

Runnyfeedometer TM Image from Analysis result: If ≤6 of 10 samples report positive then I am >99.99% sure the feed is normal. Acceptance criteria: If ≤6 of 10 tests are positive, use feed, otherwise reject feed. Q: What are the odds of rejecting normal feed? P(normal | 7 or more positive results from 10 tests)= P(7 or more positive results from 10 tests | normal) P(normal) P(7 or more positive results from 10 tests) Very rarely..

Take Home Messages Acceptance sampling provides an easy to implement way to eliminate variation Basic probability rules like Bayes Rule help to rearrange your expressions to get to things you can solve.