Matlab: Statistics Probability distributions Hypothesis tests

Slides:



Advertisements
Similar presentations
DOE Review Torren Carlson. Goals  Review of experimental design -we can use this for real experiments?  Review/Learn useful Matlab functions  Homework.
Advertisements

Ch8 Inference concerning variance
CHAPTER 2 Building Empirical Model. Basic Statistical Concepts Consider this situation: The tension bond strength of portland cement mortar is an important.
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
Likelihood Ratio, Wald, and Lagrange Multiplier (Score) Tests
Chap 9: Testing Hypotheses & Assessing Goodness of Fit Section 9.1: INTRODUCTION In section 8.2, we fitted a Poisson dist’n to counts. This chapter will.
Lab 4: What is a t-test? Something British mothers use to see if the new girlfriend is significantly better than the old one?
Multiple regression analysis
DISTRIBUTION FITTING.
Elementary hypothesis testing
G. Cowan Lectures on Statistical Data Analysis 1 Statistical Data Analysis: Lecture 10 1Probability, Bayes’ theorem, random variables, pdfs 2Functions.
The Normal Distribution. n = 20,290  =  = Population.
Simulation Modeling and Analysis
Generalised linear models
Maximum likelihood (ML) and likelihood ratio (LR) test
Elementary hypothesis testing Purpose of hypothesis testing Type of hypotheses Type of errors Critical regions Significant levels Hypothesis vs intervals.
Descriptive statistics Experiment  Data  Sample Statistics Experiment  Data  Sample Statistics Sample mean Sample mean Sample variance Sample variance.
Chapter Goals After completing this chapter, you should be able to:
Final Review Session.
Maximum Likelihood We have studied the OLS estimator. It only applies under certain assumptions In particular,  ~ N(0, 2 ) But what if the sampling distribution.
1 Econ 240A Power Outline Review Projects 3 Review: Big Picture 1 #1 Descriptive Statistics –Numerical central tendency: mean, median, mode dispersion:
Lec 6, Ch.5, pp90-105: Statistics (Objectives) Understand basic principles of statistics through reading these pages, especially… Know well about the normal.
Log-linear and logistic models
7. Least squares 7.1 Method of least squares K. Desch – Statistical methods of data analysis SS10 Another important method to estimate parameters Connection.
OMS 201 Review. Range The range of a data set is the difference between the largest and smallest data values. It is the simplest measure of dispersion.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 14 Goodness-of-Fit Tests and Categorical Data Analysis.
Inference about a Mean Part II
Linear and generalised linear models
Lecture 7 1 Statistics Statistics: 1. Model 2. Estimation 3. Hypothesis test.
Chapter 9 Hypothesis Testing.
Definitions In statistics, a hypothesis is a claim or statement about a property of a population. A hypothesis test is a standard procedure for testing.
Inferential Statistics: SPSS
Variance-Test-1 Inferences about Variances (Chapter 7) Develop point estimates for the population variance Construct confidence intervals for the population.
Hypothesis Testing in SPSS Using the T Distribution
Data Analysis II Anthony E. Butterfield CH EN "There is a theory which states that if ever anybody discovers exactly what the Universe is for and.
The maximum likelihood method Likelihood = probability that an observation is predicted by the specified model Plausible observations and plausible models.
- Interfering factors in the comparison of two sample means using unpaired samples may inflate the pooled estimate of variance of test results. - It is.
Selected Statistics Examples from Lectures
Today’s lesson Confidence intervals for the expected value of a random variable. Determining the sample size needed to have a specified probability of.
Statistics Galton’s Heights of Hypothetical Men (1869)
Statistics for Engineer Week II and Week III: Random Variables and Probability Distribution.
Inferences in Regression and Correlation Analysis Ayona Chatterjee Spring 2008 Math 4803/5803.
One Sample Inf-1 If sample came from a normal distribution, t has a t-distribution with n-1 degrees of freedom. 1)Symmetric about 0. 2)Looks like a standard.
2 Input models provide the driving force for a simulation model. The quality of the output is no better than the quality of inputs. We will discuss the.
Basic Probability (Chapter 2, W.J.Decoursey, 2003) Objectives: -Define probability and its relationship to relative frequency of an event. -Learn the basic.
Engineering Statistics ENGR 592 Prepared by: Mariam El-Maghraby Date: 26/05/04 Design of Experiments Plackett-Burman Box-Behnken.
CS433: Modeling and Simulation Dr. Anis Koubâa Al-Imam Mohammad bin Saud University 15 October 2010 Lecture 05: Statistical Analysis Tools.
Maximum Likelihood Estimator of Proportion Let {s 1,s 2,…,s n } be a set of independent outcomes from a Bernoulli experiment with unknown probability.
A Course In Business Statistics 4th © 2006 Prentice-Hall, Inc. Chap 9-1 A Course In Business Statistics 4 th Edition Chapter 9 Estimation and Hypothesis.
Confidence intervals and hypothesis testing Petter Mostad
Testing Hypothesis That Data Fit a Given Probability Distribution Problem: We have a sample of size n. Determine if the data fits a probability distribution.
Selecting Input Probability Distribution. Simulation Machine Simulation can be considered as an Engine with input and output as follows: Simulation Engine.
HYPOTHESIS TESTING Distributions(continued); Maximum Likelihood; Parametric hypothesis tests (chi-squared goodness of fit, t-test, F-test) LECTURE 2 Supplementary.
Lecture 3: Statistics Review I Date: 9/3/02  Distributions  Likelihood  Hypothesis tests.
Mystery 1Mystery 2Mystery 3.
- We have samples for each of two conditions. We provide an answer for “Are the two sample means significantly different from each other, or could both.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 12 Tests of Goodness of Fit and Independence n Goodness of Fit Test: A Multinomial.
1 Design and Analysis of Experiments (2) Basic Statistics Kyung-Ho Park.
Objectives (BPS chapter 12) General rules of probability 1. Independence : Two events A and B are independent if the probability that one event occurs.
The Probit Model Alexander Spermann University of Freiburg SS 2008.
MEGN 537 – Probabilistic Biomechanics Ch.5 – Determining Distributions and Parameters from Observed Data Anthony J Petrella, PhD.
Chapter 14 Single-Population Estimation. Population Statistics Population Statistics:  , usually unknown Using Sample Statistics to estimate population.
Statistical hypothesis Statistical hypothesis is a method for testing a claim or hypothesis about a parameter in a papulation The statement H 0 is called.
Selected Statistics Examples from Lectures. Matlab: Histograms >> y = [
Chapter 4. Inference about Process Quality
Econ Roadmap Focus Midterm 1 Focus Midterm 2 Focus Final Intro
Goodness-of-Fit Tests
Interval Estimation and Hypothesis Testing
Statistical Inference for the Mean: t-test
1/2555 สมศักดิ์ ศิวดำรงพงศ์
Presentation transcript:

Matlab: Statistics Probability distributions Hypothesis tests Response surface modeling Design of experiments

Statistics Toolbox Capabilities Descriptive statistics Statistical visualization Probability distributions Hypothesis tests Linear models Nonlinear models Multivariate statistics Statistical process control Design of experiments Hidden Markov models

Probability Distributions 21 continuous distributions for data analysis Includes normal distribution 6 continuous distributions for statistics Includes chi-square and t distributions 8 discrete distributions Includes binomial and Poisson distributions Each distribution has functions for: pdf — Probability density function cdf — Cumulative distribution function inv — Inverse cumulative distribution functionsstat — Distribution statistics function fit — Distribution fitting function like — Negative log-likelihood function rnd — Random number generator

Normal Distribution Functions normpdf – probability distribution function normcdf – cumulative distribution function norminv – inverse cumulative distribution function normstat – mean and variance normfit – parameter estimates and confidence intervals for normally distributed data normlike – negative log-likelihood for maximum likelihood estimation normrnd – random numbers from normal distribution

Hypothesis Tests 17 hypothesis tests available chi2gof – chi-square goodness-of-fit test. Tests if a sample comes from a specified distribution, against the alternative that it does not come from that distribution. ttest – one-sample or paired-sample t-test. Tests if a sample comes from a normal distribution with unknown variance and a specified mean, against the alternative that it does not have that mean. vartest – one-sample chi-square variance test. Tests if a sample comes from a normal distribution with specified variance, against the alternative that it comes from a normal distribution with a different variance.

Mean Hypothesis Test Example >> h = ttest(data,m,alpha,tail) data: vector or matrix of data m: expected mean alpha: significance level Tail = ‘left’ (left handed alternative), ‘right’ (right handed alternative) or ‘both’ (two-sided alternative) h = 1 (reject hypothesis) or 0 (accept hypothesis) Measurements of polymer molecular weight Hypothesis: m0 = 1.3 instead of m1 < m0 >> h = ttest(x,1.3,0.1,'left') h = 1

Variance Hypothesis Test Example >> h = vartest(data,v,alpha,tail) data: vector or matrix of data v: expected variance alpha: significance level Tail = ‘left’ (left handed alternative), ‘right’ (right handed alternative) or ‘both’ (two-sided alternative) h = 1 (reject hypothesis) or 0 (accept hypothesis) Hypothesis: s2 = 0.0049 and not a different variance >> h = vartest(x,0.0049,0.1,'both') h = 0

Goodness of Fit Perform hypothesis test to determine if data comes from a normal distribution Usage: [h,p,stats]=chi2gof(x,’edges’,edges) x: data vector edges: data divided into intervals with the specified edges h = 1, reject hypothesis at 5% significance h = 0, accept hypothesis at 5% significance p: probability of observing the given statistic stats: includes chi-square statistic and degrees of freedom

Goodness of Fit Example Find maximum likelihood estimates for µ and σ of a normal distribution >> data=[320 … 360]; >> phat = mle(data) phat = 364.7 26.7 Test if data comes from a normal distribution >> [h,p,stats]=chi2gof(data,’edges’,[-inf,325:10:405,inf]); >> h = 0 >> p = 0.8990 >> chi2stat = 2.8440 >> df = 7

Response Surface Modeling Develop linear and quadratic regression models from data Commonly termed response surface modeling Usage: rstool(x,y,model) x: vector or matrix of input values y: vector or matrix of output values model: ‘linear’ (constant and linear terms), ‘interaction’ (linear model plus interaction terms), ‘quadratic’ (interaction model plus quadratic terms), ‘pure quadratic’ (quadratic model minus interaction terms) Creates graphical user interface for model analysis

Response Surface Model Example VLE data – liquid composition held constant >> x = [300 1; 275 1; 250 1; 300 0.75; 275 0.75; 250 0.75; 300 1.25; 275 1.25; 250 1.25]; >> y = [0.75; 0.77; 0.73; 0.81; 0.80; 0.76; 0.72; 0.74; 0.71]; Experiment 1 2 3 4 5 6 7 8 9 Temperature 300 275 250 Pressure 1.0 0.75 1.25 Vapor Composition 0.77 0.73 0.81 0.80 0.76 0.72 0.74 0.71

Response Surface Model Example cont. >> rstool(x,y,'linear') >> beta = 0.7411 (bias) 0.0005 (T) -0.1333 (P) >> rstool(x,y,'interaction') >> beta2 = 0.3011 (bias) 0.0021 (T) 0.3067 (P) -0.0016 (T*P) >> rstool(x,y,'quadratic') >> beta3 = -2.4044 (bias) 0.0227 (T) 0.0933 (P) -0.0016 (T*P) -0.0000 (T*T) 0.1067 (P*P)

Design of Experiments Full factorial designs Fractional factorial designs Response surface designs Central composite designs Box-Behnken designs D-optimal designs – minimize the volume of the confidence ellipsoid of the regression estimates of the linear model parameters

Full Factorial Designs >> d = fullfact(L1,…,Lk) L1: number of levels for first factor Lk: number of levels for last (kth) factor d: design matrix >> d = ff2n(k) k: number of factors d: design matrix for two levels >> d = ff2n(3) d = 0 0 0 0 0 1 0 1 0 0 1 1 1 0 0 1 0 1 1 1 0 1 1 1

Fractional Factorial Designs >> [d,conf] = fracfact(gen) gen: generator string for the design d: design matrix conf: cell array that describes the confounding pattern >> [x,conf] = fracfact('a b c abc') x = -1 -1 -1 -1 -1 -1 1 1 -1 1 -1 1 -1 1 1 -1 1 -1 -1 1 1 -1 1 -1 1 1 -1 -1 1 1 1 1

Fractional Factorial Designs cont. >> gen = fracfactgen(model,K,res) model: string containing terms that must be estimable in the design K: 2K total experiments in the design res: resolution of the design gen: generator string for use in fracfact >> gen = fracfactgen('a b c d e f g',4,4) gen = 'a' 'b' 'c' 'd' 'bcd' 'acd' 'abd'