Distribution functions

Slides:



Advertisements
Similar presentations
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 9 Inferences Based on Two Samples.
Advertisements

Statistical Techniques I
Lecture (11,12) Parameter Estimation of PDF and Fitting a Distribution Function.
Hypothesis Testing Steps in Hypothesis Testing:
Confidence Interval and Hypothesis Testing for:
Analysis and Interpretation Inferential Statistics ANOVA
Testing means, part III The two-sample t-test. Sample Null hypothesis The population mean is equal to  o One-sample t-test Test statistic Null distribution.
PSY 307 – Statistics for the Behavioral Sciences
PSY 307 – Statistics for the Behavioral Sciences
Test statistic: Group Comparison Jobayer Hossain Larry Holmes, Jr Research Statistics, Lecture 5 October 30,2008.
ANOVA Single Factor Models Single Factor Models. ANOVA ANOVA (ANalysis Of VAriance) is a natural extension used to compare the means more than 2 populations.
1 Inference About a Population Variance Sometimes we are interested in making inference about the variability of processes. Examples: –Investors use variance.
AM Recitation 2/10/11.
Jeopardy Hypothesis Testing T-test Basics T for Indep. Samples Z-scores Probability $100 $200$200 $300 $500 $400 $300 $400 $300 $400 $500 $400.
Hypothesis testing – mean differences between populations
T-test Mechanics. Z-score If we know the population mean and standard deviation, for any value of X we can compute a z-score Z-score tells us how far.
Chapter 9.3 (323) A Test of the Mean of a Normal Distribution: Population Variance Unknown Given a random sample of n observations from a normal population.
1 Tests with two+ groups We have examined tests of means for a single group, and for a difference if we have a matched sample (as in husbands and wives)
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Statistical Inferences Based on Two Samples Chapter 9.
Statistics for Engineer Week II and Week III: Random Variables and Probability Distribution.
Moment Generating Functions
Chapter 9 Hypothesis Testing and Estimation for Two Population Parameters.
1 G Lect 6b G Lecture 6b Generalizing from tests of quantitative variables to tests of categorical variables Testing a hypothesis about a.
Mid-Term Review Final Review Statistical for Business (1)(2)
PSY 307 – Statistics for the Behavioral Sciences Chapter 16 – One-Factor Analysis of Variance (ANOVA)
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
© Copyright McGraw-Hill 2000
Analysis of Variance. What is Variance? Think….think…
Exam 2: Rules Section 2.1 Bring a cheat sheet. One page 2 sides. Bring a calculator. Bring your book to use the tables in the back.
1 G Lect 7a G Lecture 7a Comparing proportions from independent samples Analysis of matched samples Small samples and 2  2 Tables Strength.
- We have samples for each of two conditions. We provide an answer for “Are the two sample means significantly different from each other, or could both.
T Test for Two Independent Samples. t test for two independent samples Basic Assumptions Independent samples are not paired with other observations Null.
Inferences Concerning Variances
IE241: Introduction to Design of Experiments. Last term we talked about testing the difference between two independent means. For means from a normal.
Introduction to ANOVA Research Designs for ANOVAs Type I Error and Multiple Hypothesis Tests The Logic of ANOVA ANOVA vocabulary, notation, and formulas.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 12 Tests of Goodness of Fit and Independence n Goodness of Fit Test: A Multinomial.
Chi Square Test for Goodness of Fit Determining if our sample fits the way it should be.
Lecture 22 Dustin Lueker.  Similar to testing one proportion  Hypotheses are set up like two sample mean test ◦ H 0 :p 1 -p 2 =0  Same as H 0 : p 1.
 List the characteristics of the F distribution.  Conduct a test of hypothesis to determine whether the variances of two populations are equal.  Discuss.
Oneway ANOVA comparing 3 or more means. Overall Purpose A Oneway ANOVA is used to compare three or more average scores. A Oneway ANOVA is used to compare.
The 2 nd to last topic this year!!.  ANOVA Testing is similar to a “two sample t- test except” that it compares more than two samples to one another.
Sampling and Sampling Distributions
I. ANOVA revisited & reviewed
Inference concerning two population variances
Biostatistics Lecture /5 & 6/6/2017.
Comparing Two Proportions
University of Palestine
IEE 380 Review.
Size of a hypothesis test
3. The X and Y samples are independent of one another.
Chapter 4. Inference about Process Quality
Math 4030 – 10b Inferences Concerning Variances: Hypothesis Testing
Statistical Data Analysis - Lecture10 26/03/03
Testing a Claim About a Mean:  Not Known
Psych 706: Stats II Class #2.
Inferences Regarding Population Variances
Chapter 14 Repeated Measures
Chapter 9 Hypothesis Testing.
Chi-square and F Distributions
The t distribution and the independent sample t-test
AP Stats Check In Where we’ve been… Chapter 7…Chapter 8…
Discrete Event Simulation - 4
Discrete Event Simulation - 5
Chapter 24 Comparing Means Copyright © 2009 Pearson Education, Inc.
Psych 231: Research Methods in Psychology
Chapter 8 Estimation.
Statistical Inference for the Mean: t-test
Inference for Distributions
Introductory Statistics
Presentation transcript:

Distribution functions There is some core terminology that we need to know in order to use distribution functions Probability function (pf) / probability density function (pdf). Terms are used interchangeably although traditional convention is to use pf for discrete distributions and pdf for continuous. In the discrete case the pdf gives the probability that in a random experiment you will observe outcome x. The pdf for a continuous does not have the same interpretation Statistical Data Analysis - Lecture08 21/03/03

Cumulative distribution functions The cumulative distribution function (cdf) gives us Pr(Xx) “the probability that our random variable X is less than or equal to x” For a discrete random variable this is the sum of the probabilities for all outcomes less than or equal to x For a continuous random variable we replace the sum by an integral Statistical Data Analysis - Lecture08 21/03/03

Inverse cumulative distribution functions Given a probability p, the inverse cdf gives us the value x such that Pr(X x) = p These three functions are very important – they help us do many different things in statistics Some calculations can be done by hand or with tables In general we need to use the computer to answer our questions. To use tables or the computer, many distributions require extra parameters Statistical Data Analysis - Lecture08 21/03/03

So distribution functions we should know about Discrete Binomial Hypergeometric Poisson Uniform Continuous Chi-square Exponential F Normal Student t Uniform Statistical Data Analysis - Lecture08 21/03/03

Statistical Data Analysis - Lecture08 21/03/03 Working with cts. pdfs To use the Normal distribution Don’t need anything extra as long as the statistic has been standardised To use Student’s t, Chi-Square Need degrees of freedom To use F Need numerator and denominator degrees of freedom Statistical Data Analysis - Lecture08 21/03/03

Statistical Data Analysis - Lecture08 21/03/03 Some calculations in R Let X~N(5, 3). Find Pr(X > 8) Pr(X < -8) Pr(2 < X < 7) If Pr(X > x)=0.14159 what is x? What value does the Normal pdf take at the point x = 5? Let T ~ t with 4 df Pr(T >0) Pr(|T| > 2) If Pr( T > t) = 0.025 what is t? Let X2 ~ Chi-square with 5 df If X2 = 1.944, what is the P-value? Statistical Data Analysis - Lecture08 21/03/03

Some key facts about the distribution of test statistics If we know the population mean, , and std. deviation, , then any statistic of the form will have a N(0,1) distribution If we don’t know  and substitute the sample standard deviation, then the resulting test statistic is distributed Student t Statistical Data Analysis - Lecture08 21/03/03

Some key facts about the distribution of test statistics If we have a std. normal random variable (rv) and we square it, the resulting rv is chi-square with 1 df. If we sum n squared normal rv’s the result is a chi-squared rv with n degrees of freedom If we have two chi-square rv’s, X1, and X2, with df equal to n1, and n2 respectively, the ratio, F=X1/ X2 has an F distribution with n1 numerator degrees of freedom and n2 denominator degrees of freedom Statistical Data Analysis - Lecture08 21/03/03

Statistical Data Analysis - Lecture08 21/03/03 Degrees of freedom Some of what is in the previous slides looks contradictory to what we’ve seen already or perhaps know For this course, let the df be dictated by the standard formulae (which I will tell you) for each problem If you want to know why the formulae are what they are take 655.322! Statistical Data Analysis - Lecture08 21/03/03

Some common hypothesis tests One sample t Tests H0:  = 0, where 0 is some hypothesised value Alternatives: One-sided: H1:  > 0 or H1:  < 0 Two-sided: H1:   0 3. Test statistic 4. P-value given by (with df=n-1) 5. P-value < 0.05 => significance at 5% level Statistical Data Analysis - Lecture08 21/03/03

Statistical Data Analysis - Lecture08 21/03/03 Paired t-test Really just a one-sample t test in disguise Have two measurements for each of n subjects and examine differences. e.g. before diet and after diet weight. Null hypothesis: H0: diff = 0 (hypothesis of no difference) Alternative hypothesis is usually two tailed H1: diff  0 Test statistic Statistical Data Analysis - Lecture08 21/03/03

Statistical Data Analysis - Lecture08 21/03/03 Two sample t-test Two independent samples of size nx, ny Null hypothesis: H0: x- y = 0 (usually 0 = 0, hypothesis of no difference) Alternative hypothesis: H1: x- y  0 (usually two tailed but can be one tailed as well) Test statistic (this assumes unequal variances) 5. P-value on nx+ny-2 df Statistical Data Analysis - Lecture08 21/03/03

Statistical Data Analysis - Lecture08 21/03/03 A common theme? We’ve seen three different hypothesis tests, and all three have the same form for their test statistic, namely “The estimate minus the hypothesised value divided by the standard error of the estimate has a Student t distribution” Statistical Data Analysis - Lecture08 21/03/03

Chi-square/Goodness of Fit (GOF) Usually for one or two (or n) way tables Null hypothesis Test statistic P-value (on k-1 df for oneway and (nr-1)(nc-1) for twoway) Statistical Data Analysis - Lecture08 21/03/03

Statistical Data Analysis - Lecture08 21/03/03 F tests Most useful to us in ANOVA Hypotheses vary, but a general one might be If Note this will look slightly different than what we see in ANOVA. There are reasons. Statistical Data Analysis - Lecture08 21/03/03

Statistical Data Analysis - Lecture08 21/03/03 Two sample t-test Two independent samples of size nx, ny Null hypothesis: H0: x-  y =  0 (usually  0 = 0, hypothesis of no difference) Alternative hypothesis: H1:  x-  y   0 (usually two tailed but can be one tailed as well) Test statistic (this assumes unequal variances) Statistical Data Analysis - Lecture08 21/03/03

Welch’s modification to the two-sample t-test continued When we assume the variances are not equal, the degrees of freedom value is not nx + ny – 2. In fact they are given by You do not want to calculate this by hand! Statistical Data Analysis - Lecture08 21/03/03

Pooled two sample t-test When we assume the variances are equal, then the pooled test is performed instead. This uses a pooled estimate of the standard error, namely Statistical Data Analysis - Lecture08 21/03/03

Differences between pooled test and Welch’s test Most introductory text books recommend the routine use of Welch’s modification. Why? The assumption of equal variance is often hard to justify. Are there some costs associated with this? Yes, but we need to learn a couple of terms Statistical Data Analysis - Lecture08 21/03/03