Statistics Introduction to Statistic [stuh-tis-tik] noun. A numerical fact or datum, especially one computed from a sample.

Slides:



Advertisements
Similar presentations
Chapter 9 Introduction to the t-statistic
Advertisements

Statistics Review – Part II Topics: – Hypothesis Testing – Paired Tests – Tests of variability 1.
CHAPTER 21 Inferential Statistical Analysis. Understanding probability The idea of probability is central to inferential statistics. It means the chance.
Inferential Statistics
Is it statistically significant?
Confidence Intervals Chapter 10. Rate your confidence Name my age within 10 years? 0 within 5 years? 0 within 1 year? 0 Shooting a basketball.
Sampling Distributions (§ )
Statistical Significance What is Statistical Significance? What is Statistical Significance? How Do We Know Whether a Result is Statistically Significant?
Statistical Significance What is Statistical Significance? How Do We Know Whether a Result is Statistically Significant? How Do We Know Whether a Result.
Topic 2: Statistical Concepts and Market Returns
BHS Methods in Behavioral Sciences I
1 Hypothesis Testing In this section I want to review a few things and then introduce hypothesis testing.
Statistics for the Social Sciences Psychology 340 Fall 2006 Hypothesis testing.
Probability & Statistics for Engineers & Scientists, by Walpole, Myers, Myers & Ye ~ Chapter 10 Notes Class notes for ISE 201 San Jose State University.
Hypothesis Testing for Population Means and Proportions
OMS 201 Review. Range The range of a data set is the difference between the largest and smallest data values. It is the simplest measure of dispersion.
Inference about a Mean Part II
Chapter 11: Inference for Distributions
Chapter 9 Hypothesis Testing.
1 BA 555 Practical Business Analysis Review of Statistics Confidence Interval Estimation Hypothesis Testing Linear Regression Analysis Introduction Case.
Chapter 9: Introduction to the t statistic
Hypothesis Testing and T-Tests. Hypothesis Tests Related to Differences Copyright © 2009 Pearson Education, Inc. Chapter Tests of Differences One.
AM Recitation 2/10/11.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 9 Hypothesis Testing.
Overview of Statistical Hypothesis Testing: The z-Test
Intermediate Statistical Analysis Professor K. Leppel.
Overview Definition Hypothesis
Jeopardy Hypothesis Testing T-test Basics T for Indep. Samples Z-scores Probability $100 $200$200 $300 $500 $400 $300 $400 $300 $400 $500 $400.
Chapter 8 Hypothesis testing 1. ▪Along with estimation, hypothesis testing is one of the major fields of statistical inference ▪In estimation, we: –don’t.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.
CENTRE FOR INNOVATION, RESEARCH AND COMPETENCE IN THE LEARNING ECONOMY Session 2: Basic techniques for innovation data analysis. Part I: Statistical inferences.
Section #4 October 30 th Old: Review the Midterm & old concepts 1.New: Case II t-Tests (Chapter 11)
STAT 5372: Experimental Statistics Wayne Woodward Office: Office: 143 Heroy Phone: Phone: (214) URL: URL: faculty.smu.edu/waynew.
Significance Tests …and their significance. Significance Tests Remember how a sampling distribution of means is created? Take a sample of size 500 from.
Go to Index Analysis of Means Farrokh Alemi, Ph.D. Kashif Haqqi M.D.
CHAPTER 16: Inference in Practice. Chapter 16 Concepts 2  Conditions for Inference in Practice  Cautions About Confidence Intervals  Cautions About.
Single Sample Inferences
1 CSI5388: Functional Elements of Statistics for Machine Learning Part I.
Estimation of Statistical Parameters
6.1 - One Sample One Sample  Mean μ, Variance σ 2, Proportion π Two Samples Two Samples  Means, Variances, Proportions μ 1 vs. μ 2.
CHAPTER 18: Inference about a Population Mean
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
1 Chapter 10: Introduction to Inference. 2 Inference Inference is the statistical process by which we use information collected from a sample to infer.
Statistics - methodology for collecting, analyzing, interpreting and drawing conclusions from collected data Anastasia Kadina GM presentation 6/15/2015.
Copyright © 2012 Pearson Education. All rights reserved © 2010 Pearson Education Copyright © 2012 Pearson Education. All rights reserved. Chapter.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 8 Hypothesis Testing.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Physics 270 – Experimental Physics. Let say we are given a functional relationship between several measured variables Q(x, y, …) x ±  x and x ±  y What.
Statistical Inference for the Mean Objectives: (Chapter 9, DeCoursey) -To understand the terms: Null Hypothesis, Rejection Region, and Type I and II errors.
KNR 445 Statistics t-tests Slide 1 Introduction to Hypothesis Testing The z-test.
26134 Business Statistics Tutorial 11: Hypothesis Testing Introduction: Key concepts in this tutorial are listed below 1. Difference.
Data Analysis.
Introducing Communication Research 2e © 2014 SAGE Publications Chapter Seven Generalizing From Research Results: Inferential Statistics.
© Copyright McGraw-Hill 2004
Inference About Means Chapter 23. Getting Started Now that we know how to create confidence intervals and test hypotheses about proportions, it’d be nice.
T tests comparing two means t tests comparing two means.
Learning Objectives After this section, you should be able to: The Practice of Statistics, 5 th Edition1 DESCRIBE the shape, center, and spread of the.
Chapter 13 Understanding research results: statistical inference.
Chapter 9: Introduction to the t statistic. The t Statistic The t statistic allows researchers to use sample data to test hypotheses about an unknown.
Fundamentals of Data Analysis Lecture 4 Testing of statistical hypotheses pt.1.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
Data Analysis. Qualitative vs. Quantitative Data collection methods can be roughly divided into two groups. It is essential to understand the difference.
Chapter 11: Categorical Data n Chi-square goodness of fit test allows us to examine a single distribution of a categorical variable in a population. n.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Chapter 9 Introduction to the t Statistic
Chapter 9 Hypothesis Testing.
CHAPTER 2 Modeling Distributions of Data
Georgi Iskrov, MBA, MPH, PhD Department of Social Medicine
Correlation and Regression
Chapter 9 Hypothesis Testing.
Presentation transcript:

Statistics Introduction to Statistic [stuh-tis-tik] noun. A numerical fact or datum, especially one computed from a sample

How long does the ball take to fall? Measured values: See Board How do we decide which of these measured values is correct? How do we discuss the variation in our measurements?

Mean Also known as “Average” Add all results, and divide by the number of measurements. Equation form:

Propagation of Uncertainty Accuracy Sources of Inaccuracy: Broken measurement device Parallax Random error ? Precision Sources of Imprecision: Multiple measurement methods Systematic error? Low bias, high variability High bias, low variability

Variance and Standard Deviation Squared deviation: How much variation is there from the mean? Variance: measures the absolute distance observations are from the mean

Error Error is the difference between the measured and expected value Error is how we make sense of differences between two measurements that should be the same Error is NOT mistakes! If you made a mistake, do it again.

Types of Error Descriptions For a true mean, µ, and standard deviation, σ, the sample mean has an uncertainty of the mean over the square root of the number of samples. Gives a measure of reliability of the mean. Sample standard error tells you how close your sample mean should be to the true mean.

Using the Standard Error inside confirmed outside not confirmed This is the simplest way of using data to confirm or refute a hypothesis. This is also what is used to create the error bars.

Example with data Set of values: 2, 4, 4, 4, 5, 5, 7, 9 Mean: Standard Deviation:

Error Types False positive: Say things are different when they are the same False negative: Say things are the same when they are different. No effect, null hypothesis true Effect exists, Null hypothesis false Reject null hypothesisType I error (false positive) Correct Accept null hypothesisCorrectType II error (false negative)

Group Discussion: What happens to Standard Deviation as sample size increases? What does that imply about sample error? Define standard deviation and sample error in your own words?

Summary Mean Standard Deviation Variance Sample Standard Error

Summary: Often used alternative Mean Standard Deviation Variance Sample Standard Error

Types of Graphs: Continuous vs. Catagorical Examples? Options: Times of ball rolling down ramp with increasing steepness Sales of coffee, tea and soft drinks at a restaurant Time it takes students to commute to UCI SAT scores of varying ethnic groups

Density Curve Low values indicate a small spread (all values close to the mean) high values indicate a large spread (all values far from the mean)

Normal Distribution Particularly important class of density curve Symmetric, unimodal, bell-shaped Mean, μ, is at the center of the curve Probabilities are the area under the curve Total area = 1

The Empirical Rule In a normal distribution with mean μ and standard deviation of σ: 68% of observations fall within 1 σ of the mean 95% of observation fall within 2 σ of the mean 99.7% observations fall within 3 σ of the mean BADFC

Example with data Set of values: 2, 4, 4, 4, 5, 5, 7, 9 Mean: Standard Deviation:

Data Distribution

Confidence Interval

Central Limit Theorem If X follows a normal distribution with mean μ and standard deviation σ, then x̄ is also normally distributed with mean What if X is not normally distributed? When sampling from any population with mean μ and standard deviation σ, when n is large, the sampling distribution of x̄ is approximately normal: As the number of measurements increase, they will approach a normal distribution (Gaussian). s/CLAppClasses/CentLimApplet. htm Visit This webpage to play with the numbers

Applications Simulated examples: Dice rolling, coin flipping ect… Exit polling

Non-normal Distributions

Central Limit Theorem Summary For large N of sample, the distribution of those mean values will be: which is a normal distribution. Normal distribution of CLT is independent of the type of distribution of data.

Where else would this become problematic? Where can it still be used, but issues should be considered?

Questions?

Effective Statistics You might have strong association, but how do you prove causation? (that x causes y?) Good evidence for causation: a well designed experiment where all other variables that cause changes in the response variable are controlled

The Scientific/Statistic Process 1. Formulating a scientific question 2. Decide on the population you are interested in 3. Select a sample 4. Observational study or experiment? 5. Collect data 6. Analyze data 7. State your conclusion

Ways to collect information from sample Anecdotal evidence Available data Observational study Experiment

Sampling and Inference population sample σμσμ sx̄sx̄ sampling inference

Some Cautions Statistics can not account for poor experimental design There is no sharp border between “significant” and “non-significant” correlation, only increasing and decreasing evidence Lack of significance may be due to poorly designed experiment

Fit Tests t-test, z-test, and χ 2 test

z-Test

z-test All normal distributions are the same if we standardize our data: Units of size σ Mean μ as center If x is an observation from a normal distribution, the standardized value of x is called the z-score Z-scores tell how many standard deviations away from the mean an observation is

z- test procedure To use: find the mean, standard deviation, and standard error Use these statistics along with the observed value to find Z value Consult the z-score table to find P(Z) the determined z Equation for hypothesis testing:

Example Jacob scores 16 on the ACT. Emily scores 670 on the SAT. Assuming that both tests measure scholastic aptitude, who has the higher score? The SAT scores for 1.4 million students in a recent graduating class were roughly normal with a mean of 1026 and standard deviation of 209. The ACT scores for more than 1 million students in the same class were roughly normal with mean of 20.8 and standard deviation of 4.8.

Example Continued Jacob – ACT Score: 16 Mean: 20.8 Standard Dev.: 4.8 Emily - SAT Score: 670 Mean: 1026 Standard Dev.: 209

Interpreting Results

“Backwards” z-test What if we are given a probability (P(Z)) and we are interested in finding the observed value corresponding to the probability.? Find the Z-score Set up the probability (could be 2 sided) P(-z 0 <Z<z o ) = Convert the score to x by

t Tests

Necessary assumptions for t-Test 1. Population is normally distributed. 2. Sample is randomly selected from the unknown population. 3. Standard deviation of the unknown population is the same as the known population. So, we can take the sample standard deviation as an estimate of the known population.

This is typical of the kind of data many of you may generate. Let’s take a quick Look at how this T Test calculated from the data, using Excel.

z versus t procedures Use z procedures if you know the population standard deviation Use t procedure if you don’t know the population standard deviation Usually we don’t know the population standard deviation, unless told otherwise Central Limit Theorem

χ 2 -test (kai)

χ 2 -test (Goodness-of-fit) Users Guide χ 2 -test tells us whether distributions of categorical variables differ from one another Can use to determine if your data conforms to a functional fit. Compares multiple means to multiple expected values. Can only use when you have multiple data sets that cannot be combined into one mean. Use when comparing means to expected values.

χ 2 -test X i is each individual mean µ i is each expected value ΔX i = uncertainty in X i d = # of mean values χ 2 /d table gives probability that data matches expected values. In χ 2 /d, d is count of independent measurements.

χ 2 - (Goodness-of-fit) Test Procedure Find averages and uncertainty for each average. Calculate χ 2 using averages, uncertainties, and expected values. Count number of independent variables. Use table to find probability of fit accuracy based on χ 2 /d and number of independent variables (d).

Example Launch a bottle rocket with several different volumes of water. Measure height of flight multiple times for each volume. You decide you have a fit of: Plot of fit with data on left.

Example 7 degrees of freedom Probability of fit ≈50% 50% of the time, chance alone could produce a larger χ 2 value. No reason to reject fit. This does not mean that other fits might not match the data better, so try other fits and see which one is closest.

Interpreting Results Probability is how similar data is to expected value. Large P means data is similar to expected value. Small P means data is different than expected value.

Summary Propagation of uncertainty Mean Accuracy vs. Precision Error Standard deviation Central Limit Theorem Fit Tests z-test t-test χ 2 -test