StatCrunch Workshop Hector Facundo.

Slides:



Advertisements
Similar presentations
Chapter 11 Other Chi-Squared Tests
Advertisements

Statistics Review – Part II Topics: – Hypothesis Testing – Paired Tests – Tests of variability 1.
Section 9.3 Inferences About Two Means (Independent)
IB Math Studies – Topic 6 Statistics.
Chapter 10 Two-Sample Tests
© 2010 Pearson Prentice Hall. All rights reserved The Chi-Square Test of Independence.
Chapter Goals After completing this chapter, you should be able to:
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 9-1 Introduction to Statistics Chapter 10 Estimation and Hypothesis.
Stat 512 – Lecture 13 Chi-Square Analysis (Ch. 8).
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 10-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Independent Sample T-test Often used with experimental designs N subjects are randomly assigned to two groups (Control * Treatment). After treatment, the.
Total Population of Age (Years) of People that Smoke
Chapter 15 Nonparametric Statistics
Hypothesis Testing and T-Tests. Hypothesis Tests Related to Differences Copyright © 2009 Pearson Education, Inc. Chapter Tests of Differences One.
Math 227 Elementary Statistics
Hypothesis Testing with Two Samples
Hypothesis Testing II The Two-Sample Case.
The table shows a random sample of 100 hikers and the area of hiking preferred. Are hiking area preference and gender independent? Hiking Preference Area.
Fall 2013 Lecture 5: Chapter 5 Statistical Analysis of Data …yes the “S” word.
Section 10.1 ~ t Distribution for Inferences about a Mean Introduction to Probability and Statistics Ms. Young.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
Population All members of a set which have a given characteristic. Population Data Data associated with a certain population. Population Parameter A measure.
Hypothesis Testing for Proportions
1 Objective Compare of two matched-paired means using two samples from each population. Hypothesis Tests and Confidence Intervals of two dependent means.
Unit 8 Section : z Test for a Mean  Many hypotheses are tested using the generalized statistical formula: Test value = (Observed Value)-(expected.
Copyright © 2009 Pearson Education, Inc LEARNING GOAL Interpret and carry out hypothesis tests for independence of variables with data organized.
A Course In Business Statistics 4th © 2006 Prentice-Hall, Inc. Chap 9-1 A Course In Business Statistics 4 th Edition Chapter 9 Estimation and Hypothesis.
Chap 9-1 Two-Sample Tests. Chap 9-2 Two Sample Tests Population Means, Independent Samples Means, Related Samples Population Variances Group 1 vs. independent.
Lecture 5: Chapter 5: Part I: pg Statistical Analysis of Data …yes the “S” word.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Section Inference about Two Means: Independent Samples 11.3.
Chi-Square Procedures Chi-Square Test for Goodness of Fit, Independence of Variables, and Homogeneity of Proportions.
Statistics in Biology. Histogram Shows continuous data – Data within a particular range.
Section 10.2 Independence. Section 10.2 Objectives Use a chi-square distribution to test whether two variables are independent Use a contingency table.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 11-1 Chapter 11 Chi-Square Tests and Nonparametric Tests Statistics for.
1 Objective Compare of two population variances using two samples from each population. Hypothesis Tests and Confidence Intervals of two variances use.
Chapter Outline Goodness of Fit test Test of Independence.
1 Objective Compare of two matched-paired means using two samples from each population. Hypothesis Tests and Confidence Intervals of two dependent means.
Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman 1 Assumptions 1) Sample is large (n > 30) a) Central limit theorem applies b) Can.
AP Statistics. Chap 13-1 Chapter 13 Estimation and Hypothesis Testing for Two Population Parameters.
Elementary Statistics:
Hypothesis Testing – Two Means(Small, Independent Samples)
Definition Two samples are independent if the sample selected from one population is not related to the sample selected from the second population. The.
Chapter 12 Chi-Square Tests and Nonparametric Tests
Chi-Square hypothesis testing
Chapter 10 Two-Sample Tests and One-Way ANOVA.
Chapter 10 Two Sample Tests
Hypothesis Testing for Proportions
Assumptions For testing a claim about the mean of a single population
Chapter 9: Inferences Involving One Population
Estimation & Hypothesis Testing for Two Population Parameters
Comparing Three or More Means
PCB 3043L - General Ecology Data Analysis.
Testing a Claim About a Mean:  Not Known
Simulation-Based Approach for Comparing Two Means
StatCrunch Workshop Hector Facundo.
Week 11 Chapter 17. Testing Hypotheses about Proportions
Elementary Statistics
Elementary Statistics
Chapter 11: Inference for Distributions of Categorical Data
Statistical Inference about Regression
Ch. 9 examples.
Inference on Categorical Data
Tests About a Population Proportion
Review of Chapter 10 Comparing Two Population Parameters
Day 63 Agenda:.
Probability and Statistics
Ch 11 實習 (2).
WHAT I REALLY WANT TO SEE ON YOUR TEST
Testing a Claim About a Mean:  Known
Presentation transcript:

StatCrunch Workshop Hector Facundo

Resources Math Lab Website http://www.cos.edu/Academics/MathEngineering/Pages/Math-Lab.aspx Small Group Math Tutoring http://www.cos.edu/Library/Services/TutorialCenter/Pages/Small-Group- Math-Tutorial-Hours.aspx

Basic Summary Stats Calculates Mean, Median, Mode, Q1, Q3, Standard Deviation, Variance, etc. Stat -> Summary Stats -> Column Calculates statistics from Column Variable Lets Play Around with the “Test Scores” Column Compute: Mean Min, Q1, Median, Q3, Max Standard Deviation and Unadjusted Standard Deviation Note: The difference between the two is Standard Deviation is for sample data and Unadjusted Standard Deviation is for Population data.

Simple Graphs Lets create a histogram of the “Test Scores” data with starting value 50 and class width of 10. Graph -> Histogram Frequency Histogram: Relative Frequency Histogram

Simple Graphs Lets do a split bar plot of the “Education” data with the salaries for men and women. Graph -> Chart -> Column

Data is Your Friend! Manipulate values, columns, rows, etc. Data -> Arrange -> Stack Allows you to stack observations from multiple columns into one column. Let’s Stack the “Height” Data for Men and Women into one column.

Data is Your Friend! Data -> Compute -> Expression Allows you to do arithmetic operations (+, -, *, /) Allows you to do operations with more than one column. Has built in functions for better “equation” building. Some built in functions: Mean -> mean() Sum -> sum() Cumulative Sum -> cumsum() “Good for cumulative frequencies” Standard Deviation -> std() Unadjusted Standard Deviation -> ustd()

Data is Your Friend! Some simple computations: Add 2 to every score in “Test Scores” Subtract the Height of Men and the Height of women (i.e. Height (Men) – Height (Women)) Subtract the mean of “Test Scores” from all the values in “Test Scores”

Graph Revisited Create a cumulative frequency bar graph for the “Frequency” column Step 1: Get cumulative frequency counts from “Frequency” column using Data -> Compute -> Expression Step 2: Graph the cumulative frequencies Graph -> Chart -> Column

Probability with Stat X (Outcome) P(x) “Probability” 0.1 1 0.15 2 0.3 0.1 1 0.15 2 0.3 3 0.25 4 0.2 Probability with Stat Discrete Random Variable Example: 𝑀𝑒𝑎𝑛= 𝜇 𝑋 = 𝑖=1 𝑛 𝑋 𝑖 ∗𝑃( 𝑋 𝑖 ) = 0∗0.1 + 1∗0.15 + 2∗0.3 + 3∗0.25 + 4∗0.2 =2.3 𝑉𝑎𝑟𝑖𝑎𝑛𝑐𝑒= 𝜎 𝑋 2 = 𝑖=1 𝑛 𝑋 𝑖 − 𝜇 𝑋 2 ∗𝑃( 𝑋 𝑖 ) = 0−2.3 2 ∗0.1 +…+ 4−2.3 2 ∗0.2 =1.51 𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝐷𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛= 𝑉𝑎𝑟𝑖𝑎𝑛𝑐𝑒 = 1.51 ≈1.2288 Stat -> Calculators -> Custom

Probability with Stat Normal Distribution Stat -> Calculators -> Normal Say X come from a normal distribution with a mean of 0 and standard deviation of 1 (Standard Normal). How would we find the following probabilities: P(X ≥ 0.5) P(X ≤ -1) P(-3 ≤ X ≤ -2) Hint: Use “Between” P(X ≥ 0.5) = 0.3085 P(X ≤ -1) = 0.1587 P(-3 ≤ X ≤ -2) = 0.0214

Probability with Stat Normal Distribution Stat -> Calculators -> Normal On the contrary, what if X has a normal distribution of mean 1 and standard deviation of 0 and we wanted to find the value(s) that gave us the upper 5%? Lower 1%? Middle 90% P(X ≥ ?) = 0.05 P(X ≤ ?) = 0.01 P(? ≤ X ≤ ?) = 0.90 P(X ≥ 1.645) = 0.05 P(X ≤ -2.326) = 0.01 P(-1.645 ≤ X ≤ 1.645) = 0.90 This will be very helpful when finding “critical values”

Confidence Intervals 100(1 - α)% Confidence intervals for μ (Mean) Say we want a 95% Confidence Interval for the mean of “Test Scores” Note: I’m assuming we are using t – distribution for this problem. 𝑋 + 𝑡 ∝ 2 ,𝑛−1 𝑠 𝑛 Stat -> T-Stats -> One Sample -> With Data Lots of Work!

Confidence Intervals 100(1 - α)% Confidence intervals for μ (Mean) Say we want a 90% Confidence Interval for the mean and we are given the following data: Sample Mean = 34.5, Sample Standard Deviation = 2.3, Sample Size = 20 Note: I’m still assuming we are using t – distribution for this problem. Stat -> T-Stats -> One Sample -> With Summary

Confidence Intervals 100(1 - α)% Confidence intervals for p (Proportion) We have Political “Party” data where we have the political affiliation of 50 people (Rep, Dem, Ind). We want a 92% Confidence Interval of the true proportion of people who are republican. Note: I’m using Normal distribution for this problem. 𝑝 ± 𝑍 1− ∝ 2 𝑝 (1− 𝑝 ) 𝑛 Stat -> Proportion Stats -> One Sample -> With Data More Work!

Confidence Intervals 100(1 - α)% Confidence intervals for p (Proportion) Say we did a survey in which we sampled 2500 people if they eat tofu and 768 people respond with yes. We want a 98% Confidence Interval of the true proportion of people who eat tofu. “Successes” = 768, Observations = 2500 Stat -> Proportion Stats -> One Sample -> With Summary

Hypothesis Testing Tips 𝐻 0 :𝜇=80 𝐻 𝐴 :𝜇≠80 Hints to set up the Alternative Hypothesis: Conclusions Guide:  This is referred to as the alternative hypothesis < > ≠ “Less Than” “Smaller” “Lower” “Greater Than” “More” “Higher” “Different” “Difference” “Change” “If they are the same” P-Value > ∝ P-Value < ∝ Fail to Reject Null Hypothesis Reject Null Hypothesis

Hypothesis Testing – One Sample Mean Test Say we believe that the average for all test scores in math classes that took a particular test is 80. However others believe it is not 80. We set up a hypothesis test to test this claim at the α = 0.05 level using the “Test Scores” column as our random sample. Note: I’m Assuming a t-distribution for this problem 𝐻 0 :𝜇=80 𝐻 𝐴 :𝜇≠80 Stat -> T-Stats -> One Sample -> With Data

Hypothesis Testing – One Sample Proportion Test A previous study suggested 48% of all voters in the untied States identify as Republican. However, researchers believe the true value to be higher. They take a sample of people’s voting preference where the Column “Party” is the data. We set up a hypothesis test to test this claim at the α = 0.10. 𝐻 0 :𝑝=0.48 𝐻 𝐴 :𝑝>0.48 Stat -> Proportion Stats -> One Sample -> With Data

Hypothesis Testing – Paired Difference Test We want to know if a structured tutoring session will increase test scores for students. We give them a test before the tutoring session and then we test the students after the tutoring session with a similar but slightly different test. We set up a hypothesis test to test this claim at the α = 0.05. 𝑑 𝑖 = 𝐴𝑓𝑡𝑒𝑟 𝑖 − 𝐵𝑒𝑓𝑜𝑟𝑒 𝑖 𝐻 0 : 𝜇 𝑑 =0 𝐻 𝐴 : 𝜇 𝑑 >0 Stat -> T-Stats -> Paired

Hypothesis Testing – Two Sample Mean Test A study was conducted to see the average body temperature for males and females. For 2355 males, the average body temperature was 98.105 degrees F with a standard deviation of 0.699 F. For 1985 females, the average body temperature 98.342 degrees F with a standard deviation of 0.743 F. We set up a hypothesis test to test the claim that males have lower body temperatures than females at the α = 0.01. 𝐻 0 : 𝜇 𝑀𝑎𝑙𝑒𝑠 − 𝜇 𝐹𝑒𝑚𝑎𝑙𝑒𝑠 =0 𝐻 𝐴 : 𝜇 𝑀𝑎𝑙𝑒𝑠 − 𝜇 𝐹𝑒𝑚𝑎𝑙𝑒𝑠 <0 Stat -> T-Stats -> 2 Sample -> With Summary

Hypothesis Testing – Two Sample Proportion Test Time magazine reported the result of a telephone poll of 800 adult Americans. The question posed of the Americans who were surveyed was: "Should the federal tax on cigarettes be raised to pay for health care reform?" The results of the survey were: Is there sufficient evidence at the α = 0.05 level, say, to conclude that the two populations — smokers and non-smokers — differ significantly with respect to their opinions? 𝐻 0 : 𝑝 𝑀𝑎𝑙𝑒𝑠 − 𝑝 𝐹𝑒𝑚𝑎𝑙𝑒𝑠 =0 𝐻 𝐴 : 𝑝 𝑀𝑎𝑙𝑒𝑠 − 𝑝 𝐹𝑒𝑚𝑎𝑙𝑒𝑠 ≠0 Stat -> Proportion Stats -> 2 Sample -> With Summary

Thank You For Coming! If you have any suggestions on how we can improve the workshop, send an email to hector@cos.edu Don’t forget, you can get extra math help in the Math Lab in the Learning Resource Center.