The usual course of events for conducting scientific work “The Scientific Method” Reformulate or extend hypothesis Develop a Working Hypothesis Observation.

Slides:



Advertisements
Similar presentations
Inferential Statistics
Advertisements

Testing means, part III The two-sample t-test. Sample Null hypothesis The population mean is equal to  o One-sample t-test Test statistic Null distribution.
Statistics. The usual course of events for conducting scientific work “The Scientific Method” Reformulate or extend hypothesis Develop a Working Hypothesis.
Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.
Topic 2: Statistical Concepts and Market Returns
Chi Square: A Nonparametric Test PSYC 230 June 3rd, 2004 Shaun Cook, ABD University of Arizona.
Final Review Session.
Inferences About Process Quality
PSY 307 – Statistics for the Behavioral Sciences Chapter 19 – Chi-Square Test for Qualitative Data Chapter 21 – Deciding Which Test to Use.
Today Concepts underlying inferential statistics
Data Analysis Statistics. Levels of Measurement Nominal – Categorical; no implied rankings among the categories. Also includes written observations and.
Hypothesis Testing Using The One-Sample t-Test
Chapter 14 Inferential Data Analysis
1 Nominal Data Greg C Elvers. 2 Parametric Statistics The inferential statistics that we have discussed, such as t and ANOVA, are parametric statistics.
Statistical Analysis. Purpose of Statistical Analysis Determines whether the results found in an experiment are meaningful. Answers the question: –Does.
Inferential Statistics
INFERENTIAL STATISTICS – Samples are only estimates of the population – Sample statistics will be slightly off from the true values of its population’s.
1 of 27 PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2013, Michael Kalsher Michael J. Kalsher Department of Cognitive Science Adv. Experimental.
TESTING HYPOTHESES. Two ways of arriving at a conclusion 2. Inductive inference samplepopulation samplepopulation 1. Deductive inference.
POPULATION DYNAMICS Required background knowledge: Data and variability concepts  Data collection Measures of central tendency (mean, median, mode, variance,
AM Recitation 2/10/11.
Estimation and Hypothesis Testing Faculty of Information Technology King Mongkut’s University of Technology North Bangkok 1.
Statistical Analysis I have all this data. Now what does it mean?
Copyright, Gerry Quinn & Mick Keough, 1998 Please do not copy or distribute this file without the authors’ permission Experimental design and analysis.
Hypothesis Testing Charity I. Mulig. Variable A variable is any property or quantity that can take on different values. Variables may take on discrete.
Statistical Analysis Statistical Analysis
Inferential Stats, Discussions and Abstracts!! BATs Identify which inferential test to use for your experiment Use the inferential test to decide if your.
Single-Sample T-Test Quantitative Methods in HPELS 440:210.
1 Level of Significance α is a predetermined value by convention usually 0.05 α = 0.05 corresponds to the 95% confidence level We are accepting the risk.
The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable.
Statistics & Biology Shelly’s Super Happy Fun Times February 7, 2012 Will Herrick.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Inferential Statistics.
Copyright © 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 22 Using Inferential Statistics to Test Hypotheses.
Introduction To Biological Research. Step-by-step analysis of biological data The statistical analysis of a biological experiment may be broken down into.
Special Topics 504: Practical Methods in Analyzing Animal Science Experiments The course is: Designed to help familiarize you with the most common methods.
Statistical Analysis I have all this data. Now what does it mean?
Statistics 11 Correlations Definitions: A correlation is measure of association between two quantitative variables with respect to a single individual.
The Scientific Method Formulation of an H ypothesis P lanning an experiment to objectively test the hypothesis Careful observation and collection of D.
Correlation and Regression Used when we are interested in the relationship between two variables. NOT the differences between means or medians of different.
Statistical analysis Prepared and gathered by Alireza Yousefy(Ph.D)
Chapter 20 For Explaining Psychological Statistics, 4th ed. by B. Cohen 1 These tests can be used when all of the data from a study has been measured on.
Biostatistics, statistical software VII. Non-parametric tests: Wilcoxon’s signed rank test, Mann-Whitney U-test, Kruskal- Wallis test, Spearman’ rank correlation.
Parametric tests (independent t- test and paired t-test & ANOVA) Dr. Omar Al Jadaan.
Confidence intervals and hypothesis testing Petter Mostad
Essential Question:  How do scientists use statistical analyses to draw meaningful conclusions from experimental results?
Chapter 9 Three Tests of Significance Winston Jackson and Norine Verberg Methods: Doing Social Research, 4e.
Educational Research Chapter 13 Inferential Statistics Gay, Mills, and Airasian 10 th Edition.
Experimental Research Methods in Language Learning Chapter 10 Inferential Statistics.
Statistics: Unlocking the Power of Data Lock 5 Exam 2 Review STAT 101 Dr. Kari Lock Morgan 11/13/12 Review of Chapters 5-9.
Lecture 3: Statistics Review I Date: 9/3/02  Distributions  Likelihood  Hypothesis tests.
N318b Winter 2002 Nursing Statistics Specific statistical tests Chi-square (  2 ) Lecture 7.
Chapter Eight: Using Statistics to Answer Questions.
IMPORTANCE OF STATISTICS MR.CHITHRAVEL.V ASST.PROFESSOR ACN.
© Copyright McGraw-Hill 2004
Chapter 15 The Chi-Square Statistic: Tests for Goodness of Fit and Independence PowerPoint Lecture Slides Essentials of Statistics for the Behavioral.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
STATS 10x Revision CONTENT COVERED: CHAPTERS
The p-value approach to Hypothesis Testing
Statistics Statistics Data measurement, probability and statistical tests.
1 Testing Statistical Hypothesis The One Sample t-Test Heibatollah Baghi, and Mastee Badii.
Hypothesis test flow chart
Chapter 13 Understanding research results: statistical inference.
HYPOTHESIS TESTING FOR DIFFERENCES BETWEEN MEANS AND BETWEEN PROPORTIONS.
Nonparametric Statistics
Statistical principles: the normal distribution and methods of testing Or, “Explaining the arrangement of things”
CHI SQUARE DISTRIBUTION. The Chi-Square (  2 ) Distribution The chi-square distribution is the probability distribution of the sum of several independent,
Hypothesis Testing Review
Quantitative Methods in HPELS HPELS 6210
What are their purposes? What kinds?
Presentation transcript:

The usual course of events for conducting scientific work “The Scientific Method” Reformulate or extend hypothesis Develop a Working Hypothesis Observation Conduct an experiment or a series of controlled systematic observations Appropriate statistical tests Confirm or reject hypothesis

The usual course of events for conducting scientific work “The Scientific Method” Reformulate or extend hypothesis Develop a Working Hypothesis Observation Conduct an experiment or a series of controlled systematic observations Appropriate statistical tests Confirm or reject hypothesis In the intertidal zone, algae seem to be confined to specific areas There will be a positive correlation of algal abundance and tide height Measure tide heights and count number of algae at each Product-moment correlation There is a positive correlation of tide height and algal abundance Algal will grow higher on the shore in areas of high wave action

Imagine that you are collecting samples (i.e. individuals) from a population of little ball creatures - Critterus sphericales Little ball creatures come in 3 sizes: Small = Medium = Large =

-sample 1 -sample 2 -sample 3 -sample 4 -sample 5 You take a total of five samples

The real population (all the little ball creatures that exist) Your samples

Each sample is a representation of the population BUT No single sample can be expected to accurately represent the whole population So……………

To be statistically valid, each sample must be: 1) Random: Thrown quadrat?? Guppies netted from an aquarium?

To be truly random: Choose numbers randomly from 1 to 300

To be truly random: Choose numbers randomly from 1 to 300

Assign numbers from a random number table

To be statistically valid, each sample must be: 2) Replicated:

Bark Samples for levels of cadmium

Pseudoreplicated Sample size (n) =1 Not pseudoreplicated Sample size (n) =10 10 samples from 10 different trees 10 samples from the same tree

IF YOUR DATA ARE: 1. Continuous data 2. Ratio or interval 3. Approximately normal distribution 4. Equal variance (F-test) 5. Conclusions about population based on sample (inductive) 6. Sample size > 10 samplepopulation

CHARACTERIZING DATA

Variables -dependent – in any experiment, the dependent variable is the one being measured by the experimenter -also known as a reponse or test variable -independent – in any experiment, the independent variable is the one being changed by the experimenter -also known as a factor

Nominal data (nominal scales, nominal variables) Drosophila genetic traits - data are in categories Species Sex

Look at the distribution of lizards in the forests Tree branches Tree trunks Ground Species ASpecies BSpecies CSpecies D

- Both the dependent and independent variables are nominal/categorical Habitat GroundTree trunkTree branchSpecies totals Lizard Species Species A Species B Species C95014 Species D Totals

- data are in categories -grades Ordinal data (ordinal scales, ordinal variables) - categories are ranked -surveys -behavioural responses

Interval data (interval scales, interval variables) zero point depends on the scale used e.g. temperature - constant size interval - no true zero point - values can be treated arithmetically (only +, -) to give a meaningful result

Ratio data (or ratio scales or ratio variables) - constant size interval - a zero point with some reality height weight time - values can be treated arithmetically (+, -, x, ÷ ) to give a meaningful result

Ratio data (or ratio scales or ratio variables) - constant size interval - a zero point with some reality Can also be continuous - values can be treated arithmetically (+, -, x, ÷ ) to give a meaningful result Or discrete - counts, “number of …..”

Kinds of Variables Assignment as a discrete (= categorical) or continuous variable can depend on the method of measurement Dappled Full Open Continuous Discrete ( = categorical)

The kind of data you are dealing with is one determining factor in the kind of statistical test you will use.

IF YOUR DATA ARE: 1. Continuous data 2. Ratio or interval 3. Approximately normal distribution 4. Equal variance (F-test) 5. Conclusions about population based on sample (inductive) 6. Sample size > 10 samplepopulation

Two ways of arriving at a conclusion 2. Inductive inference sample population sample population 1. Deductive inference

IF YOUR DATA ARE: 1. Continuous data 2. Ratio or interval 3. Approximately normal distribution 4. Equal variance (F-test) 5. Conclusions about population based on sample (inductive) 6. Sample size > 10 samplepopulation

Imagine the following experiment: 2 groups of crickets Group 1 – fed a diet with extra supplements Group 2 – fed a diet with no supplements Weights Mean = 12.8 Mean = 9.49

What you’re doing here is comparing two samples that, because you’ve not violated any of the assumptions we saw before, should represent populations that look like this: Are the means of these populations different?? Frequency Weight

Are the means of these populations different?? To answer this question – use a statistical test A statistical test is just a method of determining mathematically whether you definitively say ‘yes’ or ‘no’ to this question What test should I use??

IF YOU HAVEN’T VIOLATED ANY OF THE ASSUMPTIONS WE MENTIONED BEFORE…… Number of groups compared 2 other than 2 T -test Direction of difference specified? YesNo One-tailedTwo- tailed Does each data point in one data set (population) have a corresponding one in the other data set? YesNo Paired t-testUnpaired t-test Are the means of two populations the same? Are the means of more than two populations the same? Number of factors being tested 12>2 Does each data point in one data set (population) have a corresponding one in the other data sets? Two way ANOVA ANOVA YesNo One way ANOVA Repeated Measures ANOVA Other tests

A simple t-test 1. State hypotheses H o – there is no difference between the means of the two populations of crickets (i.e. the extra nutrients had no effect on weight) H 1 – there is a difference between the means of the two populations of crickets (i.e. the extra nutrients had an effect on weight)

A simple t-test 2. Calculate a t-value (any stats program does this for you) 3. Use a probability table for the test you used to determine the probability that corresponds to the t- value that was calculated. (for the truly masochistic)

A simple t-test 2. Calculate a t-value (any stats program does this for you) 3. Use a probability table for the test you used to determine the probability that corresponds to the t- value that was calculated. DataTest statisticProbability

Unpaired t test Do the means of Nutrient fed and No nutrient differ significantly? P value The two-tailed P value is < , considered extremely significant. t = with 38 degrees of freedom. 95% confidence interval Mean difference = (Mean of No nutrient minus mean of Nutrient fed) The 95% confidence interval of the difference: to Assumption test: Are the standard deviations equal? The t test assumes that the columns come from populations with equal SDs. The following calculations test that assumption. F = The P value is This test suggests that the difference between the two SDs is not significant. Assumption test: Are the data sampled from Gaussian distributions? The t test assumes that the data are sampled from populations that follow Gaussian distributions. This assumption is tested using the method Kolmogorov and Smirnov: Group KS P Value Passed normality test? =============== ====== ======== ======================= Nutrient fed >0.10 Yes No nutrient >0.10 Yes

Interpretation of p <.0001? This means that there is less than 1 chance in 10,000 that these two means are from the same population. In the world of statistics, that is too small a chance to have happened randomly and so the H o is rejected and the H 1 accepted

For all statistical tests that you’ll use, it is convention that the minimum probability that two samples can differ and still be from the same population is 5% or p =.05

What happens if you violate any of the assumptions? Step 1 - Panic

What happens if you violate any of the assumptions? Step 1 - Panic Step 2 - It depends on what assumptions have been violated. AssumptionOther testsAnother solution? 1. Continuous dataYes 2. Ratio/intervalYes 3. Normal distributionYesTransform the data 4. Equal varianceYes - Welch’s 5. Sample PopulationYes 6. N<10YesTake more samples

Nonparametric Tests These tests are used when the assumptions of t-tests and ANOVA have been violated They are called “nonparametric” because there is no estimation of parameters (means, standard deviations or variances) involved. Several kinds: 1)Goodness-of-Fit tests - when you calculate an expected value 2)Non-parametric equivalents of parametric tests

SUMMARY Problem - trying to determine the expected frequencies of any result in a particular experiment Type of data Discrete 2 categories & Bernoulli process > 2 categories Use a Binomial model to calculate expected frequencies Use a Poisson distribution to calculate expected frequencies

Consider the following problem: Sampling earthworms 25 plots Quadrat# of worms

Quadrat# of worms N = 25 X = 2.24 worms/quadrat

What is the expected number of worms/quadrat? OR What is the probability of x worms being in a particular quadrat?

Use a Poisson distribution ->2 mutually exclusive categories -N is relatively large and p is relatively small The distribution of worms in space is expected to be random

Formula for a Poisson distribution P x = e -µ µ x X! Probability of observing X individuals in a category Base of natural logarithms (= ….) True mean of the population (approximated by sample mean) An integer (number of indviduals)

Formula for a Poisson distribution P x = e -µ µ x X! Probability of observing X worms in a quadrat Base of natural logarithms (= ….) µ = X = 2.24 Number of worms)

# of worms Probability of finding X worms in a quadrat Calculation 0Po = e -µ (µ x /0!)=e = Po = e -µ (µ 1 /1!)=e (2.24/1) = Po = e -µ (µ 2 /2!)=e ( /2) = Po = e -µ (µ 3 /3!)=e ( /6) = Po = e -µ (µ 4 /4!)=e ( /24) = Po = e -µ (µ 5 /5!)=.05 6Po = e -µ (µ 6 /6!)= Po = e -µ (µ 7 /7!)=.006 Could go on forever or to ∞ - whichever comes first!

Practically…. P 0 + P 1 + P 2 + P 3 + P 4 + P 5 + P 6 + P 7 =.998 And P 8 + P 9 ……=.002 For convenience - P 8 =.002

Other kinds of Poisson problems 1. Cell counts in a hemocytometer 2. Number of parasitic mites per fly in a population 3. Number of fish per seine 4. Number of animals in a particular subdivision of the habitat Poisson Distributions are very common in biological work!

Goodness-of-Fit Tests Use with nominal scale data e.g. results of genetic crosses Also, you’re using the population to deduce what the sample should look like

Classic example - genetic crosses Do they conform to an “expected’ Mendelian ratio? Back to our little ball creatures - Critterus sphericales Phenotypes: A_B_ A_bb aaB_ aabb Mendelian inheritance -Predict a 9:3:3:1 ratio

-sampled 320 animals A_B_A_bbaaB_aabb Observed (o)

-sampled 320 animals A_B_A_bbaaB_aabb Observed (o) Expected (e)

-sampled 320 animals A_B_A_bbaaB_aabb Observed (o) Expected (e) o - e

-sampled 320 animals A_B_A_bbaaB_aabb Observed (o) Expected (e) o - e (o - e)

-sampled 320 animals A_B_A_bbaaB_aabb Observed (o) Expected (e) o - e (o - e) (o - e) 2 e

-sampled 320 animals A_B_A_bbaaB_aabb Observed (o) Expected (e) o - e (o - e) (o - e) 2 e (o -e) 2 e   2 = = = df = number of classes -1 = 3

X 2 = 12.52Critical value for 3 degrees of freedomat.05 level is7.82 X 2 Table Conclusion: Probability of these data fitting the expected distribution is <.05, therefore they are not from a Mendelian population The actual probability of X 2 =12.52 and df = 3 is.01 > p >.001

A little X 2 wrinkle - the Yates correction Formula is (o -e) 2 e   2 = Except of df = 1 (i.e. you’re using two categories of data) Then the formula becomes (|o -e| - 0.5) 2 e   2 =

Type of dataNumber of samples Are data related? Test to use Nominal2YesMcNemar Nominal2NoFisher’s Exact Nominal>2YesCochran’s Q Summary!

Type of dataNumber of samplesAre data related?Test to use Nominal2YesMcNemar Nominal2NoFisher’s Exact Nominal>2YesCochran’s Q Ordinal1NoKomolgorov- Smirnov Ordinal+2YesWilcoxon (paired t-test analogue) Ordinal+2NoMann Whitney U (unpaired t-test analogue) Ordinal+>2NoKruskal Wallis (analogue of one- way ANOVA Ordinal>2YesFriedman two-way ANOVA All of the parametric tests (remember the big flow chart!) have non-parametric equivalents (or analogues)