BioStatistics. Why Statistics? You want to make the strongest conclusions based on limited data Differences in biological systems sometimes cannot be.

Slides:



Advertisements
Similar presentations
Hypothesis Testing Steps in Hypothesis Testing:
Advertisements

CHAPTER 21 Inferential Statistical Analysis. Understanding probability The idea of probability is central to inferential statistics. It means the chance.
Lecture 6 Outline – Thur. Jan. 29
Inferential Statistics
Objectives (BPS chapter 24)
1. Estimation ESTIMATION.
EPIDEMIOLOGY AND BIOSTATISTICS DEPT Esimating Population Value with Hypothesis Testing.
MARE 250 Dr. Jason Turner Hypothesis Testing II. To ASSUME is to make an… Four assumptions for t-test hypothesis testing:
Lecture 6 Outline: Tue, Sept 23 Review chapter 2.2 –Confidence Intervals Chapter 2.3 –Case Study –Two sample t-test –Confidence Intervals Testing.
Evaluating Hypotheses Chapter 9 Homework: 1-9. Descriptive vs. Inferential Statistics n Descriptive l quantitative descriptions of characteristics ~
Final Review Session.
Analysis of Differential Expression T-test ANOVA Non-parametric methods Correlation Regression.
PSY 1950 Confidence and Power December, Requisite Quote “The picturing of data allows us to be sensitive not only to the multiple hypotheses that.
Chapter 2 Simple Comparative Experiments
UNDERSTANDING RESEARCH RESULTS: STATISTICAL INFERENCE © 2012 The McGraw-Hill Companies, Inc.
1.  Why understanding probability is important?  What is normal curve  How to compute and interpret z scores. 2.
Today Concepts underlying inferential statistics
Data Analysis Statistics. Levels of Measurement Nominal – Categorical; no implied rankings among the categories. Also includes written observations and.
Chapter 14 Inferential Data Analysis
7.1 Lecture 10/29.
Inferential Statistics
Statistical Analysis. Purpose of Statistical Analysis Determines whether the results found in an experiment are meaningful. Answers the question: –Does.
Chapter 12 Inferential Statistics Gay, Mills, and Airasian
Thomas Songer, PhD with acknowledgment to several slides provided by M Rahbar and Moataza Mahmoud Abdel Wahab Introduction to Research Methods In the Internet.
Inferential Statistics
INFERENTIAL STATISTICS – Samples are only estimates of the population – Sample statistics will be slightly off from the true values of its population’s.
AM Recitation 2/10/11.
Basic Statistics (for this class) Special thanks to Jay Pinckney (The HPLC and Statistics Guru) APOS.
 Mean: true average  Median: middle number once ranked  Mode: most repetitive  Range : difference between largest and smallest.
Statistical Analysis Statistical Analysis
T-Tests and Chi2 Does your sample data reflect the population from which it is drawn from?
Evidence Based Medicine
F OUNDATIONS OF S TATISTICAL I NFERENCE. D EFINITIONS Statistical inference is the process of reaching conclusions about characteristics of an entire.
1 CSI5388: Functional Elements of Statistics for Machine Learning Part I.
Statistics & Biology Shelly’s Super Happy Fun Times February 7, 2012 Will Herrick.
1 Statistical Inference Greg C Elvers. 2 Why Use Statistical Inference Whenever we collect data, we want our results to be true for the entire population.
Topics: Statistics & Experimental Design The Human Visual System Color Science Light Sources: Radiometry/Photometry Geometric Optics Tone-transfer Function.
Lecture 12 Statistical Inference (Estimation) Point and Interval estimation By Aziza Munir.
Hypothesis Testing CSCE 587.
Introduction To Biological Research. Step-by-step analysis of biological data The statistical analysis of a biological experiment may be broken down into.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
1 CS 391L: Machine Learning: Experimental Evaluation Raymond J. Mooney University of Texas at Austin.
Inference and Inferential Statistics Methods of Educational Research EDU 660.
PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
1 Chapter 10: Introduction to Inference. 2 Inference Inference is the statistical process by which we use information collected from a sample to infer.
Essential Question:  How do scientists use statistical analyses to draw meaningful conclusions from experimental results?
Educational Research Chapter 13 Inferential Statistics Gay, Mills, and Airasian 10 th Edition.
ITEC6310 Research Methods in Information Technology Instructor: Prof. Z. Yang Course Website: c6310.htm Office:
Chapter 13 - ANOVA. ANOVA Be able to explain in general terms and using an example what a one-way ANOVA is (370). Know the purpose of the one-way ANOVA.
Psych 230 Psychological Measurement and Statistics
Stats Lunch: Day 3 The Basis of Hypothesis Testing w/ Parametric Statistics.
Inferential Statistics Introduction. If both variables are categorical, build tables... Convention: Each value of the independent (causal) variable has.
Introducing Communication Research 2e © 2014 SAGE Publications Chapter Seven Generalizing From Research Results: Inferential Statistics.
STATISTICS FOR SCIENCE RESEARCH (The Basics). Why Stats? Scientists analyze data collected in an experiment to look for patterns or relationships among.
Principles of statistical testing
Hypothesis Testing Introduction to Statistics Chapter 8 Feb 24-26, 2009 Classes #12-13.
SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS. SAMPLING AND SAMPLING VARIATION Sample Knowledge of students No. of red blood cells in a person Length of.
Chapter 13 Understanding research results: statistical inference.
HYPOTHESIS TESTING FOR DIFFERENCES BETWEEN MEANS AND BETWEEN PROPORTIONS.
BIOL 582 Lecture Set 2 Inferential Statistics, Hypotheses, and Resampling.
Hypothesis Tests u Structure of hypothesis tests 1. choose the appropriate test »based on: data characteristics, study objectives »parametric or nonparametric.
Hypothesis Testing and Statistical Significance
Educational Research Inferential Statistics Chapter th Chapter 12- 8th Gay and Airasian.
Review Statistical inference and test of significance.
Inferential Statistics Psych 231: Research Methods in Psychology.
Statistical principles: the normal distribution and methods of testing Or, “Explaining the arrangement of things”
Chapter 10: The t Test For Two Independent Samples.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
STA248 week 121 Bootstrap Test for Pairs of Means of a Non-Normal Population – small samples Suppose X 1, …, X n are iid from some distribution independent.
Understanding Results
Presentation transcript:

BioStatistics

Why Statistics? You want to make the strongest conclusions based on limited data Differences in biological systems sometimes cannot be easily observed Random variation? Real difference?

Statistics sometimes are Unnecessary Large differences in observed events And small scatter within groups In most instances, though, the use of statistics can provide you with mathematically-based conclusions Clinical research Field research

Statistics extrapolate from sample to population The only way to draw absolute conclusions about a population is to measure the trait(s) of interest of every individual in that population The reality is, this is almost always impossible to do Thus, randomly sampling some of the individuals can provide information about the entire population Sometimes random sampling can be difficult to define If your sample is not random, then conclusions drawn from it are not reliable

Samples and Populations Quality control A company manufactures 20,000 vials (population) of a vaccine from a single production run About 50 vials (samples) are taken from this production run and analyzed for a variety of characteristics The results on 50 vials are then extrapolated to the remaining vials

Samples and Populations Political polls The number of eligible U. S. voters is about 125,000,000 (population) A few hundred or thousands (sample) are asked to respond to political questions

Samples and Populations Clinical studies Patients in a clinical study (sample) have a clinical condition (e.g., disease) They rarely reflect the entire population However, they often reflect the population with the condition Sampling humans can be particularly difficult

Samples and Populations Field experiments Local variations Impact of weather Environmental conditions/changes Human impact Sampling bias

Samples and Populations Laboratory experiments Usually not necessary Highly-controlled experiments Single variable Genetically-defined organisms Very little variation

What statistical calculations can do Statistical estimation Calculation of a mean within a population is a precise number However, the number is only an estimate of the whole population Statistical hypothesis testing Helps determine if an observed difference is due simply to random chance Provides a P value; if P is small, the difference is unlikely due to random chance and the conclusion is statistically significant Statistical modeling Tests how well experimental data fit a mathematical model The most common form of statistical modeling is linear regression LR usually determines the best straight line through a set of data points

What statistical calculations cannot do Analysis of a simple experiment Define a population you are interested in Randomly select a sample of subjects to study Randomly split the sample subjects into two groups One group gets one treatment The other group gets another treatment Measure a single variable trait in each subject Use statistical tests to determine if there’s a difference between the groups

What statistical calculations cannot do The problems with real experiments Populations can be more diverse than your samples Samples are collected on convenience, rather than randomly The measured value is proxy value for what you’re really interested in Errors in data collection Record data incorrectly Assays may not report what you think they report You need to combine different types of measurements to reach an overall conclusion (multiple variables)

Why statistics are difficult to learn Deceptive terminology (significant, error, hypothesis) Statistical conclusions are never absolute (statistically significant) Statistics uses abstract concepts (populations, probabilities) Statistics are at the interface of math and science Many statistical calculations require complex math

Variables Independent variable - The variable scientists manipulate to evaluate a response Dependent variable - The variable (i.e., trait) resulting from a treatment with an independent variable

Variables Types of variables in biology Measurement variables Continuous Discontinuous Ranked variables Attributes

Variables Measurement variables - Those whose differing states can be expressed in a numerically-ordered fashion Continuous Can assume any value between two distinct points For example, there are infinite numbers between 1.5 and 1.6 Include: lengths, areas, volumes, weights, angles, temperatures, periods of time, percentages, rates Discontinuous Discrete values that can only have fixed numerical values The number of segments in an insect’s appendage may be 4, 5, or 6, but not 4.3

Variables Ranked variables Variables that cannot be measured For example, order of emergence of pupae without regard to time Attribute variables Variables that cannot be measured, but must be expressed qualitatively For example: black/white; pregnant/nonpregnant; male/female; live/dead

Appropriate tests DesignMeasurement VarRanked VarAttribute Var 1 variable 1 sample Computing median and frequencies Computing means Computing standard deviations Confidence limits for percentages Runs test for randomness 1 Variable 2 samples t-tests Test of equality Paired comparisons test Mann-Whitney U- test Kolmogorov- Smirnov two-sample test Testing differences between two percentages 1 Variable 2+ Samples ANOVA Tukey-Kramer test Kruskal-Wallis test Friedman’s random- ized block test G-test for percentages 2 Variables 1 Sample Regression analysis Polynomial regression Olmstead and Tukey’s corner test Ordering test Spearman’s rank test Chi-square test Fisher’s exact test

Means and Standard Deviations The mean is the average of measured trait from a population In biology, we usually compare two or more populations, which we call groups The standard deviation is the variance around the mean Many statistical tests use means and standard deviations to determine if there are significant differences between groups

null hypothesis Used to assume an event is true Statistics can be used to disprove the hypothesis This lends support to an alternative hypothesis Nearly every experiment that uses statistics should define null and alternative hypotheses

Student’s T-test Determines if there is a significant difference between the means of two groups of measured data Paired - compares matched values between members of a group Unpaired - assumes values between members are not related Tests values for fit to a normal (aka -Gaussian) distribution (“bell curve”) If not, then use nonparametric testing One-tailed vs. two-tailed One-tailed: You must specify which group will have a larger mean in advance of data collection Two-tailed: You do not know which group will have a larger mean in advance of data collection

Student’s T-test P value: Is there a significant difference between the means of the two groups? Generally, if the P value is less than or equal to 0.05, then the difference is considered significant t-value: Positive if the first mean is larger than the second and negative if it is smaller

Student’s T-test Confidence interval The calculated mean is unlikely the exact same as the entire population Assumes your samples are randomly collected and fit a normal distribution If your sample is large with a small standard deviation, then your calculated mean likely is close to the actual mean The CI is a calculation based upon sample size and standard deviation If the CI is 95%, then the range of your calculated mean (i.e, standard deviation) probably (95%) includes the actual mean of the population under study