STATISTICAL ANALYSIS.

Slides:



Advertisements
Similar presentations
Modifyuse bio. IB book IB Biology Topic 1: Statistical Analysis ary/Science/c4b/1/stat1.htm
Advertisements

Statistical Analysis WHY ?.
Statistical Analysis IB Diploma Biology Modified by Christopher Wilkinson from Stephen Taylor Image: 'Hummingbird Checks Out Flower'
Statistical Analysis IB Diploma BiologyIB Diploma Biology (HL/SL)
Statistical Tests Karen H. Hagglund, M.S.
Lect 10b1 Histogram – (Frequency distribution) Used for continuous measures Statistical Analysis of Data ______________ statistics – summarize data.
1 STATISTICS!!! The science of data. 2 What is data? Information, in the form of facts or figures obtained from experiments or surveys, used as a basis.
Psy B07 Chapter 1Slide 1 ANALYSIS OF VARIANCE. Psy B07 Chapter 1Slide 2 t-test refresher  In chapter 7 we talked about analyses that could be conducted.
Data Collection & Processing Hand Grip Strength P textbook.
STATISTICS!!! The science of data. What is data? Information, in the form of facts or figures obtained from experiments or surveys, used as a basis for.
STEM Fair Graphs & Statistical Analysis. Objectives: – Today I will be able to: Construct an appropriate graph for my STEM fair data Evaluate the statistical.
Statistical Analysis Mean, Standard deviation, Standard deviation of the sample means, t-test.
Statistical Analysis IB Diploma Biology Stephen Taylor Image: 'Hummingbird Checks Out Flower'
Analyzing and Interpreting Quantitative Data
Statistical analysis Outline that error bars are a graphical representation of the variability of data. The knowledge that any individual measurement.
PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
The Statistical Analysis of Data. Outline I. Types of Data A. Qualitative B. Quantitative C. Independent vs Dependent variables II. Descriptive Statistics.
STATISTICS!!! The science of data.
Statistics allow biologists to support the findings of their experiments.
1.1 Statistical Analysis. Learning Goals: Basic Statistics Data is best demonstrated visually in a graph form with clearly labeled axes and a concise.
STATISTICS!!! The science of data. What is data? Information, in the form of facts or figures obtained from experiments or surveys, used as a basis for.
Statistical Analysis Topic – Math skills requirements.
Chapter Eight: Using Statistics to Answer Questions.
Statistical Analysis Image: 'Hummingbird Checks Out Flower'
Data Analysis.
Chapter 6: Analyzing and Interpreting Quantitative Data
CHAPTER 27: One-Way Analysis of Variance: Comparing Several Means
PCB 3043L - General Ecology Data Analysis.
+ Data Analysis Chemistry GT 9/18/14. + Drill The crown that King Hiero of Syracuse gave to Archimedes to analyze had a volume of 575 mL and a mass of.
Excel How To Mockingbird Example BIO II Van Roekel.
MAKING MEANING OUT OF DATA Statistics for IB-SL Biology.
Statistical Analysis adapted from the work of Stephen Taylor.
PCB 3043L - General Ecology Data Analysis Organizing an ecological study What is the aim of the study? What is the main question being asked? What are.
Data measurement, probability and Spearman’s Rho
Statistical analysis.
DTC Quantitative Methods Bivariate Analysis: t-tests and Analysis of Variance (ANOVA) Thursday 20th February 2014  
Statistics made simple Dr. Jennifer Capers
AP Biology Intro to Statistics
Lecture Slides Elementary Statistics Twelfth Edition
Statistical Analysis IB Diploma Biology Stephen Taylor
Modify—use bio. IB book  IB Biology Topic 1: Statistical Analysis
Statistical analysis.
PCB 3043L - General Ecology Data Analysis.
Analyzing and Interpreting Quantitative Data
Simulation-Based Approach for Comparing Two Means
Statistics (0.0) IB Diploma Biology
Central Limit Theorem, z-tests, & t-tests
STATS DAY First a few review questions.
STATISTICS!!! The science of data.
Statistical Analysis - IB Biology - Mark Polko
Inferential statistics,
STATISTICAL ANALYSIS.
EXAMPLES OF STATS FUNCTIONS
STEM Fair Graphs & Statistical Analysis
Basic Statistical Terms
Statistics in Science Data can be collected about a population (surveys) Data can be collected about a process (experimentation)
Statistics for IB-SL Biology
Statistical Analysis Error Bars
What is Data? Information, in the form of facts or figures obtained from experiments or surveys, used as a basis for making calculations or drawing.
STEM Fair Graphs.
STATISTICS Topic 1 IB Biology Miss Werba.
Correlation and the Pearson r
Inferential Statistics
15.1 The Role of Statistics in the Research Process
Statistical analysis.
Chapter Nine: Using Statistics to Answer Questions
1.1 Statistical Analysis.
Topic 1 Statistical Analysis.
Chapters 8, 9, 10, 11 Dr. Bhattacharya
Presentation transcript:

STATISTICAL ANALYSIS

Reasons for using statistics Since we can’t measure the whole population, we need to take a sample to have data representing the population. *

Types of DatA Quantitative data: IS anything that can be expressed as a number, or quantified. Examples of quantitative data are: scores on achievement tests number of hours of study weight of a subject This data can be statistically manipulated.

Information is often grouped by descriptive category qualitative data: Information that relates to characteristics or description (observable qualities) Information is often grouped by descriptive category Examples of Qualitative data are: Species of plant Type of insect Shades of color Rank of flavor in taste testing *Remember: qualitative data can be “scored” and evaluated numerically

UNCERTAINTY (measurement reading error)  When you do a simple measurement using: micrometer voltmeter thermometer graduated cylinder stopwatch UNCERTAINTY is +/- half of the smallest division When you do a measurement using a RULER or digital display: UNCERTAINTY is +/- the smallest division (or otherwise stated on the instrument)

MEAN The "mean" is the "average" you're used to, where you add up all the numbers and then divide by the number of numbers. Mean  =  600 + 470 + 170 + 430 + 3005  =  19705 =  394 mm 5 5

STANDARD DEVIATION is used to summarize the spread of values around the mean Means do not tell us everything about a sample.   Samples can be very uniform with the data all bunched around the mean (Figure 1.) or they can be spread out a long way from the mean (Figure 2). The statistic that measures this spread is called the standard deviation. The wider the spread of scores, the larger the standard deviation.

And the good thing about the Standard Deviation is that it is useful And the good thing about the Standard Deviation is that it is useful. Now we can show which heights are within one Standard Deviation (147mm) of the Mean: For data that has a normal distribution, 68% of the data lies within (+/-)one standard deviation of the mean. Now we can show which dog heights are within +/- 1 Standard Deviation (147mm) of the Mean Using the Standard Deviation we have a "standard" way of knowing what is normal, and what is extra large or extra small. Rottweilers are tall dogs. And Dachshunds are a bit short ... but don't tell them!

HOW Standard deviation is calculated

ERROR BARS An error bar is a line through a point on a graph, parallel to one of the axes, which represents the uncertainty or variation of the corresponding coordinate of the point.  In IB Biology, the error bars most often represent the standard deviation of a data set. 

What do error bars suggest? If the bars show extensive overlap, it is likely that there is not a significant difference between those values

Graphing Mean with Standard dEViation as Error Bars Figure 1: Mean length of mollusc shell in the different types of water. (Error bars represent one standard deviation.) This standard deviation graph compares 68% of the population and begins to show that the two sets of data are likely different due to the very little overlap of error bars.

How do we determine if the two samples are different and not just due to chance or sampling error?

INDEPENDENT T-TEST  t-test is a statistical test that compares the mean and standard deviation of two samples to see if there is a significant difference between them. 

In an experiment, a t-test might be used to calculate whether or not differences seen between the control and the experimental group are a factor of the manipulated (INDEP) variable or simply the result of chance or sampling. The unpaired T-test would be used to determine if there is a significant difference between the control and treated enzyme activities. The paired T-test would be used to determine if there is a significant difference between the pre- and post-treatment blood pressures.

In any significance test, there are two possible hypothesis Null Hypothesis (n0): "There is not a significant difference between the two groups; any observed differences may be due to chance and sampling error." Alternative Hypothesis (N1): "There is a significant difference between the two groups; the observed differences are most likely not due to chance or sampling error."

Alternative Hypothesis (N1): Null Hypothesis (n0): Alternative Hypothesis (N1): There is no significant difference between the control and treatment group enzyme activity; the difference we see in the means of the two groups may be due to chance and sampling error. There is a significant difference between the control and treatment group enzyme activity; the difference we see in the means of the two groups is mostly likely not due to chance or sampling error.

In biology the critical probability left to chance or sampling Error is usually taken as 0.05 (or 5%). This may seem very low, but it reflects the facts that biology experiments are expected to produce quite similar results.

T-test Conclusion If your calculated P (probability) value from your T-Test is GREATER than .05 -Accept the NULL HYPOTHESIS - there is no significant difference and the results are due to chance If your calculated P (probability) value from your T-Test is LESS than .05 - Reject the NULL and Accept the ALTERNATIVE HYPOTHESIS - There is a significant difference and your independent variable is affecting this difference.

Analysis of Variance (ANOVA) The ANOVA test is a statistical test that can be done in place of multiple T- tests when comparing the means of more than two groups at a time.   Same rules to the t-test apply (null, alternative, p-value) The ANOVA test would be used to determine if there is a significant difference in the mean number of bird species in the seven locations. The ANOVA is a single test to determine the significance of the difference between the means of three or more groups.

Correlation When analyzing an experiment you are very often looking for an association between variables. This can be a correlation to see if two variables vary together or a relationship to see how one variable affects another. One test is the Pearson correlation coefficient ( r ) +1 (perfect positive correlation) through 0 (no correlation) to -1 (perfect negative correlation

 If your   correlation coefficient is a positive number, then you know that you have a direct, positive relationship. This means that as one variable increases (or decreases) the values of the other variable tend to go in the same direction. If one increases, so does the other. If one decreases, so does the other in a predictable manner.  

If your correlation coefficient is a negative number you can tell, just by looking  at it, that there is an indirect, negative relationship between the two variables.  As you may recall, a negative relationship means that as values on one variable increase (go up) the values on the other variable tend to decrease (go down) in a predictable manner.  

CORRELATION does not imply causation, FURTHER testing does

“Why is this Biology?” Variation in populations. The key methodology in Biology is hypothesis testing through experimentation. Carefully-designed and controlled experiments and surveys give us quantitative (numeric) data that can be compared. We can use the data collected to test our hypothesis and form explanations of the processes involved… but only if we can be confident in our results. We therefore need to be able to evaluate the reliability of a set of data and the significance of any differences we have found in the data. Variation in populations. Variability in results. affects Confidence in conclusions. Image: 'Transverse section of part of a stem of a Dead-nettle (Lamium sp.) showing+a+vascular+bundle+and+part+of+the+cortex' http://www.flickr.com/photos/71183136@N08/6959590092 Found on flickrcc.net

As a result of natural selection, hummingbird bills have evolved. Hummingbirds are nectarivores (herbivores that feed on the nectar of some species of flower). In return for food, they pollinate the flower. This is an example of mutualism – benefit for all. As a result of natural selection, hummingbird bills have evolved. Birds with a bill best suited to their preferred food source have the greater chance of survival. Photo: Archilochus colubris, from wikimedia commons, by Dick Daniels.

(red-throated hummingbird) and Researchers studying comparative anatomy collect data on bill-length in two species of hummingbirds: Archilochus colubris (red-throated hummingbird) and Cynanthus latirostris (broadbilled hummingbird). To do this, they need to collect sufficient relevant, reliable data so they can test the Null hypothesis (H0) that: “there is no significant difference in bill length between the two species.” Photo: Archilochus colubris (male), wikimedia commons, by Joe Schneid

The sample size must be large enough to provide sufficient reliable data and for us to carry out relevant statistical tests for significance. We must also be mindful of uncertainty in our measuring tools and error in our results. Photo: Broadbilled hummingbird (wikimedia commons).

The mean is a measure of the central tendency of a set of data. Calculate the mean using: Your calculator (sum of values / n) Excel Table 1: Raw measurements of bill length in A. colubris and C. latirostris. Bill length (±0.1mm)  n A. colubris C. latirostris 1 13.0 17.0 2 14.0 18.0 3 15.0 4 5 19.0 6 16.0 7 8 20.0 9 10 Mean s n = sample size. The bigger the better. In this case n=10 for each group. All values should be centred in the cell, with decimal places consistent with the measuring tool uncertainty. =AVERAGE(highlight raw data)

The mean is a measure of the central tendency of a set of data. Table 1: Raw measurements of bill length in A. colubris and C. latirostris. Bill length (±0.1mm)  n A. colubris C. latirostris 1 13.0 17.0 2 14.0 18.0 3 15.0 4 5 19.0 6 16.0 7 8 20.0 9 10 Mean 15.9 18.8 s Descriptive table title and number. Uncertainties must be included. Raw data and the mean need to have consistent decimal places (in line with uncertainty of the measuring tool)