Topic 1: Statistical Analysis

Slides:



Advertisements
Similar presentations
Using the T-test IB Biology Topic 1.
Advertisements

Are our results reliable enough to support a conclusion?
Review Feb Adapted from: Taylor, S. (2009). Statistical Analysis. Taken from:
RESEARCH METHODOLOGY & STATISTICS LECTURE 6: THE NORMAL DISTRIBUTION AND CONFIDENCE INTERVALS MSc(Addictions) Addictions Department.
Standard Normal Table Area Under the Curve
Psy B07 Chapter 1Slide 1 ANALYSIS OF VARIANCE. Psy B07 Chapter 1Slide 2 t-test refresher  In chapter 7 we talked about analyses that could be conducted.
Answering questions about life with statistics ! The results of many investigations in biology are collected as numbers known as _____________________.
TOPIC 1 STATISTICAL ANALYSIS
Statistical Analysis Statistical Analysis
STATISTICS For Research. 1. Quantitatively describe and summarize data A Researcher Can:
STATISTICS For Research. Why Statistics? 1. Quantitatively describe and summarize data A Researcher Can:
t-Test: Statistical Analysis
APPENDIX B Data Preparation and Univariate Statistics How are computer used in data collection and analysis? How are collected data prepared for statistical.
Statistical Analysis A Quick Overview. The Scientific Method Establishing a hypothesis (idea) Collecting evidence (often in the form of numerical data)
Topics: Statistics & Experimental Design The Human Visual System Color Science Light Sources: Radiometry/Photometry Geometric Optics Tone-transfer Function.
Statistics The POWER of Data. Statistics: Definition Statistics is the mathematics of the collection, organization, and interpretation of numerical data.
Statistical Analysis Mean, Standard deviation, Standard deviation of the sample means, t-test.
6.1 Statistical Analysis.
Understanding Numerical Data
Measures of Dispersion CUMULATIVE FREQUENCIES INTER-QUARTILE RANGE RANGE MEAN DEVIATION VARIANCE and STANDARD DEVIATION STATISTICS: DESCRIBING VARIABILITY.
NOTES The Normal Distribution. In earlier courses, you have explored data in the following ways: By plotting data (histogram, stemplot, bar graph, etc.)
Statistical Analysis Topic – Math skills requirements.
Statistical analysis Outline that error bars are a graphical representation of the variability of data. The knowledge that any individual measurement.
Measures of central tendency are statistics that express the most typical or average scores in a distribution These measures are: The Mode The Median.
PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Statistics - methodology for collecting, analyzing, interpreting and drawing conclusions from collected data Anastasia Kadina GM presentation 6/15/2015.
Statistics in Biology. Histogram Shows continuous data – Data within a particular range.
Section 10.1 Confidence Intervals
Statistical Analysis IB Topic 1. Why study statistics?  Scientists use the scientific method when designing experiments  Observations and experiments.
1.1 Statistical Analysis. Learning Goals: Basic Statistics Data is best demonstrated visually in a graph form with clearly labeled axes and a concise.
Statistical analysis. Types of Analysis Mean Range Standard Deviation Error Bars.
More statistics notes!. Syllabus notes! (The number corresponds to the actual IB numbered syllabus.) Put the number down from the syllabus and then paraphrase.
Statistical Analysis Topic – Math skills requirements.
Statistics in IB Biology Error bars, standard deviation, t-test and more.
Statistical Analysis. Null hypothesis: observed differences are due to chance (no causal relationship) Ex. If light intensity increases, then the rate.
PCB 3043L - General Ecology Data Analysis. PCB 3043L - General Ecology Data Analysis.
Data Analysis.
PCB 3043L - General Ecology Data Analysis.
Statistical analysis Why?? (besides making your life difficult …)  Scientists must collect data AND analyze it  Does your data support your hypothesis?
Are our results reliable enough to support a conclusion?
DO NOW1/29/14 Use the gas price data to answer the following questions. {$4.79, $4.60, $4.75, $4.66, $4.60} 1.Find the mean of the data (hint: average)
Measurements and Their Analysis. Introduction Note that in this chapter, we are talking about multiple measurements of the same quantity Numerical analysis.
The T-Test Are our results reliable enough to support a conclusion?
Data Analysis. Qualitative vs. Quantitative Data collection methods can be roughly divided into two groups. It is essential to understand the difference.
STATISICAL ANALYSIS HLIB BIOLOGY TOPIC 1:. Why statistics? __________________ “Statistics refers to methods and rules for organizing and interpreting.
Statistical Analysis IB Topic 1. IB assessment statements:  By the end of this topic, I can …: 1. State that error bars are a graphical representation.
PCB 3043L - General Ecology Data Analysis Organizing an ecological study What is the aim of the study? What is the main question being asked? What are.
Data analysis and basic statistics KSU Fellowship in Clinical Pathology Clinical Biochemistry Unit
Cell Diameters and Normal Distribution. Frequency Distributions a frequency distribution is an arrangement of the values that one or more variables take.
Using the T-test Topic 1.
Statistical analysis.
AP Biology Intro to Statistics
Statistical analysis.
PCB 3043L - General Ecology Data Analysis.
Success Criteria: I will be able to analyze data about my classmates.
STATISTICS For Research
AP Biology Intro to Statistics
Module 8 Statistical Reasoning in Everyday Life
TOPIC 1: STATISTICAL ANALYSIS
Are our results reliable enough to support a conclusion?
Statistical Inference about Regression
Statistical Analysis IB Topic 1.
Are our results reliable enough to support a conclusion?
Are our results reliable enough to support a conclusion?
Are our results reliable enough to support a conclusion?
Statistical analysis.
1.1 Statistical Analysis.
Are our results reliable enough to support a conclusion?
Topic 1 Statistical Analysis.
Are our results reliable enough to support a conclusion?
Presentation transcript:

Topic 1: Statistical Analysis Assessment Statements 1.1.1 State that error bars are a graphical representation of the variability of data 1.1.2 Calculate the mean and standard deviation of a set of values. 1.1.3 State that the term standard deviation is used to summarize the spread of values around the mean, and that 68% of the values fall within one standard deviation of the mean. 1.1.4 Explain how the standard deviation is useful for comparing the means and the spread of data between 2 or more samples 1.1.5 Deduce the significance of the difference between two sets of data using calculated values for t and the appropriate tables. 1.1.6 Explain that the existence of a correlation does not establish that there is a causal relationship between two variables.

Why use statistics? In science, we are always making observations about the world around us. Many times these observations result in the collection of measurable, quantitative data. Using this data, we can ask questions based upon some of these observations and try to answer those questions.

Essential Statistics Terms Mean: the average of the data points Range: the difference between the largest and smallest observed values in a data set (the measure of the spread of the data) Standard Deviation: a measure of how the individual observations of a data set are dispersed or spread out around the mean Error bars: Graphical representation of the variability of data

Standard Deviation Standard deviation is used to summarize the spread of values around the mean and to compare the means of data between two or more samples. In a normal distribution, about 68% of all values lie within ±1 standard deviation of the mean. 95% will lie within ±2 standard deviations of the mean. This normal distribution will form a bell curve The shape of the curve tells us how close all of the data points are to the mean.

Calculating Standard Deviation We can use the formula to solve for standard deviation… Σ(xi - (Σx)2 Sx = n n - 1 Example data: Ward pg.5 For calculator help, go to http://www.heinemann.co.uk/hotlinks ,enter code 4242p and click weblink 1.4a or 1.4b HOWEVER the easiest way to calculate…find an app online!!!!

Are our results reliable enough to support a conclusion?

Imagine we chose two children at random from two class rooms… … and compare their height …

… we find that one pupil is taller than the other C1 … we find that one pupil is taller than the other WHY?

REASON 1: There is a significant difference between the two groups, so pupils in C1 are taller than pupils in D8 D8 C1 YEAR 7 YEAR 11

REASON 2: By chance, we picked a short pupil from D8 and a tall one from C1 D8 C1 Sammy (Year 9) HAGRID (Year 9)

How do we decide which reason is most likely? MEASURE MORE STUDENTS!!!

… the average or mean height of the two groups should be very… If there is a significant difference between the two groups… D8 C1 … the average or mean height of the two groups should be very… … DIFFERENT

… the average or mean height of the two groups should be very… If there is no significant difference between the two groups… D8 C1 … the average or mean height of the two groups should be very… … SIMILAR

Living things normally show a lot of variation, so… Remember:

It is VERY unlikely that the mean height of our two samples will be exactly the same C1 Sample Average height = 162 cm D8 Sample Average height = 168 cm Is the difference in average height of the samples large enough to be significant?

Here, the ranges of the two samples have a small overlap, so… We can analyse the spread of the heights of the students in the samples by drawing histograms 2 4 6 8 10 12 14 16 Frequency 140-149 150-159 160-169 170-179 180-189 Height (cm) C1 Sample D8 Sample Here, the ranges of the two samples have a small overlap, so… … the difference between the means of the two samples IS probably significant.

Here, the ranges of the two samples have a large overlap, so… 2 4 6 8 10 12 14 16 Frequency 140-149 150-159 160-169 170-179 180-189 Height (cm) C1 Sample Here, the ranges of the two samples have a large overlap, so… … the difference between the two samples may NOT be significant. 2 4 6 8 10 12 14 16 Frequency 140-149 150-159 160-169 170-179 180-189 Height (cm) D8 Sample The difference in means is possibly due to random sampling error

… and the spread of heights in each sample. To decide if there is a significant difference between two samples we must compare the mean height for each sample… … and the spread of heights in each sample. Statisticians calculate the standard deviation of a sample as a measure of the spread of a sample Sx = Σ(xi - (Σxi)2 n n - 1 Where: Sx is the standard deviation of sample Σ stands for ‘sum of’ xi stands for the individual measurements in the sample n is the number of individuals in the sample You can calculate standard deviation using the formula:

It is much easier to use the statistics functions on a scientific calculator! e.g. for data 25, 34, 13 Set calculator on statistics mode MODE 2 (CASIO fx-85MS) Clear statistics memory SHIFT CLR 1 (Scl) = Enter data 2 5 DT (M+ Button) 3 4 1

Calculate the standard deviation Calculate the mean 24 AC SHIFT S-VAR (2 Button) 1 ( x ) = Calculate the standard deviation 10.5357 AC SHIFT S-VAR 3 = (xσn-1)

We start by calculating a number, t Student’s t-test The Student’s t-test compares the averages and standard deviations of two samples to see if there is a significant difference between them. We start by calculating a number, t t can be calculated using the equation: ( x1 – x2 ) (s1)2 n1 (s2)2 n2 + t = Where: x1 is the mean of sample 1 s1 is the standard deviation of sample 1 n1 is the number of individuals in sample 1 x2 is the mean of sample 2 s2 is the standard deviation of sample 2 n2 is the number of individuals in sample 2

Worked Example: Random samples were taken of pupils in C1 and D8 Their recorded heights are shown below… Students in C1 Students in D8 Student Height (cm) 145 149 152 153 154 148 157 161 162 158 160 166 163 167 172 175 177 182 183 185 187 Step 1: Work out the mean height for each sample C1: x1 = 161.60 D8: x2 = 168.27 Step 2: Work out the difference in means x2 – x1 = 168.27 – 161.60 = 6.67

Step 3: Work out the standard deviation for each sample C1: s1 = 10.86 D8: s2 = 11.74 Step 4: Calculate s2/n for each sample (s1)2 n1 = C1: 10.862 ÷ 15 = 7.86 (s2)2 n2 = D8: 11.742 ÷ 15 = 9.19

Step 5: Calculate (s1)2 n1 + (s2)2 n2 (s1)2 n1 + (s2)2 n2 = (7.86 + 9.19) = 4.13 Step 6: Calculate t (Step 2 divided by Step 5) t = (s1)2 n1 + (s2)2 n2 = x2 – x1 6.67 4.13 = 1.62

Step 7: Work out the number of degrees of freedom d.f. = n1 + n2 – 2 = 15 + 15 – 2 = 28 Step 8: Find the critical value of t for the relevant number of degrees of freedom Use the 95% (p=0.05) confidence limit Critical value = 2.048 Our calculated value of t is below the critical value for 28d.f., therefore, there is no significant difference between the height of students in samples from C1 and D8

Follow the instructions CAREFULLY Do not worry if you do not understand how or why the test works Follow the instructions CAREFULLY You will NOT need to remember how to do this for your exam