Statistics for science fairs

Slides:



Advertisements
Similar presentations
Inference for Regression
Advertisements

Biol 500: basic statistics
Social Research Methods
Today Concepts underlying inferential statistics
Summary of Quantitative Analysis Neuman and Robson Ch. 11
Introduction to Regression Analysis, Chapter 13,
Chapter 12 Inferential Statistics Gay, Mills, and Airasian
Inferential Statistics
Statistics for the Social Sciences Psychology 340 Fall 2013 Thursday, November 21 Review for Exam #4.
Statistical Analysis I have all this data. Now what does it mean?
DATA ANALYSIS FOR RESEARCH PROJECTS
Choosing and using statistics to test ecological hypotheses
BPS - 3rd Ed. Chapter 211 Inference for Regression.
Statistics & Biology Shelly’s Super Happy Fun Times February 7, 2012 Will Herrick.
STEM Fair Graphs & Statistical Analysis. Objectives: – Today I will be able to: Construct an appropriate graph for my STEM fair data Evaluate the statistical.
Statistical Analysis I have all this data. Now what does it mean?
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
1 STAT 500 – Statistics for Managers STAT 500 Statistics for Managers.
The Statistical Analysis of Data. Outline I. Types of Data A. Qualitative B. Quantitative C. Independent vs Dependent variables II. Descriptive Statistics.
Educational Research Chapter 13 Inferential Statistics Gay, Mills, and Airasian 10 th Edition.
1.1 Statistical Analysis. Learning Goals: Basic Statistics Data is best demonstrated visually in a graph form with clearly labeled axes and a concise.
Chapter Eight: Using Statistics to Answer Questions.
Introducing Communication Research 2e © 2014 SAGE Publications Chapter Seven Generalizing From Research Results: Inferential Statistics.
STATISTICS FOR SCIENCE RESEARCH (The Basics). Why Stats? Scientists analyze data collected in an experiment to look for patterns or relationships among.
+ Data Analysis Chemistry GT 9/18/14. + Drill The crown that King Hiero of Syracuse gave to Archimedes to analyze had a volume of 575 mL and a mass of.
Chapter 13 Understanding research results: statistical inference.
Jump to first page Inferring Sample Findings to the Population and Testing for Differences.
Data Analysis. Qualitative vs. Quantitative Data collection methods can be roughly divided into two groups. It is essential to understand the difference.
Educational Research Inferential Statistics Chapter th Chapter 12- 8th Gay and Airasian.
Interpretation of Common Statistical Tests Mary Burke, PhD, RN, CNE.
BPS - 5th Ed. Chapter 231 Inference for Regression.
Methods of Presenting and Interpreting Information Class 9.
Quantitative Methods in the Behavioral Sciences PSY 302
Statistics & Evidence-Based Practice
Statistical Significance
Practice As part of a program to reducing smoking, a national organization ran an advertising campaign to convince people to quit or reduce their smoking.
STATISTICS FOR SCIENCE RESEARCH
REGRESSION (R2).
Statistics for Managers using Microsoft Excel 3rd Edition
Inference for Regression
PCB 3043L - General Ecology Data Analysis.
Analyzing and Interpreting Quantitative Data
Social Research Methods
Inferential Statistics
Georgi Iskrov, MBA, MPH, PhD Department of Social Medicine
The Practice of Statistics in the Life Sciences Fourth Edition
Kin 304 Inferential Statistics
CHAPTER 26: Inference for Regression
Introduction to Statistics
STEM Fair Graphs & Statistical Analysis
Basic Statistical Terms
I. Statistical Tests: Why do we use them? What do they involve?
BA 275 Quantitative Business Methods
STEM Fair Graphs.
STATISTICS Topic 1 IB Biology Miss Werba.
Basic Practice of Statistics - 3rd Edition Inference for Regression
UNDERSTANDING RESEARCH RESULTS: STATISTICAL INFERENCE
Psych 231: Research Methods in Psychology
Psych 231: Research Methods in Psychology
Inferential Statistics
15.1 The Role of Statistics in the Research Process
Psych 231: Research Methods in Psychology
Psych 231: Research Methods in Psychology
Understanding Statistical Inferences
Chapter Nine: Using Statistics to Answer Questions
Practice As part of a program to reducing smoking, a national organization ran an advertising campaign to convince people to quit or reduce their smoking.
Descriptive Statistics
Georgi Iskrov, MBA, MPH, PhD Department of Social Medicine
Introductory Statistics
Presentation transcript:

Statistics for science fairs How to make sure your students have the best chance of success at SSEF of Florida

My personal involvement Judging Captain of Senior Animal Sciences How we judge: First, divide projects in orbit and focus groups Use scores to group similar level projects in focus group Discuss top scoring projects to determine overall category winner Last year, much debate over specific project Scores ranked from 1st – 10th depending on judge Upon further discussion, deciding factor came down to use of statistics (incorrect statistics for experimental design) Likely would’ve placed higher, possibly 1st with correct statistics.

Judge’s perspective Are there statistics? Are they the right statistics? Does the student understand the statistics? Don’t need to know the underlying calculations being used, just how they work and what they mean

Common Pitfalls Experimental design with small sample sizes Using averages to make conclusions about data results without looking at the variability in the data Correlation vs. Causation Extrapolation of data Off-the-shelf programs (i.e. Excel) make statistics easy to use, therefore easy to use incorrectly No statistics at all

Role of teachers/advisors Students are not often familiar with statistics – initial exposure Preliminary discussion prior to starting project Good statistical analysis is derived during experimental design Make sure sample size is adequate Understand what tests will help determine significance/validity Reassess statistics as first collecting data Modifications to design may be necessary Find mentor to assist with statistics, if needed

Experimental design Keys to designing project with statistics in mind: Accurately and clearly define variables and sample space Accurately define factors and levels of factors Identify the type of experiment, making sure to use appropriate controls Making sure to perform enough replications Making sure to understand the likely distribution of data Awareness of types of exploratory and inferential analyses used in your field of science. Look to journal articles.

Variables Quantitative variables – differ in magnitude, can be measured Qualitative variables - categorical, observations differ in kind (nominal) Rank qualitative variables – ordinal variables (allows for mathematical analysis) Pre-planning allows to choose what kind of data you will collect, what statistics you can use.

Reducing noise in results Experimental Observations = combination Signal – true effects of variable/outcome Noise – random error introduced by experimental design Increase signal-to-noise ratio (decrease noise) Making repeated measurements of one item Increasing sample size Randomizing samples Randomizing experiments Repeating experiments Including covariates (other variables that might impact results)

Exploratory Data Analysis Categorical Variables - bar graphs, pie charts, two-way tables Quantitative Variables – stem plots, histograms, relative cumulative frequency plots, time plots, scatterplots Calculate and compare mean, median, standard deviation (report value as a measure of center and measure of spread) What is the distribution? Are there outliers?

Statistical inference P-value – probability that the observed result is due to chance The probability that from a randomized controlled experiment, the null hypothesis is correct Relationship between 2 quantitative variables Scatter plot and regression Plot independent variable on x-axis; discuss pattern If linear, calculate correlation coefficient to measure strength and relationship Use least squares regression to determine model of relationship between two variables Conduct t-test If not linear, more advanced methods needed

Statistical inference Compare data from two different groups Box plots and t-tests Plot data using side-by-side box plot Determine alternative hypothesis Define level of significance – use two-sample t-test Evaluate practical significance of difference between groups Report the results as inferential analysis When testing against set value (not between groups), use one sample t-test 7. If comparing across more than 2 groups, consider ANOVA

Statistical inference Inference for categorical variables Testing results against expected distribution Start with two-way table. Calculate marginal distributions and differences between marginal distributions of experimental and control groups Use bar graph or pie chart to show distributions differences among groups 3. Use chi-square test for goodness of fit. Determine test statistic and p-value If chi-square finds significant results, examine to find largest components

Presenting results State statistical hypothesis along with your scientific hypothesis Use flowchart to show experimental design Show how replication, control and randomization are used Show both exploratory data analysis and inferential analysis Discuss meaning of graphs and measures State level of significance in tests (p-value) State conclusions of tests State the statistical and practical significance of results Tell whether null hypothesis was accepted or rejected Statistics are meaningless if they are misrepresented or misunderstood!

Confidence intervals Standard Deviation – measure of spread of data 95% CI Standard Deviation – measure of spread of data How far away from the mean are data points? Variance = ( 𝑥 −𝑥) 2 (𝑛−1) Standard deviation = ( 𝑥 −𝑥) 2 (𝑛−1) How do we express? According to the current data, 68% of data falls within Average ± 1 standard deviation P-value of .05 = 95% confidence interval = 2 standard deviations If this experiment were repeated on multiple samples, the calculated confidence interval would encompass the true population of the parameter 95% of the time.

Regression analysis Estimates relationship between independent and dependent variables Calculates line of best fit between data points Correlation measures how close line fits the data – correlation coefficient (R2) – closest to 1 (positive relationship) or -1 (negative relationship)

Is there a statistical difference? Plant height Control = 15 mL water Treatment A = 30 mL water Treatment B = 45 mL water Does the amount of water affect the height of the plant? On the surface, we might say yes! The averages are higher for treatment A and B. But is this really the case? N Control Treatment A Treatment B 1 5.5 6.0 2 3 6.5 4 5 6 7.0 7 8 9 7.5 10 Avg. 6.05 6.75 6.7

T-test Determines whether the difference in the averages for two or more treatments is mostly caused by the experimental treatment or whether the difference can be explained by random variation Requirements: Two or more comparison groups (control and one treatment, or two or more treatments). A sample size of 10 or more for each experimental group. Numerically measured data (no categories, even if labeled with numbers). Easy to calculate T-Test http://graphpad.com/quickcalcs/ttest1.cfm Excel Other statistics software

Control against treatment a P-value: 0.05 is generally accepted > 0.05, no difference between means P-value of 0.0033 = .33% these results are by chance Watering plants with 30 mL instead of 15 mL will increase height under these same conditions

Control against treatment B P-value of 0.0117 = 1.17% these results are by chance Watering plants with 45 mL instead of 15 mL will increase height under these same conditions But what about between 30 mL and 45 mL?

Treatment a against treatment B P-value of 0.8513 = 85.13% that difference between treatment averages are by chance No significant difference between watering with 30 mL or 45 mL

Chi-Square test Compare observed data with expected data according to a hypothesis 𝑥 2 = (𝑂−𝐸) 2 𝐸 Convert 𝑥 2 to a probability using degrees of freedom (n-1) P-value = probability that null hypothesis (observed and expected are not different) is correct Simple example – flip a coin 100 times. Expected 50 heads, tails. Observed 41 heads, 59 tails. 𝑥 2 = (41−50) 2 50 + (59−50) 2 50 = 3.72 2 groups = 1 Degree of Freedom 7% chance that null hypothesis is correct 93% chance what we saw is different than expected BUT… > 0.05 so we support null hypothesis

ANOVA Analysis of Variance test Determine if there is a difference between means of 3 or more independent, unrelated groups Tests the null hypothesis that no difference between the groups exists. Easy to perform analysis with software If you determine there is a difference between the groups, additional testing will be needed Tukey HSD Scheffe post hoc test Games Howell Dunnett’s C post hoc test

Final thoughts Proper use and understanding of statistics are often the deciding factor between the top projects Give your students the upper hand by discussing statistics at the start of the project Review data and industry standards to determine the correct statistical analysis based on experimental design Properly quantify statistical results in the context of project If possible, review statistical procedures/results with students prior to science fair by knowledgeable mentor

References Online statistics calculator - http://graphpad.com/quickcalcs/ Judge’s perspective - http://www.nsta.org/publications/news/story.aspx?id=53713 Statistics for science projects - https://slvsef.org/documents/teachers/SLVSEF_statistics_for_science_fair_students.pdf Data analysis for science projects - http://www.sciencebuddies.org/science-fair-projects/top_research-project_data-analysis.shtml http://static.nsta.org/files/PB343Xweb.pdf Biological Statistics - http://www.biostathandbook.com/ Engineering Statistics - http://www.itl.nist.gov/div898/handbook/ Contact Kim Unger – kunger@databrains.com