Welcome back!. May 12 th, 2014…TASIS Exam hall 3vI 3vI.

Slides:



Advertisements
Similar presentations
Modifyuse bio. IB book IB Biology Topic 1: Statistical Analysis ary/Science/c4b/1/stat1.htm
Advertisements

Unit 1: Science of Psychology
Statistical Tests Karen H. Hagglund, M.S.
Pengujian Parameter Koefisien Korelasi Pertemuan 04 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil 2007/2008.
Linear Regression and Correlation Analysis
Chapter 7 Forecasting with Simple Regression
Statistical Analysis I have all this data. Now what does it mean?
TOPIC 1 STATISTICAL ANALYSIS
Statistical Analysis Statistical Analysis
Fall 2013 Lecture 5: Chapter 5 Statistical Analysis of Data …yes the “S” word.
Research and Statistics AP Psychology. Questions: ► Why do scientists conduct research?  answer answer.
Topic 1: Statistical Analysis
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Inferential Statistics.
Statistical Analysis Mean, Standard deviation, Standard deviation of the sample means, t-test.
Statistical Analysis I have all this data. Now what does it mean?
© 2003 Prentice-Hall, Inc.Chap 13-1 Basic Business Statistics (9 th Edition) Chapter 13 Simple Linear Regression.
User Study Evaluation Human-Computer Interaction.
Statistical Analysis. Statistics u Description –Describes the data –Mean –Median –Mode u Inferential –Allows prediction from the sample to the population.
Statistical Analysis Topic – Math skills requirements.
Research & Statistics Looking for Conclusions. Statistics Mathematics is used to organize, summarize, and interpret mathematical data 2 types of statistics.
Lecture 5: Chapter 5: Part I: pg Statistical Analysis of Data …yes the “S” word.
Statistical analysis Outline that error bars are a graphical representation of the variability of data. The knowledge that any individual measurement.
Chapter 1: Science of Psychology Daily Objective (concept map): Apply basic statistical concepts to explain research findings: - Descriptive Statistics:
PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Data Collection and Processing (DCP) 1. Key Aspects (1) DCPRecording Raw Data Processing Raw Data Presenting Processed Data CompleteRecords appropriate.
Statistics in Biology. Histogram Shows continuous data – Data within a particular range.
Essential Question:  How do scientists use statistical analyses to draw meaningful conclusions from experimental results?
The Statistical Analysis of Data. Outline I. Types of Data A. Qualitative B. Quantitative C. Independent vs Dependent variables II. Descriptive Statistics.
Statistical Analysis IB Topic 1. Why study statistics?  Scientists use the scientific method when designing experiments  Observations and experiments.
Sampling  When we want to study populations.  We don’t need to count the whole population.  We take a sample that will REPRESENT the whole population.
1.1 Statistical Analysis. Learning Goals: Basic Statistics Data is best demonstrated visually in a graph form with clearly labeled axes and a concise.
Statistical analysis. Types of Analysis Mean Range Standard Deviation Error Bars.
Descriptive & Inferential Statistics Adopted from ;Merryellen Towey Schulz, Ph.D. College of Saint Mary EDU 496.
Chapter 8 Parameter Estimates and Hypothesis Testing.
More statistics notes!. Syllabus notes! (The number corresponds to the actual IB numbered syllabus.) Put the number down from the syllabus and then paraphrase.
Statistical Analysis Topic – Math skills requirements.
Statistics in IB Biology Error bars, standard deviation, t-test and more.
Introduction to Basic Statistical Tools for Research OCED 5443 Interpreting Research in OCED Dr. Ausburn OCED 5443 Interpreting Research in OCED Dr. Ausburn.
Statistical Analysis. Null hypothesis: observed differences are due to chance (no causal relationship) Ex. If light intensity increases, then the rate.
Data Analysis.
PCB 3043L - General Ecology Data Analysis.
STATISTICS FOR SCIENCE RESEARCH (The Basics). Why Stats? Scientists analyze data collected in an experiment to look for patterns or relationships among.
Statistical analysis Why?? (besides making your life difficult …)  Scientists must collect data AND analyze it  Does your data support your hypothesis?
Introduction to Medical Statistics. Why Do Statistics? Extrapolate from data collected to make general conclusions about larger population from which.
HL Psychology Internal Assessment
DateGroup Project TaskDetails Feb 8, 9Article Analysis Due at start of class. See pgs Feb 15, 16In-class work day Discuss scientific writing,
EDUC 200C week10 December 7, Two main ideas… Describing a sample – Individual variables (mean and spread of data) – Relationships between two variables.
Jump to first page Inferring Sample Findings to the Population and Testing for Differences.
Data Analysis. Qualitative vs. Quantitative Data collection methods can be roughly divided into two groups. It is essential to understand the difference.
Statistical Analysis IB Topic 1. IB assessment statements:  By the end of this topic, I can …: 1. State that error bars are a graphical representation.
Statistics for A2 Biology Standard deviation Student’s t-test Chi squared Spearman’s rank.
AP PSYCHOLOGY: UNIT I Introductory Psychology: Statistical Analysis The use of mathematics to organize, summarize and interpret numerical data.
Outline Sampling Measurement Descriptive Statistics:
Correlation Scientific
Statistical analysis.
Practice As part of a program to reducing smoking, a national organization ran an advertising campaign to convince people to quit or reduce their smoking.
AP Biology Intro to Statistics
Modify—use bio. IB book  IB Biology Topic 1: Statistical Analysis
Statistical analysis.
PCB 3043L - General Ecology Data Analysis.
Basic Statistics Overview
9th Grade Making sense of data
Scientific Practice Correlation.
TOPIC 1: STATISTICAL ANALYSIS
NURS 790: Methods for Research and Evidence Based Practice
Statistical Analysis Error Bars
Statistical Analysis IB Topic 1.
Statistical Analysis - Mean(Average), Median, Mode, Range - Standard Deviation - T-test/ANOVA - Correlation - Chi Test - Percent Change.
STATISTICS Topic 1 IB Biology Miss Werba.
Research Methods: Data analysis and reporting investigations.
Presentation transcript:

Welcome back!

May 12 th, 2014…TASIS Exam hall 3vI 3vI

Official IB Schedule IA SUBMISSION: MARCH 28, 2014 IB BIOLOGY FINAL EXAM: MAY 12 (MONDAY) 33 weeks from beginning of term

TOPICS FOR FALL TERM STANDARD AND HIGHER LEVEL Statistics (2h) Genetics (15h) Respiration (2h) Photosynthesis (3h) Further Ecology (6h) (SL only – Topic A: Human Nutrition and Health) (2h) HIGHER LEVEL ONLY Further Genetics ( 6h) Further Respiration (7h) Further Photosynthesis (5h) Further Ecology (5h)

TOPICS FOR SPRING TERM STANDARD AND HIGHER LEVEL (SL only – Topic A: Human Nutrition and Health) (2h) Human physiology (9 h) HIGHER LEVEL ONLY Plant science ( 11 h) HL Human physiology ( 17 h) Topic H: further human physiology (15h)

Remaining IA assignments Genetics IA( DCP and CE) [due October 21] Photosynthesis IA (Design) [due December 18] Plant Science IA (Design,DCP and CE) Human Health and Nutrition (Design, DCP and IA) IA SUBMISSION: MARCH 28, 2014

Big questions in Science… What do I need to know about statistics to succeed in IB Biology?

Statistics How can we know that scientific information is reliable and valid? Why does Biology need statistical methods? Ben Goldacre...

Can statistics help us? Chocolate gives you spots Late nights sap young people’s brain power Coffee can make you see dead people Mobile phones cause cancer!

Statisticians… ‘..people who like figures, but don’t have the personality skills to become accountants…’ do uncertainty, randomness and chance have a place in science? How should we react to them?...

What do we do with Biological data? 1. EYEBALL the data: Measure ‘central value’: mean, median, mode Measure ‘spread’ (variance): range, standard deviation, interquartile range 2. Compare data sets (STATISTICAL TESTS) 3. Look for relationships (often called correlations) between data sets

What do we need to know about statistics? ‘Average’: mean, median, mode ‘Error bars’ : Variance, standard deviation, standard error of the mean, (interquartile range) Significance and probability T-tests ( 1- and 2- tailed, paired and independent) Chi-Squared test (genetics IA) The relationship of causation and correlation Classic graphs

How do we make sense of data? Descriptive statistics Look for patterns and outliers in different groups Graphs, tables, means and variance You can’t use the results to generalise about the population beyond the data Inferential statistics Apply tests to see if the differences we see are of predictive value (reliable) T-tests Chi-squared tests ANOVA Regression analysis allow us to make inferences (generalisations) about the population beyond our da

Inferential statistics use probability (p) values The p value tells us the likelihood that the difference we observed is real and repeatable Specifically, the p value is the probability that the difference observed was produced by random data (chance) If p = 0.10, there is a 10% chance If p = 0.05, there is a 5% chance If p = there is a 1% chance Scientists accept p < 0.05 as ‘significantly different’

Sample size matters Bigger samples make it easier to detect differences A good guideline is to aim for 20 – 30 data points in each test group

Looking at data

Biological data are often normally distributed Height Blood pressure Heart rate Marks on an exam Errors in machine-made products

If NOT normally distibuted, data can be skewed (or just jumbled!)

An example Researchers have developed a new drug (tetesterol) to lower serum cholesterol levels They treat 2 groups for a month with either tetesterol or placebo After that month, the researchers measure cholesterol in both groups

MEAN Cholesterol concentration after 1 month… (i.e., does the drug really make a difference?)

First, ‘eyeball’ the data: ‘Descriptive statistics’

Measure the central tendency (mean, median, mode)

Why not just look at the means (central tendency)? The means(/medians/modes) may show you a difference, but we can’t be sure that it’s a reliable difference Which of these data sets shows the greatest variation?

Is this difference reliable? (i.e., does the drug really make a difference?) Cholesterol concentration after 1 month

In order to compare test samples, we also need to look at the spread of results

Measurement of ‘spread’ (variance): Range Variance Standard deviation (standard error) (interquartile range)

Range – and its limitations

Standard deviation σ A measure of spread It is, simply, the square root of the variance It gives us an idea of the spread of most of the data and is much more reliable than range (less affected by anomalous data) You just need to press a button You don’t need to know the formula (There are links on the Blog if you WANT to know the formula…)

Variance Officially: Variance: the average of the squared differences from the mean in a sample You calculate it using a calculator or EXCEL

Standard deviation Only applicable to normal distributions 68% of values are within 1 standard deviation of the mean 95% of values are within 2 SD’s of the mean

Error bars

Error bars on graphs They are graphical representations of the spread (variability) of the data May represent: Range Standard deviation Standard error Confidence intervals Interquartile range

There are various types of error bar

Question check: Which data set has the highest mean? Which data set has the highest variability? What do the error bars represent?

Question check:

Statistical tests for comparing two normally distributed data sets The T-test

Comparing data

Drug trial data

Large overlap: lots of shared data… Results are not likely to be significantly different (more likely due to chance) Small or no overlap: very little shared data… Results are likely to be significantly different (‘real’)

Inferential Statistics Comparing two data sets: The T-test… Used to compare two normally distributed data sets (ideally with similar variances) A t-test is a statistic that checks if the means of 2 groups are reliably different Just looking at the means may show you that they are different, but doesn’t show if the difference is reliable We always test the NULL Hypothesis (H 0 ) T-test…the movie…

Two main types of T-test Independent (unpaired) samples (most common) E.g. testing the quality of two types of fruit smoothie… Dependent (paired) samples One group measured at 2 different times E.g. heart rate before and after exercise

So what is the T-value? It’s just a number!

Calculating the T test

Drawing conclusions 1.State the Null hypothesis and the alternative hypothesis Null hypothesis: no significant difference between the two groups Alternative hypothesis: there IS a significant difference between the two groups 2. Set the critical value at p < Calculate the degrees of freedom For unmatched (independent) observations, df = (n1 + n2) – 2 4. Identify the critical t value from your table 5. If the calculated value is greater than the critical t value (or if p < 0.05), then the Null Hypothesis is REJECTED (i.e. the data sets are significantly different) 6. Write a statistical summary statement based on the decision 7. Write a statement in CLEAR ENGLISH based on the statement

Reading, writing and understanding T- tests (99) = degrees of freedom How many samples were there in this case? p = probability of results happening by chance Are these results significant? M = mean values

So what are degrees of freedom? Degrees of freedom represent sample size. For only one group, df = n-1, where n = number of samples [dependent /paired samples T-test] Usually we are looking at 2 groups, so df = (n 1 + n 2 ) -2

Question check:

Let’s try some…examples from the worksheet 6. In a t-test comparing Group A and Group B, the P value was calculated as What does this P value tell us about these two sets of data? Explain your answer. 8. (b.) A student measures 15 snail shells on the north side of an island and 16 on the south. H 0 = Confidence = DF = Critical value = t is calculated as So we reject/accept H o. Conclusion:

Correlations and coincidences

Statistics Ben Goldacre...

Correlation doesn’t mean causation Biologists frequently look for correlations (associations) between two variables (e.g. body weight and sugar consumption; drug consumption and death; hours of sleep and exam performance) Data are typically plotted as a scatter plot Mathematically derived correlations do NOT provide evidence of a cause. Rather, we must develop experiments to identify the mechanism which is the cause of the observed correlation. Observations lacking a controlled experiment can only suggest a correlation

How do we calculate correlation? We use statistical tests (you don’t need to know their names!): 1.Pearson’s correlation coefficient (r) 2.Spearman’s rank-order correlation coefficient (R s) For both, the value of r ranges from +1 (completely positive correlation) to – 1 (completely negative correlation)

An example….

Calculation of correlation coefficients

Calculation of correlation… Having identified correlation, the cause must be determined ‘Correlation’ and r values simply give us clues where to look Some weird correlations…. ‘ice cream sales and the number of shark attacks’ ‘skirt lengths and stock prices are highly correlated’ The number of dental cavities in elementary school children and vocabulary size’

Positive correlation The two variables measured change in the same direction E.g. as temperature increases, the number of ice creams sold in Sara-Li’s increases

Lines of best fit Aims to go through the middle of all of the points on a scatter plot; the better the fit, the stronger the correlation Typically use programming tools (EXCEL and Logger Pro) to draw lines and calculate correlation

Negative correlation As the number of weeks in the charts increases, the number of records sold falls

No correlation

Question check: