Presentation is loading. Please wait.

Presentation is loading. Please wait.

STATISTICS!!! The science of data. What is data? Information, in the form of facts or figures obtained from experiments or surveys, used as a basis for.

Similar presentations


Presentation on theme: "STATISTICS!!! The science of data. What is data? Information, in the form of facts or figures obtained from experiments or surveys, used as a basis for."— Presentation transcript:

1 STATISTICS!!! The science of data

2 What is data? Information, in the form of facts or figures obtained from experiments or surveys, used as a basis for making calculations or drawing conclusions Encarta dictionary Encarta dictionary

3 Statistics in Science Data can be collected about a population (surveys) How many individuals are going to vote for Trump based on his awesome hair Data can be collected about a process (experimentation) FMRI tests that determine the brain activity of those willing to vote for Trump

4 Qualitative Data Information that relates to characteristics or description (observable qualities) What is the flaw in this type of science?? Information is often grouped by descriptive category Examples Species of plant Type of insect Shades of color Rank of flavor in taste testing Remember: qualitative data can be “ scored ” and evaluated numerically

5 Qualitative data, manipulated numerically Survey results, teens and need for environmental action

6 Quantitative data Quantitative – measured using a naturally occurring numerical scale Examples Chemical concentration TemperatureLengthWeight…etc.

7 Quantitation Measurements are often displayed graphically

8 Measurement In data collection for Biology, data must be measured carefully, using laboratory equipment ( ex. Timers, meter sticks, pH meters, balances, pipettes, etc) The limits of the equipment used add some uncertainty to the data collected. All equipment has a certain magnitude of uncertainty. For example, is a ruler that is mass-produced a good measure of 1 cm? 1mm? 0.1mm? For quantitative testing, you must indicate the level of uncertainty of the tool that you are using for measurement!!

9 How to determine uncertainty? Usually the instrument manufacturer will indicate this – read what is provided by the manufacturer. The IB acceptable uncertainty is ½ the smallest measurement on the instrument you are using. You will be marked down if you do not include a valid uncertainty! Be sure that the number of significant digits in the data table/graph reflects the precision of the instrument used (for ex. If the manufacturer states that the accuracy of a balance is to 0.1g – and your average mass is 2.06g, be sure to round the average to 2.1g) Your data must be consistent with your measurement tool regarding significant figures.

10 Finding the limits If the room temperature is read as 25 degrees C, with a thermometer that is scored at 1 degree intervals – what is the range of possible temperatures for the room? If the room temperature is read as 25 degrees C, with a thermometer that is scored at 1 degree intervals – what is the range of possible temperatures for the room? (+/- 0.5 degrees Celsius - if you read 15 o C, it may in fact be 14.5 or 15.5 degrees) (+/- 0.5 degrees Celsius - if you read 15 o C, it may in fact be 14.5 or 15.5 degrees)

11 Basic Math Review – 3 measures of “ Central Tendency ” mode: value that appears most frequently median: When all data are listed from least to greatest, the value at which half of the observations are greater, and half are lesser. The most commonly used measure of central tendency is the mean, or arithmetic average (sum of data points divided by the number of points) The most commonly used measure of central tendency is the mean, or arithmetic average (sum of data points divided by the number of points)

12 Measures of Average Mean: average of the data set Steps: Add all the numbers and then divide by how many numbers you added together Example: 3, 4, 5, 6, 7 3+4+5+6+7= 25 25 divided by 5 = 5 The mean is 5 When would you be expected to do this in biology?

13 Measures of Average Median: the middle number in a range of data points Steps: Arrange data points in numerical order. The middle number is the median If there is an even number of data points, average the two middle numbers Mode: value that appears most often Example: 1, 6, 4, 13, 9, 10, 6, 3, 19 1, 3, 4, 6, 6, 9, 10, 13, 19 Median = 6 Mode = 6

14 Looking at Data How accurate is the data? (How close are the data to the “ real ” results?) This is also considered as BIAS How precise is the data? (All test systems have some uncertainty, due to limits of measurement) Estimation of the limits of the experimental uncertainty is essential.

15

16

17 Comparing Averages Once the 2 averages are calculated for each set of data, the average values can be plotted together on a graph, to visualize the relationship between the 2

18

19

20 Drawing error bars The simplest way to draw an error bar is to use the mean as the central point, and to use the distance of the measurement that is furthest from the average as the endpoints of the data bar

21 Average value Value farthest from average Calculated distance

22 What do error bars suggest? If the bars show extensive overlap, it is likely that there is not a significant difference between those values

23

24 Measures of Variability A better way to do error bars Standard Deviation In normal distribution, about 68% of values are within one standard deviation of the mean Often report data in terms of +/- standard deviation It shows how much variation there is from the "average" (mean). If data points are close together, the standard deviation with be small If data points are spread out, the standard deviation will be larger

25 Standard Deviation 1 standard deviation from the mean in either direction on horizontal axis represents 68% of the data 2 standard deviations from the mean and will include ~95% of your data 3 standard deviations form the mean and will include ~99% of your data Bozeman videoBozeman video: Standard Deviation Bozeman video

26 Calculating Standard Deviation

27 Grades from recent quiz in AP Biology: 96, 96, 93, 90, 88, 86, 86, 84, 80, 70 1 st Step: find the mean (X) Measure Number Measured Value x(x - X)(x - X) 2 196981 296981 392525 49039 58811 6861 7861 884-39 980-749 1070-17289 TOTAL868TOTAL546 Mean, X87Std Dev

28 Calculating Standard Deviation 2 nd Step: determine the deviation from the mean for each grade then square it Measure Number Measured Value x(x - X)(x - X) 2 196981 296981 392525 49039 58811 6861 7861 884-39 980-749 1070-17289 TOTAL868TOTAL546 Mean, X87Std Dev

29 Calculating Standard Deviation Step 3: Calculate degrees of freedom (n-1) where n = number of data values So, 10 – 1 = 9 Measure Number Measured Value x(x - X)(x - X) 2 196981 296981 392525 49039 58811 6861 7861 884-39 980-749 1070-17289 TOTAL868TOTAL546 Mean, X87Std Dev

30 Calculating Standard Deviation Step 4: Put it all together to calculate S S = √(546/9) = 7.79 = 7.79 = 8 = 8 Measure Number Measured Value x(x - X)(x - X) 2 196981 296981 392525 49039 58811 6861 7861 884-39 980-749 1070-17289 TOTAL868TOTAL546 Mean, X87Std Dev8

31 Calculating Standard Error So for the class data: Mean = 87 Standard deviation = 8 1 standard of deviation would be (87 – 8) thru (87 + 8) or 81-95 So, 68.3% of the data should fall between 81 and 95 2 standards of deviation would be (87 – 16) thru (87 + 16) or 71- 103 So, 95.4% of the data should fall between 71 and 103 3 standards of deviation would be (87 – 24) thru (87 + 24) or 63- 111 So, 99.7% of the data should fall between 63 and 111

32 Measures of Variability Standard Error of the Mean (SEM) Accounts for both sample size and variability Used to represent uncertainty in an estimate of a mean As SEM grows smaller, the likelihood that the sample mean is an accurate estimate of the population mean increases

33 Calculating Standard Error Using the same data from our Standard Deviation calculation: Mean = 87 S (standard deviation) = 8 n = 10 SE X = 8/ √10 = 2.52 = 2.52 = 2.5 = 2.5 This means the measurements vary by ± 2.5 from the mean Bozeman video: Standard Error

34 Graphing Standard Error Common practice to add standard error bars to graphs, marking one standard error above & below the sample mean (see figure below). These give an impression of the precision of estimation of the mean, in each sample. Which sample mean is a better estimate of its population mean, B or C? Identify the two populations that are most likely to have statistically significant differences?

35 The Good News This can be calculated on a scientific calculator OR…. In Microsoft Excel, type the following code into the cell where you want the Standard Deviation result, using the "unbiased," or "n-1" method: =STDEV(A1:A30) (substitute the cell name of the first value in your dataset for A1, and the cell name of the last value for A30.) OR….Try this! http://www.pages.drexel.edu/~jdf37/mean.htm http://www.pages.drexel.edu/~jdf37/mean.htm

36 You DO need to know the concept! standard deviation is a statistic that tells how tightly all the various data points are clustered around the mean in a set of data. standard deviation is a statistic that tells how tightly all the various data points are clustered around the mean in a set of data. When the data points are tightly bunched together and the bell-shaped curve is steep, the standard deviation is small (precise results, smaller standard of deviation) When the data points are spread apart and the bell curve is relatively flat, a large standard deviation value suggests less precise results When the data points are spread apart and the bell curve is relatively flat, a large standard deviation value suggests less precise results

37 Correlation Defined as an association between two variables Do the variables vary together Does one variable affect the other

38 Correlation cont. Important to remember that correlation does not necessarily mean causation Here are some true examples: Ice cream sales and the number of shark attacks on swimmers are correlated. Skirt lengths and stock prices are highly correlated (as stock prices go up, skirt lengths get shorter). The number of cavities in elementary school children and vocabulary size have a strong positive correlation. Air temperature and the number of earthquakes (shake and bake) If there is correlation more research would be necessary to determine whether it is co-incidence or a real relation between the data

39 THE END For today……….


Download ppt "STATISTICS!!! The science of data. What is data? Information, in the form of facts or figures obtained from experiments or surveys, used as a basis for."

Similar presentations


Ads by Google