Presentation is loading. Please wait.

Presentation is loading. Please wait.

Quantitative Skills: Data Analysis

Similar presentations


Presentation on theme: "Quantitative Skills: Data Analysis"— Presentation transcript:

1 Quantitative Skills: Data Analysis

2 Data analysis is one of the first steps toward determining whether an observed pattern has validity. Data analysis also helps distinguish among multiple working hypotheses. AP Biology Quantitative Skills Manual

3 Descriptive statistics serves to summarize the data
Descriptive statistics serves to summarize the data. It helps show the variation in the data, standard errors, best-fit functions, and confidence that sufficient data have been collected. AP Biology Quantitative Skills Manual

4 Inferential statistics involves inferring parameters in the natural population from a sample.
AP Biology Quantitative Skills Manual

5 Most of the data you will collect will fit into two categories: measurements or counts.
AP Biology Quantitative Skills Manual Measurement data Count data

6 Most measurements are continuous, meaning there is an infinite number of potential measurements over a given range. AP Biology Quantitative Skills Manual

7 Count data are recordings of qualitative, or discrete, data.
AP Biology Quantitative Skills Manual Number of leaf stomata Number of white eyed individuals

8 How much is good enough? How much data should a researcher collect to make a claim with confidence? How big should the size of the sample be? Is it possible the results were due to chance instead of the manipulation of the variable being tested? AP Biology Quantitative Skills Manual

9 Conducting Data Analysis
AP Biology Quantitative Skills Manual

10 When an investigation involves measurement data, one of the first steps is to construct a histogram, or frequency diagram, to represent the data’s distribution AP Biology Quantitative Skills Manual

11 If the data show an approximate normal distribution on a histogram, then they are parametric data.
AP Biology Quantitative Skills Manual

12 If the data do not show an approximate normal distribution on a histogram, then they are nonparametric data. Different descriptive statistics and tests need to be applied to those data. AP Biology Quantitative Skills Manual

13 Sometimes, due to sampling bias, data might not fit a normal distribution even when the actual population could be normally distributed. In this case, a larger sample size might be needed. AP Biology Quantitative Skills Manual

14 For parametric data (a normal distribution), the appropriate descriptive statistics include :
the mean (average) sample size variance standard deviation standard error AP Biology Quantitative Skills Manual

15 The mean (x)of the sample is the average
The mean (x)of the sample is the average. The mean summarizes the entire sample and might provide an estimate of the entire population’s true mean. AP Biology Quantitative Skills Manual

16 The sample size (n) refers to how many members of the population are included in the study. Sample size is important when estimating how well the sample set represents the entire population. AP Biology Quantitative Skills Manual

17 Variance  (s2) and standard deviation (s) measure how far a data set is spread out. A variance of zero indicates that all the values in a data set are identical. AP Biology Quantitative Skills Manual Distance from the mean Variance

18 Because the differences from the mean are squared to calculate variance, the units of variance are not the same units as in the original data set. The standard deviation is the square root of the variance. The standard deviation is expressed in the same units as the original data set, which makes it generally more useful than the variance. AP Biology Quantitative Skills Manual

19 A small standard deviation indicates that the data tend to be very close to the mean. A large standard deviation indicates that the data are very spread out away from the mean. AP Biology Quantitative Skills Manual

20 A little more than two-thirds of the data points will fall between +1 standard deviation and −1 standard deviation from the sample mean. More than 95% of the data falls between ±2 standard deviations from the sample mean. AP Biology Quantitative Skills Manual

21 AP Biology Quantitative Skills Manual

22 68–95–99.7 Rule AP Biology Quantitative Skills Manual In a normal distribution, 68.27% of all values lie within one standard deviation of the mean % of the values lie within two standard deviations of the mean % of the values lie within three standard deviations of the mean.

23 Sample standard error (SE) is a statistic used to make an inference about how well the sample mean matches up to the true population mean. AP Biology Quantitative Skills Manual

24 Standard error should be represented by including error bars on graphs when appropriate. Error bars are used on graphs to indicate the uncertainty of a reported measurement.  AP Biology Quantitative Skills Manual

25 Different statistical tools are used in the case of data that does not resemble a normal distribution (nonparametric data, or data that is skewed or includes large outliers). median mode quartiles box-and-whisker plots AP Biology Quantitative Skills Manual

26 The median is the value separating the higher half of a data sample from the lower half. To find the median of a data set, first arrange the data in order from lowest to highest value and then select the value in the middle. AP Biology Quantitative Skills Manual 5, 1, 3, 7, 2 1, 2, 3, 5, 7 median

27 If there are two values in the middle of an ordered data set, the median is found by averaging those two values. 5, 1, 3, 7, 4, 2 1, 2, 3, 4, 5, 7 AP Biology Quantitative Skills Manual 3.5 median

28 The mode is the value that appears most frequently in a data set.
3, 5, 1, 3, 7, 2 AP Biology Quantitative Skills Manual 3 is the mode in this example because it appears more frequently than any other number.

29 A bimodal distribution
AP Biology Quantitative Skills Manual A bimodal distribution

30 Data Analysis Flowchart:
Type of Data Measurement Data · Make histogram (Continuous) (normal distribution) Parametric standard deviation, standard error Mean, (not a normal distribution) Nonparametric Median, mode, quartiles (Discrete) Count Data

31 Example of Data Analysis: Do shady English ivy leaves have a larger surface area than sunny English ivy leaves? AP Biology Quantitative Skills Manual

32 Since the data collected is in centimeters, it is measurement data, not count data. So the first step is to make a: AP Biology Quantitative Skills Manual HISTOGRAM

33 Does the data resemble a normal curve?
AP Biology Quantitative Skills Manual (Close enough, with possible differences due to sampling error)

34 Next, the appropriate statistical tools are applied:
AP Biology Quantitative Skills Manual

35 A bar graph can then be produced to compare the means:
AP Biology Quantitative Skills Manual

36 Do the error bars for the shady leaf mean overlap with the error bars for the sunny leaf mean?
AP Biology Quantitative Skills Manual

37 A more rigorous statistical test will need to be performed, but because the error bars do not overlap there is a high probability that the two populations are indeed different from each other. AP Biology Quantitative Skills Manual

38 Example of Data Analysis: Is 98
Example of Data Analysis: Is 98.6°F actually the average body temperature for humans? The data are actually from a sample data set prepared by Allen Shoemaker (Shoemaker, 1996). This particular data set has been modified from the results of a study published in the Journal of American Medical Association (Mackowiak, Wasserman, and Levine, 1992).

39 Since the data collected is in Farenheit, it is measurement data, not count data. So the first step is to make a: AP Biology Quantitative Skills Manual HISTOGRAM

40 Does the data resemble a normal curve?
AP Biology Quantitative Skills Manual (Close Enough)

41 Next, the appropriate statistical tools are applied:
AP Biology Quantitative Skills Manual *Note that by convention, descriptive statistics rounds the calculated results to the same number of decimal places as the number of data points plus 1.

42 According to the 68–95–99.7 Rule, 68% of all samples lie within one standard deviation from the mean. This means that around 68% of the temperatures should be between and AP Biology Quantitative Skills Manual

43 Including the standard error, we can say with a 68% confidence that the mean human body temperature of our sample is ± 0.06°F. AP Biology Quantitative Skills Manual


Download ppt "Quantitative Skills: Data Analysis"

Similar presentations


Ads by Google