AP Biology Resources Statistical Analysis and Graphing.

Slides:



Advertisements
Similar presentations
Unit 1.1 Investigating Data 1. Frequency and Histograms CCSS: S.ID.1 Represent data with plots on the real number line (dot plots, histograms, and box.
Advertisements

Appendix A. Descriptive Statistics Statistics used to organize and summarize data in a meaningful way.
Introduction to Summary Statistics
Introduction to Summary Statistics
Introduction to Data Analysis
Basic Statistical Concepts
Measures of Dispersion
Central Tendency and Variability
Today: Central Tendency & Dispersion
Programming in R Describing Univariate and Multivariate data.
Objective To understand measures of central tendency and use them to analyze data.
Descriptive Statistics Used to describe the basic features of the data in any quantitative study. Both graphical displays and descriptive summary statistics.
Chapter 3: Central Tendency. Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately.
STAT02 - Descriptive statistics (cont.) 1 Descriptive statistics (cont.) Lecturer: Smilen Dimitrov Applied statistics for testing and evaluation – MED4.
Overview Summarizing Data – Central Tendency - revisited Summarizing Data – Central Tendency - revisited –Mean, Median, Mode Deviation scores Deviation.
Statistics Chapter 9. Statistics Statistics, the collection, tabulation, analysis, interpretation, and presentation of numerical data, provide a viable.
Data Handbook Chapter 4 & 5. Data A series of readings that represents a natural population parameter A series of readings that represents a natural population.
CSCI N207: Data Analysis Using Spreadsheets Copyright ©2005  Department of Computer & Information Science Univariate Data Analysis.
Introduction to Summary Statistics. Statistics The collection, evaluation, and interpretation of data Statistical analysis of measurements can help verify.
Nature of Science Science Nature of Science Scientific methods Formulation of a hypothesis Formulation of a hypothesis Survey literature/Archives.
Skewness & Kurtosis: Reference
The Central Tendency is the center of the distribution of a data set. You can think of this value as where the middle of a distribution lies. Measure.
10b. Univariate Analysis Part 2 CSCI N207 Data Analysis Using Spreadsheet Lingma Acheson Department of Computer and Information Science,
Dr. Serhat Eren 1 CHAPTER 6 NUMERICAL DESCRIPTORS OF DATA.
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
Measures of Central Tendency: The Mean, Median, and Mode
Central Tendency & Dispersion
Chapter Eight: Using Statistics to Answer Questions.
RESEARCH & DATA ANALYSIS
UNIT #1 CHAPTERS BY JEREMY GREEN, ADAM PAQUETTEY, AND MATT STAUB.
Introduction to Statistics Measures of Central Tendency and Dispersion.
Lean Six Sigma: Process Improvement Tools and Techniques Donna C. Summers © 2011 Pearson Higher Education, Upper Saddle River, NJ All Rights Reserved.
Introduction to statistics I Sophia King Rm. P24 HWB
Measurements and Their Analysis. Introduction Note that in this chapter, we are talking about multiple measurements of the same quantity Numerical analysis.
Averages and Variability
Chapter 6: Descriptive Statistics. Learning Objectives Describe statistical measures used in descriptive statistics Compute measures of central tendency.
AP Statistics. Chapter 1 Think – Where are you going, and why? Show – Calculate and display. Tell – What have you learned? Without this step, you’re never.
Slide 1 Copyright © 2004 Pearson Education, Inc.  Descriptive Statistics summarize or describe the important characteristics of a known set of population.
Section 2.1 Visualizing Distributions: Shape, Center, and Spread.
Descriptive Statistics ( )
INTRODUCTION TO STATISTICS
Different Types of Data
Probability and Statistics
Chapter 3 Describing Data Using Numerical Measures
Descriptive measures Capture the main 4 basic Ch.Ch. of the sample distribution: Central tendency Variability (variance) Skewness kurtosis.
Measures of Central Tendency
PCB 3043L - General Ecology Data Analysis.
Introduction to Summary Statistics
AP Biology Intro to Statistics
LEARNING OUTCOMES After studying this chapter, you should be able to
Univariate Analysis/Descriptive Statistics
Central Tendency and Variability
Introduction to Summary Statistics
Introduction to Summary Statistics
Introduction to Summary Statistics
Description of Data (Summary and Variability measures)
Introduction to Summary Statistics
Chapter 3 Describing Data Using Numerical Measures
Introduction to Summary Statistics
Introduction to Summary Statistics
Introduction to Summary Statistics
Displaying Distributions with Graphs
Displaying and Summarizing Quantitative Data
Statistics: The Interpretation of Data
Introduction to Summary Statistics
Introduction to Summary Statistics
Chapter Nine: Using Statistics to Answer Questions
BUSINESS MARKET RESEARCH
Introduction to Summary Statistics
Advanced Algebra Unit 1 Vocabulary
Presentation transcript:

AP Biology Resources Statistical Analysis and Graphing

Introduction to Data Analysis  Data is shorthand for information!  Data can be precise, accurate, or both …  Precision describes the reproducibility of a result. For example, if you measure a quantity several times and the values agree closely with one another, your measurement is precise.  Accuracy describes how close a measured value is to the true or known value. The closer a measured value is to the true value, the more accurate it is. 2

Data Measurement & Sampling  Guides the final experimental analysis  Influenced by  Sampling  Use of controls  Experimental error  Measuring precision  Instruments and methods used to collect data must be validated for accuracy  Sampling is the main technique employed for data selection 3

Statistics “ The mathematical study of the likelihood and probability of events occurring based on known information and inferred by taking a limited number of samples.” From:  Descriptive Statistics describe the population or sample from which the data were derived. Examples  Range  Min/Max  Average(s)  Median  Mode  Variance  Standard Deviation  Histograms and Normal Distributions 4

Averages Measures of Central Tendency  Commonly called “averages,” measures of central tendency are important in statistics because of their ability to summarize entire sets of data with a single number. There are many types of averages but the most well known are the mean, median, and mode of a set.  For the following set {73, 66, 69, 67, 49, 60, 81, 71, 78, 62, 53, 87, 74, 65, 74, 50, 85, 45, 63, 100} The three named averages are Mean = 68.6 Median = 68 Mode = 74 5

Mean The Center of Mass  The arithmetic mean is a value that is computed by dividing the sum of a set of terms by the number of terms. It is sometimes called “the average”, but it is more specific to call it “the mean”.  For the following set {73, 66, 69, 67, 49, 60, 81, 71, 78, 62, 53, 87, 74, 65, 74, 50, 85, 45, 63, 100} 1. Sum all terms = Then, divide by the number of terms (20 in this case). Mean = 68.6 Excel Function: AVERAGE() 6

Median The Middle Number(s)  The median is the "middle" value when the list of numbers is ordered sequentially. For an even number of terms the median is usually the mean of the middle terms.  For the following set {73, 66, 69, 67, 49, 60, 81, 71, 78, 62, 53, 87, 74, 65, 74, 50, 85, 45, 63, 100} Reordered in sequence… {45, 49, 50, 53, 60, 62, 63, 65, 66, 67, 69, 71, 73, 74, 74, 78, 81, 85, 87, 100} Median = ( ) ÷ 2 = 68  In a set with an even number of terms, it is occasionally appropriate to simply choose one of the middle values – necessity and nature of choice depends on context. Excel Function: MEDIAN() 7

Mode The Most Frequent  The mode is the value that occurs the most frequently in a data set. Sets can be unimodal, bimodal, trimodal, or multimodal.  A dot plot is useful for quickly identifying the mode(s) of a set. For the following set {73, 66, 69, 67, 49, 60, 81, 71, 78, 62, 53, 87, 74, 65, 74, 50, 85, 45, 63, 100} Mode = 74 Excel Function: MODE() – returns the smallest mode if there are multiples Excel Function (2010): MODE.MULTI() – this must be entered as an array function to work properly

Comparing Mean, Median, and Mode  The median may or may not be close to the mean.  The data may or may not be symmetrical around the mean value. The mode, although it is the most frequent value, may not be close to either the mean or the median. Mean = 68.6 Median = 68 Mode = 74 9

The Mean and Balance  We can think of the mean as the place where a set of identical weights put at different locations on a number line would balance. Mean =

However … Values in a Sample May Not Be Equally Important  In a set of test scores, the score on the final exam may count more (carry more weight) than the scores on quizzes and chapter tests,  In calculating the quality of water in an area, nine different parameters may be considered, but two of the parameters may be more critical for human safety Weighted Mean =

Weighted Mean Regain Your Balance  A weighted mean uses the “heaviness” or “importance” of each element to find a new balance point… Weighted Mean =

Weighted Mean Mean, Corrected for Weight  The weighted mean is similar to the mean of a frequency distribution, where “weight” and “frequency” are interchangeable.  For the following set {73, 66, 69, 67, 49, 60, 81, 71, 78, 62, 53, 87, 74, 65, 74, 50, 85, 45, 63, 100}  There must be a corresponding set of weights: {1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 4}  Follow these steps to find the weighted mean. 1.Find the product of each value with its weight. 2.Sum the products. 3.Divide by the sum of weights. Ex. 73(1) + 66(1) + 67(1)… Weighted Mean ≅ 72.7 Excel Function: SUMPRODUCT() ÷ SUM() 13

Water Quality Index (WQI)  WQI is an example of a weighted average (i.e., weighted mean) and, since the weights sum to one, it is somewhat easier to calculate. Test Results (Column A) Q-Value (Column B) Weighting Factor (Column C) TOTAL (Column D) 1. DO% sat Fecal Coliformcolonies/100 ml pHunits BODmg/L TemperatureΔ°C Total Phosphatemg/L Nitratesmg/L TurbidityNTU or Ft Total Solidsmg/L

Special Note  The term profile graph is occasionally used to refer to a line graph that represents the change in a variable like WQI over a geographical area (e.g., by milepost). 15

Deviant Data Measures of Dispersion  Measures of dispersion or “spread” represent how much the data differs or deviates in general or from the mean, median, or mode. Measures include  Range  Variance  Standard Deviation  And more! 16

Range A Measure of Dispersion  The range is the difference between the minimum and maximum values in a data set.  A large range usually (but not always) indicates wide dispersion of the values. {73, 66, 69, 67, 49, 60, 81, 71, 78, 62, 53, 87, 74, 65, 74, 50, 85, 45, 63, 100} Range = 100 – 45 = 65 Excel Functions: MAX(), MIN() 17

Variance (s 2 ) Another Measure of Dispersion  The variance of a set describes how far the numbers lie from the mean (or expected value). To calculate the variance 1.Determine the mean of the set. 2.Find each value’s deviation from the mean. 3.Square each of the deviations. 4.Sum the squared deviations and divide by N-1.  N is the number of values in the set. N-1 is the correction factor for the variance of a sample. Population variance requires division by N. Data set:{30, 23, 22} Mean:75 ÷ 3 = 25 Variance:38 ÷ 2 = 19 Excel Functions: VARP(), VAR() Totals:

Variance (s 2 ) Another Measure of Dispersion  The formula for variance is written as shown below. You may recall that the mean of the set below is 68.6, as calculated previously. {73, 66, 69, 67, 49, 60, 81, 71, 78, 62, 53, 87, 74, 65, 74, 50, 85, 45, 63, 100}  Variance s 2 = [(73 – 68.6) 2 + (66 – 68.6) 2 + (69 – 68.6) 2 + (67 – 68.6) 2 + … ] ÷ 20-1 ≈

Standard Deviation(s) You Guessed It … One More Measure of Dispersion  Standard deviation is the square root of the variance. It can be thought of as the average distance from the mean of a data set. {73, 66, 69, 67, 49, 60, 81, 71, 78, 62, 53, 87, 74, 65, 74, 50, 85, 45, 63, 100} Standard Deviation = Excel Functions: STDEVP(), STDEV() 20

Standard Deviation You Guessed It … One More Measure of Dispersion  Find the standard deviation of this sample set: {1, 2, 3, 4, 5} Step 1: Calculate the mean ( ) of the set Step 2: Find each value deviation. Step 3: Square each of the deviations. Step 4: Sum the deviations and divide by N-1. Step 5: Take the square root Where: Σ is the sum x is an element of the sample is is the mean of the set N is the sample size (number of values) Step 2 Step 3 21

Standard Deviation Tells a Different Story than the Mean Mean = 15.5 #s = Mean = 15.5 #s =

Frequency Table  A frequency table is a table that lists items in a set and records the number of occurrences.  Choose categories and group the data appropriately. {73, 66, 69, 67, 49, 60, 81, 71, 78, 62, 53, 87, 74, 65, 74, 50, 85, 45, 63, 100} Category Labels Frequency >901 23

Histogram  A histogram is a graphical representation, similar to a bar chart in structure, that organizes a group of data points into user-specified ranges. The histogram condenses a data series into an easily interpreted visual by taking many data points and grouping them into logical ranges or bins. 24

Histogram  A histogram is simply a bar* chart of a frequency table *A histogram in Excel is called a “column” chart. Category Labels Frequency >901 25

Histogram Analysis  Histogram Data Set Analysis – Test Scores {73, 66, 69, 67, 49, 60, 81, 71, 78, 62, 53, 87, 74, 65, 74, 50, 85, 45, 63, 100}  Mean68.6  Median68  Mode74 Mean (68.6) and Median (68) Mode (74) -1 SD +1 SD 26

Distributions  Descriptive statistics can be easier to interpret when graphically illustrated.  Charting each data element can lead to very busy and confusing charts that do not help interpret the data.  Dot plots and histograms provide a graphical illustration of how the data is distributed throughout its range. 27

Normal Distributions  Normal distribution is considered the most prominent probability distribution in statistics.  Bell Curve Shape  Symmetrical  Mean = Median = Mode Mean, Median, Mode 28

Neo/SCI® PO Box 3000 Nashua, NH USA Phone: FAX:

Credits Content: Kenneth G. Rainis, Eddie Marroon Powered by: Pixel KNOWLEDGE © Delta Education, LLC. A member of the School Specialty family. All rights reserved.