Analyzing Measurement Data ENGR 1181 Class 8. Analyzing Measurement Data in the Real World As previously mentioned, data is collected all of the time,

Slides:



Advertisements
Similar presentations
Introduction to Summary Statistics
Advertisements

1 - 1 © 1997 Prentice-Hall, Inc. Statistical Methods.
Introduction to Summary Statistics
SPSS Session 1: Levels of Measurement and Frequency Distributions
Descriptive Statistics A.A. Elimam College of Business San Francisco State University.
Statistics Intro Univariate Analysis Central Tendency Dispersion.
Edpsy 511 Homework 1: Due 2/6.
Engineering Probability and Statistics - SE-205 -Chap 1 By S. O. Duffuaa.
This work is licensed under a Creative Commons Attribution 3.0 Unported LicenseCreative Commons Attribution 3.0 Unported License (CC-BY). Project Management.
Today: Central Tendency & Dispersion
Math 116 Chapter 12.
Basic Data Analysis for Quantitative Research
Objective To understand measures of central tendency and use them to analyze data.
Quantitative Skills: Data Analysis and Graphing.
Graphical Analysis. Why Graph Data? Graphical methods Require very little training Easy to use Massive amounts of data can be presented more readily Can.
Copyright © Allyn & Bacon 2007 Chapter 2: Research Methods.
STAT02 - Descriptive statistics (cont.) 1 Descriptive statistics (cont.) Lecturer: Smilen Dimitrov Applied statistics for testing and evaluation – MED4.
Data Handbook Chapter 4 & 5. Data A series of readings that represents a natural population parameter A series of readings that represents a natural population.
Quantitative Skills 1: Graphing
Developing Student Researchers Part 4 Dr. Gene and Ms. Tarfa Al- Naimi Research Skills Development Unit Education Institute.
Statistics Recording the results from our studies.
Biostatistics: Measures of Central Tendency and Variance in Medical Laboratory Settings Module 5 1.
Data Collection and Analysis ENGR 1181 Class 7. Data Collection in the Real World Data is collected all of the time, just think about it. When you are.
1 Review Descriptive Statistics –Qualitative (Graphical) –Quantitative (Graphical) –Summation Notation –Qualitative (Numerical) Central Measures (mean,
Descriptive Statistics Roger L. Brown, Ph.D. Medical Research Consulting Middleton, WI Online Course #1.
Instrumentation (cont.) February 28 Note: Measurement Plan Due Next Week.
MATH IN THE FORM OF STATISTICS IS VERY COMMON IN AP BIOLOGY YOU WILL NEED TO BE ABLE TO CALCULATE USING THE FORMULA OR INTERPRET THE MEANING OF THE RESULTS.
Central Tendency and Variability Chapter 4. Variability In reality – all of statistics can be summed into one statement: – Variability matters. – (and.
1 Review Sections Descriptive Statistics –Qualitative (Graphical) –Quantitative (Graphical) –Summation Notation –Qualitative (Numerical) Central.
Measures of Central Tendency: The Mean, Median, and Mode
Data and Variation.
DATA MANAGEMENT MBF3C Lesson #4: Measures of Central Tendency.
1 Review Sections 2.1, 2.2, 1.3, 1.4, 1.5, 1.6 in text.
BASIC STATISTICAL CONCEPTS Chapter Three. CHAPTER OBJECTIVES Scales of Measurement Measures of central tendency (mean, median, mode) Frequency distribution.
RESEARCH & DATA ANALYSIS
Statistical analysis Why?? (besides making your life difficult …)  Scientists must collect data AND analyze it  Does your data support your hypothesis?
Quality Control: Analysis Of Data Pawan Angra MS Division of Laboratory Systems Public Health Practice Program Office Centers for Disease Control and.
STATISTICS STATISTICS Numerical data. How Do We Make Sense of the Data? descriptively Researchers use statistics for two major purposes: (1) descriptively.
Measurements and Their Analysis. Introduction Note that in this chapter, we are talking about multiple measurements of the same quantity Numerical analysis.
MATHS CORE AND OPTIONAL ASSESSMENT STANDARDS Found in the Subject Assessment Guidelines (SAG) Dated January 2008.
Engineering College of Engineering Engineering Education Innovation Center Analyzing Measurement Data Rev: , MCAnalyzing Data1.
Descriptive Statistics(Summary and Variability measures)
Probability and Statistics 12/11/2015. Statistics Review/ Excel: Objectives Be able to find the mean, median, mode and standard deviation for a set of.
THE ROLE OF STATISTICS IN RESEARCH. Reading APPENDIX A: Statistics pp
Chapter 6: Descriptive Statistics. Learning Objectives Describe statistical measures used in descriptive statistics Compute measures of central tendency.
MR. MARK ANTHONY GARCIA, M.S. MATHEMATICS DEPARTMENT DE LA SALLE UNIVERSITY.
Why Is It There? Chapter 6. Review: Dueker’s (1979) Definition “a geographic information system is a special case of information systems where the database.
Applied Quantitative Analysis and Practices LECTURE#05 By Dr. Osman Sadiq Paracha.
Week 2 Normal Distributions, Scatter Plots, Regression and Random.
AP PSYCHOLOGY: UNIT I Introductory Psychology: Statistical Analysis The use of mathematics to organize, summarize and interpret numerical data.
STATS DAY First a few review questions. Which of the following correlation coefficients would a statistician know, at first glance, is a mistake? A. 0.0.
© 2006 by The McGraw-Hill Companies, Inc. All rights reserved. 1 Chapter 10 Descriptive Statistics Numbers –One tool for collecting data about communication.
Data analysis is one of the first steps toward determining whether an observed pattern has validity. Data analysis also helps distinguish among multiple.
Introductory Psychology: Statistical Analysis
Statistical Methods Michael J. Watts
Probability and Statistics
Statistical Methods Michael J. Watts
Warm Up What is the mean, median, mode and outlier of the following data: 16, 19, 21, 18, 18, 54, 20, 22, 23, 17 Mean: 22.8 Median: 19.5 Mode: 18 Outlier:
Statistics.
Statistical Reasoning in Everyday Life
Statistics in AP Psychology
Statistics: the language of psychological research
Description of Data (Summary and Variability measures)
Descriptive and Inferential Statistics
STATS DAY First a few review questions.
AP Biology Intro to Statistics
Central Tendency Central Tendency – measures of location for a distribution Mode – the commonly occurring number in a data set Median – the middle score.
Module 8 Statistical Reasoning in Everyday Life
Descriptive Statistics
Describing Data Coordinate Algebra.
Presentation transcript:

Analyzing Measurement Data ENGR 1181 Class 8

Analyzing Measurement Data in the Real World As previously mentioned, data is collected all of the time, just think about it. When you are at the grocery store using your “loyalty card” or when the Google maps SUV is taking videos of your street. The challenge is analyzing the vast amount of data for some useful purpose.

Today's Learning Objectives  After today’s class, students will be able to: Define the terms mean, median, mode, central tendency, and standard deviation. Analyze data using mean, median, and mode. Determine the cause of variation in a given data set. Identify whether variation in a given data set is systematic or random. Identify outliers in a given data set. Create a histogram for a given data set by determining an appropriate bin size and range.

Important Takeaways from Preparation Reading  Histograms They depict data distribution Must know # data points Determine # of bins and bin size  Measures of Central Tendency  Measures of Variation

Analyzing Measurement Data When collecting data, there will always be variation. We can use statistical tools to help us determine:  Is the variation systematic or random?  What is the cause of the variation?  Is the variation in an acceptable range?  What is an acceptable range of variation for this data?

Example: Slingshot Experiment! An engineer is performing a data collection experiment using a slingshot and a softball. She predicted that if the slingshot is pulled back by 1 meter before launching the ball, the softball would land 17 meters downrange. Data is collected from 20 trials. Let’s analyze the data and see how the experiment went…

Example: The Data  Most of this data falls in the range of meters  Do you see any data that appears much outside this range? These rogue data points are called outliers

Example: Histogram  After we have the data, we can create a histogram to depict it graphically  What information do we need to make a histogram? Number of data points Bin size Number of bins

Example: Determining # of Bins  We have a reference chart to determine the number of bins  How many bins should we use?  We choose 7 bins, so the histogram will display as much information as possible. If you have this many data points [n] Use this number of bins [h] Less than 505 to 7 50 to 996 to to 2507 to 12 More than to 20

Example: Calculating Bin Size  There are several ways to calculate bin size (k); we will use the most common formula:  k = ( ) / 7  k = 4.43 ≈ 5

Example: The Resulting Histogram  Bin size: k=5  # Bins: h=7  # Data points: n=20  …How well does this histogram represent the data?  What aspect could we change to improve the representation?

Alternate Histograms Which is more informative?

Histograms: A Summary  It is important that histograms accurately and thoroughly depict the data set  Sometimes the suggested number of bins or bin size will not fit the data set - use your judgment to make manual adjustments  Consider several options to ensure your histogram is as descriptive as possible

Outliers: Deal With It.  Outliers will happen even in good data sets. Good engineers know how to deal with them!  Engineers must determine whether an outlier is a valid data point, or if it is an error and thus invalid.  Invalid data points can be the result of measurement errors or of incorrectly recording the data.

Characterizing The Data  Statistics allows us to characterize the data numerically as well as graphically.  We characterize data numerically in two ways: Central Tendency Variation

Central Tendency  This is a single value that best represents the data. This value could be determined by: Mean Median Mode  For many engineering applications, the mean and median are most relevant.

Central Tendency: Mean  Is the mean, 18.47, a good depiction of our slingshot data?  What about the outlier? How does that affect our mean?

Central Tendency: Mean  Outliers may decrease the usefulness of the mean as a central value.  Observe how outliers affect the calculation of the mean of a data set. Here the set has no outliers:

Central Tendency: Mean  What happens to the mean if we create an outlier on the low end of the data set?  How does this new value describe the data?

Central Tendency: Mean  What happens to the mean if we create an outlier on the high end of the data set?  How does this mean describe the data set?

What Does It All Mean?!  When outliers are present, sometimes the mean is not the best characterization of the data.  What is another value we could use? The Median!

Central Tendency: Median  Let’s find the median for the slingshot data:  With an even number of data points, we take the average of the two middle values.  Here the two middle values are the same, so in this case the median is 17.4

Central Tendency: Comparison Which value is a better representation of the slingshot data? Mean = m Median = 17.4 m

Central Tendency: Median Observe how the median reduces the impact of outliers on the central tendency: Median = 21

Characterizing the Data  We can select a value of central tendency to represent the data, but is just one number enough?  No! It is also important to know how much variation is present in the data.  Variation describes how the data is distributed around the central tendency value.

Representing Variation  As with central tendency, there are multiple ways to represent variation in a set of data: ± (“Plus/Minus”) gives the range of values Standard Deviation provides a more sophisticated look at how the data is distributed around the central value.

Standard Variation Definition How closely the values cluster around the mean; how much variation there is in the data. Equation

Calculating Standard Deviation

Interpreting Standard Deviation  Curve A has a small σ. Data points are clustered close to the mean.  Curve B has a large σ. Data points are far from the mean. Curve B Curve A AA BB

What do you think? Say these curves describe the distribution of grades from an exam, with an average score of 83%... Which class would you rather be in? Curve B Curve A AA BB

Normal Distribution Data that is normally distributed occurs with greatest frequency around the mean. Normal distributions are also known as Gaussian Distributions or Bell Curves

Normal Distribution  Mean = Median = Mode  68% of values within 1 σ  95% of values within 2 σ

Other Distributions Skewed DistributionBimodal Distribution

Important Takeaways  We have learned about some basic statistical tools that engineers use to analyze data.  Histograms are used to graphically represent data, but must be created thoughtfully.  Engineers use both central tendency and variation to numerically describe data.

Preview of Next Class  Technical Communication 2 Expand on written technical communication, with a focus on writing lab memos and lab reports. Discuss good and poor quality presentation material and verbal delivery of technical information.