SCATTER PLOTS AND SMOOTHING. An Example – Car Stopping Distances An experiment was conducted to measure how the stopping distance of a car depends on.

Slides:



Advertisements
Similar presentations
©Gioko ®2007 Uncertainties in calculated results.
Advertisements

The ISA for Physics What you need to revise.
The Distance Formula. What is The Distance Formula? The Distance formula is a formula used to find the distance between to different given points on a.
Experiments and Variables
Advanced Higher STATISTICS Linear Regression To see if there is a relationship between two variables, we draw a scatter-graph. It is then possible to draw.
Limitations of ANOVA ©2005 Dr. B. C. Paul. The Data Size Effect We Did ANOVA with one factor We Did ANOVA with one factor We Did it with two factors (Driver.
Chapter 10 Regression. Defining Regression Simple linear regression features one independent variable and one dependent variable, as in correlation the.
Regression Analysis Using Excel. Econometrics Econometrics is simply the statistical analysis of economic phenomena Here, we just summarize some of the.
Practical Sheet 6 Solutions Practical Sheet 6 Solutions The R data frame “whiteside” which deals with gas consumption is made available in R by > data(whiteside,
Math 174, Spring 2004 Trend Lines. The real world often does not provide us with formulas showing the relationship between two variables. Sometimes we.
Matching level of measurement to statistical procedures
Correlations and T-tests
EMSE 3123 Math and Science in Education
Correlation and Regression Analysis
Slide 1 Testing Multivariate Assumptions The multivariate statistical techniques which we will cover in this class require one or more the following assumptions.
Assumption of linearity
Inference for regression - Simple linear regression
Access Mathematics Linear Graphs & Best line fits (Least squares)
Math 15 Introduction to Scientific Data Analysis Lecture 5 Association Statistics & Regression Analysis University of California, Merced.
Inference for Regression
1 1 Slide © 2004 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
Ekstrom Math 115b Mathematics for Business Decisions, part II Trend Lines Math 115b.
Use of Weighted Least Squares. In fitting models of the form y i = f(x i ) +  i i = 1………n, least squares is optimal under the condition  1 ……….  n.
?v=cqj5Qvxd5MO Linear and Quadratic Functions and Modeling.
Chapter 10 Correlation and Regression
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
The Nature of Science & Science Skills Test Review.
STATISTICS 12.0 Correlation and Linear Regression “Correlation and Linear Regression -”Causal Forecasting Method.
Correlation & Regression Chapter 15. Correlation It is a statistical technique that is used to measure and describe a relationship between two variables.
1. Linear Graphs Graphing data shows if a relationship exists between two quantities also called variables. If two variables show a linear relationship.
Relationships If we are doing a study which involves more than one variable, how can we tell if there is a relationship between two (or more) of the.
Bivariate Data Analysis Bivariate Data analysis 4.
Simple linear regression Tron Anders Moger
Chapter 9: Correlation and Regression Analysis. Correlation Correlation is a numerical way to measure the strength and direction of a linear association.
The Nature of Science & Science Skills Test Review.
Copyright © Cengage Learning. All rights reserved. 1 STRAIGHT LINES AND LINEAR FUNCTIONS.
Chapter 14: Inference for Regression. A brief review of chapter 4... (Regression Analysis: Exploring Association BetweenVariables )  Bi-variate data.
?v=cqj5Qvxd5MO Linear and Quadratic Functions and Modeling.
Regression David Young & Louise Kelly Department of Mathematics and Statistics, University of Strathclyde Royal Hospital for Sick Children, Yorkhill NHS.
Correlation & Regression Analysis
Section 3.5 – Mathematical Modeling
Scatter Plots Scatter plots are a graphic representation of collated biviariate data via a mathematical diagram using Cartesian coordinates. The data.
V. Rouillard  Introduction to measurement and statistical analysis CURVE FITTING In graphical form, drawing a line (curve) of best fit through.
Good Graph! UNI Plant Physiology Spring Why make graphs? To summarize your data To contrast treatments To show what happens over time (conc., distance,
Physics *The Scientific Method* The way we do science Based on Rational thinking 1.Make an observation Scientific attitude We MUST accept the findings,
Scatter Plots. Standard: 8.SP.1 I can construct and interpret scatterplots.
Using Appropriate Displays. The rabbit’s max speed is easier to find on the ________. The number of animals with 15 mph max speed or less was easier to.
Quadratic equations can be solved using a variety of different methods. All these methods will be explained in great detail on the following slides. By.
AP PHYSICS 1 SUMMER PACKET Table of Contents 1.What is Physics? 2.Scientific Method 3.Mathematics and Physics 4.Standards of Measurement 5.Metric System.
Warm Up Find each square root Solve each inequality. 5. x + 5 ≥ ≤ 3x Compare. Write , or = ≤ 4x – –
9-2 Testing μ.
32931 Technology Research Methods Autumn 2017 Quantitative Research Component Topic 4: Bivariate Analysis (Contingency Analysis and Regression Analysis)
Have you got your workbook with you
Chapter 11 LOcally-WEighted Scatterplot Smoother (Lowess)
Paper Airplanes & Scientific Methods
CHS 221 Biostatistics Dr. wajed Hatamleh
How Science Works Precision Science at SUS
The scatterplot shows the advertised prices (in thousands of dollars) plotted against ages (in years) for a random sample of Plymouth Voyagers on several.
SIMPLE LINEAR REGRESSION MODEL
CHAPTER 10 Correlation and Regression (Objectives)
Numeracy in Science – Lines of best fit
Foundations of Physical Science
Chapter 10 Correlation and Regression
Graphing Review.
Graphing Techniques.
7.1 Draw Scatter Plots & Best-Fitting Lines
Product moment correlation
CHAPTER 3 Describing Relationships
Distance – Time Graphs Time is usually the independent variable (plotted on the x-axis) Distance is usually the dependent variable (plotted on the y-axis)
Scatter Diagrams Slide 1 of 4
Presentation transcript:

SCATTER PLOTS AND SMOOTHING

An Example – Car Stopping Distances An experiment was conducted to measure how the stopping distance of a car depends on its speed. The experiment used a random selection of cars and a variety of speeds. The measurements are contained in the R data set “cars,” which can be loaded with the command: data(cars)

Graphical Investigation We are going to use the value to investigate the relationship between speed and stopping distance. The best way to investigate the relationship between two related variables is to simply plot the pairs of values (Scatter plot). The basic plot is produced with plot. > data(cars) > attach(cars) > plot(speed, dist) Using default labels is fine for exploratory work, but not for publication.

Car Stopping Distances – Imperial Units

Car Stopping Distances – Metric Units

Comments There is a general trend for stopping distance to increase with speed. There is evidence that the variability in the stopping distances also increases with speed. It is difficult to be more precise about the form of the relationship by just looking at the scatter of points.

Scatterplot Smoothing One way to try to uncover the nature of the relationship is to add a line which conveys the basic trend in the plot. This can be done using a technique known as scatterplot smoothing. R has a smoothing procedure called LOWESS which can be used to add the trend line. LOWESS is a relatively complicated procedure, but it is easy to use. plot(speed, dist) lines(lowess(speed, dist))

Mathematical Modelling While the curve obtained by the LOWESS lets us read off the kind of stopping distance we can expect for a given speed, it does not help understanding why the relationship is the way it is. It is possible to use the data to try to fit a well defined mathematical curve to the data points. This suffers from the same difficulty. It is much better to try to understand the mechanism which produced the data.

Conclusions The “smooth” confirms that stopping distance increases with speed, but it gives us more detail. The relationship is not of the form y = a+bx but has an unknown mathematical form. If we are just interested in determining the stopping distance we can expect for a given speed this doesn’t matter. We can just read the answer off the graph.

Using Plots We can check whether these are really the underlying relationships with scatterplots. Either plot distance against speed-squared or plot the square-root of distance against speed.

Producing the Plots > plot(speed^2, dist, main = "Car Stopping Distances", xlab = "Speed-squared (MPH^2)", ylab = "Stopping Distance (Feet)") > lines(lowess(speed^2, dist)) > plot(speed, sqrt(dist), main = "Car Stopping Distances", xlab = "Speed (MPH)", ylab = "Square Root Stopping Distance (Feet)") > lines(lowess(speed, sqrt(dist)))

Conclusions Both the plots indicate that there is close to a straight line relationship between speed- squared and distance. From a statistical point-of-view, the second plot is preferable because the scatter of points about the line is independent of speed. (i.e. it is possible to compare apples with apples).

Controlling The Amount of Smoothing The amount of smoothing in lowess is controlled by and optional argument called “f.” This gives the fraction of the data which will be used as “neighbours” of a given point, when computing the smoothed value at that point. The default value of f is 2/3. The following examples will show the effect of varying the value of f. > lines(lowess(nhtemp, f = 2/3)) > lines(lowess(nhtemp, f = 1/4))