LSP 120: Quantitative Reasoning and Technological Literacy Topic 1: Introduction to Quantitative Reasoning and Linear Models Lecture Notes 1.2 Prepared.

Slides:



Advertisements
Similar presentations
Correlation and regression
Advertisements

Probabilistic & Statistical Techniques Eng. Tamer Eshtawi First Semester Eng. Tamer Eshtawi First Semester
Chapter 4 The Relation between Two Variables
Chapter 10 Regression. Defining Regression Simple linear regression features one independent variable and one dependent variable, as in correlation the.
Regression Regression: Mathematical method for determining the best equation that reproduces a data set Linear Regression: Regression method applied with.
LSP 120: Quantitative Reasoning and Technological Literacy Topic 1: Introduction to Quantitative Reasoning and Linear Models Prepared by Ozlem Elgun1.
LSP 120: Quantitative Reasoning and Technological Literacy Section 118 Özlem Elgün.
Week 1 LSP 120 Joanna Deszcz.  Relationship between 2 variables or quantities  Has a domain and a range  Domain – all logical input values  Range.
LSP 120: Quantitative Reasoning and Technological Literacy Section 118
LSP 120: Quantitative Reasoning and Technological Literacy Topic 1: Introduction to Quantitative Reasoning and Linear Models Prepared by Ozlem Elgun1.
LSP 120: Quantitative Reasoning and Technological Literacy Section 118 Özlem Elgün.
Studies of the metabolism of alcohol consistently show that blood alcohol content (BAC), after rising rapidly after ingesting alcohol, declines linearly.
LSP 120: Quantitative Reasoning and Technological Literacy Section 202
Prepared by Ozlem Elgun
LSP 120: Quantitative Reasoning and Technological Literacy Section 118 Özlem Elgün.
Linear Modeling-Trendlines  The Problem - Last time we discussed linear equations (models) where the data is perfectly linear. By using the slope-intercept.
Linear Functions and Modeling
LSP 120: Quantitative Reasoning and Technological Literacy Section 903 Özlem Elgün.
Simple Linear Regression
Excellence Justify the choice of your model by commenting on at least 3 points. Your comments could include the following: a)Relate the solution to the.
1 Simple Linear Regression Chapter Introduction In this chapter we examine the relationship among interval variables via a mathematical equation.
LSP 120: Quantitative Reasoning and Technological Literacy
What is a linear function?
LSP 120: Quantitative Reasoning and Technological Literacy Section 903 Özlem Elgün.
LSP 120: Quantitative Reasoning and Technological Literacy Topic 1: Introduction to Quantitative Reasoning and Linear Models Prepared by Ozlem Elgun1.
LSP 120: Quantitative Reasoning and Technological Literacy Section 903 Özlem Elgün.
1 Simple Linear Regression 1. review of least squares procedure 2. inference for least squares lines.
Linear Functions and Modeling
Correlation & Regression
Introduction Data surrounds us in the real world. Every day, people are presented with numbers and are expected to make predictions about future events.
Lecture 16 Correlation and Coefficient of Correlation
Linear Regression.
Introduction to Linear Regression and Correlation Analysis
12b. Regression Analysis, Part 2 CSCI N207 Data Analysis Using Spreadsheet Lingma Acheson Department of Computer and Information Science,
1 Least squares procedure Inference for least squares lines Simple Linear Regression.
Biostatistics Unit 9 – Regression and Correlation.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Direct Variation What is it and how do I know when I see it?
1 Chapter 10 Correlation and Regression 10.2 Correlation 10.3 Regression.
Correlation is a statistical technique that describes the degree of relationship between two variables when you have bivariate data. A bivariate distribution.
2- 4-’13 What have we reviewed so far? Real Numbers and Their Porperties. Equations and Inequalities with one variable. Functions and Special Functions.
Chapter 10 Correlation and Regression
Ch4 Describing Relationships Between Variables. Section 4.1: Fitting a Line by Least Squares Often we want to fit a straight line to data. For example.
Scatterplots are used to investigate and describe the relationship between two numerical variables When constructing a scatterplot it is conventional to.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
LSP 120: Quantitative Reasoning and Technological Literacy Topic 2: Exponential Models Lecture notes 2.1 Prepared by Ozlem Elgun1.
Relationships If we are doing a study which involves more than one variable, how can we tell if there is a relationship between two (or more) of the.
Chapter 4 – Correlation and Regression before: examined relationship among 1 variable (test grades, metabolism, trip time to work, etc.) now: will examine.
Correlation The apparent relation between two variables.
LSP 120: Quantitative Reasoning and Technological Literacy Topic 1: Introduction to Quantitative Reasoning and Linear Models Lecture Notes 1.3 Prepared.
Chapter 8: Simple Linear Regression Yang Zhenlin.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 1 Understandable Statistics Seventh Edition By Brase and Brase Prepared by: Lynn Smith.
CORRELATION ANALYSIS.
ContentDetail  Two variable statistics involves discovering if two variables are related or linked to each other in some way. e.g. - Does IQ determine.
Lines of Best Fit When data show a correlation, you can estimate and draw a line of best fit that approximates a trend for a set of data and use it to.
Ch. 14 – Scatter Plots HOW CAN YOU USE SCATTER PLOTS TO SOLVE REAL WORLD PROBLEMS?
Copyright © Cengage Learning. All rights reserved. 8 9 Correlation and Regression.
Chapter 4 More on Two-Variable Data. Four Corners Play a game of four corners, selecting the corner each time by rolling a die Collect the data in a table.
Week 2 Normal Distributions, Scatter Plots, Regression and Random.
Trick or Treat Halloween was almost over, and Mr. Green had less than 20 candies left. When the doorbell rang, he thought he would give all the candies.
Linear Regression Essentials Line Basics y = mx + b vs. Definitions
The simple linear regression model and parameter estimation
Inference for Least Squares Lines
Regression Analysis PhD Course.
Correlation and Regression
Welcome to LSP 120 Dr. Curt M. White.
CORRELATION ANALYSIS.
Correlation and Regression
LSP 120: Quantitative Reasoning and Technological Literacy
Algebra Review The equation of a straight line y = mx + b
Presentation transcript:

LSP 120: Quantitative Reasoning and Technological Literacy Topic 1: Introduction to Quantitative Reasoning and Linear Models Lecture Notes 1.2 Prepared by Ozlem Elgun1

What is a linear function? Most people would say it is a straight line or that it fits the equation y = mx + b. They are correct, but what is true about a function that when graphed yields a straight line? What is the relationship between the variables in a linear function? A linear function indicates a relationship between x and y that has a fixed or constant rate of change. Prepared by Ozlem Elgun2

Is the relationship between x and y is linear? The first thing we want to do is be able to determine whether a table of values for 2 variables represents a linear function. In order to do that we use the formula below: Prepared by Ozlem Elgun3

To determine if a relationship is linear in Excel, add a column in which you calculate the rate of change. You must translate the definition of “change in y over change is x” to a formula using cell references. Entering a formula using cell references allows you to repeat a certain calculation down a column or across a row. Once you enter the formula, you can drag it down to apply it to subsequent cells. ABC 1xyRate of Change =(B3-B2)/(A3-A2) This is a cell reference Prepared by Ozlem Elgun4

Note that we entered the formula for rate of change not next to the first set of values but next to the second. This is because we are finding the change from the first to the second. Then fill the column and check whether the values are constant. To fill a column, either put the cursor on the corner of the cell with the formula and double click or (if the column is not unbroken) put the cursor on the corner and click and drag down. If the rate of change values are constant then the relationship is a linear function. So this example does represent a linear function. Rate of change is 2.5 and it is constant. This means that that when the x value increases by 1, the y value increases by 2.5. ABC 1xyRate of Change Prepared by Ozlem Elgun5

How to Write a Linear Equation Next step is to write the equation for this function. y = mx + b. y and x are the variables m is the slope (rate of change) b is the y-intercept (the initial value when x=0) We know x, y, and m, we need to calculate b: Using the first set of values (x=3 and y=11) and 2.5 for "m“ (slope): 11=2.5*3 + b. Solving: 11=7.5 + b 3.5 = b. The equation for this function is : y = 2.5 x Another way to find the equation is to use Excel’s intercept function. ABC 1xy Rate of Change Prepared by Ozlem Elgun6

Practice For the following, determine whether the function is linear and if so, write the equation for the function. xy xy xy Prepared by Ozlem Elgun7

Warning: Not all graphs that look like lines represent linear functions The graph of a linear function is a line. However, a graph of a function can look like a line even thought the function is not linear. Graph the following data where t is years and P is the population of Mexico (in millions): What does the graph look like? Now, calculate the rate of change for each set of data points (as we learned under Does the data represent a linear function?) Is it constant? tP Prepared by Ozlem Elgun8

What if you were given the population for every ten years? Would the graph no longer appear to be linear? Graph the following data. Does this data (derived from the same equation as the table above) appear to be linear? Both of these tables represent an exponential model (which we will be discussing shortly). The important thing to note is that exponential data can appear to be linear depending on how many data points are graphed. The only way to determine if a data set is linear is to calculate the rate of change (slope) and verify that it is constant. tP Prepared by Ozlem Elgun9

"Real world" example of a linear function: Studies of the metabolism of alcohol consistently show that blood alcohol content (BAC), after rising rapidly after ingesting alcohol, declines linearly. For example, in one study, BAC in a fasting person rose to about % after a single drink. After an hour the level had dropped to %. Assuming that BAC continues to decline linearly (meaning at a constant rate of change), approximately when will BAC drop to 0.002%? In order to answer the question, you must express the relationship as an equation and then use to equation. First, define the variables in the function and create a table in excel. The two variables are time and BAC. Calculate the rate of change. TimeBAC % % Prepared by Ozlem Elgun10

TimeBAC Rate of change % %-0.008% This rate of change means when the time increases by 1, the BAC decreases (since rate of change is negative) by.008. In other words, the BAC % is decreasing.008 every hour. Since we are told that BAC declines linearly, we can assume that figure stays constant. Now write the equation with Y representing BAC and X the time in hours. Y = -.008x This equation can be used to make predictions. The question is "when will the BAC reach.002%?" Plug in.002 for Y and solve for X..002 = -.008x = -.008x x = 2 Therefore the BAC will reach.002% after 2 hours. Prepared by Ozlem Elgun11

Warning: Not all graphs that look like lines represent linear functions The graph of a linear function is a line. However, a graph of a function can look like a line even thought the function is not linear. Graph the following data where t is years and P is the population of Mexico (in millions): What does the graph look like? Now, calculate the rate of change for each set of data points (as we learned under Does the data represent a linear function?) Is it constant? tP Prepared by Ozlem Elgun12

What if you were given the population for every ten years? Would the graph no longer appear to be linear? Graph the following data. Does this data (derived from the same equation as the table above) appear to be linear? Both of these tables represent an exponential model (which we will be discussing shortly). The important thing to note is that exponential data can appear to be linear depending on how many data points are graphed. The only way to determine if a data set is linear is to calculate the rate of change (slope) and verify that it is constant. tP Prepared by Ozlem Elgun13

Linear Modeling-Trendlines The Problem - To date, we have studied linear equations (models) where the data is perfectly linear. By using the slope-intercept formula, we derived linear equation/models. In the “real world” most data is not perfectly linear. How do we handle this type of data? The Solution - We use trendlines (also known as line of best fit and least squares line). Why - If we find a trendline that is a good fit, we can use the equation to make predictions. Generally we predict into the future (and occasionally into the past) which is called extrapolation. Constructing points between existing points is referred to as interpolation. Prepared by Ozlem Elgun14

Is the trendline a good fit for the data? There are five guidelines to answer this question: 1.Guideline 1: Do you have at least 7 data points? 2.Guideline 2: Does the R-squared value indicate a relationship? 3.Guideline 3: Verify that your trendline fits the shape of your graph. 4.Guideline 4: Look for outliers. 5.Guideline 5: Practical Knowledge, Common Sense Prepared by Ozlem Elgun15

Guideline 1: Do you have at least 7 data points? For the datasets that we use in this class, you should use at least 7 of the most recent data points available. If there are more data points, you will also want to include them (unless your data fails one of the guidelines below). Prepared by Ozlem Elgun16

Guideline 2: Does the R-squared value indicate a relationship? R2 is a standard measure of how well the line fits the data. (Tells us how linear the relationship between x and y is) In statistical terms, R 2 is the percentage of variance of y that is explained by our trendline. It is more useful in the negative sense: if R2 is very low, it tells us the model is not very good and probably shouldn't be used. If R2 is high, we should also look at other guidelines to determine whether our trendline is a good fit for the data, and whether we can have confidence in our predictions. Prepared by Ozlem Elgun17

More on R-squared… If the R 2 = 1, then there is a perfect match between the line and the data points. If the R 2 = 0, then there is no relationship between n the x and y values. If the R 2 value is between.7 and 1.0, there is a strong linear relationship and if the data meets all the other guidelines, you can use it to make predictions. If the R 2 value is between.4 and.7, there is a moderate linear relationship and the data can most likely be used to make predictions. If the R 2 value is below.4, the relationship is weak and you should not use this data to make predictions. Prepared by Ozlem Elgun18

Even more on R-squared…  The coefficient of determination, r 2, is useful because it gives the proportion of the variance (fluctuation) of one variable that is predictable from the other variable.  It is a measure that allows us to determine how certain one can be, in making predictions from a certain model/graph.  The coefficient of determination is the ratio of the explained variation to the total variation.  The coefficient of determination is such that 0 < r 2 < 1, and denotes the strength of the linear association between x and y.  The coefficient of determination represents the percent of the data that is the closest to the line of best fit. For example, if r = 0.922, then r 2 = 0.850, which means that 85% of the total variation in y can be explained by the linear relationship between x and y (as described by the regression equation). The other 15% of the total variation in y remains unexplained.  The coefficient of determination is a measure of how well the regression line represents the data. If the regression line passes exactly through every point on the scatter plot, it would be able to explain all of the variation. The further the line is away from the points, the less it is able to explain. Prepared by Ozlem Elgun19

NOW BACK TO OUR GUIDELINES FOR DETERMINING WHETHER A TRENDLINE IS A GOOD FIT FOR THE DATA... Prepared by Ozlem Elgun20

Guideline 3: Verify that your trendline fits the shape of your graph. For example, if your trendline continues upward, but the data makes a downward turn during the last few years, verify that the “higher” prediction makes sense (see practical knowledge). In some cases it is obvious that you have a localized trend. Localized trends will be discussed at a later date. Prepared by Ozlem Elgun21

Guideline 4: Look for outliers: Outliers should be investigated carefully. Often they contain valuable information about the process under investigation or the data gathering and recording process. Before considering the possible elimination of these points from the data, try to understand why they appeared and whether it is likely similar values will continue to appear. Of course, outliers are often bad data points. If the data was entered incorrectly, it is important to find the right information and update it. In some cases, the data is correct and an anomaly occurred that partial year. The outlier can be removed if it is justified. It must also be documented. Prepared by Ozlem Elgun22

Guideline 5: Practical Knowledge, Common Sense How many years out can we predict? Based on what you know about the topic, does it make sense to go ahead with the prediction? Use your subject knowledge, not your mathematical knowledge to address this guideline. Prepared by Ozlem Elgun23