Presentation is loading. Please wait.

Presentation is loading. Please wait.

Statistics: Analyzing 2 Quantitative Variables MIDDLE SCHOOL LEVEL  Session #2  Presented by: Dr. Del Ferster.

Similar presentations


Presentation on theme: "Statistics: Analyzing 2 Quantitative Variables MIDDLE SCHOOL LEVEL  Session #2  Presented by: Dr. Del Ferster."— Presentation transcript:

1 Statistics: Analyzing 2 Quantitative Variables MIDDLE SCHOOL LEVEL  Session #2  Presented by: Dr. Del Ferster

2  Are 2 quantitative variables always related?  If there is a strong trend, can we assume a cause and effect status?  Why is linear regression important?  What does it let us do?

3  We’re going to spend time today on QUANTITATIVE STATISTICS.  We’ll examine scatter plots and look for patterns and strength of relationships.

4  We’ll interpret regression lines in the context of the problem.  We’ll look at correlation—a measure of the linear trend of the data.  I’ve also included a “spiffy” activity that I think you can use with your students.

5

6

7  A graphical display of two quantitative variables  We plot the explanatory (independent) variable on the x-axis and the response (dependent) variable on the y-axis  Each dot represents a single observation and its ordered pair (x,y)

8  When we consider scatterplots, we focus on 4 things: ◦ Direction ◦ Form ◦ Scatter ◦ Unusual elements

9  Positive: as values of the explanatory variable increase, values in the response variable tend to increase As x gets larger, y gets larger

10  Negative : as values of the explanatory variable increase, values in the response variable tend to decrease As x gets larger, y gets smaller

11  Null: no discernible patter of change in the response variable

12  Linear: The shape has the appearance of a linear relationship.  There doesn’t have to be a perfect fit.

13  Curved  We can use logarithms to transform into linear forms.

14  None  No discernible form

15  Strong association: very little scatter

16  Moderate strength :

17  Weak strength: lots of scatter

18  Outliers—They just don’t fit the trend

19  Look for changes in the scatter.  A horn shape :

20 How would you describe the following plots?

21  The scatterplot shows a moderately strong, negative association.  There is a bit of a curve.

22 The scatterplot shows a weak, positive linear association. The scatter tends to decrease as the scores in Exam 1 increase.

23 The scatterplot shows a moderately strong, positive linear association. There appears to be an outlier around (9, 35).

24 The scatterplot shows no apparent association. There is great scatter and an outlier around (60,8).

25 The scatterplot shows a curved form in which the scatter increases as the explanatory variable increases

26 Determining the LINE that best fits our data.

27  A regression line is a straight line that describes how a response variable y changes as an explanatory variable x changes.  A regression line summarizes the relationship between two variables, but only in a specific setting: when one of the variables helps explain or predict the other.

28  We often use a regression line to predict the value of y for a given value of x.  Regression, unlike correlation, requires that we have an explanatory variable and a response variable

29  Fitting a line to data means drawing a line that comes as close as possible to the points.  Extrapolation-the use of a regression line for prediction far outside the range of values of the explanatory variable x that you used to obtain the line. ◦ Such predictions are often not accurate.

30  Regression analysis finds the equation of the line that best describes the relationship between the two variables.  In other words, what line best fits the data that is represented on our scatterplot.  While there are formulas to calculate this line, most of the time we’d use a graphing calculator or app for our ipad.

31  The equation of the least-squares regression line of y on x is or more simply, the regression line

32  The slope, b, is the amount by which y changes when x increases by one unit.  The intercept, a, is the value of y when

33 Let’s look at an example

34  It seems that people are living longer these days (I hope so! ), so I’ve done some research to study this trend. According to US Government statistics, the following data represents the life expectancy for an infant born in the given year. Year of Birth 200120022003200420052006200720082009 Life Expec tancy (yrs) 77.978.278.579.079.279.780.180.280.6

35

36

37 1. Does there appear to be a linear relationship between year of birth and life expectancy? 2. Based on the context of the problem, interpret the y-intercept of the line. 3. Based on the context of the problem, interpret the slope of the line.

38 4. According to your trend line, what is the predicted life expectancy for a baby born in 2012? 5. According to your trend line, what is the predicted life expectancy for a baby born in 2050? 6. Why might you be a bit skeptical about your response to the last question?

39 1. Yes, there appears to be a strong linear trend. The regression line has a positive slope, so as x (the year of birth) increases, so does y (the life expectancy) 2. The regression line is The y-intercept means at year 0 an infant’s life expectancy is -612.46 years. NOTE: in the context of this problem this is MEANINGLESS!

40 3. The regression line is The slope means that every year an infant’s life expectancy increases by 0.345 years. 4. According to the regression line, an infant born in 2012 has a life expectancy of 81.68 years. 5. According to the regression line, an infant born in 2050 has a life expectancy of 94.79 years.

41 6. The year 2050 falls well outside the set of x values (from 2001 to 2009) upon which the regression line is based, so this is most likely EXTRAPOLATION. We’re seeking to use our regression line to predict for an x value that is WELL OUTSIDE the set of data that was used to generate the equation of the line. This isn’t a good statistical practice, so this prediction would be met with a great deal of skepticism.

42 A way to measure the strength of a LINEAR trend.

43  CORRELATION, denoted by r measures the direction and strength of the linear relationship between two quantitative variables.  General Properties  It must be between -1 and 1, or (-1≤ r ≤ 1).  If r is negative, the relationship is negative.  If r = –1, there is a perfect negative linear relationship (extreme case).  If r is positive, the relationship is positive.

44  General Properties  If r = 1, there is a perfect positive linear relationship (extreme case).  If r is 0, there is no linear relationship.  r measures the strength of the linear relationship.  If explanatory and response are switched, r remains the same.  r has no units of measurement associated with it  Scale changes do not affect r

45  Examples of extreme cases r = 1r = 0r = -1

46

47 r = 0.07 r = -0.768 r = -0.944 r = 0.936 r = 0.496 r = 1

48 r = 0.07 r = -0.768 r = -0.944 r = 0.936 r = 0.496 r = 1

49 r = 0.07 r = -0.768 r = -0.944 r = 0.936 r = 0.496 r = 1

50 r = 0.07 r = -0.768 r = -0.944 r = 0.936 r = 0.496 r = 1

51 r = 0.07 r = -0.768 r = -0.944 r = 0.936 r = 0.496 r = 1

52 r = 0.07 r = -0.768 r = -0.944 r = 0.936 r = 0.496 r = 1

53 It is possible for there to be a strong relationship between two variables and still have r ≈ 0. EXAMPLE

54  What would you guess for the value of r for this data?

55  r=0.996

56  Association does not imply causation  Correlation does not imply causation  Slope is not correlation  A scale change does not change the correlation.  Correlation doesn’t measure the strength of a non-linear relationship.

57 Now, let’s see if we can apply some of those things that we learned today.

58  If you want a blank copy for future use, or if you want a copy of my answers, just let me know.  You’re more than welcome to have one!!

59  Questions or comments  Remember, you make a difference in kids’ lives everyday!!  Challenge your students, support them, and share their successes!!

60  Thanks for your attention, participation, and energy. (I know it’s a long time to sit! )  Head out and hit the links!  If I can be of help during the school year, please don’t hesitate to let me know  EMAIL: ◦ Here at Immaculata: ◦ dferster@immaculata.edu dferster@immaculata.edu ◦ My home email: ◦ delferst@gmail.comdelferst@gmail.com  PHONE: (610) 369-7344 (HOME)  (610) 698-7615 (CELL)


Download ppt "Statistics: Analyzing 2 Quantitative Variables MIDDLE SCHOOL LEVEL  Session #2  Presented by: Dr. Del Ferster."

Similar presentations


Ads by Google