A little VOCAB
Causation Causation is the "causal relationship between conduct and result". That is to say that causation provides a means of connecting conduct with a resulting effect, typically an injury Causation indicates that one event is the result of the occurrence of the other event; i.e. there is a causal relationship between the two events. This is also referred to as cause and effect.
Confounding In statistics, a confounding variable (also confounding factor, a confound, or confounder) is an extraneous variable in a statistical model that correlates (directly or inversely) with both the dependent variable and the independent variable.
Transforming Data to perform Linear Regression Non-linear Data Transforming Data to perform Linear Regression
What to do if the data is not linear… Calculate the LSRL Transform data: Is the residual plot scattered? NO YES Appropriate model
Let’s examine this data set. This shows the monthly premium for Jackson National’s 10-year Term Life Insurance Policy of $100,000 for males and females (smoker & non-smoker) at a given age.
Looking at just the premium for males, we see that the data is not linear
Cool – it’s a piece-wise function! Separate LSRLs are fitted to different age ranges that have been transformed using logs Cool – it’s a piece-wise function!
Example 1: Consider the average length and weight at different ages for Atlantic rockfish.
Use your calculator to draw a scatterplot of the data for length (x), in L1 and weight (y), in L2. Is it linear? ____ Is there a pattern? _____ Since there is a pattern, let’s try to “straighten” the data.
Since length is __ dimensional and weight (which depends on volume) is __ dimensional, let’s graph length3 (x), in L3 vs. weight (y) in L2. Is the scatterplot linear? ____ Highlight L3 ENTER L1 ^ 3 ENTER
Calculate the LSL on the transformed points (length3, weight) and determine r2.
Predict the weight of an Atlantic Rockfish that is 31.5cm long.
The residual clearly has a pattern, so we must transform it!
COmpuTER OUTPUT
Be sure to convert r2 to decimal before taking the square root! Computer-generated regression analysis of knee surgery data: Predictor Coef Stdev T P Constant 107.58 11.12 9.67 0.000 Age 0.8710 0.4146 2.10 0.062 s = 10.42 R-sq = 30.6% R-sq(adj) = 23.7% Be sure to convert r2 to decimal before taking the square root! NEVER use adjusted r2! What is the equation of the LSRL? Find the slope & y-intercept. What are the correlation coefficient and the coefficient of determination?