Transformations. Transformation (re-expression) of a Variable A very useful transformation is the natural log transformation Transformation of a variable.

Slides:



Advertisements
Similar presentations
Inference for Linear Regression (C27 BVD). * If we believe two variables may have a linear relationship, we may find a linear regression line to model.
Advertisements

Chapter 10: Re-Expressing Data: Get it Straight
Chapter 10 Re-Expressing data: Get it Straight
Chapter 10 Re-expressing the data
Re-expressing data CH. 10.
Chapter 10 Re-expressing Data: Get it Straight!!
Multivariate distributions. The Normal distribution.
Jan Shapes of distributions… “Statistics” for one quantitative variable… Mean and median Percentiles Standard deviations Transforming data… Rescale:
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 6-1 Statistics for Managers Using Microsoft® Excel 5th Edition.
1 Re-expressing Data  Chapter 6 – Normal Model –What if data do not follow a Normal model?  Chapters 8 & 9 – Linear Model –What if a relationship between.
Chapter 5 Transformations and Weighting to Correct Model Inadequacies
Business Statistics - QBM117 Statistical inference for regression.
Continuous Probability Distributions A continuous random variable can assume any value in an interval on the real line or in a collection of intervals.
1 Doing Statistics for Business Doing Statistics for Business Data, Inference, and Decision Making Marilyn K. Pelosi Theresa M. Sandifer Chapter 11 Regression.
Transformations to Achieve Linearity
Chap 6-1 Copyright ©2013 Pearson Education, Inc. publishing as Prentice Hall Chapter 6 The Normal Distribution Business Statistics: A First Course 6 th.
Bivariate Data When two variables are measured on a single experimental unit, the resulting data are called bivariate data. You can describe each variable.
Applications The General Linear Model. Transformations.
Marginal and Conditional distributions. Theorem: (Marginal distributions for the Multivariate Normal distribution) have p-variate Normal distribution.
Linear Regression Hypothesis testing and Estimation.
Statistics Review Chapter 10. Important Ideas In this chapter, we have leaned how to re- express the data and why it is needed.
Chapter 10: Re-Expressing Data: Get it Straight AP Statistics.
Wednesday, May 13, 2015 Report at 11:30 to Prairieview.
Transformations. Transformations to Linearity Many non-linear curves can be put into a linear form by appropriate transformations of the either – the.
M25- Growth & Transformations 1  Department of ISM, University of Alabama, Lesson Objectives: Recognize exponential growth or decay. Use log(Y.
Remedial measures … or “how to fix problems with the model” Transforming the data so that the simple linear regression model is okay for the transformed.
Fitting Equations to Data. A Common situation: Suppose that we have a single dependent variable Y (continuous numerical) and one or several independent.
Chapter 10 Re-expressing the data
Lecture 6 Re-expressing Data: It’s Easier Than You Think.
September 18-19, 2006 – Denver, Colorado Sponsored by the U.S. Department of Housing and Urban Development Conducting and interpreting multivariate analyses.
Bivariate Data Analysis Bivariate Data analysis 4.
AP Statistics Semester One Review Part 1 Chapters 1-3 Semester One Review Part 1 Chapters 1-3.
Hypothesis testing and Estimation
AP Statistics Section 4.1 A Transforming to Achieve Linearity.
Reexpressing Data. Re-express data – is that cheating? Not at all. Sometimes data that may look linear at first is actually not linear at all. Straight.
Applied Quantitative Analysis and Practices
Transformations.
If the scatter is curved, we can straighten it Then use a linear model Types of transformations for x, y, or both: 1.Square 2.Square root 3.Log 4.Negative.
Chapter 5 Lesson 5.4 Summarizing Bivariate Data 5.4: Nonlinear Relationships and Transformations.
Using SPSS Note: The use of another statistical package such as Minitab is similar to using SPSS.
Re-Expressing Data. Scatter Plot of: Weight of Vehicle vs. Fuel Efficiency Residual Plot of: Weight of Vehicle vs. Fuel Efficiency.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 6-1 Statistics for Managers Using Microsoft® Excel 5th Edition.
Chapter 9 Regression Wisdom
Using SPSS Note: The use of another statistical package such as Minitab is similar to using SPSS.
Chapter 10 Notes AP Statistics. Re-expressing Data We cannot use a linear model unless the relationship between the two variables is linear. If the relationship.
A little VOCAB.  Causation is the "causal relationship between conduct and result". That is to say that causation provides a means of connecting conduct.
Graphs with SPSS Aravinda Guntupalli. Bar charts  Bar Charts are used for graphical representation of Nominal and Ordinal data  Height of the bar is.
AP Statistics Review Day 1 Chapters 1-4. AP Exam Exploring Data accounts for 20%-30% of the material covered on the AP Exam. “Exploratory analysis of.
Assessing Normality Are my data normally distributed?
Statistics 10 Re-Expressing Data Get it Straight.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 10 Re-expressing Data: Get it Straight!
Data transformation. The Aim By the end of this lecture, the students will be aware of data transformation methods to make the appropriate statistical.
Continuous Random Variables
Using SPSS Note: The use of another statistical package such as Minitab is similar to using SPSS.
Evaluating Bivariate Normality
Statistical Methods For Engineers
Theme 5 Standard Deviations and Distributions
Continuous Random Variables
Transformations.
The Normal Probability Distribution Summary
Re-expressing Data:Get it Straight!
Hypothesis testing and Estimation
Transformations.
Transformations to Achieve Linearity
Statistics for Managers Using Microsoft® Excel 5th Edition
Diagnostics and Remedial Measures
Using SPSS Note: The use of another statistical package such as Minitab is similar to using SPSS.
The Normal Distribution
Diagnostics and Remedial Measures
Presentation transcript:

Transformations

Transformation (re-expression) of a Variable A very useful transformation is the natural log transformation Transformation of a variable can change its distribution from a skewed distribution to a normal distribution (bell-shaped, symmetric about its centre For any value of x, ln(x) can be: Looked up in tables Calculated by most calculators Calculated by most statistical packages

Graph of ln(x)

The effect of the transformation

The effect of the ln transformation It spreads out values that are close to zero Compacts values that are large

Transforming data to a normal distribution allows one to use powerful statistical procedures (discussed later on) that assumes the data is normally distributed.

Transformations to Linearity Many non-linear curves can be put into a linear form by appropriate transformations of the either – the dependent variable Y or –the independent variable X –or both. This leads to the wide utility of the Linear model. Another use of trans

Intrinsically Linear (Linearizable) Curves 1 Hyperbolas y = x/(ax-b) Linear form: 1/y = a -b (1/x) or Y =  0 +  1 X Transformations: Y = 1/y, X=1/x,  0 = a,  1 = -b

2. Exponential y =  e  x =  x Linear form: ln y = ln  +  x = ln  + ln  x or Y =  0 +  1 X Transformations: Y = ln y, X = x,  0 = ln ,  1 =  = ln 

3. Power Functions y = a x b Linear from: ln y = lna + blnx or Y =  0 +  1 X Transformations: Y = ln y, X = ln x,  0 = lna,  1 = b

Summary Transformations can be useful for: 1.Changing data from a skewed distribution to a Normal (bell- shaped) distribution 2.Straightening out Non-linear data 3.A common transformation is the natural log transformation ln(x)

Example – Motor Vehicle Data The data is in an Excel file – MtrVeh.xls Dependent = mpg Independent = Engine size, horsepower and weight

The data in an SPSS file

We will try to fit a model predicting mpg with Engine (engine size). First a scatter plot: The dialog box selecting the variables:

The scatter-plot

Similar to: 2. Exponential y =  e  x =  x Linear form: ln y = ln  +  x = ln  + ln  x or Y =  0 +  1 X Transformations: Y = ln y, X = x,  0 = ln ,  1 =  = ln 

To perform a ln transformation in SPSS Go to the menu Transform->Compute

In this dialogue box you define the tansformation Press OK and the trasformation will be performed

The new variable has been added to the SPSS spreadsheet

The scatterplot showing a better fit to a straight line using the new variable lnmpg.

Transformations summary Transformations can be used to convert non-normal data to normally (bell-shaped) distributed data (allowing for the use of the more powerful techniques assuming normality) Transformations can be used to convert non-linear data linear (straight line) data.

Next topic Probability