Occupational Factors Affecting the Income of Canada ’ s Residents in the 1970 ’ s Group 5 Ben Wright Bin Ren Hong Wang Jake Stamper James Rogers Yuejing.

Slides:



Advertisements
Similar presentations
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Advertisements

Forecasting Using the Simple Linear Regression Model and Correlation
Inference for Regression
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
AP Statistics Chapters 3 & 4 Measuring Relationships Between 2 Variables.
LECTURE 3 Introduction to Linear Regression and Correlation Analysis
LINEAR REGRESSION: Evaluating Regression Models Overview Assumptions for Linear Regression Evaluating a Regression Model.
LINEAR REGRESSION: Evaluating Regression Models. Overview Assumptions for Linear Regression Evaluating a Regression Model.
LINEAR REGRESSION: Evaluating Regression Models. Overview Standard Error of the Estimate Goodness of Fit Coefficient of Determination Regression Coefficients.
REGRESSION Want to predict one variable (say Y) using the other variable (say X) GOAL: Set up an equation connecting X and Y. Linear regression linear.
BA 555 Practical Business Analysis
Statistics for Managers Using Microsoft® Excel 5th Edition
Regression Diagnostics Using Residual Plots in SAS to Determine the Appropriateness of the Model.
REGRESSION MODEL ASSUMPTIONS. The Regression Model We have hypothesized that: y =  0 +  1 x +  | | + | | So far we focused on the regression part –
1 Basic statistics Week 10 Lecture 1. Thursday, May 20, 2004 ISYS3015 Analytic methods for IS professionals School of IT, University of Sydney 2 Meanings.
Statistics 350 Lecture 10. Today Last Day: Start Chapter 3 Today: Section 3.8 Homework #3: Chapter 2 Problems (page 89-99): 13, 16,55, 56 Due: February.
MATH 3359 Introduction to Mathematical Modeling Linear System, Simple Linear Regression.
Regulating for Decent Work July, Geneva The impact of minimum wage adjustments on Vietnamese workers' hourly wages By Henrik Hansen, John Rand.
© 2000 Prentice-Hall, Inc. Chap Forecasting Using the Simple Linear Regression Model and Correlation.
The Glass Ceiling: A Study on Annual Salaries Group 4 Julie Shan, Brian Abe, Yu-Ting Cheng, Kathinka Tysnes, Huan Zhang, Andrew Booth.
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Chapter 7 Forecasting with Simple Regression
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Regression Chapter 14.
DUMMY VARIABLES BY HARUNA ISSAHAKU Haruna Issahaku.
Quantitative Business Analysis for Decision Making Multiple Linear RegressionAnalysis.
Advantages of Multivariate Analysis Close resemblance to how the researcher thinks. Close resemblance to how the researcher thinks. Easy visualisation.
Copyright © 2011 Pearson Education, Inc. Multiple Regression Chapter 23.
Multiple Regression Analysis The principles of Simple Regression Analysis can be extended to two or more explanatory variables. With two explanatory variables.
Chapter 13: Inference in Regression
Hypothesis Testing in Linear Regression Analysis
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
1 1 Slide Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple Coefficient of Determination n Model Assumptions n Testing.
Regression Analysis Week 8 DIAGNOSTIC AND REMEDIAL MEASURES Residuals The main purpose examining residuals Diagnostic for Residuals Test involving residuals.
PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Copyright © 2014 by Nelson Education Limited. 3-1 Chapter 3 Measures of Central Tendency and Dispersion.
Maths Study Centre CB Open 11am – 5pm Semester Weekdays
Simple Linear Regression ANOVA for regression (10.2)
Chapter 13 Multiple Regression
Numerical Measures of Variability
Correlation Assume you have two measurements, x and y, on a set of objects, and would like to know if x and y are related. If they are directly related,
Copyright ©2011 Brooks/Cole, Cengage Learning Inference about Simple Regression Chapter 14 1.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 13-1 Introduction to Regression Analysis Regression analysis is used.
September 18-19, 2006 – Denver, Colorado Sponsored by the U.S. Department of Housing and Urban Development Conducting and interpreting multivariate analyses.
1 Regression Analysis The contents in this chapter are from Chapters of the textbook. The cntry15.sav data will be used. The data collected 15 countries’
REGRESSION DIAGNOSTICS Fall 2013 Dec 12/13. WHY REGRESSION DIAGNOSTICS? The validity of a regression model is based on a set of assumptions. Violation.
CHAPTER 5 Regression BPS - 5TH ED.CHAPTER 5 1. PREDICTION VIA REGRESSION LINE NUMBER OF NEW BIRDS AND PERCENT RETURNING BPS - 5TH ED.CHAPTER 5 2.
Lecture 10: Correlation and Regression Model.
EMPLOYMENT AND EARNINGS James and Clayton. Topic of Interest Describes the economic status of all businesses in Canada (trends) Helps with determining.
Agresti/Franklin Statistics, 1 of 88 Chapter 11 Analyzing Association Between Quantitative Variables: Regression Analysis Learn…. To use regression analysis.
Multiple Logistic Regression STAT E-150 Statistical Methods.
Chapter 6: Analyzing and Interpreting Quantitative Data
Introduction to Statistical Modelling Example: Body and heart weights of cats. The R data frame cats, and the variables therein, are made available by.
Regression Analysis: Part 2 Inference Dummies / Interactions Multicollinearity / Heteroscedasticity Residual Analysis / Outliers.
Designing Social Inquiry STATISTICAL METHOD Jaechun Kim.
Introduction to Multiple Regression Lecture 11. The Multiple Regression Model Idea: Examine the linear relationship between 1 dependent (Y) & 2 or more.
STAT 104 Section 9 Daniel Moon. Agenda Tests of Population mean μ X Comparisons of two means F-test for equal variances Multiple Linear Regression.
Biostatistics Regression and Correlation Methods Class #10 April 4, 2000.
LESSON 5 - STATISTICS & RESEARCH STATISTICS – USE OF MATH TO ORGANIZE, SUMMARIZE, AND INTERPRET DATA.
CMS SAS Users Group Conference Learn more about THE POWER TO KNOW ® October 17, 2011 Medicare Payment Standardization Modeling using SAS Enterprise Miner.
Lecture 8 Data Analysis: Univariate Analysis and Data Description Research Methods and Statistics 1.
Regression Analysis AGEC 784.
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
The Correlation Coefficient (r)
Correlation, Regression & Nested Models
Chapter 12: Regression Diagnostics
Graphical Descriptive Techniques
Stats Club Marnie Brennan
The greatest blessing in life is
Regression is the Most Used and Most Abused Technique in Statistics
The Correlation Coefficient (r)
Presentation transcript:

Occupational Factors Affecting the Income of Canada ’ s Residents in the 1970 ’ s Group 5 Ben Wright Bin Ren Hong Wang Jake Stamper James Rogers Yuejing Wu

Data Source: Census of Canada Collected by Canadian Government in 1971 Collected by Canadian Government in different occupational categories 102 different occupational categories 4 occupational categories had incomplete data 4 occupational categories had incomplete data Categories represent data aggregated over 1000’s of employees Categories represent data aggregated over 1000’s of employees Definition of variables - Gender: % of women in occupation Gender: % of women in occupation Years of Education: Average number of years of education per worker Years of Education: Average number of years of education per worker Job prestige: rating assigned based on social survey conducted in the mid-1960 ’ s Job prestige: rating assigned based on social survey conducted in the mid-1960 ’ s Job types: Job types: Blue collar (e.g. janitor) Blue collar (e.g. janitor) Professional (e.g lawyer) Professional (e.g lawyer) White collar (e.g. insurance agent) White collar (e.g. insurance agent)

What factors affected the occupational income of Canada ’ s residents in 1971? Step1: Data preparation Step1: Data preparation Removal of incomplete observations Removal of incomplete observations (4 types of employment were not classified into a type: baby sitters, athletes, newsboys, and farmers) (4 types of employment were not classified into a type: baby sitters, athletes, newsboys, and farmers) Removal of non-descriptive statistics Removal of non-descriptive statistics (Census code) (Census code)

Step2: Exploratory data analysis 1.Professional occupations have higher average income, prestige scores, and years of education of than blue and white collar jobs 2.White collar jobs (on average) employ a larger percentage of women

Step3: pair-wise scatter plot to see the relationships between variables

Step4: Linear regression Data output R 2 = F-stat: 120 P-value: < VariableCoefficientStdDevT-valueP-stat Education Women Prestige Type (b.c.) Type (prof.) Type (w.c.)

Step5: Test the validity of linear regression: Normality? Data is skewed towards higher incomes

Step5: Test the validity of linear regression: Heteroskedasticity? Data is heteroskedastic -> need to perform data transformation R 2 =.90 Variance is not constant

Step6: Log Transformation (log income) Approximates a normal distribution

Results of linear regression on log transformation education is not a significant variable and can be removed from the model VariableCoef.StdDevT-valueP-stat Education Women e-15 Prestige e-09 Type (b.c.) <2e-16 Type (prof.) <2e-16 Type (w.c) <2e-16

Are different models needed for different ranges of variables? Linear model explains the entire range of observations Linear relationship Linear relationship Variables: Women Prestige Type

Outliers affecting the model Possible outliers Model may not account for a variable which explains these data points

Model disregarding outlier The total sum of squared residuals is further reduced by removing outliers

Final Model This means that regardless of your job type, if you switched between jobs with the same level of prestige (e.g 62) to one which had a lower percentage of women (e.g. 57% to 10%), you could increase you income substantially (~$3,500)

Conclusions The level of prestige (more than education) associated with a particular occupation best describes the income it will earn The level of prestige (more than education) associated with a particular occupation best describes the income it will earn Occupations which employ a higher percentage of women will offer a lower income Occupations which employ a higher percentage of women will offer a lower income Job type (i.e. b.c., w.c., or prof) can be used to explain income differences between occupations Job type (i.e. b.c., w.c., or prof) can be used to explain income differences between occupations