Download presentation
Presentation is loading. Please wait.
1
1 Part IC. Descriptive Statistics Multivariate Statistics ( 多變量統計 ) Focus: Multiple Regression ( 多元迴歸、複迴歸 ) Spring 2007
2
2 Multiple Regression Multiple regression: contains two or more independent variables ( 研究兩個或兩個以上自變 數對依變數的影響 ) –Why? Remember that human behavior or social phenomena are complicated and multivariate. –Explanation and prediction done by multiple regression: more “accurate” In terms of analysis, multiple regression can be seen as the extension of simple regression ( 複迴 歸分析方法基本上是簡單迴歸分析的延伸 )
3
3 Multiple Regression Model: start from two independent variables Equation: y i = a+ b 1 x 1 + b 2 x 2 + ε i
4
4 Some Assumptions of Multiple Regression Here, we just discuss some important assumptions ( 只討論部分複迴歸模型的假 設 ) –V(ε i ) = σ 2, I = 1, …, n ( 變異數齊一性 ) –Cov (ε i, ε j ) = 0, i ≠j, I, j = 1, …, n ( 任何兩組誤差 項不相關 ) –Cov (x j, ε i ) = 0, j = 1, …, k, i = 1, …, n ( 誤差項 與 k 個自變數無關 ) –r Xi, Xj ≠±1 ( 自變數彼此間無完全的線性關係 )
5
5 Multiple regression estimation Follow the same idea of simple regression: minimize SSE ( 估計值與觀察值差的平方和要最 小 ) But here, we don’t usually use ordinary least squares (OLS) method. Instead, we use Maximum Likelihood Estimator ( 通常不用 OLS ,用最大概似估計式 ) MLE: widely used in regression estimation, difficult to compute by hand, often get MLE by computer
6
6 To estimate multiple regression:
7
7 An example: education, work tenure, and income
8
8 Inference for multiple regression R 2 = (explained variance) / (total variance) = ( 可被迴歸解釋的變異 )/ ( 總變異 ) Note: a problem of R 2 — 若在複迴歸公式不斷 加入自變數 ( 有些可能與模型無關 ) 時, R 2 會 提高。如此會誤導迴歸公式的解釋能力。 Solution: use adjusted R 2 ( 調整的判定係數 ) [ 經過自由度的調整,即調整任意增加不相 關的自變數 ]
9
9 F 檢定 : global test
10
10 Individual regression coefficients: variance, confidence interval ( 個別 迴歸參數的檢定與信賴區間 ) Follow the same idea of simple regression We usually rely on computer computation, ex: look at SPSS output. Interpretation of regression coefficients: 記得 : 現在有二個以上的變數,解釋時要加上: 當某變數固定時,另一變數 …
11
11 Question: Which variable in the multiple regression model can explain more? 如何比較各自變數對依變數解釋上 的相對重要性?
12
12 Compare standardized regression coefficients The issue is that independent variables have different units. ( 比較各自變數對依變 數解釋上的相對重要性時,因各自變數的 單位不同,不能直接比較。 ) SOLUTION : 用標準化的迴歸係數 (standardized regression coefficients) 標準化的迴歸係數 = ( β i _hat ) (S xi /S yi ) 標準化:去除單位不同的影響
13
13 SPSS Example: Education, Work tenure, and Salary
14
14
15
15 Dummy Variables ( 虛擬變數 ) Question: Can I include qualitative (or categorical) variables in the regression model? 〔能將質的變數(或類別變數)放 在迴歸模型裏嗎?〕 Answer: Yes, you can include qualitative variables in the regression model, but you have to transform those variables into dummy variables. ( 質的變數可以放在迴歸 模型內,但需轉成虛擬變數的形式。 )
16
16 What is a dummy variable? Dummy variables: use (0, 1) to code values of a variable ( 用 0, 1 來區分類別資料 ) Ex: 1. gender, male = 0, female = 1 2. education: below high school, high school, college and above Solution: 自變數有 n 個類別,就需要 (n-1) 個虛 擬變數,且所有的虛擬變數都要一起進入迴 歸模型。 ) 以此題為例,需兩個虛擬變數。
17
17 Example of a dummy variable, education: EducationHigh school below High schoolCollege and above educ_1010 educ_2001
18
18 Multiple regression with dummy variables Ex: two independent variables, one is a dummy variable Regression equation: (y_hat) = a + b 1 x + b 2 z (z = gender, male = 0, female = 1) Can get: (y_hat|x, z=0) = a + b 1 x (y_hat|x, z=1) = (a + b 2 )+ b 1 x 〔注意:二個迴歸線,截距不同〕 [ 有時會得到斜率 不同的二個迴歸線,即代表二個自變數有交叉影 響 (interaction effect).]
19
19 Two regression lines
20
20 The estimation and inference of regression models with dummy variables: Follow the same analyses we discussed before.
21
21 Revisit Education, Work Tenure, Gender, and Salary
22
22
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.