REGRESI 1. 2 Regresi adalah Diberikan sejumlah n buah data Yg dimodelkan oleh persamaan Model yg paling baik (best fit) secara umum adalah model yg meminimalkan.

REGRESI 1

2 Regresi adalah Diberikan sejumlah n buah data Yg dimodelkan oleh persamaan Model yg paling baik (best fit) secara umum adalah model yg meminimalkan jumlah kuadrat residual Figure. Basic model for regression Jumlah kuadrat residual

3 REGRESI LINIER

4 Regresi Linier (Kriteria 1) Diberikan sejumlah n data Best fit dimodelkan dalam bentuk persamaan x y Figure. Linear regression of y vs. x data showing residuals at a typical point, x i.

5 Contoh Kriteria 1 xy 2.04.0 3.06.0 2.06.0 3.08.0 Diberikan sejumlah titik (2,4), (3,6), (2,6) and (3,8), best fit dimodelkan dalam bentuk persamaan garis lurus Figure. Data points for y vs. x data. Table. Data Points

6 xyy predicted ε = y - y predicted 2.04.0 0.0 3.06.08.0-2.0 2.06.04.02.0 3.08.0 0.0 Table. Residuals at each point for regression model y = 4x – 4. Figure. Regression curve for y=4x-4, y vs. x data Dengan menggunakan persamaan y=4x-4 maka diperoleh kurva regresi

7 xyy predicted ε = y - y predicted 2.04.06.0-2.0 3.06.0 0.0 2.06.0 0.0 3.08.06.02.0 Table. Residuals at each point for y=6 Figure. Regression curve for y=6, y vs. x data Persamaan y=6

8 Kedua persamaan y=4x-4 and y=6 memiliki residual minimum tetapi memiliki model regresi yang tidak unik. Oleh karena itu kriteria 1 merupakan kriteria yang buruk

9 Regresi linier (Kriteria 2) x y Figure. Linear regression of y vs. x data showing residuals at a typical point, x i. Meminimalkan dengan memberikan harga mutlak

10 xyy predicted |ε| = |y - y predicted | 2.04.0 0.0 3.06.08.02.0 6.04.02.0 3.08.0 0.0 Table. The absolute residuals employing the y=4x-4 regression model Figure. Regression curve for y=4x-4, y vs. x data Dengan menggunakan persamaan y=4x-4

11 xyy predicted |ε| = |y – y predicted | 2.04.06.02.0 3.06.0 0.0 2.06.0 0.0 3.08.06.02.0 Table. Absolute residuals employing the y=6 model Figure. Regression curve for y=6, y vs. x data Dengan persamaan y=6

12 Can you find a regression line for whichand has unique regression coefficients? for both regression models of y=4x-4 and y=6. The sum of the errors has been made as small as possible, that is 4, but the regression model is not unique. Hence the above criterion of minimizing the sum of the absolute value of the residuals is also a bad criterion.

13 Least Squares Criterion Kriteria Least Squares meminimalkan jumlah kuadrat residual dari model Persamaan. x y Figure. Linear regression of y vs. x data showing residuals at a typical point, x i.

14 Finding Constants of Linear Model Minimize the sum of the square of the residuals: To find giving andwe minimizewith respect toand.

15 Finding Constants of Linear Model Solving for and directly yields,

16 Example 1 The torque, T needed to turn the torsion spring of a mousetrap through an angle, is given below. Angle, θ Torque, T RadiansN-m 0.6981320.188224 0.9599310.209138 1.1344640.230052 1.5707960.250965 1.9198620.313707 Table: Torque vs Angle for a torsional spring Find the constants for the model given by Figure. Data points for Angle vs. Torque data

17 Example 1 cont. The following table shows the summations needed for the calculations of the constants in the regression model. RadiansN-mRadians 2 N-m-Radians 0.6981320.1882240.4873880.131405 0.9599310.2091380.9214680.200758 1.1344640.2300521.28700.260986 1.5707960.2509652.46740.394215 1.9198620.3137073.68590.602274 6.28311.19218.84911.5896 Table. Tabulation of data for calculation of important Using equations described for N-m/rad summations andwith

18 Example 1 cont. Use the average torque and average angle to calculate Using, N-m

19 Example 1 Results Figure. Linear regression of Torque versus Angle data Using linear regression, a trend line is found from the data Can you find the energy in the spring if it is twisted from 0 to 180 degrees?

20 Example 2 StrainStress (%)(MPa) 00 0.183306 0.36612 0.5324917 0.7021223 0.8671529 1.02441835 1.17742140 1.3292446 1.4792752 1.52767 1.562896 To find the longitudinal modulus of composite, the following data is collected. Find the longitudinal modulus, Table. Stress vs. Strain data using the regression model and the sum of the square of the residuals. Figure. Data points for Stress vs. Strain data

21 Example 2 cont. Residual at each point is given by The sum of the square of the residuals then is Differentiate with respect to Therefore

22 Example 2 cont. iεσε 2 εσ 1 0.0000 2 1.8300×10 −3 3.0600×10 8 3.3489×10 −6 5.5998×10 5 3 3.6000×10 −3 6.1200×10 8 1.2960×10 −5 2.2032×10 6 4 5.3240×10 −3 9.1700×10 8 2.8345×10 −5 4.8821×10 6 5 7.0200×10 −3 1.2230×10 9 4.9280×10 −5 8.5855×10 6 6 8.6700×10 −3 1.5290×10 9 7.5169×10 −5 1.3256×10 7 7 1.0244×10 −2 1.8350×10 9 1.0494×10 −4 1.8798×10 7 8 1.1774×10 −2 2.1400×10 9 1.3863×10 −4 2.5196×10 7 9 1.3290×10 −2 2.4460×10 9 1.7662×10 −4 3.2507×10 7 10 1.4790×10 −2 2.7520×10 9 2.1874×10 −4 4.0702×10 7 11 1.5000×10 −2 2.7670×10 9 2.2500×10 −4 4.1505×10 7 12 1.5600×10 −2 2.8960×10 9 2.4336×10 −4 4.5178×10 7 1.2764×10 −3 2.3337×10 8 Table. Summation data for regression model With and Using

23 Example 2 Results The equation Figure. Linear regression for Stress vs. Strain data describes the data.

REGRESI NON LINIER

Nonlinear Regression Some popular nonlinear regression models: 1. Exponential model: 2. Power model: 3. Saturation growth model: 4. Polynomial model: 25

Nonlinear Regression Given n data pointsbest fit to the data, whereis a nonlinear function of. Figure. Nonlinear regression model for discrete y vs. x data 26

Regression Exponential Model 27

Exponential Model Givenbest fitto the data. Figure. Exponential model of nonlinear regression for y vs. x data 28

Finding Constants of Exponential Model The sum of the square of the residuals is defined as Differentiate with respect to a and b 29

Finding Constants of Exponential Model Rewriting the equations, we obtain 30

Finding constants of Exponential Model Substituting a back into the previous equation The constant b can be found through numerical methods such as bisection method. Solving the first equation for a yields 31

Example 1-Exponential Model t(hrs)013579 1.0000.8910.7080.5620.4470.355 Many patients get concerned when a test involves injection of a radioactive material. For example for scanning a gallbladder, a few drops of Technetium-99m isotope is used. Half of the techritium-99m would be gone in about 6 hours. It, however, takes about 24 hours for the radiation levels to reach what we are exposed to in day-to-day activities. Below is given the relative intensity of radiation as a function of time. Table. Relative intensity of radiation as a function of time. 32

Example 1-Exponential Model cont. Find: a) The value of the regression constantsand b) The half-life of Technium-99m c) Radiation intensity after 24 hours The relative intensity is related to time by the equation 33

Plot of data 34

Constants of the Model The value of λ is found by solving the nonlinear equation 35

Setting up the Equation in MATLAB t (hrs)013579 γ 1.0000.8910.7080.5620.4470.355 36

Setting up the Equation in MATLAB t=[0 1 3 5 7 9] gamma=[1 0.891 0.708 0.562 0.447 0.355] syms lamda sum1=sum(gamma.*t.*exp(lamda*t)); sum2=sum(gamma.*exp(lamda*t)); sum3=sum(exp(2*lamda*t)); sum4=sum(t.*exp(2*lamda*t)); f=sum1-sum2/sum3*sum4; 37

Calculating the Other Constant The value of A can now be calculated The exponential regression model then is 38

Plot of data and regression curve 39

Relative Intensity After 24 hrs The relative intensity of radiation after 24 hours This result implies that only radioactive intensity is left after 24 hours. 40

Homework What is the half-life of technetium 99m isotope? Compare the constants of this regression model with the one where the data is transformed. Write a program in the language of your choice to find the constants of the model. 41

Polynomial Model Givenbest fit to a given data set. Figure. Polynomial model for nonlinear regression of y vs. x data 42

Polynomial Model cont. The residual at each data point is given by The sum of the square of the residuals then is 43

Polynomial Model cont. To find the constants of the polynomial model, we set the derivatives with respect to whereequal to zero. 44

Polynomial Model cont. These equations in matrix form are given by The above equations are then solved for 45

Example 2-Polynomial Model Temperature, T ( o F) Coefficient of thermal expansion, α (in/in/ o F) 806.47×10 −6 406.24×10 −6 −405.72×10 −6 −1205.09×10 −6 −2004.30×10 −6 −2803.33×10 −6 −3402.45×10 −6 Regress the thermal expansion coefficient vs. temperature data to a second order polynomial. Table. Data points for temperature vs Figure. Data points for thermal expansion coefficient vs temperature. 46

Example 2-Polynomial Model cont. We are to fit the data to the polynomial regression model The coefficientsare found by differentiating the sum of the square of the residuals with respect to each variable and setting the values equal to zero to obtain 47

Example 2-Polynomial Model cont. The necessary summations are as follows Temperature, T ( o F) Coefficient of thermal expansion, α (in/in/ o F) 806.47×10 −6 406.24×10 −6 −405.72×10 −6 −1205.09×10 −6 −2004.30×10 −6 −2803.33×10 −6 −3402.45×10 −6 Table. Data points for temperature vs. 48

Example 2-Polynomial Model cont. Using these summations, we can now calculate Solving the above system of simultaneous linear equations we have The polynomial regression model is then 49

Linearization of Data To find the constants of many nonlinear models, it results in solving simultaneous nonlinear equations. For mathematical convenience, some of the data for such models can be linearized. For example, the data for an exponential model can be linearized. As shown in the previous example, many chemical and physical processes are governed by the equation, Taking the natural log of both sides yields, Letand (implying)with We now have a linear regression model where 50

Linearization of data cont. Using linear model regression methods, Onceare found, the original constants of the model are found as 51

Example 3-Linearization of data t(hrs)013579 1.0000.8910.7080.5620.4470.355 Many patients get concerned when a test involves injection of a radioactive material. For example for scanning a gallbladder, a few drops of Technetium- 99m isotope is used. Half of the technetium-99m would be gone in about 6 hours. It, however, takes about 24 hours for the radiation levels to reach what we are exposed to in day-to-day activities. Below is given the relative intensity of radiation as a function of time. Table. Relative intensity of radiation as a function of time Figure. Data points of relative radiation intensity vs. time 52

Example 3-Linearization of data cont. Find: a) The value of the regression constantsand b) The half-life of Technium-99m c) Radiation intensity after 24 hours The relative intensity is related to time by the equation 53

Example 3-Linearization of data cont. Exponential model given as, Assuming,andwe obtain This is a linear relationship betweenand 54

Example 3-Linearization of data cont. Using this linear relationship, we can calculate and where 55

Example 3-Linearization of Data cont. 123456123456 013579013579 1 0.891 0.708 0.562 0.447 0.355 0.00000 −0.11541 −0.34531 −0.57625 −0.80520 −1.0356 0.0000 −0.11541 −1.0359 −2.8813 −5.6364 −9.3207 0.0000 1.0000 9.0000 25.000 49.000 81.000 25.000−2.8778−18.990165.00 Summations for data linearization are as follows Table. Summation data for linearization of data model With 56

Example 3-Linearization of Data cont. Calculating Since also 57

Example 3-Linearization of Data cont. Resulting model is Figure. Relative intensity of radiation as a function of temperature using linearization of data model. 58

Example 3-Linearization of Data cont. The regression formula is then b) Half life of Technetium 99 is when 59

Example 3-Linearization of Data cont. c) The relative intensity of radiation after 24 hours is then This implies that onlyof the radioactive material is left after 24 hours. 60

Comparison Comparison of exponential model with and without data linearization: With data linearization (Example 3) Without data linearization (Example 1) A0.999740.99983 λ−0.11505−0.11508 Half-Life (hrs)6.02486.0232 Relative intensity after 24 hrs. 6.3200×10 −2 6.3160×10 −2 Table. Comparison for exponential model with and without data linearization. The values are very similar so data linearization was suitable to find the constants of the nonlinear exponential model in this case. 61

62 ADEQUACY OF REGRESSION MODELS

Is this adequate? Straight Line Model

Quality of Fitted Data Does the model describe the data adequately? How well does the model predict the response variable predictably?

Linear Regression Models Limit our discussion to adequacy of straight-line regression models

Four checks 1. Plot the data and the model. 2. Find standard error of estimate. 3. Calculate the coefficient of determination. 4. Check if the model meets the assumption of random errors.

Example: Check the adequacy of the straight line model for given data T (F) α (μin/in/F) -3402.45 -2603.58 -1804.52 -1005.28 -205.86 606.36

1. Plot the data and the model

Data and model T (F) α (μin/in/F) -3402.45 -2603.58 -1804.52 -1005.28 -205.86 606.36

2. Find the standard error of estimate

Standard error of estimate

Standard Error of Estimate -340 -260 -180 -100 -20 60 2.45 3.58 4.52 5.28 5.86 6.36 2.7357 3.5114 4.2871 5.0629 5.8386 6.6143 -0.28571 0.068571 0.23286 0.21714 0.021429 -0.25429

Standard Error of Estimate

Scaled Residuals 95% of the scaled residuals need to be in [-2,2]

Scaled Residuals TiTi αiαi Residual Scaled Residual -340 -260 -180 -100 -20 60 2.45 3.58 4.52 5.28 5.86 6.36 -0.28571 0.068571 0.23286 0.21714 0.021429 -0.25429 -1.1364 0.27275 0.92622 0.86369 0.085235 -1.0115

3. Find the coefficient of determination

Coefficient of determination

Sum of square of residuals between data and mean y x

Sum of square of residuals between observed and predicted y x

Limits of Coefficient of Determination

Calculation of S t -340 -260 -180 -100 -20 60 2.45 3.58 4.52 5.28 5.86 6.36 -2.2250 -1.0950 0.15500 0.60500 1.1850 1.6850

Calculation of S r -340 -260 -180 -100 -20 60 2.45 3.58 4.52 5.28 5.86 6.36 2.7357 3.5114 4.2871 5.0629 5.8386 6.6143 -0.28571 0.068571 0.23286 0.21714 0.021429 -0.25429

Coefficient of determination

Caution in use of r 2 Increase in spread of regressor variable (x) in y vs. x increases r 2 Large regression slope artificially yields high r 2 Large r 2 does not measure appropriateness of the linear model Large r 2 does not imply regression model will predict accurately

Final Exam Grade

Final Exam Grade vs Pre-Req GPA

4. Model meets assumption of random errors

Model meets assumption of random errors Residuals are negative as well as positive Variation of residuals as a function of the independent variable is random Residuals follow a normal distribution There is no autocorrelation between the data points.

Therm exp coeff vs temperature Tα 606.36 406.24 206.12 06.00 -205.86 -405.72 -605.58 -805.43 Tα -1005.28 -1205.09 -1404.91 -1604.72 -1804.52 -2004.30 -2204.08 -2403.83 Tα -2803.33 -3003.07 -3202.76 -3402.45

Data and model

Plot of Residuals

Histograms of Residuals

Check for Autocorrelation Find the number of times, q the sign of the residual changes for the n data points. If (n-1)/2-√(n-1) ≤ q ≤ (n-1)/2+√(n-1), you most likely do not have an autocorrelation.

Is there autocorrelation?

y vs x fit and residuals n=40 Is 13.3≤21≤ 25.7? Yes! (n-1)/2-√(n-1) ≤p≤ (n-1)/2+√(n-1)

y vs x fit and residuals (n-1)/2-√(n-1) ≤p≤ (n-1)/2+√(n-1) Is 13.3≤2≤ 25.7? No! n=40

What polynomial model to choose if one needs to be chosen?

First Order of Polynomial

Second Order Polynomial

Which model to choose?

Optimum Polynomial

Effect of an Outlier

Effect of Outlier

REGRESI 1. 2 Regresi adalah Diberikan sejumlah n buah data Yg dimodelkan oleh persamaan Model yg paling baik (best fit) secara umum adalah model yg meminimalkan.

Similar presentations

Presentation on theme: "REGRESI 1. 2 Regresi adalah Diberikan sejumlah n buah data Yg dimodelkan oleh persamaan Model yg paling baik (best fit) secara umum adalah model yg meminimalkan."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

REGRESI 1. 2 Regresi adalah Diberikan sejumlah n buah data Yg dimodelkan oleh persamaan Model yg paling baik (best fit) secara umum adalah model yg meminimalkan.

Similar presentations

Presentation on theme: "REGRESI 1. 2 Regresi adalah Diberikan sejumlah n buah data Yg dimodelkan oleh persamaan Model yg paling baik (best fit) secara umum adalah model yg meminimalkan."— Presentation transcript:

Similar presentations

About project

Feedback