Presentation is loading. Please wait.

Presentation is loading. Please wait.

Samantha Bellah Adv. Stats Final Project Real Estate Forecasting Regression Model Market: Highland Park Neighborhood Data Sources: Zillow.com E:\PuebloRESales2014Q1Q2.xlsx.

Similar presentations


Presentation on theme: "Samantha Bellah Adv. Stats Final Project Real Estate Forecasting Regression Model Market: Highland Park Neighborhood Data Sources: Zillow.com E:\PuebloRESales2014Q1Q2.xlsx."— Presentation transcript:

1 Samantha Bellah Adv. Stats Final Project Real Estate Forecasting Regression Model Market: Highland Park Neighborhood Data Sources: Zillow.com E:\PuebloRESales2014Q1Q2.xlsx

2 Sample Data Set

3 Regression Analysis (Model Runs, Variable Selection) SUMMARY OUTPUT Regression Statistics Multiple R0.651399 R Square0.42432 Adjusted R Square0.005644 Standard Error14752.05 Observations20 ANOVA dfSSMSFSignificance F Regression81.76E+092.21E+081.0134810.478104 Residual112.39E+092.18E+08 Total194.16E+09 CoefficientsStandard Errort StatP-valueLower 95% Intercept189836823091300.8221140.428473-3183993 Total SqFt49.3032124.890311.9808190.073167-5.47999 Garage sq ft36.4101274.535670.4884930.634794-127.642 Number floors-1684517836.57-0.944410.365244-56103 Detached-24669.951212.19-0.481720.639446-137387 Attached-1228524717.6-0.497010.628964-66688 Year built-957.5461196.24-0.800460.440388-3590.45 Bedrooms2296.0949728.6980.2360120.81776-19116.6 Lot SqFt3.0471153.1164690.9777460.349214-3.81219 SUMMARY OUTPUT Regression Statistics Multiple R0.610366 R Square0.372546 Adjusted R Square0.254899 Standard Error12769.95 Observations20 ANOVA dfSSMSFSignificance F Regression31.55E+095.16E+083.166630.053216 Residual162.61E+091.63E+08 Total194.16E+09 Coefficients Standard Errort StatP-valueLower 95% Intercept38110.0525772.981.4786820.158642-16526.2 Total SqFt50.4287718.285292.7578870.01400511.66568 Number floors-27994.510230.5-2.736380.014638-49682.2 Lot SqFt3.6900972.3487731.5710740.135729-1.28908

4 Presentation/Description of Final Model The second graph on slide 3, with an R Square of.37 represents my final model. It is not a great fit according to R Square because according to my model, only 37% of the variation in Selling Price is explained by the variables (total square feet, number of floors, and lot square feet). Since the F Significance is pretty low (.05) the results are somewhat statistically significant, but there are probably other predictors that are more reliable. The P-values show that the worst predictor of all the ones from my model is the lot square feet, with a P-value of.14, which is much too high to be very significant to this model. But when I took it out of the model, the significance of F raised to.06 and the R Square went even lower to.28, so the overall model seemed a better fit with the lot square feet included in the data set.

5 Residual Analysis There don’t seem to be any outliers that clearly stand out. The formula used to find the predicted selling price was as follows: Y^= 38110.05+50.4*(total sq ft)-27994.5*(number floors)+3.7*(lot sq ft)

6 Model Application AddressBedroomsTotal SqFtSelling Price# floorsGarageGarage sq ftBathLot SqFtHouse Age 1) 2936 Azalea St 31020$88,0001none02653454 2) 3909 Sheffield Ln41544$120,0003detached2882601143 Y^= 38110.05+50.4*(total sq ft)-27994.5*(number floors)+3.7*(lot sq ft) 1)38110.05+50.4*(1020)-27994.5*(1)+3.7*(6534) =$85699 2)38110.05+50.4*(1544)-27994.5*(3)+3.7*(6011) =$54185 My model for House 1 was about $2000 off for predicting the selling price, while the price for House 2 was cut from the actual selling price by over half. I knew my model was not a great predictor of selling price based on my the results from the regression analysis and I tried to use the different variables I had to run a better model, but the best model I could find with the data I had found originally never got me great results. If I were to do the project again I would probably try running the model with a different variable such as distance from the schools. The best model I could make based on my data obviously wasn’t great because important factors such as number of bedrooms and bathrooms weren’t taken into consideration when determining the selling price.


Download ppt "Samantha Bellah Adv. Stats Final Project Real Estate Forecasting Regression Model Market: Highland Park Neighborhood Data Sources: Zillow.com E:\PuebloRESales2014Q1Q2.xlsx."

Similar presentations


Ads by Google