Download presentation
Presentation is loading. Please wait.
Published byUlla Andersson Modified over 5 years ago
1
Analysis for Predicting the Selling Price of Apartments Pratik Nikte
Feature Selection Abstract In our analysis we have used Ames Iowa Housing Dataset from the Kaggle website to analyze and predict the sale price of the house. We have used machine learning techniques like Linear Regression, Random Forest and XGBoost Algorithms. Introduction Using the dataset problem we are trying to solve here is to build models to predict house sale prices. Based on the dataset, we can understand how some variables have direct impact on the sale price of the house. We used in-built machine learning libraries in R-programming Language while working on this problem. Figure 4: Multi-Correlation Matrix Feature Selection Parameters Impact on Sale Price Model Selection & Comparison 46 Categorical Variables 33 Continuous Variables 79 Total Variables Linear Regression Linear Regression model assumes that the relationship between dependent (sale price) and independent (features) variable is linear Project Plan Random Forest Random Forest does it own feature selection & create multiple decision trees which will help us to determine the accuracy to find the sale-price XGBoost We use the training data to predict target variable. With every iteration XGBboost minimizes the error rate compared to the first tree Figure 1: Project Plan Exploratory Data Analysis Choose the parameters for the model param <- list(colsample_bytree = .7, subsample = .7, booster = "gbtree", max_depth = 10, eta = 0.02, eval_metric = "rmse", objective="reg:linear") Fixing Na’s -> 0 value Data Formatting: Categorical -> Numeric Variables Transformed Sale Price to log(Sale Price+1) as our result analysis is based on RMSE log error To reduce the effect of high-end outliers Results As our result showed us that XGBoost had the lowest RMSE score and it was more accurate than the other models We evaluated XGBoost model by performing cross validation. We implemented XGBoost model on our test dataset and predicted the Sale Price Figure 2: Right Skewed Distribution Figure 3: Normalized Distribution
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.