Presentation is loading. Please wait.

Presentation is loading. Please wait.

Analysis for Predicting the Selling Price of Apartments Pratik Nikte

Similar presentations


Presentation on theme: "Analysis for Predicting the Selling Price of Apartments Pratik Nikte"— Presentation transcript:

1 Analysis for Predicting the Selling Price of Apartments Pratik Nikte
Feature Selection Abstract In our analysis we have used Ames Iowa Housing Dataset from the Kaggle website to analyze and predict the sale price of the house. We have used machine learning techniques like Linear Regression, Random Forest and XGBoost Algorithms. Introduction Using the dataset problem we are trying to solve here is to build models to predict house sale prices. Based on the dataset, we can understand how some variables have direct impact on the sale price of the house. We used in-built machine learning libraries in R-programming Language while working on this problem. Figure 4: Multi-Correlation Matrix Feature Selection Parameters Impact on Sale Price Model Selection & Comparison 46 Categorical Variables 33 Continuous Variables 79 Total Variables Linear Regression Linear Regression model assumes that the relationship between dependent (sale price) and independent (features) variable is linear Project Plan Random Forest Random Forest does it own feature selection & create multiple decision trees which will help us to determine the accuracy to find the sale-price XGBoost We use the training data to predict target variable. With every iteration XGBboost minimizes the error rate compared to the first tree Figure 1: Project Plan Exploratory Data Analysis Choose the parameters for the model param <- list(colsample_bytree = .7, subsample = .7, booster = "gbtree", max_depth = 10, eta = 0.02, eval_metric = "rmse", objective="reg:linear") Fixing Na’s -> 0 value Data Formatting: Categorical -> Numeric Variables Transformed Sale Price to log(Sale Price+1) as our result analysis is based on RMSE log error To reduce the effect of high-end outliers Results As our result showed us that XGBoost had the lowest RMSE score and it was more accurate than the other models We evaluated XGBoost model by performing cross validation. We implemented XGBoost model on our test dataset and predicted the Sale Price Figure 2: Right Skewed Distribution Figure 3: Normalized Distribution


Download ppt "Analysis for Predicting the Selling Price of Apartments Pratik Nikte"

Similar presentations


Ads by Google