Predict House Sales Price

Slides:



Advertisements
Similar presentations
The Software Infrastructure for Electronic Commerce Databases and Data Mining Lecture 4: An Introduction To Data Mining (II) Johannes Gehrke
Advertisements

Chapter 5 Multiple Linear Regression
DECISION TREES. Decision trees  One possible representation for hypotheses.
Kin 304 Regression Linear Regression Least Sum of Squares
1 Statistical Modeling  To develop predictive Models by using sophisticated statistical techniques on large databases.
CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 13 Nonlinear and Multiple Regression.
Effect of Selection Ratio on Predictor Utility Reject Accept Predictor Score Criterion Performance r =.40 Selection Cutoff Score sr =.50sr =.10sr =.95.
MACHINE LEARNING 12. Multilayer Perceptrons. Neural Networks Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
General Mining Issues a.j.m.m. (ton) weijters Overfitting Noise and Overfitting Quality of mined models (some figures are based on the ML-introduction.
Chapter 5 Data mining : A Closer Look.
CSCI 347 / CS 4206: Data Mining Module 04: Algorithms Topic 06: Regression.
1 1 Slide Simple Linear Regression Chapter 14 BA 303 – Spring 2011.
Attention Deficit Hyperactivity Disorder (ADHD) Student Classification Using Genetic Algorithm and Artificial Neural Network S. Yenaeng 1, S. Saelee 2.
Predicting Income from Census Data using Multiple Classifiers Presented By: Arghya Kusum Das Arnab Ganguly Manohar Karki Saikat Basu Subhajit Sidhanta.
INTRODUCTION TO MACHINE LEARNING. $1,000,000 Machine Learning  Learn models from data  Three main types of learning :  Supervised learning  Unsupervised.
Machine Learning CSE 681 CH2 - Supervised Learning.
Outline 1-D regression Least-squares Regression Non-iterative Least-squares Regression Basis Functions Overfitting Validation 2.
Jeff Howbert Introduction to Machine Learning Winter Regression Linear Regression.
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.6: Linear Models Rodney Nielsen Many of.
Today Ensemble Methods. Recap of the course. Classifier Fusion
1 Multiple Regression A single numerical response variable, Y. Multiple numerical explanatory variables, X 1, X 2,…, X k.
Chapter 11 Statistical Techniques. Data Warehouse and Data Mining Chapter 11 2 Chapter Objectives  Understand when linear regression is an appropriate.
Jeff Howbert Introduction to Machine Learning Winter Regression Linear Regression Regression Trees.
An Artificial Neural Network Approach to Surface Waviness Prediction in Surface Finishing Process by Chi Ngo ECE/ME 539 Class Project.
1 Statistics & R, TiP, 2011/12 Neural Networks  Technique for discrimination & regression problems  More mathematical theoretical foundation  Works.
Random Forests Ujjwol Subedi. Introduction What is Random Tree? ◦ Is a tree constructed randomly from a set of possible trees having K random features.
Prepared by Fayes Salma.  Introduction: Financial Tasks  Data Mining process  Methods in Financial Data mining o Neural Network o Decision Tree  Trading.
Predicting Energy Consumption in Buildings using Multiple Linear Regression Introduction Linear regression is used to model energy consumption in buildings.
Energy System Control with Deep Neural Networks
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
A Smart Tool to Predict Salary Trends of H1-B Holders
Restaurant Revenue Prediction using Machine Learning Algorithms
Data Transformation: Normalization
Chapter 7. Classification and Prediction
Linear Regression.
Regression Analysis Module 3.
Kin 304 Regression Linear Regression Least Sum of Squares
Table 1. Advantages and Disadvantages of Traditional DM/ML Methods
Chapter 6 Classification and Prediction
USE OF DATA ANALYTICS TO PREDICT THE DEMAND OF BIKES
BPK 304W Regression Linear Regression Least Sum of Squares
Data Mining Lecture 11.
NEURAL NETWORK APPROACHES FOR AUTOMOBILE MPG PREDICTION
Overview of Supervised Learning
NBA Draft Prediction BIT 5534 May 2nd 2018
CS548 Fall 2017 Decision Trees / Random Forest Showcase by Yimin Lin, Youqiao Ma, Ran Lin, Shaoju Wu, Bhon Bunnag Showcasing work by Cano,
Machine Learning Feature Creation and Selection
Generalization ..
Dr. Morgan C. Wang Department of Statistics
Multiple Regression Models
STA 282 – Regression Analysis
Using decision trees and their ensembles for analysis of NIR spectroscopic data WSC-11, Saint Petersburg, 2018 In the light of morning session on superresolution.
Chap. 7 Regularization for Deep Learning (7.8~7.12 )
The Multiple Regression Model
Classification and Prediction
Cross-validation for the selection of statistical models
CSCI N317 Computation for Scientific Applications Unit Weka
Forecasting Electricity Demand and Prices with Machine Learning
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Somi Jacob and Christian Bach
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Analysis for Predicting the Selling Price of Apartments Pratik Nikte
Chapter 7: Transformations
BEC 30325: MANAGERIAL ECONOMICS
Credit Card Fraudulent Transaction Detection
Primer on Neural networks
Advisor: Dr.vahidipour Zahra salimian Shaghayegh jalali Dec 2017
Presentation transcript:

Predict House Sales Price Prof: Meiliu Lu Team: Sindhura Kilaru Mrunal Makwana

Problem Buying a house for each buyer is unprecedented and unparalleled. The main objective is to find the best price your client can sell their house at.

Dataset For Instance, there are 79 different clients needs. Such as, lot area, pool area, utilities and neighborhood. Satisfying all those needs and giving them their choice is tedious.

Data Preprocessing Since, dataset has more number of parameters and there are many irrelevant (null) values. Fetching Only numeric values. 10 Cross Validation. Normalized using, maximum, minimum and scale. Still, finding the best ones from these is a challenge.

Feature Selection The process of selecting a subset of relevant features (variables, predictors) for use in model construction. Boruta package from R It finds relevant features by comparing original attributes importance with importance achievable at random, estimated using their permuted copies.

Feature Selection

Machine Learning Techniques To Find the best results we prefered to implement following techniques. Neural Network Decision Tree Multi Linear regression

Multi linear Regression Multiple linear regression attempts to model the relationship between two or more explanatory variables and a response variable by fitting a linear equation to observed data. A linear relationship is assumed between the dependent variable and the independent variables.

Decision Tree Decision Tree for the relevant values A decision tree depicts rules for dividing data into groups. The first rule splits the entire data set into some number of pieces, and then another rule may be applied to a piece, different rules to different pieces, forming a second generation of pieces.

Continue..

Neural Network Results with neural networks depends on the number of chosen hidden layers. Since inputs are huge in number, feeding them in NN with only relevant values, and numerical values were the best options. Neural Network with 15 hidden layers and 40 inputs took 2-3 minutes in our experiment.

Neural Network With only relevant values for train and test data. For Train: steps:74 error: 3.74238 time: 0.21 secs

Continue With only relevant values for train and test data. For Train: steps:74 error: 3.74238 time: 0.21 secs

Continue. For Test: steps: 21 error: 0.478 time: 0.01 secs

Continue. steps: 253 error: 0.53267 time: 2.91 secs

Continue. steps: 253 error: 0.53267 time: 2.91 secs

Reference Dataset: https://www.kaggle.com/c/house-prices-advanced-regression-techniques Boruta Package: http://www.cybaea.net/journal/2010/11/15/Feature- selection-All-relevant-selection-with-the-Boruta-package/ Neural Network: https://www.r-bloggers.com/fitting-a-neural-network-in-r- neuralnet-package/ Multiple linear Regression: Professor’s sample code Decision Tree: http://rstatistics.net/decision-trees-with-r/