Bike Sharing Demand Prediction PRESENTED BY:- AKSHAY PATIL 14MCB1031 RESEARCH FACILITATOR: PROF. BVANSS PRABHAKAR RAO M.TECH 1 ST YEAR RBL.

Slides:



Advertisements
Similar presentations
STEM RAYS Research STEMRAYS Program Evaluation (Peterfreund & Associates) Educational research (Allan and Kelly) After School Club science and engineering.
Advertisements

The Robert Gordon University School of Engineering Dr. Mohamed Amish
January 6. January 7 January 8 January 9 January 10.
IS5152 Decision Making Technologies
DEVELOPMENT OF A METHOD FOR RELIABLE AND LOW COST PREDICTIVE MAINTENANCE Jacopo Cassina.
Introduction to Data Mining with XLMiner
Project 4 U-Pick – A Project of Your Own Design Proposal Due: April 14 th (earlier ok) Project Due: April 25 th.
Mapping Between Taxonomies Elena Eneva 11 Dec 2001 Advanced IR Seminar.
Parameterizing Random Test Data According to Equivalence Classes Chris Murphy, Gail Kaiser, Marta Arias Columbia University.
Introduction to WEKA Aaron 2/13/2009. Contents Introduction to weka Download and install weka Basic use of weka Weka API Survey.
Computer Science Universiteit Maastricht Institute for Knowledge and Agent Technology Data mining and the knowledge discovery process Summer Course 2005.
Algorithms for Data Analytics Chapter 3. Plans Introduction to Data-intensive computing (Lecture 1) Statistical Inference: Foundations of statistics (Chapter.
Machine Learning Usman Roshan Dept. of Computer Science NJIT.
Data Mining By Andrie Suherman. Agenda Introduction Major Elements Steps/ Processes Tools used for data mining Advantages and Disadvantages.
Factors affecting contractors’ risk attitudes in construction projects: Case study from China 박병권.
SPAM DETECTION USING MACHINE LEARNING Lydia Song, Lauren Steimle, Xiaoxiao Xu.
 The Weka The Weka is an well known bird of New Zealand..  W(aikato) E(nvironment) for K(nowlegde) A(nalysis)  Developed by the University of Waikato.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence From Data Mining To Knowledge.
Data Mining Chun-Hung Chou
DR. AHMAD SHAHRUL NIZAM ISHA
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A data mining approach to the prediction of corporate failure.
CS525 DATA MINING COURSE INTRODUCTION YÜCEL SAYGIN SABANCI UNIVERSITY.
Hospitalization Prediction From Health Care Claims Adithya Renduchintala, Benjamin Martin, & Lance Legel University of Colorado Boulder  Data Mining 
Comparison of Bayesian Neural Networks with TMVA classifiers Richa Sharma, Vipin Bhatnagar Panjab University, Chandigarh India-CMS March, 2009 Meeting,
Weka: a useful tool in data mining and machine learning Team 5 Noha Elsherbiny, Huijun Xiong, and Bhanu Peddi.
Introduction to Science Informatics Lecture 1. What Is Science? a dependence on external verification; an expectation of reproducible results; a focus.
2009 ML Project: Goal: Do some real machine learning… A project you are interested in works better Data is often the hard part (get it in plenty of time)
2014 ML Project2: Goal: Do some real machine learning; learn you to use machine learning to make sense out of data. Group Project—4 (3) students per group.
Introduction Use machine learning and various classifying techniques to be able to create an algorithm that can decipher between spam and ham s. .
A way to integrate IR and Academic activities to enhance institutional effectiveness. Introduction The University of Alabama (State of Alabama, USA) was.
Automatic Transformation of Raw Clinical Data into Clean Data Using Decision Tree Learning Jian Zhang Supervised by: Karen Petrie 1.
Patch Based Prediction Techniques University of Houston By: Paul AMALAMAN From: UH-DMML Lab Director: Dr. Eick.
October 2-3, 2015, İSTANBUL Boğaziçi University Prof.Dr. M.Erdal Balaban Istanbul University Faculty of Business Administration Avcılar, Istanbul - TURKEY.
Chapter 1.1 – What is Science?. State and explain the goals of science. Describe the steps used in the scientific method. Daily Objectives.
Instructors’ General Perceptions on Students’ Self-Awareness Frances Feng-Mei Choi HUNGKUANG UNIVERSITY DEPARTMENT OF ENGLISH.
Research Methodology Class.   Your report must contains,  Abstract  Chapter 1 - Introduction  Chapter 2 - Literature Review  Chapter 3 - System.
Iterative similarity based adaptation technique for Cross Domain text classification Under: Prof. Amitabha Mukherjee By: Narendra Roy Roll no: Group:
SunSatFriThursWedTuesMon January
Finding τ → μ−μ−μ+ Decays at LHCb with Data Mining Algorithms
Application of Data Mining Techniques on Survey Data using R and Weka
Competition II: Springleaf Sha Li (Team leader) Xiaoyan Chong, Minglu Ma, Yue Wang CAMCOS Fall 2015 San Jose State University.
Using decision trees to build an a framework for multivariate time- series classification 1 Present By Xiayi Kuang.
Research topic here Name Surname Faculty Research Proposal CS10A0862 INTRODUCTION TO RESEARCH METHODS.
MITM613 Wednesday [ 6:00 – 9:00 ] am 1 st week. Good evening …. Every body.
© 2004 McGraw-Hill Companies, Inc., McGraw-Hill RyersonSlide 8-2 TURNING MARKETING INFORMATION INTO ACTION C HAPTER.
General Information Course Id: COSC6342 Machine Learning Time: TU/TH 1-2:30p Instructor: Christoph F. Eick Classroom:AH301
A Decision Support Based on Data Mining in e-Banking Irina Ionita Liviu Ionita Department of Informatics University Petroleum-Gas of Ploiesti.
FNA/Spring CENG 562 – Machine Learning. FNA/Spring Contact information Instructor: Dr. Ferda N. Alpaslan
1 SBM411 資料探勘 陳春賢. 2 Lecture I Class Introduction.
Combining Bagging and Random Subspaces to Create Better Ensembles
Collage Score Card & Software defect prediction
Experience Report: System Log Analysis for Anomaly Detection
A Smart Tool to Predict Salary Trends of H1-B Holders
SNS COLLEGE OF TECHNOLOGY
Presented by Khawar Shakeel
USE OF DATA ANALYTICS TO PREDICT THE DEMAND OF BIKES
SMA5422: Special Topics in Biotechnology
CS 8520: Artificial Intelligence
Predict House Sales Price
A GACP and GTMCP company
Data Mining: Concepts and Techniques Course Outline
Machine Learning Week 1.
Name of the Authors and Affiliations
iSRD Spam Review Detection with Imbalanced Data Distributions
The Research Proposal How to Write It.
Lab weighted kNN, decision trees, random forest (“cross-validation” built in – more labs on it later in the course) Peter Fox and Greg Hughes Data Analytics.
Data Analytics – ITWS-4600/ITWS-6600/MATP-4450
Practice Project Overview
Promising “Newer” Technologies to Cope with the
Presentation transcript:

Bike Sharing Demand Prediction PRESENTED BY:- AKSHAY PATIL 14MCB1031 RESEARCH FACILITATOR: PROF. BVANSS PRABHAKAR RAO M.TECH 1 ST YEAR RBL FIRST REVIEW PRESENTATION VIT-CHENNAI.

Objective  Primary Objective: To build a superior statistical model to predict the number of bicycles that can be rented with availability of data.  Secondary Objectives: 1)To learn how real time data is represented in datasets. 2)To understand how to pre-process such data. 3)To study comparison of results achieved by various Machine Learning techniques such as Regression, Decision Trees, RandomForests and SVM’s.

Research Scope  Introduction to Bike Sharing Systems.  Use of Data Analysis in such Systems.

Literature Survey  Regression: Package used: lm  Decision Trees: Package Used: rpart, ctree  RandomForests: Package Used: randomForest  SVM: Package Used: e1071

Proposed Methodology Fetch & Analyze Data Clean Data Partition Data Remove Missing Data Clean Data Create New Factors PreProcessing Building a Prediction Model Validate the Model Predict Values for Test Data

About Data:  The training set is comprised of the first 19 days of each month, while the test set is the 20th to the end of the month of year 2011 and  Training Data: observations of 12 variables.  Test Data: 6493 observations of 9 variables.

Dataset Description

Implementation Tools  R  Weka

Work Done:  Understanding Data  Factorize training set and test set  Create time column by stripping out timestamp  Create new timestamp column  Create day of week column  Create and factorize Sunday variable

Factorized Data:

Timeline  Till 20 th January: Finalizing RBL topic  20 th January – 5 th February: Understanding dataset and gaining domain knowledge  6 th February – 20 th February: Literature Survey and methods.  21 st February – 20 th March: Implementation  21 st March- 10 th April: Testing and improving model  11 th April – 30 th April: Writing Paper

Stats:  “In the world of data analysis, Analysts require only 20% of the total project time in building the actual models, about 60% of the period is spent in understanding and pre-processing the data” - Mat McHogan, Data Scientist, SVDS.com

References  1] Bike Sharing Demand:  2] Fanaee-T, Hadi, and Gama, Joao, Event labeling combining ensemble detectors and background knowledge, Progress in Artificial Intelligence (2013): pp. 1-15, Springer Berlin Heidelberg.  3]Decision Tree Learning: 20/www/mlbook/ch3.pdf 4]A Tour of Machine Learning Algorithms: algorithms/ 20/www/mlbook/ch3.pdf

Any Suggestions?

Thank You