Occupancy data analytics and prediction: A case study

Slides:



Advertisements
Similar presentations
Statistics for Improving the Efficiency of Public Administration Daniel Peña Universidad Carlos III Madrid, Spain NTTS 2009 Brussels.
Advertisements

Learning Trajectory Patterns by Clustering: Comparative Evaluation Group D.
ISE480 Sequencing and Scheduling Izmir University of Economics ISE Fall Semestre.
Decision Making: An Introduction 1. 2 Decision Making Decision Making is a process of choosing among two or more alternative courses of action for the.
Peter Xiang Gao, S. Keshav University of Waterloo.
A Parallel Statistical Learning Approach to the Prediction of Building Energy Consumption Based on Large Datasets Hai Xiang ZHAO, Phd candidate Frédéric.
TRADING OFF PREDICTION ACCURACY AND POWER CONSUMPTION FOR CONTEXT- AWARE WEARABLE COMPUTING Presented By: Jeff Khoshgozaran.
Learning From Data Chichang Jou Tamkang University.
Basic Data Mining Techniques
Lecture 5 (Classification with Decision Trees)
ML ALGORITHMS. Algorithm Types Classification (supervised) Given -> A set of classified examples “instances” Produce -> A way of classifying new examples.
A Hadoop MapReduce Performance Prediction Method
嵌入式視覺 Pattern Recognition for Embedded Vision Template matching Statistical / Structural Pattern Recognition Neural networks.
INTRODUCTION TO MACHINE LEARNING. $1,000,000 Machine Learning  Learn models from data  Three main types of learning :  Supervised learning  Unsupervised.
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
Stochastic sleep scheduling (SSS) for large scale wireless sensor networks Yaxiong Zhao Jie Wu Computer and Information Sciences Temple University.
1 Local search and optimization Local search= use single current state and move to neighboring states. Advantages: –Use very little memory –Find often.
Patterns of Event Causality Suggest More Effective Corrective Actions Abstract: The Occurrence Reporting and Processing System (ORPS) has used a consistent.
Machine Learning Approach to Report Prioritization with an Application to Travel Time Dissemination Piotr Szczurek Bo Xu Jie Lin Ouri Wolfson.
A Novel Local Patch Framework for Fixing Supervised Learning Models Yilei Wang 1, Bingzheng Wei 2, Jun Yan 2, Yang Hu 2, Zhi-Hong Deng 1, Zheng Chen 2.
Learning from observations
Copyright © 2012, SAS Institute Inc. All rights reserved. ANALYTICS IN BIG DATA ERA ANALYTICS TECHNOLOGY AND ARCHITECTURE TO MANAGE VELOCITY AND VARIETY,
Chapter 11 Statistical Techniques. Data Warehouse and Data Mining Chapter 11 2 Chapter Objectives  Understand when linear regression is an appropriate.
An Investigation of Commercial Data Mining Presented by Emily Davis Supervisor: John Ebden.
DATA MINING WITH CLUSTERING AND CLASSIFICATION Spring 2007, SJSU Benjamin Lam.
CZ5225: Modeling and Simulation in Biology Lecture 3: Clustering Analysis for Microarray Data I Prof. Chen Yu Zong Tel:
Data Mining and Decision Support
WHAT IS DATA MINING?  The process of automatically extracting useful information from large amounts of data.  Uses traditional data analysis techniques.
WHAT IS DATA MINING?  The process of automatically extracting useful information from large amounts of data.  Uses traditional data analysis techniques.
DATA MINING and VISUALIZATION Instructor: Dr. Matthew Iklé, Adams State University Remote Instructor: Dr. Hong Liu, Embry-Riddle Aeronautical University.
Thesis Presentation by Peter Xiang Gao Supervised by Prof. S. Keshav.
A Generic Approach to Big Data Alarms Prioritization
Experience Report: System Log Analysis for Anomaly Detection
Generation of Domestic Electricity Load Profiles
CSE 4705 Artificial Intelligence
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Customer Analytics: Strategies for Success
CEE 6410 Water Resources Systems Analysis
Jacob R. Lorch Microsoft Research
WP2 INERTIA Distributed Multi-Agent Based Framework
DATA MINING © Prentice Hall.
Modeling and Simulation (An Introduction)
An Investigation of Market Dynamics and Wealth Distributions
Data Mining K-means Algorithm
Table 1. Advantages and Disadvantages of Traditional DM/ML Methods
Supervised Time Series Pattern Discovery through Local Importance
USE OF DATA ANALYTICS TO PREDICT THE DEMAND OF BIKES
Data mining and statistical learning, lecture 1b
Optimal CyberSecurity Analyst Staffing Plan
Basic machine learning background with Python scikit-learn
Machine Learning Basics
Vincent Granville, Ph.D. Co-Founder, DSC
Collaborative Filtering Matrix Factorization Approach
Predict Failures with Developer Networks and Social Network Analysis
Discriminative Frequent Pattern Analysis for Effective Classification
3.1.1 Introduction to Machine Learning
Artificial Intelligence Lecture No. 28
Additional notes on random variables
Additional notes on random variables
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Model generalization Brief summary of methods
Statistical Thinking and Applications
Personalized HVAC Control System
Establishing an image-based ground truth for validation of sensor data-based room occupancy detection Steffen Petersen, Theis Heidmann Pedersen, Kasper.
Machine Learning – a Probabilistic Perspective
Evolutionary Ensembles with Negative Correlation Learning
Modeling IDS using hybrid intelligent systems
Advisor: Dr.vahidipour Zahra salimian Shaghayegh jalali Dec 2017
What is Artificial Intelligence?
Presentation transcript:

Occupancy data analytics and prediction: A case study Xin Liang, Tianzhen Hong, Geoffrey Qiping Shen Presented by: Debarun Das

Outline Background Applications Methodology Case Study Conclusions Machine Learning Algorithms Case Study Conclusions Further Study Outline

Background Occupant Presence Modelled By: Fixed Schedules Occupants categorized into groups Early Bird, Timetable complier, Flexible worker Each group assigned to a specific schedule Occupant presence satisfies a probability distribution Binomial Distribution, Poisson Distribution etc Analyzing Practical Observation Data Limited to single or a few offices T. Zhang, P.-O. Siebers, U. Aickelin, Modelling electricity consumption in office D. Wang, C.C. Federspiel, F. Rubinstein, Modeling occupancy in single person K. Sun, D. Yan, T. Hong, S. Guo, Stochastic modeling of overtime occupancy and its application in building energy simulation and calibration offices,

https://irail.be/spitsgids Applications Train Occupancy Prediction iRail - Spitsgids from Belgium Uses Crowdsourced data for Occupancy prediction Hospital Occupancy Prediction Time Series data Statistical models of prediction Control Of HVAC systems Logistic Regression Model Reduces Electricity Consumption Improves User Comfort https://irail.be/spitsgids Steven J. Littig & Mark W. Isken, «Short term hospital occupancy prediction” Jie Shia, Nanpeng Yua, Weixin Yaob, “Energy efficient building HVAC control algorithm with real-time occupancy prediction”

Methodology

Machine Learning Algorithms Unsupervised Clustering Algorithm- k means Supervised Decision Tree Learning - Algorithm C4.5

k-means Discovers patterns of occupancy schedule Needs a predefined value of k Needs a predefined distance definition Chooses among: Euclidean Distance Correlation Similarity Dynamic Time Wrap

k-means - Choosing k and Distance Metric Davies-Bouldin Index 𝐷𝐵𝐼= 1 𝑛 𝑖=1 𝑛 max 𝑗≠1 𝜎 𝑖 +𝜎 𝑗 𝑑 𝑐 𝑖 , 𝑐 𝑗 Optimal Parameters k = 4 Distance Metric = Euclidean

Decision Tree Summarizes rules within the patterns The attribute to split is chosen by Information Gain, 𝐺𝑎𝑖𝑛 𝑆, 𝐴 𝐺𝑎𝑖𝑛 𝑆, 𝐴 =𝐻 𝑆 − 𝑣∈𝑉𝑎𝑙𝑢𝑒𝑠(𝐴) | 𝑆 𝑣 | |𝑆| 𝐻( 𝑆 𝑣 ) where 𝐻 𝑆 = − 𝑝 𝑖 log 2 𝑝 𝑖

Decision Tree - Example http://www.inf.ed.ac.uk/teaching/courses/iaml/2011/slides/dt.pdf

Case Study Building 101 in the Navy Yard, Philadelphia Four sensors installed at the gates of the building Records no. of Occupants Entering and Exiting the building 𝑁 𝑡𝑜𝑡𝑎𝑙 = 1 𝑖 (𝑁 𝑖1 − 𝑁 𝑖2 + 𝑁 𝑖3 − 𝑁 𝑖4 + 𝑁 𝑖5 − 𝑁 𝑖6 + 𝑁 𝑖7 − 𝑁 18 ) Uses Matlab and RapidMiner

General Characteristics of Occupant Presence Low occupant presence on weekends and holidays Excludes weekend and holiday data High Variance of data from 7am to 4pm Stochastic and highly variable

General Characteristics of Occupant Presence Night (7pm - 6am) Going-to-Work (7am – 9am) Morning (10am – 12pm) Noon-break (12 pm – 1pm) Afternoon (2pm – 3pm) Going-home (4pm – 6pm)

Patterns of Occupant Presence Occupancy Rate Working Time Going To Work Going To Home Noon Break Pattern 1 Lowest Shortest Latest Earliest NA Pattern 2 Highest Longest Later 12 pm Pattern 3 Medium 2 pm Pattern 4 Earlier 1 pm

Rules of Patterns 3 influencing factors Seasons (temperatures) Weekdays Daylight Saving Time (DST) DST cannot contribute to enough information gain

Prediction of Occupancy Schedule Method 1 𝑃𝑟𝑒𝑑𝑖𝑐𝑡𝑖𝑜𝑛 𝑡 = 𝑀 𝑑𝑎𝑦 𝑡 Method 2 𝑃𝑟𝑒𝑑𝑖𝑐𝑡𝑖𝑜𝑛 𝑤𝑒𝑒𝑘𝑑𝑎𝑦,𝑡 = 𝑀 𝑤𝑒𝑒𝑘𝑑𝑎𝑦 𝑡 Method 3 𝑃𝑟𝑒𝑑𝑖𝑐𝑡𝑖𝑜𝑛 𝑑𝑎𝑦,𝑡 = 𝑀 𝑝1 . 𝑃 𝑝1 + 𝑀 𝑝2 . 𝑃 𝑝2 + 𝑀 𝑝3 . 𝑃 𝑝3 + 𝑀 𝑝4 . 𝑃 𝑝4

Validation 𝑅𝑀𝑆𝐸= 𝑖=1 𝑛 ( 𝐸 𝑖 − 𝐸 𝑖 ) 2 𝑛 𝑀𝐴𝐸= 𝑖=1 𝑛 | 𝐸 𝑖 − 𝐸 𝑖 | 𝑛 𝑅𝑀𝑆𝐸= 𝑖=1 𝑛 ( 𝐸 𝑖 − 𝐸 𝑖 ) 2 𝑛 𝑀𝐴𝐸= 𝑖=1 𝑛 | 𝐸 𝑖 − 𝐸 𝑖 | 𝑛 𝑚𝑒𝑑𝐸=𝑚𝑒𝑑𝑖𝑎𝑛( 𝐸 𝑖 − 𝐸 𝑖 )

Conclusions High accuracy in prediction of occupant behavior Simple input data Relatively Simple Algorithms Lower Complexity compared to learning algorithms like Neural Nets Simple Prediction method – weighted mean Can be applied for control of energy consumption Accuracy of occupancy detection procedure is not discussed in details Does not compare against other learning algorithms

Further Readings Usman Habib, Gerhard Zucker, “Automatic occupancy prediction using unsupervised learning in buildings data” James Scott , A.J. Bernheim Brush, John Krumm, Brian Meyers, Mike Hazas, Steve Hodges, Nicolas Villar, “PreHeat: Controlling Home Heating Using Occupancy Prediction”

Thank you!