Transportation Mode Recognition using Smartphone Sensor Data

Slides:



Advertisements
Similar presentations
ECG Signal processing (2)
Advertisements

Random Forest Predrag Radenković 3237/10
Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.
Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
CPSC 502, Lecture 15Slide 1 Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 15 Nov, 1, 2011 Slide credit: C. Conati, S.
Data Mining Classification: Alternative Techniques
Christoph F. Eick Questions and Topics Review Nov. 22, Assume you have to do feature selection for a classification task. What are the characteristics.
Pattern Recognition and Machine Learning
An Introduction of Support Vector Machine
Support Vector Machines
Indian Statistical Institute Kolkata
Chapter 7 – Classification and Regression Trees
Chapter 7 – Classification and Regression Trees
Fuzzy Support Vector Machines (FSVMs) Weijia Wang, Huanren Zhang, Vijendra Purohit, Aditi Gupta.
Sparse vs. Ensemble Approaches to Supervised Learning
Speaker Adaptation for Vowel Classification
Optimization of Signal Significance by Bagging Decision Trees Ilya Narsky, Caltech presented by Harrison Prosper.
Sparse vs. Ensemble Approaches to Supervised Learning
Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)
Oral Defense by Sunny Tang 15 Aug 2003
Ensemble Learning (2), Tree and Forest
1 Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data Presented by: Tun-Hsiang Yang.
This week: overview on pattern recognition (related to machine learning)
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
Chapter 9 – Classification and Regression Trees
LOGO Ensemble Learning Lecturer: Dr. Bo Yuan
Combining multiple learners Usman Roshan. Bagging Randomly sample training data Determine classifier C i on sampled data Goto step 1 and repeat m times.
Today Ensemble Methods. Recap of the course. Classifier Fusion
START OF DAY 5 Reading: Chap. 8. Support Vector Machine.
CS 478 – Tools for Machine Learning and Data Mining SVM.
Classification (slides adapted from Rob Schapire) Eran Segal Weizmann Institute.
Random Forests Ujjwol Subedi. Introduction What is Random Tree? ◦ Is a tree constructed randomly from a set of possible trees having K random features.
Machine Learning 5. Parametric Methods.
Validation methods.
Combining multiple learners Usman Roshan. Decision tree From Alpaydin, 2010.
WHAT IS DATA MINING?  The process of automatically extracting useful information from large amounts of data.  Uses traditional data analysis techniques.
Intro. ANN & Fuzzy Systems Lecture 16. Classification (II): Practical Considerations.
WHAT IS DATA MINING?  The process of automatically extracting useful information from large amounts of data.  Uses traditional data analysis techniques.
SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.
Tree and Forest Classification and Regression Tree Bagging of trees Boosting trees Random Forest.
Overfitting, Bias/Variance tradeoff. 2 Content of the presentation Bias and variance definitions Parameters that influence bias and variance Bias and.
Neural networks (2) Reminder Avoiding overfitting Deep neural network Brief summary of supervised learning methods.
1 Machine Learning Lecture 8: Ensemble Methods Moshe Koppel Slides adapted from Raymond J. Mooney and others.
Ch 1. Introduction Pattern Recognition and Machine Learning, C. M. Bishop, Updated by J.-H. Eom (2 nd round revision) Summarized by K.-I.
Ensemble Classifiers.
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Machine Learning: Ensemble Methods
Evaluating Classifiers
An Empirical Comparison of Supervised Learning Algorithms
My Tiny Ping-Pong Helper
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
Trees, bagging, boosting, and stacking
An Enhanced Support Vector Machine Model for Intrusion Detection
Introduction Feature Extraction Discussions Conclusions Results
CS548 Fall 2017 Decision Trees / Random Forest Showcase by Yimin Lin, Youqiao Ma, Ran Lin, Shaoju Wu, Bhon Bunnag Showcasing work by Cano,
K Nearest Neighbor Classification
An Inteligent System to Diabetes Prediction
REMOTE SENSING Multispectral Image Classification
Hyperparameters, bias-variance tradeoff, validation
CSCI N317 Computation for Scientific Applications Unit Weka
MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING
MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING
MAS 622J Course Project Classification of Affective States - GP Semi-Supervised Learning, SVM and kNN Hyungil Ahn
SVMs for Document Ranking
CAMCOS Report Day December 9th, 2015 San Jose State University
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
Lecture 16. Classification (II): Practical Considerations
Support Vector Machines 2
Advisor: Dr.vahidipour Zahra salimian Shaghayegh jalali Dec 2017
Outlines Introduction & Objectives Methodology & Workflow
Presentation transcript:

Transportation Mode Recognition using Smartphone Sensor Data Arash Jahangiri & Hesham Rakha Presented by Hesham Rakha Samuel Reynolds Professor of Engineering, CEE Director of the Center for Sustainable Mobility, VTTI 11/10/2018

VTTI | Center for Sustainable Mobility Outline Introduction Literature Review Data Collection Methodology Results Conclusions VTTI | Center for Sustainable Mobility 11/10/2018

VTTI | Center for Sustainable Mobility Introduction Objective: Develop classifiers to identify transportation modes Modes: Car, Bus, Bike, Run, Walk Methods: Supervised machine learning techniques Data: Obtained from smartphone sensors Developed a custom data acquisition system VTTI | Center for Sustainable Mobility 11/10/2018

VTTI | Center for Sustainable Mobility Introduction Applications Transportation Planning Traditional Approach: Questionnaires/Travel Diaries/ Telephone Interviews Environmental Applications Carbon footprint / health monitoring Safety Applications Incorporating mode information into crash prediction models VTTI | Center for Sustainable Mobility 11/10/2018

VTTI | Center for Sustainable Mobility Literature Review Methods Applied Artificial Intelligence (AI) tools Fuzzy Expert Systems, Decision Trees, Bayesian Networks, Support Vector Machine (SVM), etc. Statistical Methods Supporting Techniques GIS maps Discrete Hidden Markov Models, Bootstrap aggregating VTTI | Center for Sustainable Mobility 11/10/2018

VTTI | Center for Sustainable Mobility Literature Review Study Classes Accelerometer GPS GIS Different motorized? Positioning Window size (s) Accuracy (%) [4] 6 no yes No restrictions 30 93.5 [8] 4 1 93.6 [7] 8 10.24 82.1 [14] In pocket 5 93.9 [15] 3 96.9 [18] 8/6 >20 61.8/78.8 VTTI | Center for Sustainable Mobility 11/10/2018

VTTI | Driving Transportation with Technology Literature Review Number of classes (3 - 8) Sensor Data: Accelerometer / GPS GIS maps Different motorized Device Positioning Window size (1 - 30 seconds) Basically the factors that can affect model performance VTTI | Driving Transportation with Technology 11/10/2018

VTTI | Center for Sustainable Mobility Unique Contributions Considered both motorized and non-motorized modes. Did not depend on device positioning. Did not use the information from GPS due to GPS sensor limitations. Used data from gyroscope, accelerometer and rotation vector sensors. Had travelers collect car and bus data on different road types with different speed limits. device positioning dependency = like travelers have to attach the smartphone to their bodies in other studies VTTI | Center for Sustainable Mobility 11/10/2018

VTTI | Driving Transportation with Technology Unique Contributions Had travelers collect data for situations similar to traffic jam conditions. Applied all common machine learning procedures: Complete model selection, regularization, feature selection, and feature scaling. Created and assessed a large number of features. Created the features based on statistical measures of dispersion as well as derivatives to incorporate feature time dependency. Similar to traffic jam = at or near intersections going with the queue Complete model selection = consideration of tuning parameters (depends on the algorithm) Regularization = to deal with overfitting Feature selection = to find and use most relevant variables Feature scaling = normalizing variables to a specified range; I used (-1,1) VTTI | Driving Transportation with Technology 11/10/2018

VTTI | Center for Sustainable Mobility Data Collection Developed Smartphone App Ten individuals / two different android phones Car, bicycle, bus, walk, and run About 25 hours of data Sensors: Accelerometer, Gyroscope, and Rotation Vector Preprocessing Data Synchronization Interpolation & resampling Data obtained from different sensors were not synchronized, so first we did interpolation to have a continuous data stream, then resampled at a desired rate VTTI | Center for Sustainable Mobility 11/10/2018

VTTI | Center for Sustainable Mobility Methodology Methods: Support Vector Machine (SVM) Tree-based Methods K-Nearest Neighbor (K-NN) Feature Selection: Minimum Redundancy Maximum Relevance (mRMR) Model Selection: Five-fold Cross-Validation Out-of-bag error for bagging and random forest methods Cross-Validation: used 5-fold cross validation, in which data is divided into 5 parts; one part is set aside for validation the rest is used to train the model. then do similar steps 5 times. each time with another part as the validation. Out-of-bag error: is for Bagging and Random Forest, similar to cross validation, when creating different trees, a part of the data is not used; the error is computed only based on this unused part and is called Out-of-bag error VTTI | Center for Sustainable Mobility 11/10/2018

VTTI | Center for Sustainable Mobility Methodology – SVM Large margin classifier Single SVM model Ensemble of SVM models 𝑚𝑖𝑛 𝑤,𝑏,𝜉 1 2 𝑤 𝑇 𝑤+𝐶 𝑛=1 𝑁 𝜉 𝑛 Equation 1 𝑦 𝑛 𝑤 𝑇 𝜙 𝑥 𝑛 +𝑏 ≥1− 𝜉 𝑛 , 𝑛=1,…,𝑁 Equation 2 𝜉 𝑛 ≥0 , 𝑛=1,…,𝑁 Equation 3 Objective Function: minimizing the first term is basically equivalent to maximizing the margin between classes, and the second term consists of an error term multiplied by the regularization (penalty) parameter denoted by C Ensemble here means developing a series of SVM models using subsets of data, then do averaging to obtain results (the idea is similar to Bagging or Random Forest) More details of SVM : Equation (2) ensures that margin of at least 1 exists with consideration of some violations. The value of 1 resulted from normalizing 𝑤. Equation (3) restricts the data points to the points that have positive errors. SVM applies the function 𝜙 . to transform data from the current n-dimensional 𝑋 space into a higher dimensional 𝑍 space in which the decision boundaries between classes are easier to identify. This transformation could be computationally very expensive; consequently, to solve the problem, the SVM only needs to obtain vector inner products in the space of interest. Hence, SVM takes advantage of some functions known as Kernels that return the vector inner product in the desired Z space. We used Gaussian Kernel. Where,   𝑤 Parameters to define decision boundary between classes 𝐶 Regularization (or penalty) parameter 𝜉 𝑛 Error parameter to denote margin violation 𝑏 Intercept associated with decision boundaries 𝜙 𝑥 𝑛 Function to transform data from X space into some Z space VTTI | Center for Sustainable Mobility 11/10/2018

Methodology – Tree based Models Single Tree Bagging : Ensemble of trees / all variables used in each tree Random Forest: Ensemble of trees / restricted number of variables used in each tree Criteria used to choose the best split at each node Cross-Entropy. The range of − 𝑘=1 𝐾 𝑃 𝑘 𝑚 𝑙𝑜𝑔 𝑃 𝑘 𝑚 is (0,1), 0 being the most purer and 1 the least purer. When splitting data, the algorithm tries to use splits that result in purer nodes. For example, in case we have 2 classes (a and b): Split 1: results in a node with 70% of data being class a and 30% class b Split 2: results in a node with 90% of data being class a and 10% class b Cross-Entropy for Split 2 is lower than that of split 1 and thus purer. Other criteria can also be used (like Gini Index) − 𝑘=1 𝐾 𝑃 𝑘 𝑚 𝑙𝑜𝑔 𝑃 𝑘 𝑚 where,   𝑃 𝑘 𝑚 Proportion of class 𝑘 observations in node 𝑚 VTTI | Center for Sustainable Mobility 11/10/2018

VTTI | Center for Sustainable Mobility Methodology - KNN Classifies test observations based on classes of the K-nearest neighbors 𝑦 𝑗 𝑡𝑒𝑠𝑡 = 1 𝐾 𝑋 𝑗 𝑡𝑟𝑎𝑖𝑛 ∈ 𝑁 𝐾 𝑦 𝑗 𝑡𝑟𝑎𝑖𝑛 where,   𝑋 𝑗 𝑡𝑟𝑎𝑖𝑛 , 𝑋 𝑗 𝑡𝑒𝑠𝑡 Observation vectors for train and test sets 𝑦 𝑗 𝑡𝑟𝑎𝑖𝑛 , 𝑦 𝑗 𝑡𝑒𝑠𝑡 Response (or target) values corresponding to the observations 𝑋 𝑗 𝑡𝑟𝑎𝑖𝑛 and 𝑋 𝑗 𝑡𝑒𝑠𝑡 𝐾 Number of neighbors Just as a note, the formulation shows averaging, but for classification problems (our case), it does not mean that we average the responses, but we do majority votes. VTTI | Center for Sustainable Mobility 11/10/2018

Methodology – Feature Selection Measures used to create features: No. Measure 1 𝑚𝑒𝑎𝑛 𝑥 𝑖 𝑡 11 𝑚𝑒𝑎𝑛 𝑥 𝑖 𝑡 2 𝑚𝑎𝑥 𝑥 𝑖 𝑡 12 𝑚𝑎𝑥 𝑥 𝑖 𝑡 3 𝑚𝑖𝑛 𝑥 𝑖 𝑡 13 𝑚𝑖𝑛 𝑥 𝑖 𝑡 4 𝑣𝑎𝑟 𝑥 𝑖 𝑡 14 𝑣𝑎𝑟 𝑥 𝑖 𝑡 5 𝑠𝑡𝑑 𝑥 𝑖 𝑡 15 𝑠𝑡𝑑 𝑥 𝑖 𝑡 6 𝑟𝑎𝑛𝑔𝑒 𝑥 𝑖 𝑡 16 𝑟𝑎𝑛𝑔𝑒 𝑥 𝑖 𝑡 7 𝑖𝑞𝑟 𝑥 𝑖 𝑡 17 𝑖𝑞𝑟 𝑥 𝑖 𝑡 8 𝑠𝑖𝑔𝑛𝐶ℎ𝑎𝑛𝑔𝑒 𝑥 𝑖 𝑡 18 𝑠𝑖𝑔𝑛𝐶ℎ𝑎𝑛𝑔𝑒 𝑥 𝑖 𝑡 9 𝑒𝑛𝑒𝑟𝑔𝑦 𝑥 𝑖 𝑡 19 𝑠𝑝𝑒𝑐𝑡𝑟𝑎𝑙𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑥 𝑖 𝑡 VTTI | Driving Transportation with Technology 11/10/2018

Methodology – Feature Selection mRMR Used for feature selection Maximize the relevance between the feature and the target class Minimize the redundancy between that feature and the already selected features max 𝑥 𝑖 ∈ 𝑀−𝐹 𝑀𝐼( 𝑥 𝑖 ,𝑐) and min 𝑥 𝑖 ∈ 𝑀−𝐹 1 𝐹 𝑥 𝑗 ∈𝐹 𝑀𝐼( 𝑥 𝑖 , 𝑥 𝑗 ) 𝑀𝐼(𝑥,𝑦) Mutual Information of 𝑥 and 𝑦 𝑥 𝑖 The feature to be examined 𝑥 𝑗 A previously selected feature 𝑀 / 𝐹 Set of all features / Set of the selected features c Target class MI:  is a measure of the variables' mutual dependence and determines how similar the joint distribution p(X,Y) is to the products of factored marginal distribution p(X)p(Y). VTTI | Center for Sustainable Mobility 11/10/2018

Results – Model Selection Number of neighbors KNN Regularization SVM Gaussian parameter SVM Number of features RF Tuning parameters for different models Number of trees Bag, RF VTTI | Center for Sustainable Mobility 11/10/2018

VTTI | Center for Sustainable Mobility Results - Comparison Overall Accuracy: Calculated by dividing the total number of correct detections by the total number of test data. F-Score: Combined measure of the Recall and the Precision Youden’s index: A measure to assess the ability of a model to avoid failure Discriminant power: Shows how well a model discriminates between different classes Recall and Precision are calculated based on true positives, true negatives: The recall measure is calculated by dividing the total number of true positives by the total number of actual positives. The Precision measure is computed by dividing the total number of true positives by the total number of predicted positives. VTTI | Center for Sustainable Mobility 11/10/2018

VTTI | Driving Transportation with Technology Results - Comparison Overall Accuracy KNN 91.2% Bag 94.4% DT 87.27% SVM 94.62% DT.P 86.3% E.SVMs 94.41% RF 95.1% KNN: K nearest neighbor DT: Decision Tree DT.P: pruned Decision Tree (just making a huge tree smaller), the accuracy drops a bit but a huge tree is pruned to a smaller one RF: Random Forest, the best overall performance Bag: Bagging SVM: Support Vector Machine, the best for specific modes (walk and Run) E.SVM: Ensemble of SVM VTTI | Driving Transportation with Technology 11/10/2018

Results – Feature Importance Identified important features Based on Mean Decrease Accuracy & Mean Decrease Gini No. Feature Name 1 𝑠𝑝𝑒𝑐𝑡𝑟𝑎𝑙𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑎 𝑥 11 𝑚𝑒𝑎𝑛 𝑎 𝑧 2 𝑟𝑎𝑛𝑔𝑒 𝑎 𝑦 12 𝑖𝑞𝑟 𝑎 𝑥 3 𝑚𝑎𝑥 𝑎 𝑦 13 𝑣𝑎𝑟 𝑔 𝑥 4 𝑚𝑎𝑥 𝑔 𝑦 14 𝑚𝑖𝑛 𝑎 𝑦 5 𝑚𝑖𝑛 𝑔 𝑦 15 𝑟𝑎𝑛𝑔𝑒 𝑎 𝑥 6 𝑟𝑎𝑛𝑔𝑒 𝑔 𝑥 16 𝑒𝑛𝑒𝑟𝑔𝑦 𝑎 𝑥 7 𝑠𝑝𝑒𝑐𝑡𝑟𝑎𝑙𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑎 𝑦 17 𝑟𝑎𝑛𝑔𝑒 𝑔 𝑥 8 𝑚𝑎𝑥 𝑎 𝑧 18 𝑚𝑒𝑎𝑛 𝑔 𝑧 9 𝑚𝑒𝑎𝑛 𝑔 𝑥 19 𝑠𝑡𝑑 𝑎 𝑦 10 𝑚𝑖𝑛 𝑎 𝑧 20 𝑠𝑝𝑒𝑐𝑡𝑟𝑎𝑙𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑔 𝑥 Mean Decrease Accuracy that shows how the detection accuracy is decreased if a feature was excluded, averaged over all trees, and normalized by the standard deviation of the differences in accuracy and (2) Mean Decrease Gini that shows how a single feature contributed to decrease the Gini index (a measure similar to cross-entropy) over all the trees. Spectral Entropy: assuming the data as a distribution, this was used as a measure to show the peaky spots of a distribution Energy: if data treated as signal, Energy of a signal is 𝑥 2 over the time window of interest VTTI | Center for Sustainable Mobility 11/10/2018

VTTI | Center for Sustainable Mobility Conclusions Developed smartphone app to obtain sensor data Used Accelerometer, Gyroscope, & Rotation Vector sensors Transportation modes: Bike, Car, Walk, Run, and Bus A time window of one second Applied machine learning to develop detection models Most difficult modes to distinguish: Car and Bus (motorized modes) Best overall performance with Random Forest SVM outperformed the RF in certain modes (walk and run) Selected 80 features using mRMR, of which top 20 were identified VTTI | Center for Sustainable Mobility 11/10/2018

Future Recommendations Adding more data Applying approaches to examine the data as a sequence Considering other transportation modes (e.g. metro) Conducting error analysis incorporate that knowledge into the models to enhance the detection performance VTTI | Driving Transportation with Technology 11/10/2018

VTTI | Driving Transportation with Technology Thank you! Questions/Comments ? VTTI | Driving Transportation with Technology 11/10/2018