Midwestern State University, Wichita Falls TX 1 Computerized Trip Classification of GPS Data: A Proposed Framework Terry Griffin - Yan Huang – Ranette Halverson Midwestern State University, Wichita Falls University of North Texas, Denton
Midwestern State University, Wichita Falls TX 2 Introduction and Motivation Why Derive Trip Purpose?? Many Transportation Departments are doing studies that require Travel Diaries (TD) or Origin Destination (OD) matrices. TD’s and OD matrices require user interaction (lots of it). In this paper we propose a framework to possibly eliminate the human factor from the creation of TD’s and OD matrices. This is done by passively collecting GPS data.
Midwestern State University, Wichita Falls TX 3 Conclusions Results Generating Random Data Trip Purpose Classification Data Collection Data Preparation Data Aggregation Clustering Some Background Overview of the Presentation
Midwestern State University, Wichita Falls TX 4 Background To create a trip classification model, we first need to know: What is a trip? GPS streams How do we classify that trip? Clustering Decision Trees
Midwestern State University, Wichita Falls TX 5 GPS Streams Background What is a GPS stream? The logged GPS data can be described as a collection of points Each point is defined by a Latitude (Lat) and Longitude (Lon) pair, accompanied by the Time of Day (ToD). The entire set becomes: (P 1, P 2...P n ) (P[Lat,Lon,ToD] 1,P[Lat,Lon,ToD] 2,...,P[Lat,Lon,ToD] n )
Midwestern State University, Wichita Falls TX 6 GPS Streams Background What is a GPS stream? Each stream is typically recorded: continuously with a user defined interval or by movement only Each stream creates Points Of Interest (POI)
Midwestern State University, Wichita Falls TX 7 Clustering Background Dbscan – Density Based Clustering Eps MinPts Density Reachability Density Connectivity
Midwestern State University, Wichita Falls TX 8 Clustering Background Dbscan – Density Based Clustering
Midwestern State University, Wichita Falls TX 9 Decision Trees What is a decision tree? 1.Used as a tool for classification and prediction 2.Tree like structure that represents rules 3.leaf node - indicates the value of the target attribute (class) of examples, or 4.decision node - specifies some test to be carried out on a single attribute-value, with one branch and sub-tree for each possible outcome of the test. Background
Midwestern State University, Wichita Falls TX 10 Example Decision Tree ATTRIBUTE |POSSIBLE VALUES ============+======================= outlook | sunny, overcast, rain temperature | continuous humidity | continuous windy | true, false OUTLOOK | TEMPERATURE | HUMIDITY | WINDY | PLAY ===================================================== sunny | 85 | 85 | false | Don't Play sunny | 80 | 90 | true | Don't Play overcast| 83 | 78 | false | Play rain | 70 | 96 | false | Play rain | 68 | 80 | false | Play rain | 65 | 70 | true | Don't Play overcast| 64 | 65 | true | Play …. Given and You get Decision Trees Background
Midwestern State University, Wichita Falls TX 11 Decision Trees Example Decision Tree (Golf) Background
Midwestern State University, Wichita Falls TX 12 Decision Trees 1.Entropy – measures the purity of an arbitrary collection of examples (the homogeneity ) 2.Information gain - measures how well a given attribute separates the training examples according to their target classification Background
Midwestern State University, Wichita Falls TX 13 Trip Purpose Classification To find and classify trip purposes for a given GPS stream, we follow a series of steps Data Collection Data Preparation Data Aggregation Actual Classification
Midwestern State University, Wichita Falls TX 14 Data Collection Tools Used a Palm m515 (hardware) Magellan GPS companion (hardware) Cetus GPS 1.1 (software) Method Continuous Movement Only (caused problems) Collected 6 weeks of continuous data for 1 individual Randomly generated a data set Trip Purpose Detection
Midwestern State University, Wichita Falls TX 15 Data Preparation Data cleansing Compute trip stop lengths from given raw GPS data. Continuous Movement only Trip Purpose Detection
Midwestern State University, Wichita Falls TX 16 Data Aggregation Trip Purpose Detection Single points are not meaningful Only after many points are “clustered” together can we really gain information. Each balloon is a “POI” (cluster) Each balloon gives us: Average time of day Average length of stay Longest length of stay Earliest arrival time Etc…
Midwestern State University, Wichita Falls TX 17 Data Aggregation Trip Purpose Detection It’s from these aggregate values that we can build / train our decision tree.
Midwestern State University, Wichita Falls TX 18 Classifying Points of Interest Trip Purpose Detection Identified Clusters:
Midwestern State University, Wichita Falls TX 19 Classifying Points of Interest Trip Purpose Detection Example Tree created by c4.5:
Midwestern State University, Wichita Falls TX 20 Classifying Points of Interest Trip Purpose Detection Identified Clusters:
Midwestern State University, Wichita Falls TX 21 Random Data d = (d1,d2)| d {(0,1),(-1,0),(-1,1)} x - current time of day µ - specified time for location in which the probability of going there should be high σ - time window (standard deviation) around µ d – control parameter
Midwestern State University, Wichita Falls TX 22 Results Random Data 50 generations For each generation we modified Eps and MinPts 15x15 feet - 200x200 feet (5 distinct sizes) MinPts of 2 – 10 were used As each cluster was found, it was classified using a classification tree based on the data generated for that test. Each cluster was assigned a level of correctness (all points in the cluster correctly identified = 1) We used 20 % of the generated data to train the tree.
Midwestern State University, Wichita Falls TX 23 Results
Midwestern State University, Wichita Falls TX 24 Results
Midwestern State University, Wichita Falls TX 25 Future Work
Midwestern State University, Wichita Falls TX 26 Future Plans Create a GPS database –$5000 grant for GPS devices (fall 2006) –Additional University funds Fill a needed gap in GPS research
Midwestern State University, Wichita Falls TX 27 Conclusions This classification tool has potential, but needs real validation Be nice to obtain a large data set Future… possibly predict the next trip stop based on Markhov chains Questions??