Midwestern State University, Wichita Falls TX 1 Computerized Trip Classification of GPS Data: A Proposed Framework Terry Griffin - Yan Huang – Ranette.

Slides:



Advertisements
Similar presentations
COMP3740 CR32: Knowledge Management and Adaptive Systems
Advertisements

Decision Trees Decision tree representation ID3 learning algorithm
Demo: Classification Programs C4.5 CBA Minqing Hu CS594 Fall 2003 UIC.
Decision tree software C4.5
C4.5 algorithm Let the classes be denoted {C1, C2,…, Ck}. There are three possibilities for the content of the set of training samples T in the given node.
Classification Algorithms
Decision Tree Approach in Data Mining
Decision Tree Algorithm (C4.5)
Data Mining Methodology 1. Why have a Methodology  Don’t want to learn things that aren’t true May not represent any underlying reality ○ Spurious correlation.
ICS320-Foundations of Adaptive and Learning Systems
Deriving rules from data Decision Trees a.j.m.m (ton) weijters.
Classification Techniques: Decision Tree Learning
Decision Trees Instructor: Qiang Yang Hong Kong University of Science and Technology Thanks: Eibe Frank and Jiawei Han.
Personal Driving Diary: Constructing a Video Archive of Everyday Driving Events IEEE workshop on Motion and Video Computing ( WMVC) 2011 IEEE Workshop.
Decision Tree Rong Jin. Determine Milage Per Gallon.
Decision Tree Algorithm
An overview of The IBM Intelligent Miner for Data By: Neeraja Rudrabhatla 11/04/1999.
Induction of Decision Trees
Data Mining with Decision Trees Lutz Hamel Dept. of Computer Science and Statistics University of Rhode Island.
Decision Trees an Introduction.
Decision Trees Chapter 18 From Data to Knowledge.
Classification and Prediction by Yen-Hsien Lee Department of Information Management College of Management National Sun Yat-Sen University March 4, 2003.
ID3 Algorithm Allan Neymark CS157B – Spring 2007.
Machine Learning Chapter 3. Decision Tree Learning
Learning what questions to ask. 8/29/03Decision Trees2  Job is to build a tree that represents a series of questions that the classifier will ask of.
Copyright R. Weber Machine Learning, Data Mining ISYS370 Dr. R. Weber.
Short Introduction to Machine Learning Instructor: Rada Mihalcea.
Artificial Intelligence 7. Decision trees
Nima Hazar Amin Dehesh.  Several induction algorithm have been developed that vary in the methods employed to build the decision tree or set of rules.
Data Mining CS157B Fall 04 Professor Lee By Yanhua Xue.
Chapter 10. Sampling Strategy for Building Decision Trees from Very Large Databases Comprising Many Continuous Attributes Jean-Hugues Chauchat and Ricco.
Lecture 7. Outline 1. Overview of Classification and Decision Tree 2. Algorithm to build Decision Tree 3. Formula to measure information 4. Weka, data.
1 Knowledge Discovery Transparencies prepared by Ho Tu Bao [JAIST] ITCS 6162.
Decision Trees. Decision trees Decision trees are powerful and popular tools for classification and prediction. The attractiveness of decision trees is.
CS690L Data Mining: Classification
Decision Trees Recitation 1/17/08 Mary McGlohon
Machine Learning in Practice Lecture 5 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
An Exercise in Machine Learning
An Introduction Student Name: Riaz Ahmad Program: MSIT( ) Subject: Data warehouse & Data Mining.
Decision Tree Learning Presented by Ping Zhang Nov. 26th, 2007.
Data Mining By Farzana Forhad CS 157B. Agenda Decision Tree and ID3 Rough Set Theory Clustering.
Machine Learning Recitation 8 Oct 21, 2009 Oznur Tastan.
1 Decision Trees. 2 OutlookTemp (  F) Humidity (%) Windy?Class sunny7570true play sunny8090true don’t play sunny85 false don’t play sunny7295false don’t.
Prof. Pushpak Bhattacharyya, IIT Bombay1 CS 621 Artificial Intelligence Lecture 12 – 30/08/05 Prof. Pushpak Bhattacharyya Fundamentals of Information.
Chapter 3 Data Mining: Classification & Association Chapter 4 in the text box Section: 4.3 (4.3.1),
DATA MINING TECHNIQUES (DECISION TREES ) Presented by: Shweta Ghate MIT College OF Engineering.
Review of Decision Tree Learning Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
By N.Gopinath AP/CSE.  A decision tree is a flowchart-like tree structure, where each internal node (nonleaf node) denotes a test on an attribute, each.
CSE573 Autumn /09/98 Machine Learning Administrative –Last topic: Decision Tree Learning Reading: 5.1, 5.4 Last time –finished NLP sample system’s.
CSE573 Autumn /11/98 Machine Learning Administrative –Finish this topic –The rest of the time is yours –Final exam Tuesday, Mar. 17, 2:30-4:20.
DECISION TREES An internal node represents a test on an attribute.
Decision Trees an introduction.
Chapter 18 From Data to Knowledge
Classification Algorithms
Teori Keputusan (Decision Theory)
Prepared by: Mahmoud Rafeek Al-Farra
Mining Time-Changing Data Streams
ID3 Vlad Dumitriu.
Data Science Algorithms: The Basic Methods
Decision Tree Saed Sayad 9/21/2018.
Advanced Artificial Intelligence
ID3 Algorithm.
Classification by Decision Tree Induction
Machine Learning Chapter 3. Decision Tree Learning
Decision Trees.
Machine Learning: Lecture 3
Machine Learning Chapter 3. Decision Tree Learning
Decision Trees Decision tree representation ID3 learning algorithm
Data Mining CSCI 307, Spring 2019 Lecture 15
Data Mining CSCI 307, Spring 2019 Lecture 9
Presentation transcript:

Midwestern State University, Wichita Falls TX 1 Computerized Trip Classification of GPS Data: A Proposed Framework Terry Griffin - Yan Huang – Ranette Halverson Midwestern State University, Wichita Falls University of North Texas, Denton

Midwestern State University, Wichita Falls TX 2 Introduction and Motivation Why Derive Trip Purpose?? Many Transportation Departments are doing studies that require Travel Diaries (TD) or Origin Destination (OD) matrices. TD’s and OD matrices require user interaction (lots of it). In this paper we propose a framework to possibly eliminate the human factor from the creation of TD’s and OD matrices. This is done by passively collecting GPS data.

Midwestern State University, Wichita Falls TX 3 Conclusions Results Generating Random Data Trip Purpose Classification Data Collection Data Preparation Data Aggregation Clustering Some Background Overview of the Presentation

Midwestern State University, Wichita Falls TX 4 Background To create a trip classification model, we first need to know: What is a trip? GPS streams How do we classify that trip? Clustering Decision Trees

Midwestern State University, Wichita Falls TX 5 GPS Streams Background What is a GPS stream? The logged GPS data can be described as a collection of points Each point is defined by a Latitude (Lat) and Longitude (Lon) pair, accompanied by the Time of Day (ToD). The entire set becomes: (P 1, P 2...P n ) (P[Lat,Lon,ToD] 1,P[Lat,Lon,ToD] 2,...,P[Lat,Lon,ToD] n )

Midwestern State University, Wichita Falls TX 6 GPS Streams Background What is a GPS stream? Each stream is typically recorded: continuously with a user defined interval or by movement only Each stream creates Points Of Interest (POI)

Midwestern State University, Wichita Falls TX 7 Clustering Background Dbscan – Density Based Clustering Eps MinPts Density Reachability Density Connectivity

Midwestern State University, Wichita Falls TX 8 Clustering Background Dbscan – Density Based Clustering

Midwestern State University, Wichita Falls TX 9 Decision Trees What is a decision tree? 1.Used as a tool for classification and prediction 2.Tree like structure that represents rules 3.leaf node - indicates the value of the target attribute (class) of examples, or 4.decision node - specifies some test to be carried out on a single attribute-value, with one branch and sub-tree for each possible outcome of the test. Background

Midwestern State University, Wichita Falls TX 10 Example Decision Tree ATTRIBUTE |POSSIBLE VALUES ============+======================= outlook | sunny, overcast, rain temperature | continuous humidity | continuous windy | true, false OUTLOOK | TEMPERATURE | HUMIDITY | WINDY | PLAY ===================================================== sunny | 85 | 85 | false | Don't Play sunny | 80 | 90 | true | Don't Play overcast| 83 | 78 | false | Play rain | 70 | 96 | false | Play rain | 68 | 80 | false | Play rain | 65 | 70 | true | Don't Play overcast| 64 | 65 | true | Play …. Given and You get Decision Trees Background

Midwestern State University, Wichita Falls TX 11 Decision Trees Example Decision Tree (Golf) Background

Midwestern State University, Wichita Falls TX 12 Decision Trees 1.Entropy – measures the purity of an arbitrary collection of examples (the homogeneity ) 2.Information gain - measures how well a given attribute separates the training examples according to their target classification Background

Midwestern State University, Wichita Falls TX 13 Trip Purpose Classification To find and classify trip purposes for a given GPS stream, we follow a series of steps Data Collection Data Preparation Data Aggregation Actual Classification

Midwestern State University, Wichita Falls TX 14 Data Collection Tools Used a Palm m515 (hardware) Magellan GPS companion (hardware) Cetus GPS 1.1 (software) Method Continuous Movement Only (caused problems) Collected 6 weeks of continuous data for 1 individual Randomly generated a data set Trip Purpose Detection

Midwestern State University, Wichita Falls TX 15 Data Preparation Data cleansing Compute trip stop lengths from given raw GPS data. Continuous Movement only Trip Purpose Detection

Midwestern State University, Wichita Falls TX 16 Data Aggregation Trip Purpose Detection Single points are not meaningful Only after many points are “clustered” together can we really gain information. Each balloon is a “POI” (cluster) Each balloon gives us: Average time of day Average length of stay Longest length of stay Earliest arrival time Etc…

Midwestern State University, Wichita Falls TX 17 Data Aggregation Trip Purpose Detection It’s from these aggregate values that we can build / train our decision tree.

Midwestern State University, Wichita Falls TX 18 Classifying Points of Interest Trip Purpose Detection Identified Clusters:

Midwestern State University, Wichita Falls TX 19 Classifying Points of Interest Trip Purpose Detection Example Tree created by c4.5:

Midwestern State University, Wichita Falls TX 20 Classifying Points of Interest Trip Purpose Detection Identified Clusters:

Midwestern State University, Wichita Falls TX 21 Random Data d = (d1,d2)| d {(0,1),(-1,0),(-1,1)} x - current time of day µ - specified time for location in which the probability of going there should be high σ - time window (standard deviation) around µ d – control parameter

Midwestern State University, Wichita Falls TX 22 Results Random Data 50 generations For each generation we modified Eps and MinPts 15x15 feet - 200x200 feet (5 distinct sizes) MinPts of 2 – 10 were used As each cluster was found, it was classified using a classification tree based on the data generated for that test. Each cluster was assigned a level of correctness (all points in the cluster correctly identified = 1) We used 20 % of the generated data to train the tree.

Midwestern State University, Wichita Falls TX 23 Results

Midwestern State University, Wichita Falls TX 24 Results

Midwestern State University, Wichita Falls TX 25 Future Work

Midwestern State University, Wichita Falls TX 26 Future Plans Create a GPS database –$5000 grant for GPS devices (fall 2006) –Additional University funds Fill a needed gap in GPS research

Midwestern State University, Wichita Falls TX 27 Conclusions This classification tool has potential, but needs real validation Be nice to obtain a large data set Future… possibly predict the next trip stop based on Markhov chains Questions??