Download presentation
Presentation is loading. Please wait.
Published byAnnabel Strickland Modified over 9 years ago
1
1 Knowledge Discovery from Transportation Network Data Paper Review Jiang, W., Vaidya, J., Balaporia, Z., Clifton, C., and Banich, B. Knowledge Discovery from Transportation Network Data. In ICDE, 2005
2
2 Outline ● Background. ● Experiments. Structurally Similar Routes Temporally Repeated Routes ● Experiment results. ● Conventional techniques. ● New challenges.
3
3 A natural application area for Data Mining ● Transportation and logistics are an important sector of the economy. --Transportation consumes 60% of oil worldwide ● Data mining has lead to significant gains in other areas ● Computer use is widespread in transportation and logistics. --Inventory management, parcel tracking, and even on- truck location sensors
4
4 Existing Applications Data Mining ● Mining with transactional characteristics of freight and events. -- i.e. classification on safety/accident records might find that trucks are prone to accidents at 7:00 AM on east - west roads. -- NO geometry of the network. Network Structure ● Optimization -- Finds solution (Minimize cost)
5
5 Transportation Networks ● Graph problems ● Graph mining i.e. Finding the frequent sub-graphs Algorithms * WARMR * AGM * SUBDUE * FSG
6
6 Dataset ● Six months of origin-destination (OD) data from a large third-party logistic company. 98,292 transactions. ● Represented as a directed graph by mapping locations to vertices. ● Each transaction can then be represented as the edge of an OD pair. ● The edges are labeled with the other attributes of the transaction: pickup date, delivery date, distance, hours, weight, and mode. (binning strategy)
7
7
8
8 Mining Interests ● Structurally Similar Routes --Identify structurally similar patterns that occur in many locations. Methods * SUBDUE * FSG ● Temporally Repeated Routes --Find patterns of routes repeated in time, rather than space. Method * FSG
9
9 Structurally Similar Routes ● We assign all vertices the same label. ● Three variants for edge labels: weight, distance, and time. -- OD_TD : TOTAL-DISTANCE -- OD_GW : GROSS-WEIGHT -- OD_TH : MOVE-TRANSIT-HOURS
10
10 Experiments with SUBDUE (MDL principle) SUBDUE: A substructure discovery system Results: ● Took about 3.25 hours to handle a graph of 100 vertices and 561 edges to find the best 3 patterns of beam size 4. ● Would need 6 months on the complete graph. ● Results were trivial.
11
11 ● Significant traffic from node 2 to node 4 via node 3, but not much return traffic (deadheading)
12
12 Experiments with FSG ● FSG mines patterns across a set of graph transactions. ● Divides the single graph into multiple distinct sub-graphs, and treats each sub-graph as a separate transaction. ✔ Breadth first partitioning ✔ Depth first partitioning ✔ Both may result in patterns being broken across partitions
13
13 Results ● Partition sizes; 400, 800, 1200 and 1600. ● Depth-first partitioning: 200 frequent patterns were found with the minimum support 120. ● Breadth-first partitioning: 667 frequent patterns were found with the minimum support 240. ● Had runtime and memory problems with lower supports on the breadth-first partitions. ● FSG is not an appropriate tool to use for mining recurrence patterns in a large single graph
14
14
15
15 Temporally Repeated Routes ● FSG ● Exploits the temporal nature of the transportation graph ● Partition each graph into a set of graph transactions based on date
16
16 Results ● Unable to run FSG on the entire data set due to insufficient memory / swap space. ● Most were small patterns. (The following is the biggest one)
17
17 Patterns Discovered by Using Conventional Mining Algorithms ● Mapped the dataset into a standard “transactional” representation. ● Used traditional data mining approaches. ● Used Weka for association rule mining, instance (tuple) classification and cluster analysis on the transportation data.
18
18 Evaluations of Conventional Algorithms ● Traditional data mining techniques have produced interesting and meaningful results to summarize our data. ● Further experimentation is required to explore the potential and limitations of these techniques on temporal transportation network data. ● Lose some insights from the structural characteristics of the data.
19
19 Challenges for Data Mining Research ● Handling the temporal aspects of graphs (dynamic graphs). ● Incorporating the notion of events into a graph. ● Expanding graph mining techniques beyond data similar to molecular structures. ● Determining what makes a graph pattern interesting.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.