Presentation is loading. Please wait.

Presentation is loading. Please wait.

VLDB 2008. 2008-08-28 2  Motivation  TraClass: Trajectory Feature Generation  Trajectory Partitioning  Region-Based Clustering  Trajectory-Based.

Similar presentations


Presentation on theme: "VLDB 2008. 2008-08-28 2  Motivation  TraClass: Trajectory Feature Generation  Trajectory Partitioning  Region-Based Clustering  Trajectory-Based."— Presentation transcript:

1 VLDB 2008

2 2008-08-28 2  Motivation  TraClass: Trajectory Feature Generation  Trajectory Partitioning  Region-Based Clustering  Trajectory-Based Clustering  Classification Strategy  Performance Evaluation  Related Work  Conclusions

3 2008-08-28 3 Feature Generation Classifier Class label Training data Features Prediction Unseen data (Jeff, Professor, 4, ?) Tenured = Yes Scope of this paper

4 2008-08-28 4  A trajectory is a sequence of the location and timestamp of a moving object Hurricanes Turtles Vessels Vehicles

5 2008-08-28 5  Definition: The process of predicting the class labels of moving objects based on their trajectories and other features  Applications: Homeland security, weather forecast, law enforcement, etc.  Example: Detection of vessel types (e.g., container ships, tankers, and fishing boats) from satellite images

6 2008-08-28 6  Several trajectory classification methods have been proposed mainly in the fields of pattern recognition, bioengineering, and video surveillance  A common characteristic of earlier methods is that they use the shapes of whole trajectories to do classification, e.g., by using the HMM Note: Although a few methods partition trajectories, the purpose of their partitioning is just to approximate or smooth trajectories

7 2008-08-28 7  Problem Statement: Given a set of labeled trajectories, generate discriminative trajectory features that make a specific class distinguishable from other classes  Observations: (1) Discriminative features are likely to appear at parts of trajectories, not at whole trajectories; (2) Discriminative features appear not only as common movement patterns, but also as regions

8 2008-08-28 8  Observation 1: Parts of trajectories near the container port and near the refinery enable us to distinguish between container ships and tankers even if they share common long paths  Observation 2: Those in the fishery enable us to recognize fishing boats even if they have no common path there Region Sub-trajectory

9 2008-08-28 9  The classification accuracy of earlier methods might not be high since the overall shapes of whole trajectories are similar to each other  Our framework TraClass aims at discovering both region and sub- trajectory features Overall shape

10 2008-08-28 10  Extract features in a top-down fashion, first by region-based clustering and then by trajectory- based clustering Trajectory partitions in non-homogeneous regions Region-based and Trajectory-based clusters Trajectory partitions Recursively quantize non-homogeneous regions Repeatedly find finer-granularity clusters

11 2008-08-28 11  Achieve high classification accuracy owing to the collaboration between the two types of clustering  Region features ← Region-based clustering  Sub-trajectory features ← Trajectory partitioning and trajectory-based clustering

12 2008-08-28 12 Trajectory partitions in non-homogeneous regions Region-based and Trajectory-based clusters Trajectory partitions Recursively quantize non-homogeneous regions Repeatedly find finer-granularity clusters

13 2008-08-28 13 1. Trajectories are partitioned based on their shapes as in the partition-and-group framework [12] 2. Trajectory partitions are further partitioned by the class labels  The real interest here is to guarantee that trajectory partitions do not span the class boundaries Additional partitioning points Non-discriminative Discriminative Class A Class B

14 2008-08-28 14  If the most prevalent class around one endpoint is different from that around the other endpoint, further partition it  Example: Class A Class B Prevalent class = Class A Prevalent class = Class B Need to be further partitioned

15 2008-08-28 15 Trajectory partitions in non-homogeneous regions Region-based and Trajectory-based clusters Trajectory partitions Recursively quantize non-homogeneous regions Repeatedly find finer-granularity clusters

16 2008-08-28 16  Discover regions that have trajectories mostly of one class regardless of their movement patterns  The region-based cluster is a set of trajectory partitions of the same class within a rectangular region regardless of their movement patterns (1)(2)

17 2008-08-28 17  Homogeneity: The class distribution in each region should be as homogeneous as possible  Conciseness: The number of regions should be as small as possible Note: Two properties are contradictory to each other  Need to find a good tradeoff between the properties One large regionMany small regions homogeneity conciseness

18 2008-08-28 18  The minimum description length (MDL) cost consists of the description cost and the code cost  The former measures conciseness, and the latter homogeneity  The best hypothesis is the one that minimizes the sum of the description cost and the code cost  Finding a good quantization translates to finding the best hypothesis using the MDL principle

19 2008-08-28 19  Progressively find a better partitioning alternately for the X axis and for the Y axis as long as the MDL cost decreases  Select the partition that has the maximum code cost and divide it into two parts in order to decrease the MDL cost (1) (2) (3) (4)

20 2008-08-28 20 Trajectory partitions in non-homogeneous regions Region-based and Trajectory-based clusters Trajectory partitions Recursively quantize non-homogeneous regions Repeatedly find finer-granularity clusters

21 2008-08-28 21  Discover sub-trajectories that indicate common movement patterns of each class  The trajectory-based cluster is a set of trajectory partitions of the same class which share a common movement pattern (3)(4)

22 2008-08-28 22  Similar to our trajectory clustering algorithm [12], but incorporate the class labels into clustering  The algorithm is based on DBSCAN [5]  If an ε-neighborhood contains trajectory partitions mostly of the same class, it is used for clustering; otherwise, it is discarded immediately Non-homogeneous Homogeneous ε-neighborhood ε-neighborhood L1L1 L2L2 XO

23 2008-08-28 23  After trajectory-based clusters are found, discriminative clusters are selected for effective classification  If the average distance to other clusters of different classes is high, the discriminative power of the cluster is high  Example: C1C1 C2C2 Class A Class B C 1 is more discriminative than C 2

24 2008-08-28 24  A cluster link is a sequence of connectable (i.e., consecutive) trajectory-based clusters  Two clusters are connectable if they share enough trajectories (more formally, the ratio of common trajectories is higher than χ)  The benefit of cluster links is to derive also whole- trajectory features  Cluster links are added to the set of trajectory-based clusters for use in classification

25 2008-08-28 25 1. Partition trajectories by considering the class labels 2. Perform region-based clustering 3. Perform trajectory-based clustering 4. Select discriminative trajectory-based clusters 5. Find cluster links from trajectory-based clusters 6. Convert each trajectory into a feature vector  Each feature is either a region-based cluster or a trajectory-based cluster  The i-th entry of a feature vector is the frequency that the i-th feature occurs in the trajectory 7. Feed the feature vectors to the SVM

26 2008-08-28 26  Use three real trajectory data sets  Animal movement data set  Movements of elk, deer, and cattle for the years 1993 through 1996  Three classes: Elk, Deer, and Cattle  Number of trajectories (points): 38 (7117), 30 (4333), and 34 (3540)  Vessel navigation data set  Navigation paths of two vessels in August 2000  Two classes: Point Lobos and Point Sur  Number of trajectories (points): 600 (65500) and 550 (125750)  Hurricane track data set  Atlantic Hurricanes for the years 1950 through 2006  Two classes: Category 2 and Category 3  Number of trajectories (points): 61 (2459) and 72 (3126)  Randomly select 20% of trajectories for the test set

27 2008-08-28 27  Measure classification accuracy, training time, and prediction time for the three data sets  Compare two versions of the algorithm  TB-ONLY: Perform trajectory-based clustering only  RB-TB: Perform both types of clustering  TB-ONLY is expected to be no worse than earlier methods since it discovers also whole-trajectory features by cluster- link generation Classification accuracy = # of test trajectories correctly classified total # of test trajectories

28 2008-08-28 28 Data SetAnimalVesselHurricane VersionTB-ONLYRB-TBTB-ONLYRB-TBTB-ONLYRB-TB Accuracy (%)50.083.384.498.265.473.1 Training Time (ms)354224064468322902331317 Prediction Time (ms)104987226084846  The classification accuracy of RB-TB is much higher than that of TB-ONLY  The training time of RB-TB is much shorter than that of TB-ONLY

29 2008-08-28 29 Data: Three classes Features: 10 region-based clusters 37 trajectory-based clusters Red: Elk Blue: Deer Black: Cattle Accuracy = 83.3%

30 2008-08-28 30 Gulf of Mexico 1 region-based cluster 15 trajectory-based clusters Red: Category 2 Blue: Category 3 Stronger hurricanes tend to go further than weaker ones These hurricanes entered the Gulf of Mexico and thus stayed longer at sea before landfall than others; They are likely to get strong because hurricanes gain energy from the evaporation of warm ocean water

31 2008-08-28 31  Effect of region-based clustering  Effect of the data size (scalability test)

32 2008-08-28 32  Pattern recognition [1] e.g., speech, handwriting, signature, and gesture recognition  Classifying human motion trajectories  Employing the hidden Markov model (HMM)  Bioengineering [16]  Classifying biological motion trajectories  Video surveillance [15]  Detecting suspicious behaviors of pedestrians  Time-series classification [20,21]  Moving-object anomaly detection [14]

33 2008-08-28 33  A novel and comprehensive feature generation framework for trajectories has been proposed  The primary advantage is the high classification accuracy owing to the collaboration between the two types of clustering  Various real-world applications, e.g., vessel classification, can benefit from our framework

34 2008-08-28 34


Download ppt "VLDB 2008. 2008-08-28 2  Motivation  TraClass: Trajectory Feature Generation  Trajectory Partitioning  Region-Based Clustering  Trajectory-Based."

Similar presentations


Ads by Google