Download presentation
Presentation is loading. Please wait.
Published byVictor Marshall Modified over 9 years ago
1
Trajectory Outlier Detection: A Partition-and-Detect Framework1 04/08/08 April 8, 2007 Trajectory Outlier Detection: A Partition-and-Detect Framework Jae-Gil Lee, Jiawei Han, and Xiaolei Li Department of Computer Science University of Illinois at Urbana-Champaign ICDE 2008
2
Trajectory Outlier Detection: A Partition-and-Detect Framework2 04/08/08 Table of Contents Motivation Partition-and-Detect Framework Outlier Detection Algorithm: TRAOD Partitioning Phase (Simple) Detection Phase Partitioning Phase (Enhanced) Performance Evaluation Related Work Conclusions
3
Trajectory Outlier Detection: A Partition-and-Detect Framework3 04/08/08 Outlier Detection Definition: the process of detecting a data object that is grossly different from or inconsistent with the remaining set of data Applications: the detection of credit card fraud, the monitoring of criminal activities in electronic commerce, etc. Algorithms: distribution-based, distance-based, density- based, and deviation-based Target data: previous research has mainly dealt with outlier detection of point data
4
Trajectory Outlier Detection: A Partition-and-Detect Framework4 04/08/08 Analysis on Trajectory Data Tremendous amounts of trajectory data of moving objects are being collected Example : vehicle positioning data, hurricane tracking data, animal movement data, etc. Trajectory outlier detection has many important, real- world applications Detection of suspicious persons in video surveillance Analysis of unusual air-mass trajectories in meteorology … A powerful outlier detection algorithm for trajectories is needed urgently
5
Trajectory Outlier Detection: A Partition-and-Detect Framework5 04/08/08 Limitations of Existing Algorithms Knorr et al. [5] have presented one of very few attempts Define the distance between two whole trajectories using the summary information (e.g., the coordinates of the starting and ending points) Apply a distance-based approach to detection of trajectory outliers Existing algorithms might not be able to detect outlying portions of trajectories Example : TR 3 is not detected as an outlier since its overall behavior is similar to those of neighboring trajectories TR 5 TR 1 TR 4 TR 3 TR 2 An outlying sub-trajectory
6
Trajectory Outlier Detection: A Partition-and-Detect Framework6 04/08/08 Discovery of Outlying Sub-Trajectories Discovery of outlying sub -trajectories is very useful in the real world Example : S udden changes in hurricane’s path [10] We propose the partition-and-detect framework
7
Trajectory Outlier Detection: A Partition-and-Detect Framework7 04/08/08 The Partition-and-Detect Framework Consists of two phases: partitioning and detection TR 5 TR 1 TR 4 TR 3 TR 2 A set of trajectories (1) Partition (2) Detect TR 3 A set of trajectory partitions An outlier Outlying trajectory partitions Note : A set of outlying trajectory partitions indicates an outlying sub- trajectory
8
Trajectory Outlier Detection: A Partition-and-Detect Framework8 04/08/08 The Problem Statement I O Given a set of trajectories I = { TR 1, …, TR n }, our algorithm generates a set of outliers O = { O 1, …, O m } with outlying trajectory partitions for each O i Necessary definitions: A trajectory is a sequence of multi-dimensional points, which is denoted as TR i = p 1 p 2 p 3 … p j … p leni ; a trajectory partition ( t-partition for short) is a line segment p i p j ( i < j ), where p i and p j are the points chosen from the same trajectory A t-partition is outlying if it does not have a sufficient number of similar neighbors A trajectory is an outlier if it contains a non-negligible amount of outlying t-partitions
9
Trajectory Outlier Detection: A Partition-and-Detect Framework9 04/08/08 The Outlier Detection Algorithm: TRAOD Based on the partition-and-detect framework Algorithm TRAOD (TRAjectory Outlier Detection) I Input : A set of trajectories I = { TR 1, …, TR n } O Output : A set of outliers O = { O 1, …, O m } with outlying t-partitions for each O i Algorithm : /* Partitioning Phase */ I 01: for each TR I do L 02: Partition TR into a set L of line segments; LD 03: Accumulate L into a set D ; /* Detection Phase */ D 04: for each P D do 05: Mark P if it is an outlying t-partition; I 06: for each TR I do 07: Output TR if it is an outlier;
10
Trajectory Outlier Detection: A Partition-and-Detect Framework10 04/08/08 Where We Are Now /* Partitioning Phase */ I 01: for each TR I do L 02: Partition TR into a set L of line segments LD 03: Accumulate L into a set D ; /* Detection Phase */ D 04: for each P D do 05: Mark P if it is an outlying t-partition; I 06: for each TR I do 07: Output TR if it is an outlier; by a simple strategy; by a two-level partitioning strategy;
11
Trajectory Outlier Detection: A Partition-and-Detect Framework11 04/08/08 A Simple Partitioning Strategy (1/2) Careless partitioning (especially, in a long length) could miss possible outliers Example : Even though TR out behaves differently from its neighboring trajectories, these differences are averaged out due to careless partitioning Neighboring Trajectories A t-partition A trajectory TR out
12
Trajectory Outlier Detection: A Partition-and-Detect Framework12 04/08/08 A Simple Partitioning Strategy (2/2) A trajectory is partitioned at a base unit : the smallest meaningful unit of a trajectory in a given application Example : The base unit can be every single point Pros : high detection quality in general Cons : poor performance due to a large number of t-partitions remedied by a two-level partitioning strategy Neighboring Trajectories A t-partition A trajectory TR out An outlying t-partition
13
Trajectory Outlier Detection: A Partition-and-Detect Framework13 04/08/08 Where We Are Now /* Partitioning Phase */ I 01: for each TR I do L 02: Partition TR into a set L of line segments LD 03: Accumulate L into a set D ; /* Detection Phase */ D 04: for each P D do 05: Mark P if it is an outlying t-partition; I 06: for each TR I do 07: Output TR if it is an outlier; by a simple strategy; by a two-level partitioning strategy;
14
Trajectory Outlier Detection: A Partition-and-Detect Framework14 04/08/08 Distance between T-Partitions The weighted sum of three components: the perpendicular distance ( ), parallel distance ( ), and angle distance ( ) Adapted from similarity measures used in the domain of pattern recognition [13]
15
Trajectory Outlier Detection: A Partition-and-Detect Framework15 04/08/08 Trajectory Outliers Based on Distance (1/2) Def. (a close trajectory ): Def. (an outlying t-partition ): TR j is close to L i TR j is not close to L i L i is an outlying t-partition L i is not an outlying t-partition Not close ≤ 1 ‒ p Close > 1 ‒ p
16
Trajectory Outlier Detection: A Partition-and-Detect Framework16 04/08/08 Trajectory Outliers Based on Distance (2/2) Def. (an outlier ): A trajectory TR i is an outlier if the sum of the lengths of all t-partitions in TR i the sum of the lengths of outlying t-partitions in TR i ≥ F TR i TR j TR i is an outlier TR j is not an outlier
17
Trajectory Outlier Detection: A Partition-and-Detect Framework17 04/08/08 Incorporation of Density (1/2) The previous definition, as it is, has the local density problem A t-partition in a dense region tends to have relatively a larger number of close trajectories than that in a sparse region T-Partitions in dense regions are favored!
18
Trajectory Outlier Detection: A Partition-and-Detect Framework18 04/08/08 Incorporation of Density (2/2) Def. (the density of a t-partition): The density of a t-partition L i is the number of t-partitions within the distance σ from L i, where σ is the standard deviation of pairwise distances between t-partitions Def. (the adjusting coefficient of a t-partition): Adjustment by the density The number of close trajectories is multiplied by the adjusting coefficient adj ( L i ) adj ( L i ) < 1.0 in a dense region adj ( L i ) > 1.0 in a sparse region the density of the t-partition L i the average density of all t-partitions adj ( L i ) =
19
Trajectory Outlier Detection: A Partition-and-Detect Framework19 04/08/08 Guidelines for Parameter Values Three parameters: D corresponds to similar, p to sufficient, and F to non-negligible Remark: There is no universally correct parameter value even for the same data set and application Our guideline: Resorts on user feedback Want Many Outliers? Have Many Trajectories? Are Trajectories Short? DpFDpF 0.900.99 0.200.10 SmallerLarger
20
Trajectory Outlier Detection: A Partition-and-Detect Framework20 04/08/08 Where We Are Now /* Partitioning Phase */ I 01: for each TR I do L 02: Partition TR into a set L of line segments LD 03: Accumulate L into a set D ; /* Detection Phase */ D 04: for each P D do 05: Mark P if it is an outlying t-partition; I 06: for each TR I do 07: Output TR if it is an outlier; by a simple strategy; by a two-level partitioning strategy;
21
Trajectory Outlier Detection: A Partition-and-Detect Framework21 04/08/08 Two-Level Trajectory Partitioning Objective Achieves much higher performance than the simple strategy Obtains the same result as that of the simple strategy; i.e., does not lose the quality of the result Basic idea 1. Partition a trajectory in coarse granularity first 2. Partition a coarse t-partition in fine granularity only when necessary Main benefit Narrows the search space that needs to be inspected in fine granularity Many portions of trajectories can be pruned early on
22
Trajectory Outlier Detection: A Partition-and-Detect Framework22 04/08/08 Intuition to Two-Level Trajectory Partitioning If the distance between coarse t-partitions is very large (or small), the distances between their fine t-partitions is also very large (or small) TR i TR j Coarse-Granularity Partitioning Fine-Granularity Partitioning Given two coarse t-partitions, can we know if the distance between any two fine t-partitions is greater than (or less than) D ?
23
Trajectory Outlier Detection: A Partition-and-Detect Framework23 04/08/08 Coarse-Granularity Partitioning* Try to maximize two rivalry measures Preciseness : the difference between a trajectory and a set of its coarse t-partitions should be as small as possible −Required for making the bounds tight Conciseness : the number of coarse t-partitions should be as small as possible −Required for reducing the number of comparisons Formulate this problem using the minimum length description (MDL) principle A good tradeoff between the two measures is found based on the information theory * Coarse-granularity partitioning is identical to that in our earlier work on trajectory clustering [15]
24
Trajectory Outlier Detection: A Partition-and-Detect Framework24 04/08/08 Fine-Granularity Partitioning Identify outlying coarse t-partitions by deriving the distance bounds between two coarse t-partitions L i and L j Suppose l i is a fine t-partition in L i and l j is that in L j Derive the above bounds separately for ( Lemmas 1~3 ) and combine them ( Lemma 4 ) TR i TR j LiLi LjLj lb ( L i, L j, f ) The lower bound of f ( l i, l j ), ub ( L i, L j, f ) The upper bound of f ( l i, l j ),
25
Trajectory Outlier Detection: A Partition-and-Detect Framework25 04/08/08 Derivation of the Distance Bounds Lemma 1. Bounds for Lemma 2. Bounds for Lemma 3. Bounds for Lemma 4. Bounds for dist ( L i, L j ) Combine
26
Trajectory Outlier Detection: A Partition-and-Detect Framework26 04/08/08 Pruning Rules for Fine-Granularity Partitioning Rule 1 : If lb ( L i, L j, dist ) > D, fine-granularity partitioning is not required when comparing L i and L j Rule 2 : If ub ( L i, L j, dist ) ≤ D, fine-granularity partitioning is required, but the distance between the fine t-partitions in L i and L j needs not be computed > D LiLi LjLj lb ( L i, L j, dist ) > D LiLi LjLj ub ( L i, L j, dist ) ≤ D ≤ D
27
Trajectory Outlier Detection: A Partition-and-Detect Framework27 04/08/08 Performance Evaluation Use two real trajectory data sets Hurricane track data set −Records the Atlantic hurricanes for the years 1950 through 2006 −The entire set: 608 trajectories and 18,951 points; A small set (1990~2006): 221 trajectories and 7,270 points Animal movement data set −Records the locations of elk, deer, and cattle for the years 1993 through 1996 (the Starkey Project) − Elk1993 : 33 trajectories and 15,422 points; Deer1995 : 32 trajectories and 20,065 points; Cattle1993 : 41 trajectories and 19,556 points Validate the quality of outlier detection Evaluate the effectiveness of the two-level partitioning strategy
28
Trajectory Outlier Detection: A Partition-and-Detect Framework28 04/08/08 Trajectory Outliers for Hurricane Data (Small) D = 85, p = 0.95, F = 0.2 → # of outliers = 13
29
Trajectory Outlier Detection: A Partition-and-Detect Framework29 04/08/08 Trajectory Outliers for Elk1993 D = 55, p = 0.95, F = 0.1 → # of outliers = 3
30
Trajectory Outlier Detection: A Partition-and-Detect Framework30 04/08/08 Trajectory Outliers for Deer1995 D = 80, p = 0.95, F = 0.1 → # of outliers = 3
31
Trajectory Outlier Detection: A Partition-and-Detect Framework31 04/08/08 Effects of Parameter Values (a) D = 83, p = 0.95, F = 0.2 (b) D = 87, p = 0.95, F = 0.2 19 outliers 10 outliers
32
Trajectory Outlier Detection: A Partition-and-Detect Framework32 04/08/08 Pruning Power of Two-Level Partitioning 2L-Total : the ratio of the number of pairs pruned by Rule 1 to the total number of pairs of coarse t-partitions 2L-False : the proportion of pairs pruned incorrectly Optimal : the maximum ratio of pairs that can be pruned Achieves high pruning power (64~88%)
33
Trajectory Outlier Detection: A Partition-and-Detect Framework33 04/08/08 Speedup Ratio of Two-Level Partitioning the elapsed time of the algorithm using the simple partitioning strategy the elapsed time of the algorithm using the two-level partitioning strategy Speedup Ratio = Shows significant performance improvement
34
Trajectory Outlier Detection: A Partition-and-Detect Framework34 04/08/08 Related Work Outlier detection algorithms for points Distribution-based [2], distance-based [3, 4, 5, 6], density-based [7, 8], deviation-based [9] Trajectory outlier detection technique using a distance- based approach [5] Not clear whether this technique can detect outlying sub - trajectories from very complicated trajectories Trajectory outlier detection algorithms based on classification [12] Require a good training set and depend on training
35
Trajectory Outlier Detection: A Partition-and-Detect Framework35 04/08/08ConclusionsConclusions Proposed a novel framework, the partition-and-detect framework, for detecting trajectory outliers For the 1 st phase, proposed a two-level trajectory partitioning strategy Ensures both high quality and high efficiency For the 2 nd phase, proposed a hybrid of the distance-based and density-based approaches Very intuitive, but does not have the local density problem Demonstrated the effectiveness of TRAOD using various real trajectory data
36
Trajectory Outlier Detection: A Partition-and-Detect Framework36 04/08/08 Thank You!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.