Presentation is loading. Please wait.

Presentation is loading. Please wait.

Trajectory Outlier Detection: A Partition-and-Detect Framework1 04/08/08 April 8, 2007 Trajectory Outlier Detection: A Partition-and-Detect Framework Jae-Gil.

Similar presentations


Presentation on theme: "Trajectory Outlier Detection: A Partition-and-Detect Framework1 04/08/08 April 8, 2007 Trajectory Outlier Detection: A Partition-and-Detect Framework Jae-Gil."— Presentation transcript:

1 Trajectory Outlier Detection: A Partition-and-Detect Framework1 04/08/08 April 8, 2007 Trajectory Outlier Detection: A Partition-and-Detect Framework Jae-Gil Lee, Jiawei Han, and Xiaolei Li Department of Computer Science University of Illinois at Urbana-Champaign ICDE 2008

2 Trajectory Outlier Detection: A Partition-and-Detect Framework2 04/08/08 Table of Contents  Motivation  Partition-and-Detect Framework  Outlier Detection Algorithm: TRAOD Partitioning Phase (Simple) Detection Phase Partitioning Phase (Enhanced)  Performance Evaluation  Related Work  Conclusions

3 Trajectory Outlier Detection: A Partition-and-Detect Framework3 04/08/08 Outlier Detection  Definition: the process of detecting a data object that is grossly different from or inconsistent with the remaining set of data  Applications: the detection of credit card fraud, the monitoring of criminal activities in electronic commerce, etc.  Algorithms: distribution-based, distance-based, density- based, and deviation-based  Target data: previous research has mainly dealt with outlier detection of point data

4 Trajectory Outlier Detection: A Partition-and-Detect Framework4 04/08/08 Analysis on Trajectory Data  Tremendous amounts of trajectory data of moving objects are being collected Example : vehicle positioning data, hurricane tracking data, animal movement data, etc.  Trajectory outlier detection has many important, real- world applications Detection of suspicious persons in video surveillance Analysis of unusual air-mass trajectories in meteorology …  A powerful outlier detection algorithm for trajectories is needed urgently

5 Trajectory Outlier Detection: A Partition-and-Detect Framework5 04/08/08 Limitations of Existing Algorithms  Knorr et al. [5] have presented one of very few attempts Define the distance between two whole trajectories using the summary information (e.g., the coordinates of the starting and ending points) Apply a distance-based approach to detection of trajectory outliers  Existing algorithms might not be able to detect outlying portions of trajectories Example : TR 3 is not detected as an outlier since its overall behavior is similar to those of neighboring trajectories TR 5 TR 1 TR 4 TR 3 TR 2 An outlying sub-trajectory

6 Trajectory Outlier Detection: A Partition-and-Detect Framework6 04/08/08 Discovery of Outlying Sub-Trajectories  Discovery of outlying sub -trajectories is very useful in the real world Example : S udden changes in hurricane’s path [10]  We propose the partition-and-detect framework

7 Trajectory Outlier Detection: A Partition-and-Detect Framework7 04/08/08 The Partition-and-Detect Framework  Consists of two phases: partitioning and detection TR 5 TR 1 TR 4 TR 3 TR 2 A set of trajectories (1) Partition (2) Detect TR 3 A set of trajectory partitions An outlier Outlying trajectory partitions Note : A set of outlying trajectory partitions indicates an outlying sub- trajectory

8 Trajectory Outlier Detection: A Partition-and-Detect Framework8 04/08/08 The Problem Statement I O  Given a set of trajectories I = { TR 1, …, TR n }, our algorithm generates a set of outliers O = { O 1, …, O m } with outlying trajectory partitions for each O i  Necessary definitions: A trajectory is a sequence of multi-dimensional points, which is denoted as TR i = p 1 p 2 p 3 … p j … p leni ; a trajectory partition ( t-partition for short) is a line segment p i p j ( i < j ), where p i and p j are the points chosen from the same trajectory A t-partition is outlying if it does not have a sufficient number of similar neighbors A trajectory is an outlier if it contains a non-negligible amount of outlying t-partitions

9 Trajectory Outlier Detection: A Partition-and-Detect Framework9 04/08/08 The Outlier Detection Algorithm: TRAOD  Based on the partition-and-detect framework Algorithm TRAOD (TRAjectory Outlier Detection) I Input : A set of trajectories I = { TR 1, …, TR n } O Output : A set of outliers O = { O 1, …, O m } with outlying t-partitions for each O i Algorithm : /* Partitioning Phase */ I 01: for each TR  I do L 02: Partition TR into a set L of line segments; LD 03: Accumulate L into a set D ; /* Detection Phase */ D 04: for each P  D do 05: Mark P if it is an outlying t-partition; I 06: for each TR  I do 07: Output TR if it is an outlier;

10 Trajectory Outlier Detection: A Partition-and-Detect Framework10 04/08/08 Where We Are Now /* Partitioning Phase */ I 01: for each TR  I do L 02: Partition TR into a set L of line segments LD 03: Accumulate L into a set D ; /* Detection Phase */ D 04: for each P  D do 05: Mark P if it is an outlying t-partition; I 06: for each TR  I do 07: Output TR if it is an outlier; by a simple strategy; by a two-level partitioning strategy;

11 Trajectory Outlier Detection: A Partition-and-Detect Framework11 04/08/08 A Simple Partitioning Strategy (1/2)  Careless partitioning (especially, in a long length) could miss possible outliers Example : Even though TR out behaves differently from its neighboring trajectories, these differences are averaged out due to careless partitioning Neighboring Trajectories A t-partition A trajectory TR out

12 Trajectory Outlier Detection: A Partition-and-Detect Framework12 04/08/08 A Simple Partitioning Strategy (2/2)  A trajectory is partitioned at a base unit : the smallest meaningful unit of a trajectory in a given application Example : The base unit can be every single point Pros : high detection quality in general Cons : poor performance due to a large number of t-partitions  remedied by a two-level partitioning strategy Neighboring Trajectories A t-partition A trajectory TR out An outlying t-partition

13 Trajectory Outlier Detection: A Partition-and-Detect Framework13 04/08/08 Where We Are Now /* Partitioning Phase */ I 01: for each TR  I do L 02: Partition TR into a set L of line segments LD 03: Accumulate L into a set D ; /* Detection Phase */ D 04: for each P  D do 05: Mark P if it is an outlying t-partition; I 06: for each TR  I do 07: Output TR if it is an outlier; by a simple strategy; by a two-level partitioning strategy;

14 Trajectory Outlier Detection: A Partition-and-Detect Framework14 04/08/08 Distance between T-Partitions  The weighted sum of three components: the perpendicular distance ( ), parallel distance ( ), and angle distance ( ) Adapted from similarity measures used in the domain of pattern recognition [13]

15 Trajectory Outlier Detection: A Partition-and-Detect Framework15 04/08/08 Trajectory Outliers Based on Distance (1/2)  Def. (a close trajectory ):  Def. (an outlying t-partition ): TR j is close to L i TR j is not close to L i L i is an outlying t-partition L i is not an outlying t-partition Not close ≤ 1 ‒ p Close > 1 ‒ p

16 Trajectory Outlier Detection: A Partition-and-Detect Framework16 04/08/08 Trajectory Outliers Based on Distance (2/2)  Def. (an outlier ): A trajectory TR i is an outlier if the sum of the lengths of all t-partitions in TR i the sum of the lengths of outlying t-partitions in TR i ≥ F TR i TR j TR i is an outlier TR j is not an outlier

17 Trajectory Outlier Detection: A Partition-and-Detect Framework17 04/08/08 Incorporation of Density (1/2)  The previous definition, as it is, has the local density problem A t-partition in a dense region tends to have relatively a larger number of close trajectories than that in a sparse region T-Partitions in dense regions are favored!

18 Trajectory Outlier Detection: A Partition-and-Detect Framework18 04/08/08 Incorporation of Density (2/2)  Def. (the density of a t-partition): The density of a t-partition L i is the number of t-partitions within the distance σ from L i, where σ is the standard deviation of pairwise distances between t-partitions  Def. (the adjusting coefficient of a t-partition):  Adjustment by the density The number of close trajectories is multiplied by the adjusting coefficient adj ( L i ) adj ( L i ) < 1.0 in a dense region adj ( L i ) > 1.0 in a sparse region the density of the t-partition L i the average density of all t-partitions adj ( L i ) =

19 Trajectory Outlier Detection: A Partition-and-Detect Framework19 04/08/08 Guidelines for Parameter Values  Three parameters: D corresponds to similar, p to sufficient, and F to non-negligible  Remark: There is no universally correct parameter value even for the same data set and application  Our guideline: Resorts on user feedback Want Many Outliers? Have Many Trajectories? Are Trajectories Short? DpFDpF 0.900.99 0.200.10 SmallerLarger

20 Trajectory Outlier Detection: A Partition-and-Detect Framework20 04/08/08 Where We Are Now /* Partitioning Phase */ I 01: for each TR  I do L 02: Partition TR into a set L of line segments LD 03: Accumulate L into a set D ; /* Detection Phase */ D 04: for each P  D do 05: Mark P if it is an outlying t-partition; I 06: for each TR  I do 07: Output TR if it is an outlier; by a simple strategy; by a two-level partitioning strategy;

21 Trajectory Outlier Detection: A Partition-and-Detect Framework21 04/08/08 Two-Level Trajectory Partitioning  Objective Achieves much higher performance than the simple strategy Obtains the same result as that of the simple strategy; i.e., does not lose the quality of the result  Basic idea 1. Partition a trajectory in coarse granularity first 2. Partition a coarse t-partition in fine granularity only when necessary  Main benefit Narrows the search space that needs to be inspected in fine granularity  Many portions of trajectories can be pruned early on

22 Trajectory Outlier Detection: A Partition-and-Detect Framework22 04/08/08 Intuition to Two-Level Trajectory Partitioning  If the distance between coarse t-partitions is very large (or small), the distances between their fine t-partitions is also very large (or small) TR i TR j Coarse-Granularity Partitioning Fine-Granularity Partitioning Given two coarse t-partitions, can we know if the distance between any two fine t-partitions is greater than (or less than) D ?

23 Trajectory Outlier Detection: A Partition-and-Detect Framework23 04/08/08 Coarse-Granularity Partitioning*  Try to maximize two rivalry measures Preciseness : the difference between a trajectory and a set of its coarse t-partitions should be as small as possible −Required for making the bounds tight Conciseness : the number of coarse t-partitions should be as small as possible −Required for reducing the number of comparisons  Formulate this problem using the minimum length description (MDL) principle A good tradeoff between the two measures is found based on the information theory * Coarse-granularity partitioning is identical to that in our earlier work on trajectory clustering [15]

24 Trajectory Outlier Detection: A Partition-and-Detect Framework24 04/08/08 Fine-Granularity Partitioning  Identify outlying coarse t-partitions by deriving the distance bounds between two coarse t-partitions L i and L j Suppose l i is a fine t-partition in L i and l j is that in L j Derive the above bounds separately for ( Lemmas 1~3 ) and combine them ( Lemma 4 ) TR i TR j LiLi LjLj lb ( L i, L j, f ) The lower bound of f ( l i, l j ), ub ( L i, L j, f ) The upper bound of f ( l i, l j ),

25 Trajectory Outlier Detection: A Partition-and-Detect Framework25 04/08/08 Derivation of the Distance Bounds Lemma 1. Bounds for Lemma 2. Bounds for Lemma 3. Bounds for Lemma 4. Bounds for dist ( L i, L j ) Combine

26 Trajectory Outlier Detection: A Partition-and-Detect Framework26 04/08/08 Pruning Rules for Fine-Granularity Partitioning  Rule 1 : If lb ( L i, L j, dist ) > D, fine-granularity partitioning is not required when comparing L i and L j  Rule 2 : If ub ( L i, L j, dist ) ≤ D, fine-granularity partitioning is required, but the distance between the fine t-partitions in L i and L j needs not be computed > D LiLi LjLj lb ( L i, L j, dist ) > D LiLi LjLj ub ( L i, L j, dist ) ≤ D ≤ D

27 Trajectory Outlier Detection: A Partition-and-Detect Framework27 04/08/08 Performance Evaluation  Use two real trajectory data sets Hurricane track data set −Records the Atlantic hurricanes for the years 1950 through 2006 −The entire set: 608 trajectories and 18,951 points; A small set (1990~2006): 221 trajectories and 7,270 points Animal movement data set −Records the locations of elk, deer, and cattle for the years 1993 through 1996 (the Starkey Project) − Elk1993 : 33 trajectories and 15,422 points; Deer1995 : 32 trajectories and 20,065 points; Cattle1993 : 41 trajectories and 19,556 points  Validate the quality of outlier detection  Evaluate the effectiveness of the two-level partitioning strategy

28 Trajectory Outlier Detection: A Partition-and-Detect Framework28 04/08/08 Trajectory Outliers for Hurricane Data (Small) D = 85, p = 0.95, F = 0.2 → # of outliers = 13

29 Trajectory Outlier Detection: A Partition-and-Detect Framework29 04/08/08 Trajectory Outliers for Elk1993 D = 55, p = 0.95, F = 0.1 → # of outliers = 3

30 Trajectory Outlier Detection: A Partition-and-Detect Framework30 04/08/08 Trajectory Outliers for Deer1995 D = 80, p = 0.95, F = 0.1 → # of outliers = 3

31 Trajectory Outlier Detection: A Partition-and-Detect Framework31 04/08/08 Effects of Parameter Values (a) D = 83, p = 0.95, F = 0.2 (b) D = 87, p = 0.95, F = 0.2 19 outliers 10 outliers

32 Trajectory Outlier Detection: A Partition-and-Detect Framework32 04/08/08 Pruning Power of Two-Level Partitioning  2L-Total : the ratio of the number of pairs pruned by Rule 1 to the total number of pairs of coarse t-partitions  2L-False : the proportion of pairs pruned incorrectly  Optimal : the maximum ratio of pairs that can be pruned Achieves high pruning power (64~88%)

33 Trajectory Outlier Detection: A Partition-and-Detect Framework33 04/08/08 Speedup Ratio of Two-Level Partitioning the elapsed time of the algorithm using the simple partitioning strategy the elapsed time of the algorithm using the two-level partitioning strategy Speedup Ratio = Shows significant performance improvement

34 Trajectory Outlier Detection: A Partition-and-Detect Framework34 04/08/08 Related Work  Outlier detection algorithms for points Distribution-based [2], distance-based [3, 4, 5, 6], density-based [7, 8], deviation-based [9]  Trajectory outlier detection technique using a distance- based approach [5] Not clear whether this technique can detect outlying sub - trajectories from very complicated trajectories  Trajectory outlier detection algorithms based on classification [12] Require a good training set and depend on training

35 Trajectory Outlier Detection: A Partition-and-Detect Framework35 04/08/08ConclusionsConclusions  Proposed a novel framework, the partition-and-detect framework, for detecting trajectory outliers  For the 1 st phase, proposed a two-level trajectory partitioning strategy Ensures both high quality and high efficiency  For the 2 nd phase, proposed a hybrid of the distance-based and density-based approaches Very intuitive, but does not have the local density problem  Demonstrated the effectiveness of TRAOD using various real trajectory data

36 Trajectory Outlier Detection: A Partition-and-Detect Framework36 04/08/08 Thank You!


Download ppt "Trajectory Outlier Detection: A Partition-and-Detect Framework1 04/08/08 April 8, 2007 Trajectory Outlier Detection: A Partition-and-Detect Framework Jae-Gil."

Similar presentations


Ads by Google