Download presentation
Presentation is loading. Please wait.
1
1
2
2 General problem Retrieval of time-series similar to a given pattern.
3
3 Example: Stock charts Database of time-series
4
4 Example: Stock charts Database of time-seriesPattern
5
5 Example: Stock charts Database of time-seriesPatternRetrieval results
6
6 Example: Stock charts Database of time-seriesPatternRetrieval results.92.87.86.84
7
7 Example: Electrocardiogram Database of time-series
8
8 Example: Electrocardiogram Database of time-seriesPattern
9
9 Example: Electrocardiogram Database of time-seriesPatternRetrieval results.91.87.98 1.0
10
10 Outline Previous work Important points Indexing and retrieval Empirical results Conclusions
11
11 Outline Previous work Important points Indexing and retrieval Empirical results Conclusions Contributions }
12
12 Criteria for retrieval methods Gunopulos [2000]: Work for erratic time-series Accept any pattern Find inexact matches Work when some points are missing Work on streaming data
13
13 Outline Previous work Important points Indexing and retrieval Empirical results Conclusions
14
14 Previous work Feature choice Similarity metrics Indexing and retrieval
15
15 Previous work: Feature choice Discrete Fourier transforms Alphabets Statistical features Subsets of points
16
16 Previous work: Similarity metrics Euclidean distance Bounding rectangles Envelope count Aggregate similarity
17
17 Previous work: Indexing and retrieval Advanced techniques: B-trees R-trees KD-trees VP-trees Grids Applied techniques: Linear search with compression
18
18 Outline Previous work Important points Indexing and retrieval Empirical results Conclusions
19
19 Important points Choose “important” maxima and minima, and discard the other points.
20
20 Important points Choose “important” maxima and minima, and discard the other points. Original series Example:
21
21 Important points Choose “important” maxima and minima, and discard the other points. Original series Example:
22
22 Important points Choose “important” maxima and minima, and discard the other points. Original series Example: Compressed series
23
23 Definition of important points Important minimum
24
24 Definition of important points Important minimum a m is the minimum among a i,…, a j
25
25 Definition of important points Important minimum a m is the minimum among a i,…, a j a i /a m R and a j /a m R
26
26 Definition of important points Important minimum a m is the minimum among a i,…, a j a i /a m R and a j /a m R R is a knob that determines compression rate
27
27 Definition of important points Important maximum a m is the maximum among a i,…, a j a m /a i R and a m /a j R R is a knob that determines compression rate
28
28 Compression example Original series
29
29 Compression example Original series Compressed series
30
30 Compression example Original series Compressed series
31
31 Compression example Original series Compressed series
32
32 Compression algorithm Linear time Constant memory Accepts streaming data For a series with n values, compression time is 0.0133 n milliseconds (300 MHz PC, Visual Basic 6.0).
33
33 Outline Previous work Important points Indexing and retrieval Empirical results Conclusions
34
34 Retrieval Retrieval of time-series similar to a given pattern. Intuition: Find a prominent feature in the pattern Find candidate segments with a similar feature Compare similarity of candidates to the pattern
35
35 Example: Stock charts Database of time-series
36
36 Example: Stock charts Database of time-series
37
37 Example: Stock charts Database of time-seriesPattern
38
38 Example: Stock charts Database of time-seriesPattern
39
39 Example: Stock charts Database of time-seriesPattern
40
40 Example: Stock charts Database of time-seriesPatternRetrieval results.92.87.86.84
41
41 Algorithm Identify the prominent leg in the pattern Retrieve similar legs from the database Identify corresponding candidate segments For each candidate segment, compute its similarity to the pattern Output the candidates whose similarity is above the threshold
42
42 Important details Use compressed pattern and compressed sequences in the retrieval process The prominent feature is the leg having the greatest ratio of right end to left end All legs in the database are indexed by their prominence, using a binary search tree
43
43 Alternative versions Different prominence definitions Different similarity metrics The end-point ratio prominence usually gives the best empirical results.
44
44 Extended legs Similar sequence
45
45 Indexing on extended legs Advantage: More accurate retrieval Disadvantage: Larger index, more memory If a compressed sequence has n legs: Worst case: n 2 /2 extended legs Average case: (n lg n) extended legs
46
46 Outline Previous work Important points Indexing and retrieval Empirical results Conclusions
47
47 Data sets Stock charts Air and sea temperatures Wind speeds Electroencephalograms Electrocardiograms
48
48 Data sets Stock charts Air and sea temperatures Wind speeds Electroencephalograms Electrocardiograms 60,000 points 445,000 points 79,000 points 17,000 points 2,000 points
49
49 Patterns Compressed patterns with 4 to 27 legs Examples:
50
50 Retrieval time Retrieval time: 0.07 m k milliseconds m legs in a pattern k candidates
51
51 Retrieval accuracy: Stock charts 20 % candidates C = 3 10 % C = 2 5 % C = 1.5 1 % C = 1.1
52
52 Retrieval accuracy: Wind speeds 20 % candidates C = 1.5 10 % C = 1.2 5 % C = 1.1
53
53 Retrieval candidate quality Stock charts (5,400 legs)447 Air and sea temperatures (5,500 legs)456 Wind speeds (10,500 legs)379 Candidates 5%10%20% Found matches among ten best:
54
54 Outline Previous work Important points Indexing and retrieval Empirical results Conclusions
55
55 Criteria for retrieval methods Gunopulos [2000]: Work for erratic time-series Accept any pattern Find inexact matches Work when some points are missing Work on streaming data
56
56 Criteria for retrieval methods Gunopulos [2000]: Work for erratic time-series Accept any pattern Find inexact matches Work when some points are missing Work on streaming data
57
57 Criteria for retrieval methods Gunopulos [2000]: Work for erratic time-series Accept any pattern Find inexact matches Work when some points are missing Work on streaming data ~
58
58 Criteria for retrieval methods Gunopulos [2000]: Work for erratic time-series Accept any pattern Find inexact matches Work when some points are missing Work on streaming data ~
59
59 Criteria for retrieval methods Gunopulos [2000]: Work for erratic time-series Accept any pattern Find inexact matches Work when some points are missing Work on streaming data ~
60
60 Criteria for retrieval methods Gunopulos [2000]: Work for erratic time-series Accept any pattern Find inexact matches Work when some points are missing Work on streaming data ~ ~
61
61 Main results Compression Fast compression procedure Preserves similarity Retrieval Works with compressed data Controlled trade-off between speed and accuracy
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.