Download presentation
Presentation is loading. Please wait.
1
Time Series and Dynamic Time Warping
CSE 4309 β Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington
2
Sequential Data Sequential data, as the name implies, are sequences.
What is the difference between a sequence and a set? A sequence πΏ is a set of elements, together with a total order imposed on those elements. A total order describes, for any two elements π π , π π , which of them comes before and which comes after. Examples of sequential data: Strings: sequences of characters. Time series: sequences of vectors.
3
Time Series A time series is a sequence of observations made over time. Examples: Stock market prices (for a single stock, or for multiple stocks). Heart rate of a patient over time. Position of one or multiple people/cars/airplanes over time. Speech: represented as a sequence of audio measurements at discrete time steps. A musical melody: represented as a sequence of pairs (note, duration).
4
Applications of Time Series Classification
Predicting future prices (stock market, oil, currenciesβ¦). Heart rate of a patient over time: Is it indicative of a healthy heart, or of some disease? Position of one or multiple people/cars/airplanes over time. Predict congestion along a route suggested by the GPS. Speech recognition. Music recognition. Sing a tune to your phone, have it recognize the song. Recognize the genre of a song based on how it sounds.
5
Time Series Example: Signs
0.5 to 2 million users of American Sign Language (ASL) in the US. Different regions in the world use different sign languages. For example, British Sign Language (BSL) is different than American Sign Language. These languages have vocabularies of thousands of signs. We will use sign recognition as our example application, as we introduce methods for time series classification.
6
Example: The ASL Sign for "again"
7
Example: The ASL Sign for "bicycle"
8
Representing Signs as Time Series
A sign is a video (or part of a video). A video is a sequence of frames. A sequence of images, which, when displayed rapidly in succession, create the illusion of motion. At every frame π, we extract a feature vector π π . How should we define the feature vector? This is a (challenging) computer vision problem. The methods we will discuss work with any choice we make for the feature vector. Then, the entire sign is represented as time series πΏ= π 1 , π 2 ,β¦ π π· . π· is the length of the sign video, measured in frames.
9
Feature Vectors Finding good feature vectors for signs is an active research problem in computer vision.
10
Feature Vectors We choose a simple approach: the feature vector at each frame is the (x, y) pixel location of the right hand at that frame. Frame 1: Feature vector π 1 =(192,205)
11
Feature Vectors We choose a simple approach: the feature vector at each frame is the (x, y) pixel location of the right hand at that frame. Frame 2: Feature vector π 2 =(190,201)
12
Feature Vectors We choose a simple approach: the feature vector at each frame is the (x, y) pixel location of the right hand at that frame. Frame 3: Feature vector π 3 =(189,197)
13
Time Series for a Sign Using the previous representation, for sign "AGAIN" we end up with this time series: ((192, 205),(190, 201),(190, 194),(191, 188),(194, 182),(205, 183), (211, 178),(217, 171),(225, 168),(232, 167),(233, 167),(238, 168), (241, 169),(243, 176),(254, 177),(256, 179),(258, 186),(269, 194), (271, 202),(274, 206),(276, 207),(277, 208)). It is a sequence of 22 2D vectors.
14
Time Series for a Sign Using the previous representation, for sign "AGAIN" we end up with this time series: ((192, 205),(190, 201),(190, 194),(191, 188),(194, 182),(205, 183), (211, 178),(217, 171),(225, 168),(232, 167),(233, 167),(238, 168), (241, 169),(243, 176),(254, 177),(256, 179),(258, 186),(269, 194), (271, 202),(274, 206),(276, 207),(277, 208)) It is a sequence of 22 2D vectors.
15
Time Series Classification
Training sign for "AGAIN".
16
Time Series Classification
Training sign for "BICYCLE".
17
Time Series Classification
Suppose our training set only contains those two signs: "AGAIN" and "BICYCLE". We get this test sign. Does it mean "AGAIN", or "BICYCLE"?
18
Time Series Classification
Does this test sign mean "AGAIN", or "BICYCLE"? We can use nearest-neighbor classification. Big question: what distance measure should we choose?
19
Visualizing a Time Series
π 1 π 2 π 3 π 4 π 5 π 6 π 7 π 8 Here is a visualization of one time series πΏ. Each feature vector (as before) is a point in 2D. The time series has eight elements. πΏ= π 1 , π 2 ,β¦ π π· . For example, indicates the position of 2D point π 4 . π 4
20
Comparing Time Series π 1 π 2 π 3 π 4 π 5 π 6 π 7 π 8 π 1 π 2 π 3 π 4
π 5 π 6 π 7 π 8 π 9 Here is a visualization of two time series, πΏ and π. How do we measure the distance between two time series?
21
Comparing Time Series π 1 π 2 π 3 π 4 π 5 π 6 π 7 π 8 π 1 π 2 π 3 π 4
π 5 π 6 π 7 π 8 π 9 We could possibly use the Euclidean distance. The first time series is a 16-dimensional vector. The second time series is an 18-dimensional vector. However, there are issuesβ¦
22
Problems with the Euclidean Distance
π 1 π 2 π 3 π 4 π 5 π 6 π 7 π 8 π 1 π 2 π 3 π 4 π 5 π 6 π 7 π 8 π 9 The two vectors have different numbers of dimensions. We could fix that by "stretching" the first time series to also have 9 elements.
23
Problems with the Euclidean Distance
π 1 π 2 π 3 π 4 π 5 π 6 π 7 π 8 π 9 π 9 π 2 π 3 π 4 π 5 π 6 π 7 π 8 π 1 Here, the time series on the left has been converted to have nine elements, using interpolation. We will not explore that option any further, becauseβ¦
24
Problems with the Euclidean Distance
π 1 π 2 π 3 π 4 π 5 π 6 π 7 π 8 π 1 π 2 π 3 π 4 π 5 π 6 π 7 π 8 π 9 Bigger problem: the two time series correspond to a similar pattern being performed with variable speed. The first five elements of the first time series visually match the first seven elements of the second time series.
25
Problems with the Euclidean Distance
π 1 π 2 π 3 π 4 π 5 π 6 π 7 π 8 π 1 π 2 π 3 π 4 π 5 π 6 π 7 π 8 π 9 So, the first time series had a faster speed than the second time series initially. It took 5 time ticks for the first time series, and seven time ticks for the second time series, to move through approximately the same trajectory.
26
Problems with the Euclidean Distance
π 1 π 2 π 3 π 4 π 5 π 6 π 7 π 8 π 1 π 2 π 3 π 4 π 5 π 6 π 7 π 8 π 9 Similarly, the last four elements of the first time series visually match the last three elements of the second time series. So, the first time series had a faster speed than the second time series in the last part. 4 time ticks for the first time series, 3 time ticks for the second one.
27
Time Series Alignment π 1 π 2 π 3 π 4 π 5 π 6 π 7 π 8 π 1 π 2 π 3 π 4
π 5 π 6 π 7 π 8 π 9 An alignment is a correspondence between elements of two time series. To reliably measure the similarity between two time series, we need to figure out, for each element of the first time series, which elements of the second time series it corresponds to.
28
Time Series Alignment π 1 π 2 π 3 π 4 π 5 π 6 π 7 π 8 π 1 π 2 π 3 π 4
π 5 π 6 π 7 π 8 π 9 An alignment π¨ between πΏ and π is a sequence of pairs of indices: π¨= π 1,1 , π 1,2 , π 2,1 , π 2,2 ,β¦, π π
,1 , π π
,2 Element π π,1 , π π,2 of alignment π¨ specifies that element π π π,1 of πΏ corresponds to element π π π,2 of the other time series.
29
Time Series Alignment π 1 π 2 π 3 π 4 π 5 π 6 π 7 π 8 π 1 π 2 π 3 π 4
π 5 π 6 π 7 π 8 π 9 For example, here is a "good" alignment between the two series: 1, 1 , 2, 2 , 2, 3 , 3, 4 , 4, 5 , 4, 6 , 5, 7 , 6, 7 , 7, 8 , 8, 9 We will discuss later how to find such an alignment automatically.
30
Time Series Alignment π 1 π 2 π 3 π 4 π 5 π 6 π 7 π 8 π 1 π 2 π 3 π 4
π 5 π 6 π 7 π 8 π 9 Alignment: 1, 1 , 2, 2 , 2, 3 , 3, 4 , 4, 5 , 4, 6 , 5, 7 , 6, 7 , 7, 8 , 8, 9 According to this alignment: π 1 corresponds to π 1 .
31
Time Series Alignment π 1 π 2 π 3 π 4 π 5 π 6 π 7 π 8 π 1 π 2 π 3 π 4
π 5 π 6 π 7 π 8 π 9 Alignment: 1, 1 , 2, 2 , 2, 3 , 3, 4 , 4, 5 , 4, 6 , 5, 7 , 6, 7 , 7, 8 , 8, 9 According to this alignment: π 2 corresponds to π 2 .
32
Time Series Alignment π 1 π 2 π 3 π 4 π 5 π 6 π 7 π 8 π 1 π 2 π 3 π 4
π 5 π 6 π 7 π 8 π 9 Alignment: 1, 1 , 2, 2 , 2, 3 , 3, 4 , 4, 5 , 4, 6 , 5, 7 , 6, 7 , 7, 8 , 8, 9 According to this alignment: π 2 also corresponds to π 3 .
33
Time Series Alignment π 1 π 2 π 3 π 4 π 5 π 6 π 7 π 8 π 1 π 2 π 3 π 4
π 5 π 6 π 7 π 8 π 9 π 2 also corresponds to π 3 . One element from one time series can correspond to multiple consecutive elements from the other time series. This captures cases where one series moves slower than the other. For this segment, the first series moves faster than the second one.
34
Time Series Alignment π 1 π 2 π 3 π 4 π 5 π 6 π 7 π 8 π 1 π 2 π 3 π 4
π 5 π 6 π 7 π 8 π 9 Alignment: 1, 1 , 2, 2 , 2, 3 , 3, 4 , 4, 5 , 4, 6 , 5, 7 , 6, 7 , 7, 8 , 8, 9 According to this alignment: π 3 corresponds to π 4 .
35
Time Series Alignment π 1 π 2 π 3 π 4 π 5 π 6 π 7 π 8 π 1 π 2 π 3 π 4
π 5 π 6 π 7 π 8 π 9 Alignment: 1, 1 , 2, 2 , 2, 3 , 3, 4 , 4, 5 , 4, 6 , 5, 7 , 6, 7 , 7, 8 , 8, 9 According to this alignment: π 4 corresponds to π 5 .
36
Time Series Alignment π 1 π 2 π 3 π 4 π 5 π 6 π 7 π 8 π 1 π 2 π 3 π 4
π 5 π 6 π 7 π 8 π 9 Alignment: 1, 1 , 2, 2 , 2, 3 , 3, 4 , 4, 5 , 4, 6 , 5, 7 , 6, 7 , 7, 8 , 8, 9 According to this alignment: π 4 also corresponds to π 6 .
37
Time Series Alignment π 1 π 2 π 3 π 4 π 5 π 6 π 7 π 8 π 1 π 2 π 3 π 4
π 5 π 6 π 7 π 8 π 9 π 4 also corresponds to π 6 . As before, for this segment, the first time series is moving faster than the second one.
38
Time Series Alignment π 1 π 2 π 3 π 4 π 5 π 6 π 7 π 8 π 1 π 2 π 3 π 4
π 5 π 6 π 7 π 8 π 9 Alignment: 1, 1 , 2, 2 , 2, 3 , 3, 4 , 4, 5 , 4, 6 , 5, 7 , 6, 7 , 7, 8 , 8, 9 According to this alignment: π 5 corresponds to π 7 .
39
Time Series Alignment π 1 π 2 π 3 π 4 π 5 π 6 π 7 π 8 π 1 π 2 π 3 π 4
π 5 π 6 π 7 π 8 π 9 Alignment: 1, 1 , 2, 2 , 2, 3 , 3, 4 , 4, 5 , 4, 6 , 5, 7 , 6, 7 , 7, 8 , 8, 9 According to this alignment: π 6 also corresponds to π 5 .
40
Time Series Alignment π 1 π 2 π 3 π 4 π 5 π 6 π 7 π 8 π 1 π 2 π 3 π 4
π 5 π 6 π 7 π 8 π 9 Alignment: 1, 1 , 2, 2 , 2, 3 , 3, 4 , 4, 5 , 4, 6 , 5, 7 , 6, 7 , 7, 8 , 8, 9 Here, the first time series is moving slower than the second one, and thus π 5 and π 6 are both matched to π 7 .
41
Time Series Alignment π 1 π 2 π 3 π 4 π 5 π 6 π 7 π 8 π 1 π 2 π 3 π 4
π 5 π 6 π 7 π 8 π 9 Alignment: 1, 1 , 2, 2 , 2, 3 , 3, 4 , 4, 5 , 4, 6 , 5, 7 , 6, 7 , 7, 8 , 8, 9 According to this alignment: π 7 corresponds to π 8 .
42
Time Series Alignment π 1 π 2 π 3 π 4 π 5 π 6 π 7 π 8 π 1 π 2 π 3 π 4
π 5 π 6 π 7 π 8 π 9 Alignment: 1, 1 , 2, 2 , 2, 3 , 3, 4 , 4, 5 , 4, 6 , 5, 7 , 6, 7 , 7, 8 , 8, 9 According to this alignment: π 8 corresponds to π 9 .
43
The Cost of an Alignment
π 1 π 2 π 3 π 4 π 5 π 6 π 7 π 8 π 1 π 2 π 3 π 4 π 5 π 6 π 7 π 8 π 9 An alignment π¨ is defined as a sequence of pairs π¨= π 1,1 , π 1,2 , π 2,1 , π 2,2 ,β¦, π π
,1 , π π
,2 If we are given an alignment π¨ between two time series πΏ and π, we can compute the cost πΆ πΏ,π π¨ of that alignment as: πΆ πΏ,π π¨ = π=1 π
Cost( π π π,1 , π π π,2 )
44
The Cost of an Alignment
An alignment is defined as a sequence of pairs: π¨= π 1,1 , π 1,2 , π 2,1 , π 2,2 ,β¦, π π
,1 , π π
,2 If we are given an alignment π¨ between two time series πΏ and π, we can compute the cost πΆ πΏ,π π¨ of that alignment as: πΆ πΏ,π π¨ = π=1 π
Cost( π π π,1 , π π π,2 ) In this formula, Cost π π π,1 , π π π,2 is a black box. You can define it any way you like. For example, if π π π,1 and π π π,2 are vectors, Cost( π π π,1 , π π π,2 ) can be the Euclidean distance between π π π,1 and π π π,2 .
45
The Cost of an Alignment
An alignment is defined as a sequence of pairs: π¨= π 1,1 , π 1,2 , π 2,1 , π 2,2 ,β¦, π π
,1 , π π
,2 If we are given an alignment π¨ between two time series πΏ and π, we can compute the cost πΆ πΏ,π π¨ of that alignment as: πΆ πΏ,π π¨ = π=1 π
Cost( π π π,1 , π π π,2 ) Notation πΆ πΏ,π π¨ indicates that the cost of alignment π¨ depends on the specific pair of time series that we are aligning. The cost of an alignment π¨ depends on the alignment itself, as well as the two time series πΏ and π that we are aligning.
46
Rules of Alignment π 1 π 2 π 3 π 4 π 5 π 6 π 7 π 8 π 1 π 2 π 3 π 4 π 5 π 6 π 7 π 8 π 9 Should alignment ((1, 5), (2, 3), (6, 7), (7, 1)) be legal? It always depends on what makes sense for your data. Typically, for time series, alignments have to obey certain rules, that this alignment violates.
47
Rules of Alignment π 1 π 2 π 3 π 4 π 5 π 6 π 7 π 8 π 1 π 2 π 3 π 4 π 5 π 6 π 7 π 8 π 9 We will require that legal alignments obey three rules: Rule 1: Boundary Conditions. Rule 2: Monotonicity. Rule 3: Continuity The next slides define these rules.
48
Rule 1: Boundary Conditions
π 1 π 2 π 3 π 4 π 5 π 6 π 7 π 8 π 1 π 2 π 3 π 4 π 5 π 6 π 7 π 8 π 9 Illegal alignment (violating boundary conditions): ( 1, 5 , β¦, 7, 1 ). π 1 , π‘ 1 , π 2 , π‘ 2 ,β¦, π π
, π‘ π
Alignment rule #1 (boundary conditions): π 1 =1, π‘ 1 =1. π π
=π= length of first time series π‘ π
=π= length of second time series first elements match last elements match
49
Rule 2: Monotonicity Illegal alignment (violating monotonicity):
π 1 π 2 π 3 π 4 π 5 π 6 π 7 π 8 π 1 π 2 π 3 π 4 π 5 π 6 π 7 π 8 π 9 Illegal alignment (violating monotonicity): (β¦, (3, 5), (4, 3), β¦). π 1 , π‘ 1 , π 2 , π‘ 2 ,β¦, π π
, π‘ π
Alignment rule #2: sequences π 1 ,β¦, π π
and π‘ 1 ,β¦, π‘ π
are monotonically increasing. π π+1 β π π β₯0 π‘ π+1 β π‘ π β₯0 The alignment cannot go backwards.
50
Rule 3: Continuity Illegal alignment (violating continuity):
π 1 π 2 π 3 π 4 π 5 π 6 π 7 π 8 π 1 π 2 π 3 π 4 π 5 π 6 π 7 π 8 π 9 Illegal alignment (violating continuity): (β¦, (3, 5), (6, 7), β¦). π 1 , π‘ 1 , π 2 , π‘ 2 ,β¦, π π
, π‘ π
Alignment rule #3: sequences π 1 ,β¦, π π
and π‘ 1 ,β¦, π‘ π
cannot increase by more than one at each step. π π+1 β π π β€1 π‘ π+1 β π‘ π β€1 The alignment cannot skip elements.
51
Visualizing a Warping Path
Warping path is an alternative term for an alignment
52
Visualizing a Warping Path
π¨= (1, 1), (2, 2), (2, 3), (3, 4), (4, 5), (4, 6), (5, 7), (6, 7), (7, 8), (8, 9) We can visualize a warping path π¨ in 2D as follows: An element of π¨ corresponds to a colored cell in the 2D table. π 1 π 2 π 3 π 4 π 5 π 6 π 7 π 8 π 9 π 1 π 2 π 3 π 4 π 5 π 6 π 7 π 8
53
Dynamic Time Warping Dynamic Time Warping (DTW) is a distance measure between time series. The DTW distance is the cost of the optimal legal alignment between the two time series. The alignment must be legal: it must obey the three rules of alignments. Boundary conditions. Monotonicity. Continuity. The alignment must minimize πΆ πΏ,π π¨ = π=1 π
Cost( π π π,1 , π π π,2 ) So: DTW πΏ,π = min π¨ πΆ πΏ,π π¨ where π¨ ranges over all legal alignments between πΏ and π.
54
Finding the Optimal Alignment
π 1 π 2 π 3 π 4 π 5 π 6 π 7 π 8 π 1 π 2 π 3 π 4 π 5 π 6 π 7 π 8 π 9 πΏ π πΏ= π 1 , π 2 ,β¦ π π . πΏ is a time series of length π. π= π 1 , π 2 ,β¦ π π . π is a time series of length π. Each π π and each π π are feature vectors.
55
Finding the Optimal Alignment
πΏ= π 1 , π 2 ,β¦ π π . π= π 1 , π 2 ,β¦ π π . We want to find the optimal alignment between πΏ and π. Dynamic programming strategy: Break problem up into a 2D array of smaller, interrelated problems π,π , where 1β€πβ€π, 1β€πβ€π. Problem(π,π): find optimal alignment between π 1 , β¦, π π and π 1 , β¦ π π . We need to solve every problem(π,π) for every π and π. Then, the optimal alignment between πΏ and π is the solution to problem(π,π).
56
Finding the Optimal Alignment
πΏ= π 1 , π 2 ,β¦ π π . π= π 1 , π 2 ,β¦ π π . We want to find the optimal alignment between πΏ and π. We break up the problem into a 2D array of smaller, interrelated problems π,π , where 1β€πβ€π, 1β€πβ€π. Problem(π,π): find optimal alignment between π 1 , β¦, π π and π 1 , β¦ π π . How do we start computing the optimal alignment?
57
Finding the Optimal Alignment
πΏ= π 1 , π 2 ,β¦ π π . π= π 1 , π 2 ,β¦ π π . We want to find the optimal alignment between πΏ and π. We break up the problem into a 2D array of smaller, interrelated problems π,π , where 1β€πβ€π, 1β€πβ€π. Problem(π,π): find optimal alignment between π 1 , β¦, π π and π 1 , β¦ π π . Solve problem(1, 1): How is problem(1, 1) defined?
58
Finding the Optimal Alignment
πΏ= π 1 , π 2 ,β¦ π π . π= π 1 , π 2 ,β¦ π π . We want to find the optimal alignment between πΏ and π. We break up the problem into a 2D array of smaller, interrelated problems π,π , where 1β€πβ€π, 1β€πβ€π. Problem(π,π): find optimal alignment between π 1 , β¦, π π and π 1 , β¦ π π . Solve problem(1, 1): Find optimal alignment between π 1 and π 1 . What is the optimal alignment between π 1 and π 1 ?
59
Finding the Optimal Alignment
πΏ= π 1 , π 2 ,β¦ π π . π= π 1 , π 2 ,β¦ π π . We want to find the optimal alignment between πΏ and π. We break up the problem into a 2D array of smaller, interrelated problems π,π , where 1β€πβ€π, 1β€πβ€π. Problem(π,π): find optimal alignment between π 1 , β¦, π π and π 1 , β¦ π π . Solve problem(1, 1): Find optimal alignment between π 1 and π 1 . The optimal alignment is the only legal alignment: (1, 1) .
60
Alignment for Problem (1, 1)
π¨= (1, 1) π 1 π 2 π 3 π 4 π 5 π 6 π 7 π 8 π 9 π 1 π 2 π 3 π 4 π 5 π 6 π 7 π 8
61
Finding the Optimal Alignment
πΏ= π 1 , π 2 ,β¦ π π . π= π 1 , π 2 ,β¦ π π . We want to find the optimal alignment between πΏ and π. We break up the problem into a 2D array of smaller, interrelated problems π,π , where 1β€πβ€π, 1β€πβ€π. Problem(π,π): find optimal alignment between π 1 , β¦, π π and π 1 , β¦ π π . Solve problem(1, π): How is problem(1, π) defined?
62
Finding the Optimal Alignment
πΏ= π 1 , π 2 ,β¦ π π . π= π 1 , π 2 ,β¦ π π . We want to find the optimal alignment between πΏ and π. We break up the problem into a 2D array of smaller, interrelated problems π,π , where 1β€πβ€π, 1β€πβ€π. Problem(π,π): find optimal alignment between π 1 , β¦, π π and π 1 , β¦ π π . Solve problem(1, π): Find optimal alignment between π 1 and π 1 , β¦ π π . What is the optimal alignment between π 1 and π 1 ,β¦, π π ?
63
Finding the Optimal Alignment
πΏ= π 1 , π 2 ,β¦ π π . π= π 1 , π 2 ,β¦ π π . We want to find the optimal alignment between πΏ and π. We break up the problem into a 2D array of smaller, interrelated problems π,π , where 1β€πβ€π, 1β€πβ€π. Problem(π,π): find optimal alignment between π 1 , β¦, π π and π 1 , β¦ π π . Solve problem(1, π): Find optimal alignment between π 1 and π 1 , β¦ π π . Optimal alignment: (1, 1), (1, 2), β¦, (1, π) . Here, π 1 is matched with each of the first π elements of π.
64
Alignment for Problem (1, 5)
π¨= (1, 1), (1,2), (1,3), (1,4), (1,5) π 1 π 2 π 3 π 4 π 5 π 6 π 7 π 8 π 9 π 1 π 2 π 3 π 4 π 5 π 6 π 7 π 8
65
Finding the Optimal Alignment
πΏ= π 1 , π 2 ,β¦ π π . π= π 1 , π 2 ,β¦ π π . We want to find the optimal alignment between πΏ and π. We break up the problem into a 2D array of smaller, interrelated problems π,π , where 1β€πβ€π, 1β€πβ€π. Problem(π,π): find optimal alignment between π 1 , β¦, π π and π 1 , β¦ π π . Solve problem(π, 1): How is problem(π, 1) defined?
66
Finding the Optimal Alignment
πΏ= π 1 , π 2 ,β¦ π π . π= π 1 , π 2 ,β¦ π π . We want to find the optimal alignment between πΏ and π. We break up the problem into a 2D array of smaller, interrelated problems π,π , where 1β€πβ€π, 1β€πβ€π. Problem(π,π): find optimal alignment between π 1 , β¦, π π and π 1 , β¦ π π . Solve problem(π, 1): Find optimal alignment between π 1 ,β¦ π π and π 1 . What is the optimal alignment between π 1 ,β¦ π π and π 1 ?
67
Finding the Optimal Alignment
πΏ= π 1 , π 2 ,β¦ π π . π= π 1 , π 2 ,β¦ π π . We want to find the optimal alignment between πΏ and π. We break up the problem into a 2D array of smaller, interrelated problems π,π , where 1β€πβ€π, 1β€πβ€π. Problem(π,π): find optimal alignment between π 1 , β¦, π π and π 1 , β¦ π π . Solve problem(π, 1): Find optimal alignment between π 1 ,β¦ π π and π 1 . Optimal alignment: (1, 1), (2, 1), β¦, (π, 1) . Here, π 1 is matched with each of the first π elements of πΏ.
68
Alignment for Problem (5, 1)
π¨= (1, 1), (2, 1), (3, 1), (4, 1), (5, 1) π 1 π 2 π 3 π 4 π 5 π 6 π 7 π 8 π 9 π 1 π 2 π 3 π 4 π 5 π 6 π 7 π 8
69
Finding the Optimal Alignment
π= π 1 , π 2 ,β¦ π π . πΏ= π 1 , π 2 ,β¦ π π . We want to find the optimal alignment between πΏ and π. We break up the problem into a 2D array of smaller, interrelated problems π,π , where 1β€πβ€π, 1β€πβ€π. Problem(π,π): find optimal alignment between π 1 , β¦, π π and π 1 , β¦ π π . Solve problem(π, π): Here is where dynamic programming comes into play. The solution can be obtained from the solutions to problem π, πβ1 , problem(πβ1, π), problem(πβ1, πβ1).
70
Finding the Optimal Alignment
Solving problem(π, π): Look at the following three alignments: Let π¨ π,πβ1 be the solution to problem π, πβ1 . Let π¨ πβ1,π be the solution to problem πβ1, π . Let π¨ πβ1,πβ1 be the solution to problem(πβ1, πβ1). Let π¨ β π,π be the alignment among π¨ π,πβ1 , π¨ π,πβ1 , and π¨ πβ1,πβ1 with the smallest cost. Then, the solution to problem π, π is obtained by appending pair π, π to the end of π¨ β π,π .
71
Computing DTW(πΏ, π) Input: Initialization: Main loop: Return πΆ π, π .
πΏ= π 1 , π 2 ,β¦ π π . π= π 1 , π 2 ,β¦ π π . Initialization: πΆ = zeros(π, π). % Zero matrix, of size πΓπ. πΆ(1, 1) = Cost π 1 , π 1 . For π = 2 to π: πΆ(π, 1) = πΆ(πβ1, 1) +Cost π π , π 1 . For π = 2 to π: πΆ(1, π) = πΆ(1, πβ1) +Cost π 1 , π π . Main loop: For π = 2 to π, for π = 2 to π: πΆ(π, π) = min πΆ πβ1, π , πΆ π, πβ1 ,πΆ(πβ1, πβ1) +Cost π π , π π . Return πΆ π, π .
72
Cost vs. Alignment Note: there are two related but different concepts here: optimal alignment and optimal cost. We may want to find the optimal alignment between two time series, to visualize how elements of those two time series correspond to each other. We may want to compute the DTW distance between two time series πΏ and π, which means finding the cost of the optimal alignment between πΏ and π. We can use such DTW distances, for example, for nearest neighbor classification. The pseudocode in the previous slide computes the cost of the optimal alignment, without explicitly outputing the optimal alignment itself.
73
Cost vs. Alignment If you want to compute both the cost of the optimal alignment, and the optimal alignment itself, there are two ways you can modify the pseudocode to do that: Option 1: Maintain an alignment array π of size πΓπ. Every time we save a cost at πΆ(π, π), we should save the corresponding optimal alignment at π i,j . This alignment is obtained by adding pair (π, π) to the end of the optimal alignment stored at one of π iβ1,j , π i,jβ1 , π iβ1,jβ1 . Option 2: Maintain a backtrack array π΅ of size πΓπ. Every time we save a cost to πΆ(π, π), we should record at π΅ i,j the choice, among πβ1, π , π, πβ1 ,(πβ1, πβ1), that we made. When we are done computing πΆ(π, π), we use array π΅ to backtrack and recover the optimal alignment between πΏ and π.
74
DTW Complexity For each (π,π):
π 1 π 2 π 3 π 4 π 5 π 6 π 7 π 8 π 9 π 1 π 2 π 3 π 4 π 5 ? π 6 π 7 π 8 For each (π,π): Compute optimal alignment between π 1 , β¦, π π and π 1 , β¦ π π . The solution is computed based on the solutions for (πβ1, π), (π, πβ1), (πβ1, πβ1). Time complexity?
75
DTW Complexity For each (π,π):
π 1 π 2 π 3 π 4 π 5 π 6 π 7 π 8 π 9 π 1 π 2 π 3 π 4 π 5 ? π 6 π 7 π 8 For each (π,π): Compute optimal alignment between π 1 , β¦, π π and π 1 , β¦ π π . The solution is computed based on the solutions for (πβ1, π), (π, πβ1), (πβ1, πβ1). Time complexity? Linear to the size of the πΆ array (πΓπ).
76
DTW Complexity π 1 π 2 π 3 π 4 π 5 π 6 π 7 π 8 π 9 π 1 π 2 π 3 π 4 π 5 ? π 6 π 7 π 8 Assume that both time series have approximately the same length π. Then, the time complexity of DTW is π π 2 . Improvements and variations have been proposed, with lower time complexity.
77
DTW Finds the Optimal Alignment
Proof:
78
DTW Finds the Optimal Alignment
Proof: by induction. Base cases:
79
DTW Finds the Optimal Alignment
Proof: by induction. Base cases: i = 1 OR j = 1.
80
DTW Finds the Optimal Alignment
Proof: by induction. Base cases: π = 1 OR π = 1. Proof of claim for base cases: For any problem(1, π) and problem(π, 1), only one legal warping path exists. Unless we consider warping paths with repetitions, like ( 1, 1 , 1,1 , 1,2 ), which are legal, but can never have lower cost than their non-repeating versions, like ( 1, 1 , 1,2 ). Therefore, DTW finds the optimal path for problem(1, π) and problem(π, 1). It is optimal since it is the only one.
81
DTW Finds the Optimal Alignment
Proof: by induction. General case: (π, π), for πβ₯2, πβ₯ 2. Inductive hypothesis:
82
DTW Finds the Optimal Alignment
Proof: by induction. General case: (π, π), for πβ₯2, πβ₯ 2. Inductive hypothesis: What we want to prove for (π, π) is true for (πβ1, π), (π, πβ1), (πβ1, πβ1):
83
DTW Finds the Optimal Alignment
Proof: by induction. General case: (π, π), for πβ₯2, πβ₯ 2. Inductive hypothesis: What we want to prove for (π, π) is true for (πβ1, π), (π, πβ1), (πβ1, πβ1): DTW has computed optimal solutions for problems (πβ1, π), (π, πβ1), (πβ1, πβ1).
84
DTW Finds the Optimal Alignment
Proof: by induction. General case: (π, π), for πβ₯2, πβ₯ 2. Inductive hypothesis: What we want to prove for (π, π) is true for (πβ1, π), (π, πβ1), (πβ1, πβ1): DTW has computed optimal solutions for problems (πβ1, π), (π, πβ1), (πβ1, πβ1). Proof by contradiction:
85
DTW Optimality Proof Proof: by induction. General case:
(π, π), for πβ₯2, πβ₯ 2. Inductive hypothesis: What we want to prove for (π, π) is true for (πβ1, π), (π, πβ1), (πβ1, πβ1): DTW has computed optimal solutions for problems (πβ1, π), (π, πβ1), (πβ1, πβ1). Proof by contradiction: Summary: if solution for (π, π) is not optimal, then one of the solutions for (πβ1, π), (π, πβ1), or (πβ1, πβ1) was not optimal, which violates the inductive hypothesis.
86
DTW Optimality Proof Let π¨ π,π be the DTW solution to problem π, π .
We want to prove that π¨ π,π is optimal, using the inductive hypothesis. Let π¨ β π,π be the alignment among π¨ π,πβ1 , π¨ π,πβ1 , and π¨ πβ1,πβ1 with the smallest cost. Then, π¨ π,π = π¨ β π,π β¨ π, π . πΈβ¨π is the result of appending item π to the end of list πΈ. Let π© π,π be the optimal solution to problem π, π . π© π,π = π© β π,π β¨ π, π for some π© β . Assume contradiction ( π© π,π β π¨ π,π ).
87
DTW Optimality Proof πΆ πΏ,π π¨ π,π > πΆ πΏ,π π© π,π Why?
DTW solution for problem π,π : π¨ π,π = π 1,1 , π 1,2 , π 2,1 , π 2,2 ,β¦, π π
,1 , π π
,2 Optimal solution for problem π,π : π© π,π = π 1,1 , π 1,2 , π 2,1 , π 2,2 ,β¦, π π,1 , π π,2 Assume contradiction: π¨ π,π is not optimal. Then, πΆ πΏ,π π¨ π,π > πΆ πΏ,π π© π,π Why? Because π© π,π is optimal and π¨ π,π is not optimal.
88
DTW Optimality Proof Why?
π¨ π,π = π 1,1 , π 1,2 , π 2,1 , π 2,2 ,β¦, π π
,1 , π π
,2 π© π,π = π 1,1 , π 1,2 , π 2,1 , π 2,2 ,β¦, π π,1 , π π,2 πΆ πΏ,π π¨ π,π > πΆ πΏ,π π© π,π β π=1 π
Cost π π π,1 , π π π,2 > π=1 π Cost( π π π,1 , π π π,2 ) Why? We are just using the definition of the cost of an alignment.
89
DTW Optimality Proof π¨ π,π = π 1,1 , π 1,2 , π 2,1 , π 2,2 ,β¦, π π
,1 , π π
,2 π© π,π = π 1,1 , π 1,2 , π 2,1 , π 2,2 ,β¦, π π,1 , π π,2 π=1 π
Cost π π π,1 , π π π,2 > π=1 π Cost( π π π,1 , π π π,2 ) β π=1 π
β1 Cost π π π,1 , π π π,2 > π=1 πβ1 Cost( π π π,1 , π π π,2 ) Why? The last element of both π¨ π,π and π© π,π is π, π . Both π¨ π,π and π© π,π align π 1 , β¦, π π with π 1 , β¦ π π .
90
DTW Optimality Proof π¨ π,π = π 1,1 , π 1,2 , π 2,1 , π 2,2 ,β¦, π π
,1 , π π
,2 π© π,π = π 1,1 , π 1,2 , π 2,1 , π 2,2 ,β¦, π π,1 , π π,2 π=1 π
β1 Cost π π π,1 , π π π,2 > π=1 πβ1 Cost( π π π,1 , π π π,2 ) β πΆ πΏ,π π¨ β π,π > πΆ πΏ,π π© β π,π Why? We are just using the definitions of π¨ β π,π and π© β π,π . π¨ π,π = π¨ β π,π β¨ π, π . π© π,π = π© β π,π β¨ π, π
91
DTW Optimality Proof π¨ π,π = π 1,1 , π 1,2 , π 2,1 , π 2,2 ,β¦, π π
,1 , π π
,2 π© π,π = π 1,1 , π 1,2 , π 2,1 , π 2,2 ,β¦, π π,1 , π π,2 πΆ πΏ,π π¨ β π,π > πΆ πΏ,π π© β π,π β min πΆ πΏ,π π¨ π,πβ1 , πΆ πΏ,π π¨ πβ1,π , πΆ πΏ,π π¨ πβ1,πβ1 > πΆ πΏ,π π© β π,π Why? We are just using the definitions of π¨ β π,π and π© β π,π . π¨ β π,π =argmin πΆ πΏ,π π¨ π,πβ1 , πΆ πΏ,π π¨ πβ1,π , πΆ πΏ,π π¨ πβ1,πβ1
92
DTW Optimality Proof min πΆ πΏ,π π¨ π,πβ1 , πΆ πΏ,π π¨ πβ1,π , πΆ πΏ,π π¨ πβ1,πβ1 > πΆ πΏ,π π© β π,π Contradiction!!! The inductive hypothesis was that alignments π¨ π,πβ1 , π¨ πβ1,π , π¨ πβ1,πβ1 were the optimal alignments for problems (πβ1, π), (π, πβ1), (πβ1, πβ1). However, π© β π,π is also an alignment for one of the three problems. Otherwise, π© π,π would break one of the three rules of legal alignments. Plus, π© β π,π has a lower cost than each of π¨ π,πβ1 , π¨ πβ1,π , π¨ πβ1,πβ1 , so one of those three alignments is not optimal.. This contradicts the inductive hypothesis.
93
Visualizing π© β π,π for π=4, π=5
π 1 π 2 π 3 π 4 π 5 π 6 π 7 π 8 π 9 π 1 π 2 π 3 π 4 ? π 5 π 6 π 7 π 8 π© π,π aligns π 1 , β¦, π π with π 1 , β¦ π π . π© β π,π is the result of removing the last element from π© π,π . Based on the rules of continuity and monotonicity, the last element of π© β π,π must be (πβ1, π), (π, πβ1), or (πβ1, πβ1).
94
DTW Optimality Proof - Recap
Proof: by induction. The base cases are easy to prove. The tricky part is to prove the inductive case. I.e., prove that π¨ π,π , which is the DTW solution problem (π, π), is indeed the optimal alignment for that problem. We prove the inductive case by contradiction: Inductive hypothesis: alignments π¨ π,πβ1 , π¨ πβ1,π , π¨ πβ1,πβ1 were the optimal alignments for problems (πβ1, π), (π, πβ1), (πβ1, πβ1). If π¨ π,π is not optimal, then we prove that one of alignments π¨ π,πβ1 , π¨ πβ1,π , π¨ πβ1,πβ1 must not be optimal, which contradicts the inductive hypothesis.
95
Dynamic Time Warping, Recap
It is oftentimes useful to use nearest neighbor classification to classify time series. Especially when we have very few training examples per class, or, in the extreme, only one training example per class. However, to use nearest neighbor classification, we need to define a meaningful distance measure between time series. The Euclidean distance works poorly because even slight misalignments can lead to large distance values. Dynamic Time Warping is based on computing an optimal alignment between two time series, and thus it can tolerate even large misalignments. The complexity of Dynamic Time Warping is quadratic to the length of the time series, but more efficient variants can also be used.
96
Other Distance Measures
There are more distance measures that can be used on time series data. For example: Edit Distance with Real Penalty (ERP). Lei Chen and Raymond Ng. "On the marriage of lp-norms and edit distance." In International Conference on Very Large Data Bases (VLDB), pp , 2004. Time Warp Edit Distance (TWED). Pierre-François Marteau. "Time warp edit distance with stiffness adjustment for time series matching." IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 31, no. 2 (2009): Move-Split-Merge (MSM). Alexandra Stefan, Vassilis Athitsos, and Gautam Das. "The move-split-merge metric for time series." IEEE Transactions on Knowledge and Data Engineering (TKDE), 25, no. 6 (2013):
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.