Jessica Lin Eamonn Keogh Stefano Lonardi Visualizing and discovering non-trivial patterns in large time series databases Jessica Lin Eamonn Keogh Stefano Lonardi Presented by Thomas Lotze
Purpose: Motifs and Anomalies Timeseries visualization Motifs (frequently occurring patterns) Anomalies
Semantic Representation: SAX Equprobable a c d c b d b a
Sequence Trees Set 1: Set 2: 000 100 010 101 001 101 010 101 000 100 010 101 001 101 010 101 010 110 010 101 011 111 010 101
Subsequence Trees
Subsequence Trees on Timeseries Sliding window Discretized using SAX Store the subsequence pattern Display as tree Thickness: frequency of pattern Color: grey if none exist
Numerosity Reduction Avoiding Trivial matches (neighbors) Same as previous MINDIST No overlap None Non-monotonic Can make a big difference!!!
VizTree Demo Remember to note: - pixels required only depends on number of characters, depth…
Comparing Series Support Confidence Surprisingness Difference in frequency of pattern Confidence Average of how significant this pattern is Surprisingness Support x Confidence Mention Bayesian method if the support is 0 (i.e., if the pattern does not occur in the time series) Maybe just talk about this during the demo (or do contrast portion of demo after this slide)
Likes Tree structure visualization Mapping using color and thickness Automatic Pattern Identification Speed of computation Generality of patterns Simultaneous view of subsequences matching a pattern/motif Can zoom to different tree levels/sections Remember to mention the generality is especially nice because it
Wishlist Dynamic parameter response Sliders for window size/character parameters Automatic suggestions for parameters Select individual subsequences in simultaneous time-zoom panel HCIL-style selectors for tree focus Maybe a human-assisting time series clusterer? Allowing for different lengths of patterns Allow user to “kick a subsequence out” from a cluster Allow user to recluster the unclustered More possible patterns Functional Data Analysis? Allow “no pattern overlap” or fuzzy overlap…fuzzy length?...somehow? Okay, now I’m just dreaming… …when lots of neighbors (relative to window size) have the same pattern, perhaps that indicates that the window should be longer?
Calendar Clusters van Wijk, van Selow Office hours are followed strictly. Most people arrive between 8:30 and 9:00 am, and leave between 4:00 and 5:00 pm. Furthermore, in the morning the number of employees present is slightly higher than in the afternoon. On Fridays and in the summer fewer people are present (cluster 722); On Fridays in the summer even fewer people are present (cluster 718); In the weekend and at holidays only very few people are working (cluster 710): security and fire brigade; Holidays in the Netherlands in 1997 were January 1st, March 28th, March 31st, April 30th, May 5th,May 8th, May 19th, December 25th and 26th. School vacations are visible in Spring (May 3rd toMay 11th), in Autumn (October 11th to October 19th), and in Winter (December 21th to December 31st); Many people take a day off after a holiday (cluster 721); On December 5th many people left at 4:00 PM. Dutch people will immediately knowthe explanation: On this day we celebrate Santa Claus and are allowed to leave earlier!
Spirals Marc Weber Marc Alexa Wolfgang Müller
Periodicity/Pattern Length Now: known in advance Find good candidates? spectral analysis? non-constant period? Visual insight scroll/slide through different window lengths see when patterns start to coalesce Note that even VizTree requires knowing the length of the subsequence motif…but it does allow for varying space between the subsequence occurrences