Discovery of Meaningful Rules in Time Series

Name: Discovery of Meaningful Rules in Time Series
Uploaded: 2017-07-05T09:28:37+00:00
Duration: PTM5S1
Channel: Benjamin Porter
Description: Discovery of Meaningful Rules in Time Series

Discovery of Meaningful Rules in Time Series
SIGKDD2015

What is rule 明->月静夜思唐李白床前明月光，疑是地上霜。举头望明月，低头思故乡。秋浦歌炉火照天地，
唐李白床前明月光，疑是地上霜。举头望明月，低头思故乡。秋浦歌炉火照天地，红星乱紫烟。赧郎明月夜，歌曲动寒川。明->月

The Raven A poem by Edgar Allan Poe
“Once upon a midnight dreary, while I pondered weak and …” chamber(房间) → door chamber: antecedent door: consequent Antecedent 先行词，或者叫前项吧

Rule in Time Series A major difference between text and time series is that the latter does not have a natural segmentation onceuponamidnightdrearywhileIponderedweak.... qncexauponwamidmightmtdreerydwgileuIppondered iweek... dist(“chamber”, substring) ≤ t → door For example: t = 2 chanbet -> door 之前是单词的规则，那么当问题转到时间序列上，又会有什么不一样呢，首先时间序列没有像单词那样的自然分割，类似的相当于这样，然后时间序列可能并不是那么精确，就相当于字符串的拼写错误，所以我们可以定义一个阈值，只要和这个字符串的距离小于这个阈值，我们就还是认为规则成立

About lag The consequent may not immediately followed the antecedent. chamberdoor, chamberzdoor, chamberxydoor So we need to define a parameter, maxlag, which is the maximum number of characters between the the antecedent and the consequent Example:if maxlag = 2,the above predictions is valid

The formal definition of rule in Time Series
“If we see a substring of length ρ that is within distance 𝑡 1 of the word chamber, then we fire the rule and expect to see a similar substring to word door, within a learned distance 𝑡 2 , in the next maxlag time steps.”

Rule is like this

Time Series Motif The method is based on Time Series Motif, which has been extensively studied in many literature

Definitions

DATA DISCRETIZATION Find the minimum value and maximum value, then we set bin boundaries that are uniformly sized between min and max. The resulting bin width is then: (max - min) / cardinality

MDL MDL is used as a scoring function, which is novel in this paper
Why MDL? Why not ED? The Euclidean distance does not allow us to compare the quality of consequents with different lengths. The Euclidean distance between two subsequences of length ρ can actually decrease when we expand to length ρ + 1 due to the (re)normalization of the data. So not only is the effect of length not linear, it is not even monotonic.

After encoding, how many bits it cost to save the sequence: 𝐷𝐿(𝑚|𝐻)
𝑏𝑖𝑡𝑠𝑎𝑣𝑒 m, H =DL m −DL m H What is MDL MDL or Minimum Description Length is used to score a rule based on how many bits that can be saved. A hypothesis (green/bold) can be used to score subsequences by subtracting it from them (producing the small integers shown top) and encoding the difference vector with Huffman encoding Here the left sequence requires 57 bits, whereas the right sequence requires 84.

RULE DISCOVERY ALGORITHM
A scoring function A search algorithm which repeatedly invokes this scoring function while searching for high quality rules

Rule Scoring For clarity, we begin to consider maxlag is 0

Motif-Based Rule Searching
Efficient algorithms for discovering the top K motifs in a time series are well-known.In this paper, we use MK algorithm 因为我们是要得到最好的Rule，那么我们可以先

EXPERIMENT-Zebra finch
这篇文章的实验的话，我觉得主要是展示这个东西可以拿来干什么

EXPERIMENT-Energy Disaggregation
Clothes Washer Clothes Dryer

Conclusion Applid MDL to score time series rules
Rule representation is expressive enough to allow rules with different length antecedents/consequents/lags/firing thresholds

Future work On some datasets, Dynamic Time Warping, in single or multi-dimensional cases, may be more robust than the Euclidean distance, but to massive datasets remains an issue. It may be possible to generalize the rule representation to allow more expressive logical connectives There are currently no standard benchmarks for time series rule discovery.

Discovery of Meaningful Rules in Time Series

Similar presentations

Presentation on theme: "Discovery of Meaningful Rules in Time Series"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Discovery of Meaningful Rules in Time Series

Similar presentations

Presentation on theme: "Discovery of Meaningful Rules in Time Series"— Presentation transcript:

Similar presentations

About project

Feedback