Discovery of Meaningful Rules in Time Series

Slides:



Advertisements
Similar presentations
Time Series Epenthesis: Clustering Time Series Streams Requires Ignoring Some Data Thanawin Rakthanmanon Eamonn Keogh Stefano Lonardi Scott Evans.
Advertisements

王老师 wang laoshi (Mrs Sproule) (Mrs Sproule) 中秋节. 月 yue M. Sproule The full moon is traditionally a symbol for reunion, 团圆 tuanyuan, as 圆 yuan means round.
Longest Common Subsequence
Learning Trajectory Patterns by Clustering: Comparative Evaluation Group D.
Approximations of points and polygonal chains
Huffman code and ID3 Prof. Sin-Min Lee Department of Computer Science.
Edgar Allan Poe Edgar Poe was born in Boston on January 19, 1809, to David and Elizabeth Poe. Elizabeth died in 1811 shortly after separating.
Sequence Clustering and Labeling for Unsupervised Query Intent Discovery Speaker: Po-Hsien Shih Advisor: Jia-Ling Koh Source: WSDM’12 Date: 1 November,
Constructing Popular Routes from Uncertain Trajectories Authors of Paper: Ling-Yin Wei (National Chiao Tung University, Hsinchu) Yu Zheng (Microsoft Research.
Computation-Theoretic origin of Cosmic Inflation The secret behind Physical Laws By Hoi-Lai YU
MCFRoute: A Detailed Router Based on Multi- Commodity Flow Method Xiaotao Jia, Yici Cai, Qiang Zhou, Gang Chen, Zhuoyuan Li, Zuowei Li.
EDGAR ALLAN POE try!! Nicholas Lauerman Ninth Grade English.
Incremental Learning of Temporally-Coherent Gaussian Mixture Models Ognjen Arandjelović, Roberto Cipolla Engineering Department, University of Cambridge.
Unit 4 A Garden of Poems English Poetry.
Distance Functions for Sequence Data and Time Series
HASH TABLES Malathi Mansanpally CS_257 ID-220. Agenda: Extensible Hash Tables Insertion Into Extensible Hash Tables Linear Hash Tables Insertion Into.
Detecting Time Series Motifs Under
A Multiresolution Symbolic Representation of Time Series
E.G.M. PetrakisBinary Image Processing1 Binary Image Analysis Segmentation produces homogenous regions –each region has uniform gray-level –each region.
Imagery, Symbolism, and Atmosphere By: Emily, Sam, and Farhan.
Huffman Coding Vida Movahedi October Contents A simple example Definitions Huffman Coding Algorithm Image Compression.
Book 6 Unit 2 Poems Warming up & Reading 开课教师:叶 芳 指导教师:郑雄国 开课班级:高二( 6 )班 开课时间: 2011 年 11 月 23 日 星期三.
Protein Structure Alignment by Incremental Combinatorial Extension (CE) of the Optimal Path Ilya N. Shindyalov, Philip E. Bourne.
Assembler Efficient Discovery of Spatial Co-evolving Patterns in Massive Geo-sensory Data Sheng QIAN SIGKDD 2015.
Qualitative approximation to Dynamic Time Warping similarity between time series data Blaž Strle, Martin Možina, Ivan Bratko Faculty of Computer and Information.
Unit 2 Poems Step Ⅰ revision 静夜思 静夜思 床前明月光, 床前明月光, 疑是地上霜。 疑是地上霜。 举头望明月, 举头望明月, 底头思故乡。 底头思故乡。
October 14, 2014Computer Vision Lecture 11: Image Segmentation I 1Contours How should we represent contours? A good contour representation should meet.
Understanding the Title Look at the picture What can you hear? Nothing – silent, serene (first character) What time is it? Night time (second character)
Discovering the Intrinsic Cardinality and Dimensionality of Time Series using MDL BING HU THANAWIN RAKTHANMANON YUAN HAO SCOTT EVANS1 STEFANO LONARDI EAMONN.
Dynamic Time Warping Algorithm for Gene Expression Time Series
Topic 1 What’s your hobby ? 湄洲湾北岸东埔中学 潘金扬 sectionA.
春晓 春眠不觉晓,处处闻啼鸟。 夜来风雨声,花落知多少。 李白 杜甫 李清照 寻寻觅觅,冷冷清清,凄凄惨惨戚 戚。乍暖还寒时候,最难将息。 明月松间照,清泉石上流 王维 天生我材必有用。千金散尽还复来。 月落鸟啼霜满天,江枫渔父对愁眠 张继 出师未捷身先死,长使英雄泪满襟!
If winter comes, can spring be far behind? Shelley--- Ode to the west wind ( 西风颂 ) Unit2 Poems.
1 CS 552/652 Speech Recognition with Hidden Markov Models Winter 2011 Oregon Health & Science University Center for Spoken Language Understanding John-Paul.
Efficient Elastic Burst Detection in Data Streams Yunyue Zhu and Dennis Shasha Department of Computer Science Courant Institute of Mathematical Sciences.
The Group Lasso for Logistic Regression Lukas Meier, Sara van de Geer and Peter Bühlmann Presenter: Lu Ren ECE Dept., Duke University Sept. 19, 2008.
Moon Festival Sixth graders Shuang Lian Elementary School Taipei, Taiwan Teacher: Frances Tzeng.
今、年 方衍慧 和 艾沙白 制作 九班 二零零八年二月十九日. 今年 今年 – jīnnián – this year.
Protein motif extraction with neuro-fuzzy optimization Bill C. H. Chang and Author : Bill C. H. Chang and Saman K. Halgamuge Saman K. Halgamuge Adviser.
Lesson3 Poetry Pre-Reading: Li Bai ( ), a famous chinese poet in Tang Dynasty,whose poems express enlighted thinking, attack dignitary in his.
Making Time: Pseudo Time-Series for the Temporal Analysis of Cross-Section Data Emma Peeling, Allan Tucker Centre for Intelligent Data Analysis Brunel.
YININGSHI NO.4 MIDDLE SCHOOL 伊宁市第四中学 4X10 课件 英语课件 伊宁市四中教研室 创 新 · 求 真创 新 · 求 真.
Accelerating Dynamic Time Warping Clustering with a Novel Admissible Pruning Strategy Nurjahan BegumLiudmila Ulanova Jun Wang 1 Eamonn Keogh University.
Lesson Ten No.46 Middle School Jiang Weiwei Can you match two sentences from a poem for the picture? 你能为这幅图配上两句古 诗吗 ?
Unit 18 Lesson 3 Poetry. To practise strategies for reading poetry. To accumulate some cultural background of Christmas. Objectives To study imagery and.
Lesson 3. Do you know? the main characteristics of a poem: Poetry is natural expression of human emotions. It is written in lines, and rhymed( 压韵 的 ).
吴体能 Unit 2 Poems Warming up. Do you know who he is in this picture? Warming up.
NSF Career Award IIS University of California Riverside Eamonn Keogh Efficient Discovery of Previously Unknown Patterns and Relationships.
Figures in High Resolution. Hamming distance for all sliding words using Average Link Three clusters of equal diameter when K = 20 { whoid, davud, njoin,
Discovering Musical Patterns through Perceptive Heuristics By Oliver Lartillot Presentation by Ananda Jacobs.
Copyright OpenHelix. No use or reproduction without express written consent1.
Lesson 3. Objectives To practise strategies for reading poetry. To study imagery and learn to appreciate poems appreciation impression description.
静夜思 床前明月光,疑是地上霜。 举头望明月,低头思故乡。 So bright a gleam on the foot of my bed__ Could there have been a frost already ? Lifting myself to look, I found that.
Mustafa Gokce Baydogan, George Runger and Eugene Tuv INFORMS Annual Meeting 2011, Charlotte A Bag-of-Features Framework for Time Series Classification.
October 3, 2013Computer Vision Lecture 10: Contour Fitting 1 Edge Relaxation Typically, this technique works on crack edges: pixelpixelpixel pixelpixelpixelebg.
Warming up and listening Unit4 A Garden of Poems.
Data Mining Association Analysis: Basic Concepts and Algorithms
Subtracting Integers #41.
Matrix Profile II: Exploiting a Novel Algorithm and GPUs to break the one Hundred Million Barrier for Time Series Motifs and Joins Yan Zhu, Zachary Zimmerman,
Unit 6 Entertainment and Friendship
Supervised Time Series Pattern Discovery through Local Importance
Greedy Algorithm.
Huffman Coding, Arithmetic Coding, and JBIG2
That's one small step for (a) man, one giant leap for mankind.
Fast Sequence Alignments
Automatic Segmentation of Data Sequences
Time Relaxed Spatiotemporal Trajectory Joins
Chapter 1: Boundary Value Testing
Jìng yè sī 静夜思 李 白Li Bai chuáng qián míng yuè guāng 床 前 明 月 光 , Before my bed, the moon is shining bright, yí shì dì shàng shuāng . 疑 是 地 上 霜I thought.
The Raven – Edgar Allan Poe
Presentation transcript:

Discovery of Meaningful Rules in Time Series SIGKDD2015

What is rule 明->月 静夜思 唐 李白 床前明月光, 疑是地上霜。 举头望明月, 低头思故乡。 秋浦歌 炉火照天地, 唐 李白 床前明月光, 疑是地上霜。 举头望明月, 低头思故乡。 秋浦歌 炉火照天地, 红星乱紫烟。 赧郎明月夜, 歌曲动寒川。 明->月

The Raven A poem by Edgar Allan Poe “Once upon a midnight dreary, while I pondered weak and …” chamber(房间) → door chamber: antecedent door: consequent Antecedent 先行词,或者叫前项吧

Rule in Time Series A major difference between text and time series is that the latter does not have a natural segmentation onceuponamidnightdrearywhileIponderedweak.... qncexauponwamidmightmtdreerydwgileuIppondered iweek... dist(“chamber”, substring) ≤ t → door For example: t = 2 chanbet -> door 之前是单词的规则,那么当问题转到时间序列上,又会有什么不一样呢,首先时间序列没有像单词那样的自然分割,类似的相当于这样,然后时间序列可能并不是那么精确,就相当于字符串的拼写错误,所以我们可以定义一个阈值,只要和这个字符串的距离小于这个阈值,我们就还是认为规则成立

About lag The consequent may not immediately followed the antecedent. chamberdoor, chamberzdoor, chamberxydoor So we need to define a parameter, maxlag, which is the maximum number of characters between the the antecedent and the consequent Example:if maxlag = 2,the above predictions is valid

The formal definition of rule in Time Series “If we see a substring of length ρ that is within distance 𝑡 1 of the word chamber, then we fire the rule and expect to see a similar substring to word door, within a learned distance 𝑡 2 , in the next maxlag time steps.”

Rule is like this

Time Series Motif The method is based on Time Series Motif, which has been extensively studied in many literature

Definitions

Definitions

DATA DISCRETIZATION Find the minimum value and maximum value, then we set bin boundaries that are uniformly sized between min and max. The resulting bin width is then: (max - min) / cardinality

MDL MDL is used as a scoring function, which is novel in this paper Why MDL? Why not ED? The Euclidean distance does not allow us to compare the quality of consequents with different lengths. The Euclidean distance between two subsequences of length ρ can actually decrease when we expand to length ρ + 1 due to the (re)normalization of the data. So not only is the effect of length not linear, it is not even monotonic.

After encoding, how many bits it cost to save the sequence: 𝐷𝐿(𝑚|𝐻) 𝑏𝑖𝑡𝑠𝑎𝑣𝑒 m, H =DL m −DL m H What is MDL MDL or Minimum Description Length is used to score a rule based on how many bits that can be saved. A hypothesis (green/bold) can be used to score subsequences by subtracting it from them (producing the small integers shown top) and encoding the difference vector with Huffman encoding Here the left sequence requires 57 bits, whereas the right sequence requires 84.

RULE DISCOVERY ALGORITHM A scoring function A search algorithm which repeatedly invokes this scoring function while searching for high quality rules

Rule Scoring For clarity, we begin to consider maxlag is 0

Motif-Based Rule Searching Efficient algorithms for discovering the top K motifs in a time series are well-known.In this paper, we use MK algorithm 因为我们是要得到最好的Rule,那么我们可以先

EXPERIMENT-Zebra finch 这篇文章的实验的话,我觉得主要是展示这个东西可以拿来干什么

EXPERIMENT-Energy Disaggregation Clothes Washer Clothes Dryer

Conclusion Applid MDL to score time series rules Rule representation is expressive enough to allow rules with different length antecedents/consequents/lags/firing thresholds

Future work On some datasets, Dynamic Time Warping, in single or multi-dimensional cases, may be more robust than the Euclidean distance, but to massive datasets remains an issue. It may be possible to generalize the rule representation to allow more expressive logical connectives There are currently no standard benchmarks for time series rule discovery.