Indexing of Time Series by Major Minima and Maxima Eugene Fink Kevin B. Pratt Harith S. Gandhi.

Slides:



Advertisements
Similar presentations
GAMPS COMPRESSING MULTI SENSOR DATA BY GROUPING & AMPLITUDE SCALING
Advertisements

Indexing DNA Sequences Using q-Grams
A. point P. B. point Q. C. point R. D. point S.
Reference-based Indexing of Sequence Databases Jayendra Venkateswaran, Deepak Lachwani, Tamer Kahveci, Christopher Jermaine University of Florida-Gainesville.
Fast Algorithms For Hierarchical Range Histogram Constructions
Fast-Paced Trading of Multi-Attribute Goods Eugene Fink Josh Johnson John Hershberger.
Quadratic Functions and Equations
1 Pricing Bermudan Option by Binomial Tree Speaker: Xiao Huan Liu Course: L03.
Phylogenetic Trees Lecture 4
Properties of Functions
Properties of Functions Section 1.6. Even functions f(-x) = f(x) Graph is symmetric with respect to the y-axis.
FINGER PRINTING BASED AUDIO RETRIEVAL Query by example Content retrieval Srinija Vallabhaneni.
Effective Image Database Search via Dimensionality Reduction Anders Bjorholm Dahl and Henrik Aanæs IEEE Computer Society Conference on Computer Vision.
Indexing Time Series. Time Series Databases A time series is a sequence of real numbers, representing the measurements of a real variable at equal time.
1. 2 General problem Retrieval of time-series similar to a given pattern.
Novelty Detection and Profile Tracking from Massive Data Jaime Carbonell Eugene Fink Santosh Ananthraman.
Important Extrema of Time Series Eugene Fink Harith S. Gandhi.
Multi-Attribute Exchange Market: Search for Optimal Matches Eugene Fink Jianli Gong John Hershberger.
1.3 Graphs of Functions Pre-Calculus. Home on the Range What kind of "range" are we talking about? What kind of "range" are we talking about? What does.
1 Efficient Discovery of Conserved Patterns Using a Pattern Graph Inge Jonassen Pattern Discovery Arwa Zabian 13/07/2015.
Computational Mechanisms for Multi-Attribute Exchange Markets Eugene Fink Part I: Research interests and projects Part II: Automated exchange system.
Basic PRAM algorithms Problem 1. Min of n numbers Problem 2. Computing a position of the first one in the sequence of 0’s and 1’s.
Fast Subsequence Matching in Time-Series Databases Christos Faloutsos M. Ranganathan Yannis Manolopoulos Department of Computer Science and ISR University.
Library of Functions.
Chapter 1 – Functions and Their Graphs
College Algebra Sixth Edition James Stewart Lothar Redlin Saleem Watson.
A computational study of protein folding pathways Reducing the computational complexity of the folding process using the building block folding model.
Functions and Their Graphs Advanced Math Chapter 2.
Today in Pre-Calculus Go over homework Notes: Finding Extrema –You’ll need a graphing calculator (id’s please) Homework.
Similarity based Retrieval from Sequence Databases using Automata as Queries 作者 : A. Prasad Sistla, Tao Hu, Vikas howdhry 出處 :CIKM 2002 ACM 指導教授 : 郭煌政老師.
2.3 Analyzing Graphs of Functions. Graph of a Function set of ordered pairs.
Section 1.5.
PMLAB Finding Similar Image Quickly Using Object Shapes Heng Tao Shen Dept. of Computer Science National University of Singapore Presented by Chin-Yi Tsai.
Objectives: Graph the functions listed in the Library of Functions
Fast Subsequence Matching in Time-Series Databases Author: Christos Faloutsos etc. Speaker: Weijun He.
Hashing Sections 10.2 – 10.3 CS 302 Dr. George Bebis.
A Study of Balanced Search Trees: Brainstorming a New Balanced Search Tree Anthony Kim, 2005 Computer Systems Research.
Search for Approximate Matches in Large Databases Eugene Fink Jaime Carbonell Aaron Goldstein Philip Hayes.
Notes Over 2.3 The Graph of a Function Finding the Domain and Range of a Function. 1.Use the graph of the function f to find the domain of f. 2.Find the.
Copyright © 2016, 2012 Pearson Education, Inc
Copyright © Cengage Learning. All rights reserved. Functions.
3.2 Properties of Functions. If c is in the domain of a function y=f(x), the average rate of change of f from c to x is defined as This expression is.
Trig/Pre-Calculus Opening Activity
Time Series Sequence Matching Jiaqin Wang CMPS 565.
Indexing Time Series. Outline Spatial Databases Temporal Databases Spatio-temporal Databases Multimedia Databases Time Series databases Text databases.
Parallel tree search: An algorithmic approach for multi- field packet classification Authors: Derek Pao and Cutson Liu. Publisher: Computer communications.
The Development of a search engine & Comparison according to algorithms Sung-soo Kim The final report.
Functions 2 Copyright © Cengage Learning. All rights reserved.
Mustafa Gokce Baydogan, George Runger and Eugene Tuv INFORMS Annual Meeting 2011, Charlotte A Bag-of-Features Framework for Time Series Classification.
On the R ange M aximum-Sum S egment Q uery Problem Kuan-Yu Chen and Kun-Mao Chao Department of Computer Science and Information Engineering, National Taiwan.
Fast Subsequence Matching in Time-Series Databases.
Growth of Functions & Algorithms
Indexing Goals: Store large files Support multiple search keys
Graphical techniques in Economics
Chapter 25: Advanced Data Types and New Applications
Section 1.3 More on Functions and Their Graphs
Functions and Their Graphs
Supervised Time Series Pattern Discovery through Local Importance
1.3 Graphs of Functions Pre-Calculus.
Exchanges for Complex Commodities: Search for Optimal Matches
Section 1.3 More on Functions and Their Graphs
Copyright © Cengage Learning. All rights reserved.
A function f is increasing on an open interval I if, for any choice of x1 and x2 in I, with x1 < x2, we have f(x1) < f(x2). A function f is decreasing.
Section 2.2 More on Functions and Their Graphs
Write each using Interval Notation. Write the domain of each function.
On the Range Maximum-Sum Segment Query Problem
Hashing Sections 10.2 – 10.3 Lecture 26 CS302 Data Structures
2.3 Properties of Functions
Graphing Data A graph reveals a relationship. A graph represents “quantity” as a distance. A two-variable graph uses two perpendicular scale lines.
Presentation transcript:

Indexing of Time Series by Major Minima and Maxima Eugene Fink Kevin B. Pratt Harith S. Gandhi

Time series A time series is a sequence of real values measured at equal intervals. Example: 0, 3, 1, 2, 0, 1, 1, 3, 0, 2, 1, 4, 0, 1,

Results Compression of a time series by extracting its major minima and maxima Indexing of compressed time series Retrieval of series similar to a given pattern Experiments with stock and weather series

Outline Compression Indexing Retrieval Experiments

Compression We select major minima and maxima, along with the start point and end point, and discard the other points. We use a positive parameter R to control the compression rate.

Major minima A point a[m] in a[1..n] is a major minimum if there are i and j, where i < m < j, such that: a[m] is a minimum among a[i..j], and a[i] – a[m]  R and a[j] – a[m]  R. a[j]a[j]a[i]a[i] a[m]a[m]  R R  R R

Major maxima A point a[m] in a[1..n] is a major maximum if there are i and j, where i < m < j, such that: a[m] is a maximum among a[i..j], and a[m] – a[i]  R and a[m] – a[j]  R. a[j]a[j]a[i]a[i] a[m]a[m]  R R  R R

Compression procedure The procedure performs one pass through a given series. It can compress a live series without storing it in memory. It takes linear time and constant memory.

Outline Compression Indexing Retrieval Experiments

Indexing of series We index series in a database by their major inclines, which are upward and downward segments of the series.

Major inclines A segment a[1..j] is a major upward incline if a[i] is a major minimum; a[j] is a major maximum; for every m  [i..j], a[i] < a[m] < a[j]. a[i]a[i] a[j]a[j] The definition of a major downward incline is symmetric.

Identification of inclines The procedure performs two passes through a list of major minima and maxima.

Identification of inclines The procedure performs two passes through a list of major minima and maxima. Its time is linear in the number of inclines.

Indexing of inclines We index major inclines of series in a database by their lengths and heights. We use a range tree, which supports indexing of points by two coordinates. length height length height incline

Outline Compression Indexing Retrieval Experiments

Retrieval The procedure inputs a pattern series and searches for similar segments in a database. Pattern Example: Database 1 3 2

Retrieval The procedure inputs a pattern series and searches for similar segments in a database. Main steps: Find the pattern’s inclines with the greatest height Retrieve all segments that have similar inclines Compare each of these segments with the pattern

Highest inclines First, the retrieval procedure identifies the important inclines in the pattern., and selects the highest inclines. length 1 height length 2 12

Candidate segments Second, the procedure retrieves segments with similar inclines from the database. An incline is considered similar if its height is between height / C and height · C; its length is between length / D and length · D. We use the range tree to retrieve similar inclines. incline length / C length · C height / C height · C

Similarity test Third, the procedure compares the retrieved segments with the pattern., using a given similarity test.

Outline Compression Indexing Retrieval Experiments

We have tested a Visual-Basic implemen- tation on a 2.4-GHz Pentium computer. Data sets: Stock prices: 98 series, 60,000 points Air and sea temperatures: 136 series, 450,000 points

fast ranking C = D = 5 time: 0.05 sec 200 perfect ranking Stock prices (60,000 points) Search for 100-point patterns The x-axes show the ranks of matches retrieved by the developed procedure, and the y-axes are the ranks assigned by a slow exhaustive search fast ranking C = D = 2 time: 0.02 sec 200 perfect ranking fast ranking C = D = 1.5 time: 0.01 sec 151 perfect ranking

Stock prices (60,000 points) Search for 500-point patterns The x-axes show the ranks of matches retrieved by the developed procedure, and the y-axes are the ranks assigned by a slow exhaustive search fast ranking C = D = 5 time: 0.31 sec 200 perfect ranking fast ranking C = D = 2 time: 0.12 sec 200 perfect ranking fast ranking C = D = 1.5 time: 0.09 sec 167 perfect ranking

Temperatures (450,000 points) Search for 200-point patterns The x-axes show the ranks of matches retrieved by the developed procedure, and the y-axes are the ranks assigned by a slow exhaustive search fast ranking C = D = 5 time: 1.18 sec 200 perfect ranking fast ranking C = D = 2 time: 0.27 sec 151 perfect ranking fast ranking C = D = 1.5 time: 0.14 sec 82 perfect ranking

Conclusions Main results: Compression and indexing of time series by major minima and maxima. Current work: Hierarchical indexing by importance levels of minima and maxima