Presentation is loading. Please wait.

Presentation is loading. Please wait.

Mining Time-Series Databases

Similar presentations


Presentation on theme: "Mining Time-Series Databases"— Presentation transcript:

1 Mining Time-Series Databases

2 Data Mining I – Introduction The extraction of nontrivial, implicit and useful knowledge from the data Data Knowledge Data Mining Artificial Intelligence Computer Science Statistics Information Retrieval

3 Data Mining goals To organize the data
I – Introduction To find “structure” in the large amount of information available from different sources To organize the data To identify patterns that translate into new understandings and viable predictions To discover relationships between data and phenomena that ordinary operations and routine analysis would otherwise overlook Or make sense

4 Time Series People measure things: I – Introduction Oil price
Sócrates popularity Blood pressure, etc. and things change over time, creating a time series

5 Introduction A Time-Series Database is a database that contains data for each point in time. Examples: Weather Data Stock Prices

6 What to Mine? Full Periodic Patterns Partial Periodic Patterns
Every point in time contributes to the cyclic behavior of the time-series for each period. e.g., describing the weekly stock prices pattern considering all the days of the week. Partial Periodic Patterns Describing the behavior of the time-series at some but not all points in time. e.g., discovering that the stock prices are high every Saturday and small every Tuesday.

7 Time Series definition
I – Introduction A (numeric) time series is a sequence of observations of a numeric property over time -1,25 -1,00 0,01 0,05 5,45 0,00

8 Motivation to Work in Time Series
I – Introduction Time series are ubiquitous Most of the information (data) produced in a variety of areas are time series e.g. about 50% of all newspaper graphics are time series Other types of data can be converted to time series Image from E. J. Keogh. A decade of progress in indexing and mining large time series databases. In VLDB, page 1268, 2006.

9 Time Series Examples I – Introduction electroencephalogram
Images from a variety of papers by E. J. Keogh. Available at: electroencephalogram physiology (muscle activation) sensors historical archives motion data ECG

10 Time Series Examples (cont.)
I – Introduction Image from E. J. Keogh. A decade of progress in indexing and mining large time series databases. In VLDB, page 1268, 2006. stocks data sales goods consumption animal ECG images motion capture handwritten character recognition DNA sequences

11 Time Series data characteristics
I – Introduction Analysis is hard, as we are typically dealing with massive data-sets: One hour EEG: 1 GB of data Typical weblog: 5 GB / week MACHO database: 5 TB (growing 3 GB a day) Stanford Linear Accelerator database: 500 TB Quadratic complexity algorithms are insufficient The data also present some distortions (noise, scaling effects, etc.) that make the analysis more difficult

12 Time Series Data Mining Tasks
I – Introduction Image from E. J. Keogh. A decade of progress in indexing and mining large time series databases. In VLDB, page 1268, 2006.


Download ppt "Mining Time-Series Databases"

Similar presentations


Ads by Google