Mining Time-Series Databases

Slides:



Advertisements
Similar presentations
Prof. Carolina Ruiz Computer Science Department Bioinformatics and Computational Biology Program WPI WELCOME TO BCB4003/CS4803 BCB503/CS583 BIOLOGICAL.
Advertisements

Big Data Management and Analytics Introduction Spring 2015 Dr. Latifur Khan 1.
Mining Time Series.
Overview of Data Mining & The Knowledge Discovery Process Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
Mining Time-Series Databases Mohamed G. Elfeky. Introduction A Time-Series Database is a database that contains data for each point in time. Examples:
Week 9 Data Mining System (Knowledge Data Discovery)
Data Mining Knowledge Discovery in Databases Data 31.
Jessica Lin, Eamonn Keogh, Stefano Loardi
Artificial intelligence 4 Expert systems 4 Neural nets 4 Data base mining.
Basic concepts of Data Mining, Clustering and Genetic Algorithms Tsai-Yang Jea Department of Computer Science and Engineering SUNY at Buffalo.
Data Mining By Archana Ketkar.
Supported in part by the National Science Foundation – ISS/Digital Science & Technology Analysis of the Open Source Software development community using.
Detecting Time Series Motifs Under
Chapter 14 The Second Component: The Database.
KDD for Science Data Analysis Issues and Examples.
Business Intelligence
Special Topics in Data Mining. Direct Objectives To learn data mining techniques To see their use in real-world/research applications To get an understanding.
CS Machine Learning. What is Machine Learning? Adapt to / learn from data  To optimize a performance function Can be used to:  Extract knowledge.
Time Series Motifs Statistical Significance
OLAM and Data Mining: Concepts and Techniques. Introduction Data explosion problem: –Automated data collection tools and mature database technology lead.
Data Warehouse Fundamentals Rabie A. Ramadan, PhD 2.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence From Data Mining To Knowledge.
Tang: Introduction to Data Mining (with modification by Ch. Eick) I: Introduction to Data Mining A.Short Preview 1.Initial Definition of Data Mining 2.Motivation.
3. Multimedia Systems Technology
Dept. of Computing Science, University of Aberdeen1 CS4031/CS5012 Data Mining and Visualization Yaji Sripada.
Chapter 1 Introduction to Data Mining
Introduction to Data Mining Group Members: Karim C. El-Khazen Pascal Suria Lin Gui Philsou Lee Xiaoting Niu.
1 1 Slide Introduction to Data Mining and Business Intelligence.
Time Series Data Analysis - I Yaji Sripada. Dept. of Computing Science, University of Aberdeen2 In this lecture you learn What are Time Series? How to.
DATA MINING 1. 2 Data Mining Extracting or “mining” knowledge from large amounts of data Data mining is the process of autonomously retrieving useful.
Incremental Learning Chris Mesterharm Fordham University.
Data Warehousing Lecture-2 Introduction and Background 1.
DWH-Ahsan Abdullah 1 Data Warehousing Lecture-2 Introduction and Background Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for.
Mathematics Rationale and Philosophy
Guest Lecture Introduction to Data Mining Dr. Bhavani Thuraisingham September 17, 2010.
Big Data Analytics Large-Scale Data Management Big Data Analytics Data Science and Analytics How to manage very large amounts of data and extract value.
I Robot.
Major Disciplines in Computer Science Ken Nguyen Department of Information Technology Clayton State University.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Wednesday, March 29, 2000.
Unit 7 An Introduction to Exponential Functions 5 weeks or 25 days
CSC 562: Final Project Dave Pizzolo Artificial Neural Networks.
A Brief Introduction to Psychology Goals of the science and how they are achieved.
Introduction to Machine Learning © Roni Rosenfeld,
WHAT IS DATA MINING?  The process of automatically extracting useful information from large amounts of data.  Uses traditional data analysis techniques.
WHAT IS DATA MINING?  The process of automatically extracting useful information from large amounts of data.  Uses traditional data analysis techniques.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 28 Data Mining Concepts.
UNSD Census Workshop Day 2 - Session 7 Data Capture: Intelligent Character Recognition Andy Tye – International Manager DRS are Worldwide specialists in.
Data Mining - Introduction Compiled By: Umair Yaqub Lecturer Govt. Murray College Sialkot.
Accelerated B.S./M.S An approved Accelerated BS/MS program allows an undergraduate student to take up to 6 graduate level credits as an undergraduate.
Data Mining, Data Science, Big Data
Lecture #1 Introduction
Siemens Enables Digitalization: Data Analytics & Artificial Intelligence Dr. Mike Roshchin, CT RDA BAM.
Theme Introduction : Learning from Data
TIME SERIES ANALYSIS.
Introduction to IR Research
Supervised Time Series Pattern Discovery through Local Importance
ALZHEIMER DISEASE PREDICTION USING DATA MINING TECHNIQUES P.SUGANYA (RESEARCH SCHOLAR) DEPARTMENT OF COMPUTER SCIENCE TIRUPPUR KUMARAN COLLEGE FOR WOMEN.
A Time Series Representation Framework Based on Learned Patterns
Invitation to Computer Science 5th Edition
Data Mining: Concepts and Techniques Course Outline
What is Pattern Recognition?
Data Warehousing and Data Mining
Data Science introduction.
Data Mining: Concepts and Techniques
Data Mining: Concepts and Techniques
Social Research Methods Introduction Chapter 1 Neuman and Robson.
Data Mining: Concepts and Techniques
V. Uddameri Texas Tech University
CSE591: Data Mining by H. Liu
Presentation transcript:

Mining Time-Series Databases

Data Mining I – Introduction The extraction of nontrivial, implicit and useful knowledge from the data Data Knowledge Data Mining Artificial Intelligence Computer Science Statistics Information Retrieval

Data Mining goals To organize the data I – Introduction To find “structure” in the large amount of information available from different sources To organize the data To identify patterns that translate into new understandings and viable predictions To discover relationships between data and phenomena that ordinary operations and routine analysis would otherwise overlook Or make sense

Time Series People measure things: I – Introduction Oil price Sócrates popularity Blood pressure, etc. and things change over time, creating a time series

Introduction A Time-Series Database is a database that contains data for each point in time. Examples: Weather Data Stock Prices

What to Mine? Full Periodic Patterns Partial Periodic Patterns Every point in time contributes to the cyclic behavior of the time-series for each period. e.g., describing the weekly stock prices pattern considering all the days of the week. Partial Periodic Patterns Describing the behavior of the time-series at some but not all points in time. e.g., discovering that the stock prices are high every Saturday and small every Tuesday.

Time Series definition I – Introduction A (numeric) time series is a sequence of observations of a numeric property over time -1,25 -1,00 0,01 0,05 … 5,45 0,00

Motivation to Work in Time Series I – Introduction Time series are ubiquitous Most of the information (data) produced in a variety of areas are time series e.g. about 50% of all newspaper graphics are time series Other types of data can be converted to time series Image from E. J. Keogh. A decade of progress in indexing and mining large time series databases. In VLDB, page 1268, 2006.

Time Series Examples I – Introduction electroencephalogram Images from a variety of papers by E. J. Keogh. Available at: www.cs.ucr.edu/~eamonn electroencephalogram physiology (muscle activation) sensors historical archives motion data ECG

Time Series Examples (cont.) I – Introduction Image from E. J. Keogh. A decade of progress in indexing and mining large time series databases. In VLDB, page 1268, 2006. stocks data sales goods consumption animal ECG images motion capture handwritten character recognition DNA sequences

Time Series data characteristics I – Introduction Analysis is hard, as we are typically dealing with massive data-sets: One hour EEG: 1 GB of data Typical weblog: 5 GB / week MACHO database: 5 TB (growing 3 GB a day) Stanford Linear Accelerator database: 500 TB Quadratic complexity algorithms are insufficient The data also present some distortions (noise, scaling effects, etc.) that make the analysis more difficult

Time Series Data Mining Tasks I – Introduction Image from E. J. Keogh. A decade of progress in indexing and mining large time series databases. In VLDB, page 1268, 2006.