Mining and Processing Biomedical Data

Slides:



Advertisements
Similar presentations
Indexing Time Series Based on original slides by Prof. Dimitrios Gunopulos and Prof. Christos Faloutsos with some slides from tutorials by Prof. Eamonn.
Advertisements

SAX: a Novel Symbolic Representation of Time Series
Ch2 Data Preprocessing part3 Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2009.
Learning Trajectory Patterns by Clustering: Comparative Evaluation Group D.
Mining Time Series.
Hidden Markov Model based 2D Shape Classification Ninad Thakoor 1 and Jean Gao 2 1 Electrical Engineering, University of Texas at Arlington, TX-76013,
September 2, 2009ECE 366, Fall 2009 Introduction to ECE 366 Selin Aviyente Associate Professor.
A STUDY ON SPEECH RECOGNITION USING DYNAMIC TIME WARPING CS 525 : Project Presentation PALDEN LAMA and MOUNIKA NAMBURU.
Indexing Time Series. Time Series Databases A time series is a sequence of real numbers, representing the measurements of a real variable at equal time.
Indexing Time Series Based on Slides by C. Faloutsos (CMU) and D. Gunopulos (UCR)
Efficient Query Filtering for Streaming Time Series
Making Time-series Classification More Accurate Using Learned Constraints © Chotirat “Ann” Ratanamahatana Eamonn Keogh 2004 SIAM International Conference.
Jessica Lin, Eamonn Keogh, Stefano Loardi
Distance Functions for Sequence Data and Time Series
Based on Slides by D. Gunopulos (UCR)
Using Relevance Feedback in Multimedia Databases
A Multiresolution Symbolic Representation of Time Series
Ashish Uthama EOS 513 Term Paper Presentation Ashish Uthama Biomedical Signal and Image Computing Lab Department of Electrical.
Presented by Arun Qamra
Chapter One Characteristics of Instrumentation بسم الله الرحمن الرحيم.
Building Efficient Time Series Similarity Search Operator Mijung Kim Summer Internship 2013 at HP Labs.
Exact Indexing of Dynamic Time Warping
Multimedia and Time-series Data
CS910: Foundations of Data Analytics Graham Cormode Time Series Analysis.
Qualitative approximation to Dynamic Time Warping similarity between time series data Blaž Strle, Martin Možina, Ivan Bratko Faculty of Computer and Information.
Similarity measuress Laboratory of Image Analysis for Computer Vision and Multimedia Università di Modena e Reggio Emilia,
Analysis of Constrained Time-Series Similarity Measures
Data Mining & Knowledge Discovery Lecture: 2 Dr. Mohammad Abu Yousuf IIT, JU.
Similarity based Retrieval from Sequence Databases using Automata as Queries 作者 : A. Prasad Sistla, Tao Hu, Vikas howdhry 出處 :CIKM 2002 ACM 指導教授 : 郭煌政老師.
Time Series Data Analysis - I Yaji Sripada. Dept. of Computing Science, University of Aberdeen2 In this lecture you learn What are Time Series? How to.
K. Selçuk Candan, Maria Luisa Sapino Xiaolan Wang, Rosaria Rossini
ECE 8443 – Pattern Recognition ECE 3163 – Signals and Systems Objectives: The Trigonometric Fourier Series Pulse Train Example Symmetry (Even and Odd Functions)
Mining Time Series.
The Landmark Model: An Instance Selection Method for Time Series Data C.-S. Perng, S. R. Zhang, and D. S. Parker Instance Selection and Construction for.
Data Preprocessing Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2010.
Acknowledgements Contact Information Anthony Wong, MTech 1, Senthil K. Nachimuthu, MD 1, Peter J. Haug, MD 1,2 Patterns and Rules  Vital signs medoids.
Dynamic Learning Maps Paul Horner Newcastle University.
11/20/2015 Fourier Series Chapter /20/2015 Fourier Series Chapter 6 2.
k-Shape: Efficient and Accurate Clustering of Time Series
Exact indexing of Dynamic Time Warping
Indexing Time Series. Outline Spatial Databases Temporal Databases Spatio-temporal Databases Data Mining Multimedia Databases Text databases Image and.
Indexing Time Series. Outline Spatial Databases Temporal Databases Spatio-temporal Databases Multimedia Databases Time Series databases Text databases.
Using Word Based Features for Word Clustering The Thirteenth Conference on Language Engineering 11-12, December 2013 Department of Electronics and Communications,
Introduction to Earth Science Notes: Pages
Environmental and Exploration Geophysics II tom.h.wilson
The Trigonometric Fourier Series Representations
Mustafa Gokce Baydogan, George Runger and Eugene Tuv INFORMS Annual Meeting 2011, Charlotte A Bag-of-Features Framework for Time Series Classification.
Dr S D AL_SHAMMA Dr S D AL_SHAMMA11.
Topic 4: Cluster Analysis Analysis of Customer Behavior and Service Modeling.
Data statistics and transformation revision Michael J. Watts
Feature learning for multivariate time series classification Mustafa Gokce Baydogan * George Runger * Eugene Tuv † * Arizona State University † Intel Corporation.
Cluster Analysis This work is created by Dr. Anamika Bhargava, Ms. Pooja Kaul, Ms. Priti Bali and Ms. Rajnipriya Dhawan and licensed under a Creative Commons.
Matrix Profile I: All Pairs Similarity Joins for Time Series: A Unifying View that Includes Motifs, Discords and Shapelets Chin-Chia Michael Yeh, Yan.
Mining and Processing Biomedical Data
Data Processing As a science major, you will all eventually have to deal with data. All data has noise Devices do not give useful measurements; must convert.
Supervised Time Series Pattern Discovery through Local Importance
Biomedical Signal processing Chapter 1 Introduction
Distance Functions for Sequence Data and Time Series
A Time Series Representation Framework Based on Learned Patterns
School of Computer Science & Engineering
Time Series Data and Moving Object Trajectory
Biomedical Signal processing Chapter 1 Introduction
An Improved Neural Network Algorithm for Classifying the Transmission Line Faults Slavko Vasilic Dr Mladen Kezunovic Texas A&M University.
Alan Jovic1, Kresimir Jozic2, Davor Kukolja1,
What Is Good Clustering?
Some open questions: aggregation and privacy protection Coding, information theory and signal processing Du, Kargupta, Vora.
Determining the Function Obtained from a Series of Transformations.
Topic 5: Cluster Analysis
Data Processing Chapter 3
Measuring the Similarity of Rhythmic Patterns
Presentation transcript:

Mining and Processing Biomedical Data Dr. rer. nat. Krisztian Buza adiunkt naukowy Faculty of Mathematics, Informatics and Mechanics University of Warsaw, Poland chrisbuza@yahoo.com

Preprocessing Time Series

Time Series Sequence of numbers: 10, 12, 13 … Example: observation of the temperature 6:00 am – 10 ºC 7:00 am – 12 ºC 8:00 am – 13 ºC 9:00 am – 15 ºC ...

Multivariate Time Series Electrocardiograph (ECG) and Electroencephalograph (EEG) signals Several values (observations) at each point of time Excerpt from the publicly available EEG database at https://archive.ics.uci.edu/ml/datasets/EEG+Database Time point Senor 1 Sensor 2 Sensor 3 Sensor 4 Sensor 5 ... 1 -8.921 0.834 -19.847 8.148 -2.146 2 -8.433 3.276 -12.522 1.801 3 -2.574 5.717 1.149 -2.594 -1.658 4 5.239 7.67 14.821 -4.547 -0.682 5 11.587 9.623 20.681 -5.035 2.248 Image: http://en.wikipedia.org/wiki/File:EEG_cap.jpg

Nearest Neighbor Classification of Time Series How to compare two time series?

Nearest Neighbor Classification of Time Series How to compare two time series? Compare the values one by one: first time series: -1.34 -2 3.5 1.7 ... second time series: 0.32 -1.5 2.8 0.9 ...

Nearest Neighbor Classification of Time Series How to compare two time series? Compare the values one by one: first time series: -1.34 -2 3.5 1.7 ... second time series: 0.32 -1.5 2.8 0.9 ...

Nearest Neighbor Classification of Time Series How to compare two time series? Compare the values one by one: first time series: -1.34 -2 3.5 1.7 ... second time series: 0.32 -1.5 2.8 0.9 ...

Nearest Neighbor Classification of Time Series How to compare two time series? Compare the values one by one: first time series: -1.34 -2 3.5 1.7 ... second time series: 0.32 -1.5 2.8 0.9 ... Problems with this simple approach What if time series are of different length? Shiftings and elongations, i.e., patterns reflecting real-world phenomena may have various length and may begin at (slightly) different position of the time-series Noise

Preprocessing of Time Series Fourier transformation Aggregation of consequitve values Transformation to a symbolic representation Moving average Normalisation ...

Aggregation of consequtive values Time series: 1.0, 1.2, 1.3, 1.7, 1.7, 1.8, 1.8, 1.9, 1.9, 2.0, 2.1 2.2 … Aggregated time-series: 1.63 , 2.4 , ... (Average of the first 10 values) (Average of the next 10 values)

Aggregation of consequtive values Time series: 1.0, 1.2, 1.3, 1.7, 1.7, 1.8, 1.8, 1.9, 1.9, 2.0, 2.1 2.2 … Aggregated time-series: 1.63 , 2.4 , ... Symbolic representation B , C , ... (Average of the first 10 values) (Average of the next 10 values) 0..1 1..2 2..3 ... A B C

Moving Average Time series: Transformed time-series: 1.38 1.0, 1.2, 1.3, 1.7, 1.7, 1.8, 1.8, 1.9, 1.9, 2.0, 2.1 2.2 … Transformed time-series: 1.38 (Average of the first 5 values)

Moving Average Time series: Transformed time-series: 1.38, 1.54, 1.0, 1.2, 1.3, 1.7, 1.7, 1.8, 1.8, 1.9, 1.9, 2.0, 2.1 2.2 … Transformed time-series: 1.38, 1.54,

Moving Average Time series: Transformed time-series: 1.38, 1.54, ... 1.0, 1.2, 1.3, 1.7, 1.7, 1.8, 1.8, 1.9, 1.9, 2.0, 2.1 2.2 … Transformed time-series: 1.38, 1.54, ...

Normalisation Time series: 1.0, 1.2, 1.3, 1.7, 1.7, 1.8, 1.8, 1.9, 1.9, 2.0, 2.1 2.2 … Average of all the values: 3.2 Standard deviation of all the values: 1.5 Normalised time-series: -1.46, -1.33, -1.26 ... (1.0 – 3.2) / 1.5 = -1.46

Dynamic Time Warping (DTW)

Proximity measures for time series Similarity measures and distance measure Similarity measure High value → two time series are similar Low value → two time series are different Distance measure High value → two time series are different (dissimilar) Low value → two time series are similar Dynamic Time Warping (DTW) is a distance measure

Example x*: 0, 0, 0, 1, 3, 0, -2, -1, 0, 0, 0 x1: 0, 0, 2, 4, 1, -1, 0, 0, 0, 0, 0 x2: 0, 1, 2, 1, 0, 1, 2, 1, 0, 1, 2 x* x1 x2

Dynamic Time Warping

Dynamic Time Warping

Dynamic Time Warping

Dynamic Time Warping |3-1| + 1 = 3

Dynamic Time Warping For example, for this entry, we record that the yellow entry had the minimum out of the three entries that were considered while filling-in this entry.