Fast Shapelets: All Figures in Higher Resolution.

Slides:



Advertisements
Similar presentations
Top-Down & Bottom-Up Segmentation
Advertisements

Original Figures for "Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring"
Test: CNN vs. AMM Data: Four sets of Jail Break data from ARL/Penn State Total Negative 88 Total Positive 69 Total 157 Two sets of five tests on all four.
Mining Mouse Vocalizations Jesin Zakaria Department of Computer Science and Engineering University of California Riverside.
Theoretical Analysis. Objective Our algorithm use some kind of hashing technique, called random projection. In this slide, we will show that if a user.
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Other Classification Techniques 1.Nearest Neighbor Classifiers 2.Support Vector Machines.
Efficiently searching for similar images (Kristen Grauman)
Mixed-Resolution Patch- Matching (MRPM) Harshit Sureka and P.J. Narayanan (ECCV 2012) Presentation by Yaniv Romano 1.
Similar and Congruent Figures. Similar figures have the same shape, but not the same size. They must have the same ratio of side lengths Congruent figures.
Introduction Recognizing and using congruent and similar shapes can make calculations and design work easier. For instance, in the design at the corner,
1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.
Objective of Computer Vision
Jessica Lin, Eamonn Keogh, Stefano Loardi
Preprocessing ROI Image Geometry
Time Series Bitmap Experiments This file contains full color, large scale versions of the experiments shown in the paper, and additional experiments which.
J Cheng et al,. CVPR14 Hyunchul Yang( 양현철 )
Face Processing System Presented by: Harvest Jang Group meeting Fall 2002.
Spring 2015 Mathematics in Management Science Binary Linear Codes Two Examples.
Spring 2012Meetings 5 and 6, 7:20PM-10PM Image Processing with Applications-CSCI567/MATH563/MATH489 Lectures 8, 9, 10,11: Spatial Filtering 8. Linear Filters,
Data Analysis 1 Mark Stamp. Topics  Experimental design o Training set, test set, n-fold cross validation, thresholding, imbalance, etc.  Accuracy o.
Proliferation cluster (G12) Figure S1 A The proliferation cluster is a stable one. A dendrogram depicting results of cluster analysis of all varying genes.
Chapter 2 Frequency Distributions
Time Series Data Analysis - I Yaji Sripada. Dept. of Computing Science, University of Aberdeen2 In this lecture you learn What are Time Series? How to.
Video Based Palmprint Recognition Chhaya Methani and Anoop M. Namboodiri Center for Visual Information Technology International Institute of Information.
Nearest Neighbor Paul Hsiung March 16, Quick Review of NN Set of points P Query point q Distance metric d Find p in P such that d(p,q) < d(p’,q)
Abdullah Mueen Eamonn Keogh University of California, Riverside.
Learning Time-Series Shapelets Josif Grabocka, Nicolas Schilling, Martin Wistuba, Lars Schmidt-Thieme Information Systems and Machine Learning Lab University.
Data Structures Using C++ 2E Chapter 9 Searching and Hashing Algorithms.
11/23/2015Slide 1 Using a combination of tables and plots from SPSS plus spreadsheets from Excel, we will show the linkage between correlation and linear.
Evaluating Results of Learning Blaž Zupan
Semi-Supervised Time Series Classification Li Wei Eamonn Keogh University of California, Riverside {wli,
Expectation. Let X denote a discrete random variable with probability function p(x) (probability density function f(x) if X is continuous) then the expected.
An Approximate Nearest Neighbor Retrieval Scheme for Computationally Intensive Distance Measures Pratyush Bhatt MS by Research(CVIT)
Evaluating Classification Performance
Computational Intelligence: Methods and Applications Lecture 33 Decision Tables & Information Theory Włodzisław Duch Dept. of Informatics, UMK Google:
Indexing Time Series. Outline Spatial Databases Temporal Databases Spatio-temporal Databases Multimedia Databases Time Series databases Text databases.
Meta-learning for Algorithm Recommendation Meta-learning for Algorithm Recommendation Background on Local Learning Background on Algorithm Assessment Algorithm.
Chapter 7 Lossless Compression Algorithms 7.1 Introduction 7.2 Basics of Information Theory 7.3 Run-Length Coding 7.4 Variable-Length Coding (VLC) 7.5.
VizTree Huyen Dao and Chris Ackermann. Introducing example
 Two polygons are similar polygons if corresponding angles are congruent and if the lengths of corresponding sides are proportional.
Chapter - 2 Data strucuters for Language processing.
Clustering Microarray Data based on Density and Shared Nearest Neighbor Measure CATA’06, March 23-25, 2006 Seattle, WA, USA Ranapratap Syamala, Taufik.
Color Image Segmentation Mentor : Dr. Rajeev Srivastava Students: Achit Kumar Ojha Aseem Kumar Akshay Tyagi.
Chapter 2 Frequency Distributions PowerPoint Lecture Slides Essentials of Statistics for the Behavioral Sciences Seventh Edition by Frederick J Gravetter.
Mustafa Gokce Baydogan, George Runger and Eugene Tuv INFORMS Annual Meeting 2011, Charlotte A Bag-of-Features Framework for Time Series Classification.
EE368 Final Project Spring 2003
Feature learning for multivariate time series classification Mustafa Gokce Baydogan * George Runger * Eugene Tuv † * Arizona State University † Intel Corporation.
Image Representation and Description – Representation Schemes
Date of download: 10/12/2017 Copyright © ASME. All rights reserved.
Matrix Profile I: All Pairs Similarity Joins for Time Series: A Unifying View that Includes Motifs, Discords and Shapelets Chin-Chia Michael Yeh, Yan.
Matrix Profile II: Exploiting a Novel Algorithm and GPUs to break the one Hundred Million Barrier for Time Series Motifs and Joins Yan Zhu, Zachary Zimmerman,
Similar Polygons.
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
Efficient Image Classification on Vertically Decomposed Data
Evaluating Results of Learning
Mean Shift Segmentation
GENETIC PROGRAMMING BBB4003.
Data Mining Classification: Alternative Techniques
Efficient Image Classification on Vertically Decomposed Data
Read slide.
TOP DM 10 Algorithms C4.5 C 4.5 Research Issue:
A Fast and Scalable Nearest Neighbor Based Classification
Binary Image processing بهمن 92
Advance Database System
Top 40 Motifs from Artificial Book with Different Masking Ratios
GENETIC PROGRAMMING BBB4003.
Donghui Zhang, Tian Xia Northeastern University
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
Similar and Congruent Figures. Similar figures have the same shape, but not the same size. They must have the same ratio of side lengths Congruent figures.
Presentation transcript:

Fast Shapelets: All Figures in Higher Resolution

Figure 1: left) Skulls of horned lizards and turtles. right) the time series representing the images. The 2D shapes are converted to time series using the technique in [14]

Figure: Time series of two skulls of horned lizards

Figure 2: left) The shapelet that best distinguishes between skulls of horned lizards and turtles, shown as the purple/bold subsequence. right) The shapelet projected back to the original 2D shape space

Figure 3: The orderline shows the distance between the candidate subsequence and all time series as positions on the x-axis. The three objects on the left hand side of the line correspond to horned lizards and the three objects on the right correspond to turtles Orderline 0 ∞ split candidate

a a d b c c Figure 4: top.left) The SAX word adbacc created from a subsequence of the time series corresponding to P. coronatum. bottom) sliding window technique b c a a c d another example of a SAX word

Obj 1 Obj 2 Obj 3 SAX Words 1 st Random Mask2 nd Random Mask Figure 5: left) SAX words of each object. right) SAX words after masking two symbols. Note that masking positions are randomly picked

Obj 1 Obj 2 Obj 3 Signatures ID Obj 1 Obj 2 Obj 3 1 Object List Obj 2 1 Object List Signatures ID Obj 1 Obj 2 Obj 3 Obj 1 Obj 3 A) B) Figure 6: The first (A) and second (B) iterations of the counting process. left) Hashing process to match all same signatures. Signatures created by removing marked symbols from SAX words. right) Collision tables showing the number of matched objects by each words

Close to RefFar from Ref Obj 1 Obj 3 Obj 2 Obj 4 Class1Class2 Class1 Class2 Class1 Class2 Distinguishing Power A)B) C)D) Figure 7: A) The collision table of all words after five iterations. Note that counts show the number of occurrences that an object shares a same signature with the reference word. B) Grouping counting scores from objects in the same class. C) Complement of (B) to show that how many times objects in each class that do not share the same signature with the reference word. D) The distinguishing power of each SAX word

Figure 8: Classification accuracy of our algorithm and the state-of-the-art on 32 datasets from the UCR archive Current state-of the-art Our algorithm Classification Accuracy Comparison In this area, our algorithm is better In this area, SOTA is better wins 15 loses

Figure 9: Running time comparison between our algorithm and the state-of-the-art on 32 datasets from UCR time series archives Execution Time Comparison Current state-of-the-art Our algorithm 10X 1X 100X 1000X 10000X sec

Figure 10: Scalability of our algorithm and the current state-of-the-art on StarlightCurves dataset. left) Number of time series in the dataset is varying. right) The length of time series is varying number of time series seccond Scalability on Number of Time Series x state-of-the-art our algorithm length of time series Scalability on Time Series Length x seccond our algorithm state-of-the-art (average from 30 runs)

Figure 11: Accuracy ratio between FastShapelet algorithm and Euclidean-distance-based one nearest neighbor on all 45 datasets from UCR archives Expected Ratio Actual Ratio FP TP FN TN

Figure 12: bottom) The accuracy of the algorithm is not sensitive for both parameters r and k. top) The running time of the algorithm is approximately linear by either parameter. Note that when we vary r (k), we fix k (r) to ten, thus we are changing only one parameter at a time Vary K Vary R Accuracy (%) Time (sec) Vary K Vary R (average from 30 runs)

Figure 13: Examples of starlight curves in three classes: Eclipsed Binaries, Cepheis, and RR Lyrae Variables Eclipsed Binaries Cepheids RR Lyrae Variables

Figure 14: left) Decision tree of StarlightCurve dataset created by our algorithm. right) Two shapelets shown as the red/bold part in time series EB RR Cep II I Shapelet I Shapelet II dist thres = dist thres = 5.79 object from RR object from Cep

Figure 15: Examples of all outdoor activities from PAMAP dataset. Note that the time series of each activity are generally different lengths Slow Walk Normal Walk Nordic Walk Run Cycle Soccer Rope Jump Outdoor Activities from PAMAP Dataset

Figure 16: top) ECG time series when first recorded. left) Time series from two classes are very similar even hard to distinguish by eyes. right) the shaplet discovered by our algorithm shown in red/bold Time series of class1 and class 2 Original long time series when recorded Shapelet shown in red/bold dish threshold = 2.446