Eamonn Keogh Li Wei Xiaopeng Xi Stefano Lonardi Jin Shieh Scott Sirowy

Slides:



Advertisements
Similar presentations
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Advertisements

SAX: a Novel Symbolic Representation of Time Series
Md. Mahbub Hasan University of California, Riverside.
By: Erian Wilson. His date of birth was March 26,1516 his death was December 13,1568.
Mining Time Series.
Multithreaded FPGA Acceleration of DNA Sequence Mapping Edward Fernandez, Walid Najjar, Stefano Lonardi, Jason Villarreal UC Riverside, Department of Computer.
Intelligent Database Systems Lab Presenter : YAN-SHOU SIE Authors : Christos Ferles ∗, Andreas Stafylopatis NN Self-Organizing Hidden Markov Model.
Funding Networks Abdullah Sevincer University of Nevada, Reno Department of Computer Science & Engineering.
Locally Constraint Support Vector Clustering
1 Manifold Clustering of Shapes Dragomir Yankov, Eamonn Keogh Dept. of Computer Science & Eng. University of California Riverside.

© Prentice Hall1 DATA MINING TECHNIQUES Introductory and Advanced Topics Eamonn Keogh (some slides adapted from) Margaret Dunham Dr. M.H.Dunham, Data Mining,
Efficient Query Filtering for Streaming Time Series
Time Series Bitmap Experiments This file contains full color, large scale versions of the experiments shown in the paper, and additional experiments which.
A Symbolic Representations of Time Series Eamonn Keogh and Jessica Lin
An Efficient and Scalable Pattern Matching Scheme for Network Security Applications Department of Computer Science and Information Engineering National.
1 A Dynamic Clustering and Scheduling Approach to Energy Saving in Data Collection from Wireless Sensor Networks Chong Liu, Kui Wu and Jian Pei Computer.
Detecting Time Series Motifs Under
1 Ensembles of Nearest Neighbor Forecasts Dragomir Yankov, Eamonn Keogh Dept. of Computer Science & Eng. University of California Riverside Dennis DeCoste.
1 BrainWave Biosolutions Limited Accelerating Life Science Research through Technology.
A Multiresolution Symbolic Representation of Time Series
DFT DWT SVD APCA PAA PLA CHEB Raymond T. Ng, Yuhan Cai SIGMOD 2004.
Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Dijrre, Peter Gerstl, Roland Seiffert Presented by Drew DeHaas.
1 Dot Plots For Time Series Analysis Dragomir Yankov, Eamonn Keogh, Stefano Lonardi Dept. of Computer Science & Eng. University of California Riverside.
Time-Series Data Kaitlin Duck Sherwood CS 533c. Why do you care? Time-series data is all over the place.
1 A Topic Modeling Approach and its Integration into the Random Walk Framework for Academic Search 1 Jie Tang, 2 Ruoming Jin, and 1 Jing Zhang 1 Knowledge.
Introduction to Data Mining Engineering Group in ACL.
TEXT MINING IN BIOMEDICAL RESEARCH QI LI 03/28/14.
Knowledge Science & Engineering Institute, Beijing Normal University, Analyzing Transcripts of Online Asynchronous.
Time Series Anomaly Detection Experiments This file contains full color, large scale versions of the experiments shown in the paper, and additional experiments.
Time Series Data Analysis - II
Data Mining Techniques
AdvisorStudent Dr. Jia Li Shaojun Liu Dept. of Computer Science and Engineering, Oakland University 3D Shape Classification Using Conformal Mapping In.
Temporal Event Map Construction For Event Search Qing Li Department of Computer Science City University of Hong Kong.
RuleML-2007, Orlando, Florida1 Towards Knowledge Extraction from Weblogs and Rule-based Semantic Querying Xi Bai, Jigui Sun, Haiyan Che, Jin.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Student : Sheng-Hsuan Wang Department.
Anomaly Detection Using Symmetric Compression Benjamin Arai & Chris Baron Computer Science and Engineering Department University of California - Riverside.
O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY RobustMap: A Fast and Robust Algorithm for Dimension Reduction and Clustering Lionel F.
Name : Emad Zargoun Id number : EASTERN MEDITERRANEAN UNIVERSITY DEPARTMENT OF Computing and technology “ITEC547- text mining“ Prof.Dr. Nazife Dimiriler.
© 2010 Pearson Addison-Wesley. All rights reserved. Addison Wesley is an imprint of Designing the User Interface: Strategies for Effective Human-Computer.
Discovering the Intrinsic Cardinality and Dimensionality of Time Series using MDL BING HU THANAWIN RAKTHANMANON YUAN HAO SCOTT EVANS1 STEFANO LONARDI EAMONN.
Sequential PAttern Mining using A Bitmap Representation
CSCI 1101 Intro to Computers 7.1 Learning HTML. 2 Introduction Web pages are written using HTML Two key concepts of HTML are:  Hypertext (links Web pages.
A new way of seeing genomes Combining sequence- and signal-based genome analyses Maik Friedel, Thomas Wilhelm, Jürgen Sühnel FLI Introduction: So far,
Mining Time Series.
Graph Data Management Lab, School of Computer Science gdm.fudan.edu.cn Luyiqi Locus based alignment storage.
University of Macau, Macau
Distributed Spatio-Temporal Similarity Search Demetrios Zeinalipour-Yazti University of Cyprus Song Lin
Regular Expressions This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License. To view a copy of this.
Application of AI techniques for Computer Games BSc Computer Games Programming, 2006 Julien Delezenne GAMES ARTIFICIAL INTELLIGENCE.
Organizing data into classes such that there is high intra-class similarity low inter-class similarity Finding the class labels and the number of classes.
Exact indexing of Dynamic Time Warping
ICT-enabled Agricultural Science for Development Scenarios, Opportunities, Issues by ICTs transforming agricultural science, research & technology generation.
Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Chien-Shing Chen Author : Juan D.Velasquez Richard Weber Hiroshi Yasuda 國立雲林科技大學 National.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Mining massive document collections by the WEBSOM method Presenter : Yu-hui Huang Authors :Krista Lagus,
Accelerating Dynamic Time Warping Clustering with a Novel Admissible Pruning Strategy Nurjahan BegumLiudmila Ulanova Jun Wang 1 Eamonn Keogh University.
Kaifeng Chen Institute for Theoretical Physics Synthetic Biology with Engineering Tools 1 Francis Chen.
NSF Career Award IIS University of California Riverside Eamonn Keogh Efficient Discovery of Previously Unknown Patterns and Relationships.
DataJewel 1 : Tightly Integrating Visualization with Temporal Data Mining Mihael Ankerst, David H. Jones, Anne Kao, Changzhou Wang 1 US patent pending.
Mining Tag Semantics for Social Tag Recommendation Hsin-Chang Yang Department of Information Management National University of Kaohsiung.
Using category-Based Adherence to Cluster Market-Basket Data Author : Ching-Huang Yun, Kun-Ta Chuang, Ming-Syan Chen Graduate : Chien-Ming Hsiao.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Enhancing Text Clustering by Leveraging Wikipedia Semantics.
Designing Cross-Language Information Retrieval System using various Techniques of Query Expansion and Indexing for Improved Performance  Hello everyone,
General Terms Multimedia Hypermedia Authoring with Technology
Ryan McFall, Herb Dershem Dept. of Computer Science Hope College
Visually Mining and Monitoring Massive Time Series
مقدمه اي بر داده کاوي و اکتشاف دانش
How to Take Notes for Language Arts
CHAPTER 7: Information Visualization
Presentation Title By : Urppt.com Date : January 201X
Presentation transcript:

Eamonn Keogh Li Wei Xiaopeng Xi Stefano Lonardi Jin Shieh Scott Sirowy Intelligent Icons: Integrating Lite-Weight Data Mining and Visualization into GUI Operating Systems Eamonn Keogh Li Wei Xiaopeng Xi Stefano Lonardi Jin Shieh Scott Sirowy Computer Science & Engineering Dept. University of California – Riverside

Eamonn, patent this idea! Outline Overview An Example: DNA to Intelligent Icon Icon Generation Algorithm Experimental Evaluation Conclusion Eamonn, patent this idea! Christos Faloutsos

Dataset Kalpakis_ECG Icons in a traditional browser

Dataset Kalpakis_ECG Suppose I magically.. Color the icons to somehow reflect the contents of the file. Position the icons based on their colors/patterns normal1.txt normal10.txt normal11.txt normal12.txt normal13.txt normal2.txt normal3.txt normal4.txt normal5.txt normal6.txt normal7.txt normal8.txt normal9.txt normal14.txt normal15.txt normal16.txt normal17.txt normal18.txt

Let us start with visualizing a special data type, DNA. TGGCCGTGCTAGGCCCCACCCCTACCTTGCAGTCCCCGCAAGCTCATCTGCGCGAACCAGAACGCCCACCACCCTTGGGTTGAAATTAAGGAGGCGGTTGGCAGCTTCCCAGGCGCACGTACCTGCGAATAAATAACTGTCCGCACAAGGAGCCCGACGATAGTCGACCCTCTCTAGTCACGACCTACACACAGAACCTGTGCTAGACGCCATGAGATAAGCTAACACAAAAACATTTCCCACTACTGCTGCCCGCGGGCTACCGGCCACCCCTGGCTCAGCCTGGCGAAGCCGCCCTTCA Let us start with visualizing a special data type, DNA. The DNA of two species… Are they similar? CCGTGCTAGGGCCACCTACCTTGGTCCGCCGCAAGCTCATCTGCGCGAACCAGAACGCCACCACCTTGGGTTGAAATTAAGGAGGCGGTTGGCAGCTTCCAGGCGCACGTACCTGCGAATAAATAACTGTCCGCACAAGGAGCCGACGATAAAGAAGAGAGTCGACCTCTCTAGTCACGACCTACACACAGAACCTGTGCTAGACGCCATGAGATAAGCTAACA

C T A G C C C C C T T T T T A A A A A G G G G G 0.20 0.24 0.26 0.30 CCGTGCTAGGGCCACCTACCTTGGTCCGCCGCAAGCTCATCTGCGCGAACCAGAACGCCACCACCTTGGGTTGAAATTAAGGAGGCGGTTGGCAGCTTCCAGGCGCACGTACCTGCGAATAAATAACTGTCCGCACAAGGAGCCGACGATAAAGAAGAGAGTCGACCTCTCTAGTCACGACCTACACACAGAACCTGTGCTAGACGCCATGAGATAAGCTAACA 0.26 0.30

C T A G C C C C C C T T T T T T A A A A A A G G G G G G CC CC CT TC TT CA CG TA AC AT GC GT AA AG GA GG CC CC CC CC CC CC CC CC CT CT CT CT CT CT CT CT TC TC TC TC TC TC TC TC TT TT TT TT TT TT TT TT CCC CCC CCC CCC CCT CCT CCT CCT CTC CTC CTC CTC C C C C C C T T T T T T CCA CCA CCA CCA CCG CCG CCG CCG CTA CTA CTA CTA CA CG TA TC CA CG TA TC CA CA CA CA CA CA CA CA CG CG CG CG CG CG CG CG TA TA TA TA TA TA TA TA TC TC TG TC TG TC TC TC CAC CAC CAC CAC CAT CAT CAT CAT CAA CAA CAA CAA AC AT GC GT AC AT GC GT AC AC AC AC AC AC AC AC AT AT AT AT AT AT AT AT GC GC GC GC GC GC GC GC GT GT GT GT GT GT GT GT A A A A A A G G G G G G AA AG GA GG AA AG GA GG AA AA AA AA AA AA AA AA AG AG AG AG AG AG AG AG GA GA GA GA GA GA GA GA GG GG GG GG GG GG GG GG CCGTGCTAGGGCCACCTACCTTGGTCCGCCGCAAGCTCATCTGCGCGAACCAGAACGCCACCACCTTGGGTTGAAATTAAGGAGGCGGTTGGCAGCTTCCAGGCGCACGTACCTGCGAATAAATAACTGTCCGCACAAGGAGCCGACGATAAAGAAGAGAGTCGACCTCTCTAGTCACGACCTACACACAGAACCTGTGCTAGACGCCATGAGATAAGCTAACA

CA CA CA CA CA CA CA CA CA CA AC AC AC AC AC AC AC AC AC AC AT AT AT 1 0.02 0.04 0.09 0.04 0.03 0.07 0.02 CA CA CA CA CA CA CA CA CA CA 0.11 0.03 AC AC AC AC AC AC AC AC AC AC AT AT AT AT AT AT AT AT AT AT AA AA AA AA AA AA AA AA AA AA AG AG AG AG AG AG AG AG AG AG CCGTGCTAGGCCCCACCCCTACCTTGCAGTCCCCGCAAGCTCATCTGCGCGAACCAGAACGCCCACCACCCTTGGGTTGAAATTAAGGAGGCGGTTGGCAGCTTCCCAGGCGCACGTACCTGCGAATAAATAACTGTCCGCACAAGGAGCCCGACGATAGTCGACCCTCTCTAGTCACGACCTACACACAGAACCTGTGCTAGACGCCATGAGATAAGCTAACA

OK. Given any DNA string I can make a colored bitmap, so what? CCGTGCTAGGCCCCACCCCTACCTTGCAGTCCCCGCAAGCTCATCTGCGCGAACCAGAACGCCCACCACCCTTGGGTTGAAATTAAGGAGGCGGTTGGCAGCTTCCCAGGCGCACGTACCTGCGAATAAATAACTGTCCGCACAAGGAGCCCGACGATAGTCGACCCTCTCTAGTCACGACCTACACACAGAACCTGTGCTAGACGCCATGAGATAAGCTAACA

African elephant.dna Indian chimpanzee.dna hippopotamus.dna Human.dna orangutan.dna pygmy sperm whale.dna rhesus monkey.dna sperm whale.dna white rhinoceros.dna Indian Indian rhinoceros.dna rhinoceros.dna white white rhinoceros.dna rhesus rhesus monkey.dna monkey.dna pygmy pygmy chimpanzee.dna chimpanzee.dna sperm sperm whale.dna whale.dna Indian Indian hippopotamus.dna hippopotamus.dna chimpanzee.dna chimpanzee.dna elephant.dna elephant.dna Human.dna Human.dna African African orangutan.dna orangutan.dna elephant.dna elephant.dna pygmy pygmy sperm whale.dna sperm whale.dna

Note Elephas maximus is the Indian Elephant, Loxodonta africana is the African elephant and Pan troglodytes is the chimpanzee.

a b c d Can we make Intelligent Icons for time series? Yes, with SAX! accbabcdbcabdbcadbacbdbdcadbaacb… c c c b b b aa ab ba bb ac ad bc bd ca cb da db cc cd dc dd a b c d aaa aab aba aac aad abc aca acb acc a a Time Series Bitmap

While they are all example of EEGs, example_a While they are all example of EEGs, example_a.dat is from a normal trace, whereas the others contain examples of spike-wave discharges.

We can achieve this with MDS. We can further enhance the time series bitmaps by arranging the thumbnails by “cluster”, instead of arranging by date, size, name etc We can achieve this with MDS. We can further enhance the time series bitmaps by arranging the thumbnails by “cluster”, instead of arranging by date, size, name etc We can achieve this with MDS. August.txt July.txt June.txt April.txt May.txt Sept.txt Oct.txt Feb.txt Dec.txt March.txt Nov.txt Jan.txt January 100 200 300 December August One Year of Italian Power Demand

Text Example Here are some papers that reference Eamonn Keoghs work…

Text Example Cluster of “warping” papers Cluster of classification papers Paper on using “warping” to classify Classification paper in Italian “Warping” paper in Portuguese “classification” papers

Intelligent Icon Search

Paper Summary We show how to map DNA, time series and natural language into intelligent icons. We give a generic framework for mapping any kind of data into intelligent icons. We show the utility of intelligent icons for finding patterns (clusters, outliers etc)

Questions?