Dr. Hesam Izakian October 2014

Slides:



Advertisements
Similar presentations
CVPR2013 Poster Modeling Actions through State Changes.
Advertisements

Regional Impact Assessment AgMIP SSA Kickoff Workshop John Antle AgMIP Regional Econ Team Leader 1 Accra, Ghana Sept
Learning Trajectory Patterns by Clustering: Comparative Evaluation Group D.
Spatial Modeling of Soil Heterogeneities and their Impacts on Soil-Phosphorus Losses in a Quebec Watershed By Alaba Boluwade Department of Bioresource.
The Evolution of Spatial Outlier Detection Algorithms - An Analysis of Design CSci 8715 Spatial Databases Ryan Stello Kriti Mehra.
Patch to the Future: Unsupervised Visual Prediction
Activity Recognition Aneeq Zia. Agenda What is activity recognition Typical methods used for action recognition “Evaluation of local spatio-temporal features.
EI San Jose, CA Slide No. 1 Measurement of Ringing Artifacts in JPEG Images* Xiaojun Feng Jan P. Allebach Purdue University - West Lafayette, IN.
Constructing Popular Routes from Uncertain Trajectories Ling-Yin Wei 1, Yu Zheng 2, Wen-Chih Peng 1 1 National Chiao Tung University, Taiwan 2 Microsoft.
A Novel Scheme for Video Similarity Detection Chu-Hong Hoi, Steven March 5, 2003.
Spatio-Temporal Outlier Detection in Precipitation Data
Biodiversity: periodic boundary conditions and spatiotemporal stochasticity Uno Wennergren IFM Theory and Modelling, Division of Theoretical Biology Linköping.
Reza Sherkat ICDE061 Reza Sherkat and Davood Rafiei Department of Computing Science University of Alberta Canada Efficiently Evaluating Order Preserving.
The Impact of Spatial Correlation on Routing with Compression in WSN Sundeep Pattem, Bhaskar Krishnamachri, Ramesh Govindan University of Southern California.
Based on Slides by D. Gunopulos (UCR)
Diagnosing Spatio-Temporal Internet Congestion Properties Leiwen Deng Aleksandar Kuzmanovic EECS Department Northwestern University
A Framework For Community Identification in Dynamic Social Networks Chayant Tantipathananandh Tanya Berger-Wolf David Kempe Presented by Victor Lee.
1 Dot Plots For Time Series Analysis Dragomir Yankov, Eamonn Keogh, Stefano Lonardi Dept. of Computer Science & Eng. University of California Riverside.
GeoPKDD Geographic Privacy-aware Knowledge Discovery and Delivery Kick-off meeting Pisa, March 14, 2005.
Trends and spatial patterns of drought incidence in the Omo-Ghibe River Basin, Ethiopia Policy Brief Degefu MA. & Bewket W.
Cluster Detection Comparison in Syndromic Surveillance MGIS Capstone Project Proposal Tuesday, July 8 th, 2008.
Automatically Generating Models for Botnet Detection Presenter: 葉倚任 Authors: Peter Wurzinger, Leyla Bilge, Thorsten Holz, Jan Goebel, Christopher Kruegel,
Mobility Management in Wireless Mesh Networks Utilizing Location Routing and Pointer Forwarding Bing Wang.
Kostas Kolomvatsos, Kakia Panagidi, Stathes Hadjiefthymiades Pervasive Computing Research Group ( Department of Informatics and.
Multi-Criteria Routing in Pervasive Environment with Sensors Santhanakrishnan, G., Li, Q., Beaver, J., Chrysanthis, P.K., Amer, A. and Labrinidis, A Department.
The Landmark Model: An Instance Selection Method for Time Series Data C.-S. Perng, S. R. Zhang, and D. S. Parker Instance Selection and Construction for.
Advanced Spectrum Management in Multicell OFDMA Networks enabling Cognitive Radio Usage F. Bernardo, J. Pérez-Romero, O. Sallent, R. Agustí Radio Communications.
Approaches to a VULNERABILITY Assessment Sylvia Prieler May 26, 2004 Land Use Project IIASA.
Clustering of Trajectory Data obtained from Soccer Game Record -A First Step to Behavioral Modeling Shoji Hirano Shusaku Tsumoto
K. Kolomvatsos 1, C. Anagnostopoulos 2, and S. Hadjiefthymiades 1 An Efficient Environmental Monitoring System adopting Data Fusion, Prediction & Fuzzy.
Materials and Methods GIS Development A GIS was constructed from historical records of known villages reporting human anthrax between the years 1937 and.
ELECTIONEL ECTI ON ELECTION: Energy-efficient and Low- latEncy sCheduling Technique for wIreless sensOr Networks Shamim Begum, Shao-Cheng Wang, Bhaskar.
Witold Pedrycz Department of Electrical & Computer Engineering University of Alberta, Edmonton, Canada and Systems Research Institute, Polish Academy of.
Methodological Considerations for Integrating Dynamic Traffic Assignment with Activity-Based Models Ramachandran Balakrishna Daniel Morgan Srinivasan Sundaram.
Consistency in the spatial structure of surfaces Yukio SADAHIRO Department of Urban Engineering University of Tokyo Analysis of similarity among surfaces.
October 2-3, 2015, İSTANBUL Boğaziçi University Prof.Dr. M.Erdal Balaban Istanbul University Faculty of Business Administration Avcılar, Istanbul - TURKEY.
A Multiresolution Symbolic Representation of Time Series Vasileios Megalooikonomou Qiang Wang Guo Li Christos Faloutsos Presented by Rui Li.
Forecasting Fine-Grained Air Quality Based on Big Data Date: 2015/10/15 Author: Yu Zheng, Xiuwen Yi, Ming Li1, Ruiyuan Li1, Zhangqing Shan, Eric Chang,
Date: 2011/1/11 Advisor: Dr. Koh. Jia-Ling Speaker: Lin, Yi-Jhen Mr. KNN: Soft Relevance for Multi-label Classification (CIKM’10) 1.
Pang-Ning Tan Associate Professor Dept of Computer Science & Engineering Michigan State University
Copyright © 2001, SAS Institute Inc. All rights reserved. Data Mining Methods: Applications, Problems and Opportunities in the Public Sector John Stultz,
Melbourne, Australia, Oct., 2015 gSparsify: Graph Motif Based Sparsification for Graph Clustering Peixiang Zhao Department of Computer Science Florida.
Efficient OLAP Operations in Spatial Data Warehouses Dimitris Papadias, Panos Kalnis, Jun Zhang and Yufei Tao Department of Computer Science Hong Kong.
Identifying and Analyzing Patterns of Evasion HM Investigator: Shashi Shekhar (U Minnesota) Collaborators: Renee Laubscher, James Kang Kickoff.
Indexing Time Series. Outline Spatial Databases Temporal Databases Spatio-temporal Databases Multimedia Databases Time Series databases Text databases.
Transpo 2012 Yan Xiao, Mohammed Hadi, Maria Lucia Rojas Lehman Center for Transportation Research Department of Civil and Environmental Engineering Florida.
MODEL OF INFECTIOUS DISEASES AND OUTBREAKS IN VIETNAM MILITARY ( ) Sr.Col. Le Ngoc Anh, MD. PhD. Vietnam Military Medical Department.
© Vipin Kumar IIT Mumbai Case Study 2: Dipoles Teleconnections are recurring long distance patterns of climate anomalies. Typically, teleconnections.
Detecting Undesirable Insider Behavior Joseph A. Calandrino* Princeton University Steven J. McKinney* North Carolina State University Frederick T. Sheldon.
Similarity Measurement and Detection of Video Sequences Chu-Hong HOI Supervisor: Prof. Michael R. LYU Marker: Prof. Yiu Sang MOON 25 April, 2003 Dept.
©2011 McGraw-Hill Higher Education. All rights reserved. Chapter 2 The Measurement of Motor Performance Concept: The measurement of motor performance is.
1 Double-Patterning Aware DSA Template Guided Cut Redistribution for Advanced 1-D Gridded Designs Zhi-Wen Lin and Yao-Wen Chang National Taiwan University.
Dynamic Resource Allocation for Shared Data Centers Using Online Measurements By- Abhishek Chandra, Weibo Gong and Prashant Shenoy.
Visualization in Process Mining
Mining Data Streams with Periodically changing Distributions Yingying Tao, Tamer Ozsu CIKM’09 Supervisor Dr Koh Speaker Nonhlanhla Shongwe April 26,
European Geosciences Union, General Assembly 2009
Role of Data Quality in GIS Decision Support Tools
Dept of Biostatistics, Emory University
G10 Anuj Karpatne Vijay Borra
The University of Texas at Dallas
Ling Qiu1,2,3, Carlos M. Carrillo3, and Francisco Munoz-Arriola,3,4,5
Patterns extraction from process executions
Stefano Grassi WindEurope Summit
One Health Early Warning Alert
Automatic Segmentation of Data Sequences
K. Kolomvatsos1, C. Anagnostopoulos2, and S. Hadjiefthymiades1
Intelligent Contextual Data Stream Monitoring
Study on non-compliance of ozone target values and potential air quality improvements in relation to ozone.
Automatic and Efficient Data Virtualization System on Scientific Datasets Li Weng.
Core Capacity Monitoring Evaluation Unit
Presentation transcript:

Dr. Hesam Izakian October 2014 Cluster-Centric Anomaly Detection and Characterization in Spatial Time Series The slides can be downloaded from http://www.ece.ualberta.ca/~izakian/p1.ppt Dr. Hesam Izakian October 2014

Outline Spatial time series Problem formulation Anomaly detection in spatial time series- questions Overall scheme of the proposed method Time series segmentation Spatial time series clustering Assigning anomaly scores to clusters Visualizing the propagation of anomalies An outbreak detection scenario Application Conclusions

Spatial time series Structure of data Examples A set of spatial coordinates One or more time series for each point Examples Daily average temperature in different climate stations Stock market indexes in different countries Number of absent students in different schools Number emergency department visits in different hospitals Measured signals in different parts of brain Alberta Health Services is recording Number of ED visits in different hospitals Absenteeism in different schools etc. For outbreak detection in the province

Problem formulation There are N spatial time series Objective: Find a spatial neighborhood of data In a time interval Containing a high level of unexpected changes N : number of spatial time series r : number of features in spatial part of data n : length of time series xi(s): spatial part of data xi(t): time series part of data l :length of time interval

Anomaly detection in spatial time series- questions Spatial neighborhood of data Size of neighborhood Overlapping neighborhoods Unexpected changes (anomalies) What kind of changes are expected/not expected How to evaluate the level of unexpected changes Anomaly visualization Anomaly characterization What was the source of anomaly How the anomaly is propagated over time

Overall scheme of the proposed method Revealing the structure of data in various time intervals Comparing the revealed structures Spatial time series data Spatial time series data Sliding window Anomaly scores Spatial time series clustering Fuzzy relations

Time series part segmentation Sliding window Spatio-temporal subsequences Local view of time series part The spatial part of data is always fixed

Overall scheme of the proposed method Revealing the structure of data in various time intervals Comparing the revealed structures Spatial time series data Sliding window Anomaly scores Spatial time series clustering Fuzzy relations

Fuzzy C-Means clustering- visual illustration Clustering: Grouping objects (data) so that objects in the same group are similar and objects inside different groups are dis-similar K-means is one of the well-known clustering techniques with Boolean membership assignment

Fuzzy C-Means clustering- visual illustration In fuzzy clustering instead of Boolean assignment of data to clusters, membership degrees are employed

Fuzzy C-Means clustering… Partitions N data Into clusters Result: Objective function: Minimization: N data c cluster uik indicates the membership degree of xk to vi m controls the overlap between clusters (fuzziness) FCM algorithm: 1- Generate a partition matrix U randomly 2- Calculate cluster centers 3- Update partition matrix 4- If not converged go to 2

Spatial time series clustering Reveals available structure within data In form of partition matrices Challenges Different sources: Spatial part vs. temporal part Different dimensionality in each part Different structure within each part A partition matrix is able to express the structure of data in terms of a set of membership degrees

Spatial time series clustering… In spatial time series, we define Adopted FCM objective function Characteristics When λ=0: Only spatial part of data in clustering A higher value of λ : a higher impact of time series part in clustering Optimal value of λ: Optimal impact of each part in clustering X

Spatial-time series clustering- Optimal value of λ Reconstruction criterion evaluates the quality of clusters in terms of data granularization and de-granularization

Overall scheme of the proposed method Revealing the structure of data in various time intervals Comparing the revealed structures Spatial time series data Sliding window Anomaly scores Spatial time series clustering Fuzzy relations

Assigning anomaly scores to clusters in different time windows Assign an anomaly score to each single subsequence based on historical data Aggregating anomaly scores inside revealed clusters fk is the anomaly score calculated for the subsequence corresponding to xk in time window

Overall scheme of the proposed method Revealing the structure of data in various time intervals Comparing the revealed structures Spatial time series data Sliding window Anomaly scores Spatial time series clustering Fuzzy relations

Visualizing the propagation of anomalies- Fuzzy relations Objective: quantifying relations between clusters Each data in time interval Wi is expressed in terms of a set of membership degrees in Ui So, each spatial time series xk is expressed in Ui as a set of membership degrees

Visualizing the propagation of anomalies… Objective function to construct relation Optimization To construct a fuzzy relation R, we try to estimate the elements of U1 through the elements of U2 o is a sup-t composition (e.g., max-min operator) c1: number of clusters in U1 that is correspond to W1 c2: number of clusters in U2 that is correspond to W2 R is a matrix is size c1 * c2 and its elements are in range [0, 1] alpha: learning rate rij=1 means that ith jth cluster in U2 has a strong relation with ith cluster in U1 rij=0 indicates no relation

Example An outbreak In southern part of Alberta Using NAADSM for 100 days NAADSM: North American Animal Disease Spread Model For each station we will have its x-y coordinates and a time series in length 100 measuring the rate of infected herds in 100 days

Example… A sliding window is used Length : 20 Movement: 10 Generated spatio-temporal subsequences:

Example… The order of clusters are different in different time windows (why?)

Example… Calculated anomaly scores for each cluster

Example… Only strong fuzzy relations corresponding to anomalous clusters are considered in this figure

Example… One may represent the structure of data in different time intervals using a graph-based representation Nodes are clusters Edges are relations between clusters The numbers reported above nodes are anomaly scores

Application Implemented for Agriculture and Rural Development (Government of Alberta) Using KNIME (Konstanz Information Miner) Animal health surveillance in Alberta Anomaly detection Data visualization

Conclusions A framework for anomaly detection and characterization in spatial time series is developed A sliding window to generate a set of spatio-temporal subsequences is considered Clustering is used to discover the available structure within the spatio-temporal subsequences An anomaly score assigned to each revealed spatio-temporal cluster A fuzzy relation technique is proposed to quantify the relations between clusters in successive time steps For more information please see 1. Hesam Izakian and Witold Pedrycz, Anomaly Detection and Characterization in Spatial Time Series Data: A Cluster-Centric Approach, IEEE Transactions on Fuzzy Systems, DOI: 10.1109/TFUZZ.2014.2302456, 2014. 2. Hesam Izakian and Witold Pedrycz, Agreement-Based Fuzzy C-Means for Clustering Data with Blocks of Features, Neurocomputing, vol. 127, pp. 266–280, 2014. 3. Hesam Izakian, Witold Pedrycz, and Iqbal Jamal, Clustering Spatio–temporal Data: An Augmented Fuzzy C–Means, IEEE Transactions on Fuzzy Systems, vol. 21, no. 5, pp. 855 – 868, 2013.

Thank you