The Automatic Explanation of Multivariate Time Series (MTS) Allan Tucker.

Slides:



Advertisements
Similar presentations
Active Appearance Models
Advertisements

Artificial Intelligence Presentation
Annual International Conference On GIS, GPS AND Remote Sensing.
INTRODUCTION TO MACHINE LEARNING Bayesian Estimation.
1 An Adaptive GA for Multi Objective Flexible Manufacturing Systems A. Younes, H. Ghenniwa, S. Areibi uoguelph.ca.
COMPARISON BETWEEN A SIMPLE GA AND AN ANT SYSTEM FOR THE CALIBRATION OF A RAINFALL-RUNOFF MODEL NELSON OBREGÓN RAFAEL E. OLARTE 6th International Conference.
An Efficient Membership-Query Algorithm for Learning DNF with Respect to the Uniform Distribution Jeffrey C. Jackson Presented By: Eitan Yaakobi Tamar.
Dynamic Bayesian Networks (DBNs)
Experiments We measured the times(s) and number of expanded nodes to previous heuristic using BFBnB. Dynamic Programming Intuition. All DAGs must have.
Introduction of Probabilistic Reasoning and Bayesian Networks
1 A Framework for Modelling Short, High-Dimensional Multivariate Time Series: Preliminary Results in Virus Gene Expression Data Analysis Paul Kellam 1,
Spie98-1 Evolutionary Algorithms, Simulated Annealing, and Tabu Search: A Comparative Study H. Youssef, S. M. Sait, H. Adiche
Date:2011/06/08 吳昕澧 BOA: The Bayesian Optimization Algorithm.
Parameterising Bayesian Networks: A Case Study in Ecological Risk Assessment Carmel A. Pollino Water Studies Centre Monash University Owen Woodberry, Ann.
1 Grouping Multivariate Time Series Variables: Applications to Chemical Process and Visual Field Data Allan Tucker- Birkbeck College Stephen Swift- Brunel.
A new crossover technique in Genetic Programming Janet Clegg Intelligent Systems Group Electronics Department.
Extending Evolutionary Programming to the Learning of Dynamic Bayesian Networks Allan Tucker Xiaohui Liu Birkbeck College University of London.
Spatial Operators for Evolving Dynamic Bayesian Networks from Spatio-Temporal Data Allan Tucker Xiaohui Liu David Garway-Heath Moorfields Eye Hospital.
An Optimal Learning Approach to Finding an Outbreak of a Disease Warren Scott Warren Powell
1 Genetic Algorithms. CS The Traditional Approach Ask an expert Adapt existing designs Trial and error.
Learning Dynamic Bayesian Networks with Changing Dependencies Allan Tucker Xiaohui Liu IDA 2003.
Ant Colonies As Logistic Processes Optimizers
Artificial Intelligence Genetic Algorithms and Applications of Genetic Algorithms in Compilers Prasad A. Kulkarni.
Intro to AI Genetic Algorithm Ruth Bergman Fall 2002.
Who am I and what am I doing here? Allan Tucker A brief introduction to my research
Topic-Sensitive PageRank Taher H. Haveliwala. PageRank Importance is propagated A global ranking vector is pre-computed.
Bridging the Gap between Applications and Tools: Modeling Multivariate Time Series X Liu, S Swift & A Tucker Department of Computer Science Birkbeck College.
Explaining Multivariate Time Series to Detect Early Problem Signs Architectures and Efficient Learning Algorithms for Dynamic Bayesian Networks Allan Tucker,
1 Genetic Algorithms. CS 561, Session 26 2 The Traditional Approach Ask an expert Adapt existing designs Trial and error.
D Nagesh Kumar, IIScOptimization Methods: M1L4 1 Introduction and Basic Concepts Classical and Advanced Techniques for Optimization.
Intro to AI Genetic Algorithm Ruth Bergman Fall 2004.
Genetic Programming. Agenda What is Genetic Programming? Background/History. Why Genetic Programming? How Genetic Principles are Applied. Examples of.
. Expressive Graphical Models in Variational Approximations: Chain-Graphs and Hidden Variables Tal El-Hay & Nir Friedman School of Computer Science & Engineering.
Genetic Algorithms Overview Genetic Algorithms: a gentle introduction –What are GAs –How do they work/ Why? –Critical issues Use in Data Mining –GAs.
1 PSO-based Motion Fuzzy Controller Design for Mobile Robots Master : Juing-Shian Chiou Student : Yu-Chia Hu( 胡育嘉 ) PPT : 100% 製作 International Journal.
UWECE 539 Class Project Engine Operating Parameter Optimization using Genetic Algorithm ECE 539 –Introduction to Artificial Neural Networks and Fuzzy Systems.
Genetic Algorithm.
Tennessee Technological University1 The Scientific Importance of Big Data Xia Li Tennessee Technological University.
Alignment and classification of time series gene expression in clinical studies Tien-ho Lin, Naftali Kaminski and Ziv Bar-Joseph.
Using Genetic Programming to Learn Probability Distributions as Mutation Operators with Evolutionary Programming Libin Hong, John Woodward, Ender Ozcan,
Improved Gene Expression Programming to Solve the Inverse Problem for Ordinary Differential Equations Kangshun Li Professor, Ph.D Professor, Ph.D College.
Estimation of Distribution Algorithms (EDA)
Bayesian networks Classification, segmentation, time series prediction and more. Website: Twitter:
Study on Genetic Network Programming (GNP) with Learning and Evolution Hirasawa laboratory, Artificial Intelligence section Information architecture field.
1 Integration of Neural Network and Fuzzy system for Stock Price Prediction Student : Dah-Sheng Lee Professor: Hahn-Ming Lee Date:5 December 2003.
Zorica Stanimirović Faculty of Mathematics, University of Belgrade
Chih-Ming Chen, Student Member, IEEE, Ying-ping Chen, Member, IEEE, Tzu-Ching Shen, and John K. Zao, Senior Member, IEEE Evolutionary Computation (CEC),
Using Bayesian Networks to Analyze Whole-Genome Expression Data Nir Friedman Iftach Nachman Dana Pe’er Institute of Computer Science, The Hebrew University.
FINAL EXAM SCHEDULER (FES) Department of Computer Engineering Faculty of Engineering & Architecture Yeditepe University By Ersan ERSOY (Engineering Project)
CHAPTER 2 Statistical Inference, Exploratory Data Analysis and Data Science Process cse4/587-Sprint
Algorithms and their Applications CS2004 ( ) 13.1 Further Evolutionary Computation.
Chapter 9 Genetic Algorithms.  Based upon biological evolution  Generate successor hypothesis based upon repeated mutations  Acts as a randomized parallel.
Project 2: Classification Using Genetic Programming Kim, MinHyeok Biointelligence laboratory Artificial.
Point Estimation of Parameters and Sampling Distributions Outlines:  Sampling Distributions and the central limit theorem  Point estimation  Methods.
Alice E. Smith and Mehmet Gulsen Department of Industrial Engineering
Dimensionality Reduction in Unsupervised Learning of Conditional Gaussian Networks Authors: Pegna, J.M., Lozano, J.A., Larragnaga, P., and Inza, I. In.
D Nagesh Kumar, IIScOptimization Methods: M8L5 1 Advanced Topics in Optimization Evolutionary Algorithms for Optimization and Search.
Sporadic model building for efficiency enhancement of the hierarchical BOA Genetic Programming and Evolvable Machines (2008) 9: Martin Pelikan, Kumara.
04/21/2005 CS673 1 Being Bayesian About Network Structure A Bayesian Approach to Structure Discovery in Bayesian Networks Nir Friedman and Daphne Koller.
A Cooperative Coevolutionary Genetic Algorithm for Learning Bayesian Network Structures Arthur Carvalho
Anders Nielsen Technical University of Denmark, DTU-Aqua Mark Maunder Inter-American Tropical Tuna Commission An Introduction.
Computational methods for inferring cellular networks II Stat 877 Apr 17 th, 2014 Sushmita Roy.
Artificial Intelligence By Mr. Ejaz CIIT Sahiwal Evolutionary Computation.
Estimation of Distribution Algorithm and Genetic Programming Structure Complexity Lab,Seoul National University KIM KANGIL.
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
TECHNOLOGY GUIDE FOUR Intelligent Systems.
EE368 Soft Computing Genetic Algorithms.
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Stochastic Methods.
Presentation transcript:

The Automatic Explanation of Multivariate Time Series (MTS) Allan Tucker

The Problem - Data Datasets which are Characteristically: –High Dimensional MTS –Large Time Lags –Changing Dependencies –Little or No Available Expert Knowledge

Lack of Algorithms to Assist Users in Explaining Events where: –Model Complex MTS Data –Learnable from Data with Little or No User Intervention –Transparency Throughout the Learning and Explaining Process is Vital The Problem - Requirement

Contribution to Knowledge Using a Combination of Evolutionary Programming (EP) and Bayesian Networks (BNs) to Overcome Issues Outlined Extending Learning Algorithms for BNs to Dynamic Bayesian Networks (DBNs) with Comparison of Efficiency Introduction of an Algorithm for Decomposing High Dimensional MTS into Several Lower Dimensional MTS

Contribution to Knowledge (Continued) Introduction of New EP-Seeded GA Algorithm Incorporating Changing Dependencies Application to Synthetic and Real-World Chemical Process Data Transparency Retained Throughout Each Stage

Real Data Data Preparation Search Methods Variable Groupings Synthetic Data Explanation Model Building Evaluation Changing Dependencies Framework Pre-processing

Key Technical Points 1 Comparing Adapted Algorithms New Representation K2/K3 [Cooper and Herskovitz] Genetic Algorithm [Larranaga] Evolutionary Algorithm [Wong] Branch and Bound [Bouckaert] Log Likelihood / Description Length Publications: –International Journal of Intelligent Systems, 2001

Key Technical Points 2 Grouping A Number of Correlation Searches A Number of Grouping Algorithms Designed Metrics Comparison of All Combinations Synthetic and Real Data Publications: –IDA99 –IEEE Trans System Man and Cybernetics 2001 –Expert Systems 2000

Key Technical Points 3 EP-Seeded GA Approximate Correlation Search Based on the One Used in Grouping Strategy Results Used to Seed Initial Population of GA Uniform Crossover Specific Lag Mutation Publications: –Genetic Algorithms and Evolutionary Computation Conference 1999 (GECCO99) –International Journal of Intelligent Systems, 2001 –IDA2001

Key Technical Points 4 Changing Dependencies Dynamic Cross Correlation Function for Analysing MTS Extend Representation Introduce a Heuristic Search - Hidden Controller Hill Climb (HCHC) –Hidden Variables to Model State of the System –Search for Structure and Hidden States Iteratively

Future Work Parameter Estimation Discretisation Changing Dependencies Efficiency New Datasets –Gene Expression Data –Visual Field Data

DBN Representation t-4 t-3 t-2 t-1 t a 0 (t) a 1 (t) a 2 (t) a 3 (t) a 4 (t) a 2 (t-2) a 3 (t-2) a 4 (t-3) a 3 (t-4) (3,1,4) (4,2,3) (2,3,2) (3,0,2) (3,4,2)

Sample DBN Search Results N = 5, MaxT = 10N = 10, MaxT = 60

Grouping One High Dimensional MTS (A) 1. Correlation Search (EP) 2. Grouping Algorithm (GGA) Several Lower Dimensional MTS List (a, b, lag) 12R12R G {0,3} {1,4,5} {2}

Sample Grouping Results Original Synthetic MTS Groupings Groupings Discovered from Synthetic Data Sample of Variables from a Discovered Oil Refinery Data Group

Parameter Estimation Simulate Random Bag (Vary R, s and c, e) Calculate Mean and SD for Each Distribution (the Probability of Selecting e from s) Test for Normality (Lilliefors’ Test) Symbolic Regression (GP) to Determine the Function for Mean and SD from R, s and c (e will be Unknown) Place Confidence Limits on the P(Number of Correlations Found  e)

0: (a,b,l) 1:(a,b,l) 2:(a,b,l) EPListSize: (a,b,l) Final EPList EP 0: ((a,b,l),(a,b,l)…(a,b,l)) 1: ((a,b,l),(a,b,l)…(a,b,l)) 2: ((a,b,l),(a,b,l)…(a,b,l)) GAPopsize: ((a,b,l) … (a,b,l)) GA Initial GAPopulation DBN EP-Seeded GA

EP-Seeded GA Results N = 10, MaxT = 60N = 20, MaxT = 60

Varying the value of c

P(TGF instate_0) = 1.0 t t-1 t-11 t-13 t-16 t-20 t-60 P(TT instate_0) = 1.0P(BPF instate_3) = 1.0 P(TT instate_1) = P(TGF instate_3) = 1.0 P(SOT instate_0) = P(C2% instate_0) = P(T6T instate_0) = P(RinT instate_0) = TimeExplanation

Changing Dependencies Time (Minutes) Variable Magnitude A/M_GB TGF

Dynamic Cross- Correlation Function

Hidden Variable - OpState t-4 t-3 t-2 t-1 t a 2 (t)OpState 2 a 2 (t-1) a 3 (t-2) a 0 (t-4)

Hidden Controller Hill Climb Update Segment_Lists through Op_State Parameter Estimation Update DBN_List through DBN Structure Search Score

HCHC Results - Oil Refinery Data

HCHC Results - Synthetic Data Generate Data from Several DBNs Append each Section of Data Together to Form One MTS with Changing Dependencies Run HCHC

t t-1 t-3 t-5 t-6 t-9 Time Explanation P(OpState 1 is 0) = 1.0P(a 1 is 0) = 1.0P(a 0 is 0) = 1.0 P(a 2 is 1) = 1.0 P(OpState 1 is 0) = 1.0P(a 1 is 1) = 1.0P(a 0 is 0) = 1.0 P(a 2 is 1) = 1.0 P(a 2 is 0) = P(a 2 is 0) = P(a 0 is 0) = P(a 0 is 1) = P(OpState 0 is 0) = P(a 0 is 1) = 0.778P(OpState 0 is 0) = 0.720

t t-1 t-3 t-5 t-6 t-7 t-9 Time Explanation P(OpState 1 is 4) = 1.0P(a 1 is 0) = 1.0P(a 0 is 0) = 1.0 P(a 2 is 1) = 1.0 P(OpState 1 is 4) = 1.0P(a 1 is 1) = 1.0P(a 0 is 0) = 1.0 P(a 2 is 1) = 1.0 P(a 2 is 1) = P(a 2 is 1) = P(a 0 is 0) = P(a 0 is 1) = P(OpState 2 is 3) = P(a 2 is 0) = P(OpState 2 is 4) = 0.222

Process Diagram TT T6T T36T RBT SOTT11 SOFT13 TGF BPF %C3 %C2 RINT FF PGM PGB AFT C11/3T

Typical Discovered Relationships TT T6T T36T RBT SOTT11 SOFT13 AFT TGF BPF %C3 %C2 RINT FF C11/3T PGM PGB

Parameters DBN SearchGAEP PopSize10010 MR CR GenBased on FCBased on FC Correlation Search c - Approx. 20% of s R - Approx. 2.5% of s Grouping GA Synth. 1Synth. 2-6 Oil PopSize CR MR Gen (1000 for GPV) 150

Parameters EP-Seeded GA c- Approx. 20% of s EPListSize- Approx. 2.5% of s GAPopSize - 10 MR- 0.1 CR- 0.8 LMR- 0.1 Gen- Based on FC HCHC OilSynthetic DBN_Iterations1× Win len Win jump 50050