Time Series Prediction as a Problem of Missing Values Time Series Prediction as a Problem of Missing Values Application to ESTSP2007 and NN3 Competition.

Slides:

Advertisements

Similar presentations

Self-Organizing Maps Projection of p dimensional observations to a two (or one) dimensional grid space Constraint version of K-means clustering –Prototypes.

Advertisements

Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.

Least Squares example There are 3 mountains u,y,z that from one site have been measured as 2474 ft., 3882 ft., and 4834 ft.. But from u, y looks 1422 ft.

Solving Linear Systems (Numerical Recipes, Chap 2)

Introduction to Machine Learning BMI/IBGP 730 Kun Huang Department of Biomedical Informatics The Ohio State University.

Sensitivity kernels for finite-frequency signals: Applications in migration velocity updating and tomography Xiao-Bi Xie University of California at Santa.

Propagation of Trust and Distrust Antti Sorjamaa Propagation of Trust and Distrust R. Guha, R. Kumar, P. Raghavan and A. Tomkins New York, 2004 Antti Sorjamaa.

The loss function, the normal equation,

NORM BASED APPROACHES FOR AUTOMATIC TUNING OF MODEL BASED PREDICTIVE CONTROL Pastora Vega, Mario Francisco, Eladio Sanz University of Salamanca – Spain.

Information Retrieval in Text Part III Reference: Michael W. Berry and Murray Browne. Understanding Search Engines: Mathematical Modeling and Text Retrieval.

Goals of Adaptive Signal Processing Design algorithms that learn from training data Algorithms must have good properties: attain good solutions, simple.

Previously Two view geometry: epipolar geometry Stereo vision: 3D reconstruction epipolar lines Baseline O O’ epipolar plane.

ICA Alphan Altinok. Outline  PCA  ICA  Foundation  Ambiguities  Algorithms  Examples  Papers.

Artificial Neural Networks

Digital Image Processing Final Project Compression Using DFT, DCT, Hadamard and SVD Transforms Zvi Devir and Assaf Eden.

Linear and generalised linear models Purpose of linear models Least-squares solution for linear models Analysis of diagnostics Exponential family and generalised.

SOMTIME: AN ARTIFICIAL NEURAL NETWORK FOR TOPOLOGICAL AND TEMPORAL CORRELATION FOR SPATIOTEMPORAL PATTERN LEARNING.

Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.

AMSC 6631 Sparse Solutions of Linear Systems of Equations and Sparse Modeling of Signals and Images: Midyear Report Alfredo Nava-Tudela John J. Benedetto,

NUS CS5247 A dimensionality reduction approach to modeling protein flexibility By, By Miguel L. Teodoro, George N. Phillips J* and Lydia E. Kavraki Rice.

Radial Basis Function Networks

Algorithms for a large sparse nonlinear eigenvalue problem Yusaku Yamamoto Dept. of Computational Science & Engineering Nagoya University.

Presentation to VII International Workshop on Advanced Computing and Analysis Techniques in Physics Research October, 2000.

EMIS 8381 – Spring Netflix and Your Next Movie Night Nonlinear Programming Ron Andrews EMIS 8381.

Computational Intelligence: Methods and Applications Lecture 30 Neurofuzzy system FSM and covering algorithms. Włodzisław Duch Dept. of Informatics, UMK.

Modern Navigation Thomas Herring

CS 478 – Tools for Machine Learning and Data Mining Backpropagation.

Efficient Integration of Large Stiff Systems of ODEs Using Exponential Integrators M. Tokman, M. Tokman, University of California, Merced 2 hrs 1.5 hrs.

1 Modeling Coherent Mortality Forecasts using the Framework of Lee-Carter Model Presenter: Jack C. Yue /National Chengchi University, Taiwan Co-author:

SVM Support Vector Machines Presented by: Anas Assiri Supervisor Prof. Dr. Mohamed Batouche.

A hybrid SOFM-SVR with a filter-based feature selection for stock market forecasting Huang, C. L. & Tsai, C. Y. Expert Systems with Applications 2008.

Applications of Neural Networks in Time-Series Analysis Adam Maus Computer Science Department Mentor: Doctor Sprott Physics Department.

June 5, 2006University of Trento1 Latent Semantic Indexing for the Routing Problem Doctorate course “Web Information Retrieval” PhD Student Irina Veredina.

Orthogonalization via Deflation By Achiya Dax Hydrological Service Jerusalem, Israel

Kernel adaptive filtering Lecture slides for EEL6502 Spring 2011 Sohan Seth.

Robust Pareto Design of GMDH-type Neural Networks for Systems with Probabilistic Uncertainties N. Nariman-zadeh, F. Kalantary, A. Jamali, F. Ebrahimi Faculty.

A Note on Rectangular Quotients By Achiya Dax Hydrological Service Jerusalem, Israel

Ch 12. Continuous Latent Variables Pattern Recognition and Machine Learning, C. M. Bishop, Summarized by S.-J. Kim and J.-K. Rhee Revised by D.-Y.

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Externally growing self-organizing maps and its application to database visualization and exploration.

Prabhakar.G.Vaidya and Swarnali Majumder A preliminary investigation of the feasibility of using SVD and algebraic topology to study dynamics on a manifold.

Project 11: Determining the Intrinsic Dimensionality of a Distribution Okke Formsma, Nicolas Roussis and Per Løwenborg.

Project 11: Determining the Intrinsic Dimensionality of a Distribution Okke Formsma, Nicolas Roussis and Per Løwenborg.

ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Normal Equations The Orthogonality Principle Solution of the Normal Equations.

Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Learning multiple nonredundant clusterings Presenter :

Principal Component Analysis Zelin Jia Shengbin Lin 10/20/2015.

Perceptrons Michael J. Watts

Scientific Data Analysis via Statistical Learning Raquel Romano romano at hpcrd dot lbl dot gov November 2006.

Oceanography 569 Oceanographic Data Analysis Laboratory Kathie Kelly Applied Physics Laboratory 515 Ben Hall IR Bldg class web site: faculty.washington.edu/kellyapl/classes/ocean569_.

Face detection and recognition Many slides adapted from K. Grauman and D. Lowe.

Dimension reduction (2) EDR space Sliced inverse regression Multi-dimensional LDA Partial Least Squares Network Component analysis.

MASKS © 2004 Invitation to 3D vision Lecture 6 Introduction to Algebra & Rigid-Body Motion Allen Y. Yang September 18 th, 2006.

Soft Competitive Learning without Fixed Network Dimensionality Jacob Chakareski and Sergey Makarov Rice University, Worcester Polytechnic Institute.

Data statistics and transformation revision Michael J. Watts

Nonlinear Dimensionality Reduction

Fall 2004 Backpropagation CS478 - Machine Learning.

Summary of “Efficient Deep Learning for Stereo Matching”

Building Adaptive Basis Function with Continuous Self-Organizing Map

Data Mining, Neural Network and Genetic Programming

Outline Multilinear Analysis

§1.1.4 Affine space (points)

Principal Component Analysis

Singular Value Decomposition North Atlantic SST

Supported by the National Science Foundation.

~ Least Squares example

Lecture 13: Singular Value Decomposition (SVD)

Recursively Adapted Radial Basis Function Networks and its Relationship to Resource Allocating Networks and Online Kernel Learning Weifeng Liu, Puskal.

A Direct Numerical Imaging Method for Point and Extended Targets

~ Least Squares example

Time Series Forecasting with Recurrent Neural Networks NN3 Competition Mahmoud Abou-Nasr Research & Advanced Engineering Ford Motor Company

Goodfellow: Chapter 14 Autoencoders

Presentation transcript:

Time Series Prediction as a Problem of Missing Values Time Series Prediction as a Problem of Missing Values Application to ESTSP2007 and NN3 Competition Benchmarks Antti Sorjamaa and Amaury Lendasse Time Series Prediction and ChemoInformatics Group Adaptive Informatics Research Centre Helsinki University of Technology

Antti Sorjamaa - TSPCi - AIRC - HUT2/22 Outline Time Series Prediction vs. Missing Values Time Series Prediction vs. Missing Values Global methodology Global methodology –Self-Organizing Maps (SOM) –Empirical Orthogonal Functions (EOF) Results Results

Antti Sorjamaa - TSPCi - AIRC - HUT3/22 Missing Values 19? ??3 7? ?1?? 12?3?56 ?58?? ?21?20 Time ? ? ? ? ? ?? ??? 4950????

Antti Sorjamaa - TSPCi - AIRC - HUT4/22 Time Series Prediction vs. Missing Values Methods designed for finding Missing Values in temporally related databases Methods designed for finding Missing Values in temporally related databases Time series is such a database Time series is such a database Unknown future can be considered as a set of missing values Unknown future can be considered as a set of missing values  Same methods can be applied

Antti Sorjamaa - TSPCi - AIRC - HUT5/22 Global Methodology Based on two methods –SOM Nonlinear projection / interpolation Nonlinear projection / interpolation Topology preservation on a low-dimensional grid Topology preservation on a low-dimensional grid –EOF Linear projection Linear projection Projection to high-dimensional output space Projection to high-dimensional output space Needs initialization Needs initialization

Antti Sorjamaa - TSPCi - AIRC - HUT6/22 SOM

Antti Sorjamaa - TSPCi - AIRC - HUT7/22 SOM Interpolation SOM learning is done with known data Missing values are left out Approach proposed by Cottrell and Letrémy (in Applied Stochastic Models and Data Analysis 2005)

Antti Sorjamaa - TSPCi - AIRC - HUT8/22 EOF Projection Based on Singular Value Decomposition (SVD) Based on Singular Value Decomposition (SVD) Only q Singular Values and Vectors are used Only q Singular Values and Vectors are used –q is smaller than K (the rank of X) –Larger values contain more signal than smaller

Antti Sorjamaa - TSPCi - AIRC - HUT9/22 EOF Projection (2) SVD cannot deal with missing values SVD cannot deal with missing values –Initialization is crucial! Decomposition with SVD and reconstruction Decomposition with SVD and reconstruction –q largest singular values and vectors are used in the reconstruction –Original data is not modified! –The selection of q using validation

Antti Sorjamaa - TSPCi - AIRC - HUT10/22 EOF Projection (3) 19? ??3 7? ?1?? 12?3?56 ?58?? ?21? Initialization 2. Round 1 3. Round 2 4. Round 3... n. Done!

Antti Sorjamaa - TSPCi - AIRC - HUT11/22 Global Methodology (2) Missing Data SOM EOF Data with filled values SOM grid size Number of EOF EOF iteration

Antti Sorjamaa - TSPCi - AIRC - HUT12/22 ESTSP2007 Competition Data ValidationLearning

Antti Sorjamaa - TSPCi - AIRC - HUT13/22 Results, Regressor size 11 EOF SOM SOM+EOF

Antti Sorjamaa - TSPCi - AIRC - HUT14/22 Results (2) EOF SOM SOM+EOF

Antti Sorjamaa - TSPCi - AIRC - HUT15/22 Prediction

Antti Sorjamaa - TSPCi - AIRC - HUT16/22 NN3 Competition Prediction of 111 time series Prediction of 111 time series Single, automatic, methodology for predicting all the series Single, automatic, methodology for predicting all the series Prediction of 18 values to the future for each series Prediction of 18 values to the future for each series All series rather short, which makes the prediction tricky All series rather short, which makes the prediction tricky Mean SMAPE of all series evaluated in the competition Mean SMAPE of all series evaluated in the competition

Antti Sorjamaa - TSPCi - AIRC - HUT17/22 Validation MSE = 0,1559 NN3: Long Series Validation MSE = 0,0076

Antti Sorjamaa - TSPCi - AIRC - HUT18/22 NN3: Short Series Validation MSE = 0,3493

Antti Sorjamaa - TSPCi - AIRC - HUT19/22 NN3: Validation Errors

Antti Sorjamaa - TSPCi - AIRC - HUT20/22 Summary Time Series Prediction can be viewed as a problem of Missing Values Time Series Prediction can be viewed as a problem of Missing Values SOM+EOF methodology works well, better than individual methods alone SOM+EOF methodology works well, better than individual methods alone –SOM projection is discrete –EOF needs sufficiently good initialization  Methods complete each other

Antti Sorjamaa - TSPCi - AIRC - HUT21/22 Further Work Improvements to the methodology Improvements to the methodology The selection of singular values and vectors Convergence criterion Convergence criterion How to guarantee quick convergence? Applying the methodology to data sets from other fields Applying the methodology to data sets from other fields Climatology, finance, process data

22/22 Questions? Time Series Prediction as a Problem of Missing Values Problem of Missing Values Application to ESTSP2007 and NN3 Competition Benchmarks