Yi Zhao1, Yanyan Shen*1, Yanmin Zhu1, Junjie Yao2

Slides:

Advertisements

Similar presentations

A brief review of non-neural-network approaches to deep learning

Advertisements

Wavelets Fast Multiresolution Image Querying Jacobs et.al. SIGGRAPH95.

A Comprehensive Study on Third Order Statistical Features for Image Splicing Detection Xudong Zhao, Shilin Wang, Shenghong Li and Jianhua Li Shanghai Jiao.

1 Ensembles of Nearest Neighbor Forecasts Dragomir Yankov, Eamonn Keogh Dept. of Computer Science & Eng. University of California Riverside Dennis DeCoste.

1 Blind Image Quality Assessment Based on Machine Learning 陈欣

Oral Defense by Sunny Tang 15 Aug 2003

Self-organizing Learning Array based Value System — SOLAR-V Yinyin Liu EE690 Ohio University Spring 2005.

Attention Deficit Hyperactivity Disorder (ADHD) Student Classification Using Genetic Algorithm and Artificial Neural Network S. Yenaeng 1, S. Saelee 2.

West Virginia University

Using Neural Networks in Database Mining Tino Jimenez CS157B MW 9-10:15 February 19, 2009.

Extraction of Fetal Electrocardiogram Using Adaptive Neuro-Fuzzy Inference Systems Khaled Assaleh, Senior Member,IEEE M97G0224 黃阡.

Improved Gene Expression Programming to Solve the Inverse Problem for Ordinary Differential Equations Kangshun Li Professor, Ph.D Professor, Ph.D College.

Module 2 SPECTRAL ANALYSIS OF COMMUNICATION SIGNAL.

Multimodal Information Analysis for Emotion Recognition

A hybrid SOFM-SVR with a filter-based feature selection for stock market forecasting Huang, C. L. & Tsai, C. Y. Expert Systems with Applications 2008.

Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.

A Two-level Pose Estimation Framework Using Majority Voting of Gabor Wavelets and Bunch Graph Analysis J. Wu, J. M. Pedersen, D. Putthividhya, D. Norgaard,

A fuzzy time series-based neural network approach to option price forecasting Speaker: Prof. Yungho Leu Authors: Yungho Leu, Chien-Pang Lee, Chen-Chia.

Images Similarity by Relative Dynamic Programming M. Sc. thesis by Ady Ecker Supervisor: prof. Shimon Ullman.

Soft Computing Lecture 8 Using of perceptron for image recognition and forecasting.

Learning Long-Term Temporal Feature in LVCSR Using Neural Networks Barry Chen, Qifeng Zhu, Nelson Morgan International Computer Science Institute (ICSI),

Skeleton Based Action Recognition with Convolutional Neural Network

Iterative similarity based adaptation technique for Cross Domain text classification Under: Prof. Amitabha Mukherjee By: Narendra Roy Roll no: Group:

CS654: Digital Image Analysis Lecture 11: Image Transforms.

Collaborative Deep Learning for Recommender Systems

A Document-Level Sentiment Analysis Approach Using Artificial Neural Network and Sentiment Lexicons Yan Zhu.

Automatic Classification of Audio Data by Carlos H. L. Costa, Jaime D. Valle, Ro L. Koerich IEEE International Conference on Systems, Man, and Cybernetics.

Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.

When deep learning meets object detection: Introduction to two technologies: SSD and YOLO Wenchi Ma.

Convolutional Sequence to Sequence Learning

CNN-RNN: A Uniﬁed Framework for Multi-label Image Classiﬁcation

Learning to Compare Image Patches via Convolutional Neural Networks

Recognition of bumblebee species by their buzzing sound

End-To-End Memory Networks

Outline Problem Description Data Acquisition Method Overview

WAVENET: A GENERATIVE MODEL FOR RAW AUDIO

Data Mining, Neural Network and Genetic Programming

Deep Predictive Model for Autonomous Driving

A Neural Approach to Blind Motion Deblurring

Implementing Boosting and Convolutional Neural Networks For Particle Identification (PID) Khalid Teli .

Predicting Stock Prices with Multi-Layer Perceptron

Asymmetric Correlation Regularized Matrix Factorization for Web Service Recommendation Qi Xie1, Shenglin Zhao2, Zibin Zheng3, Jieming Zhu2 and Michael.

Multiple Wavelet Coefficients Fusion in Deep Residual Networks for Fault Diagnosis

Dynamic Routing Using Inter Capsule Routing Protocol Between Capsules

Master’s Thesis defense Ming Du Advisor: Dr. Yi Shang

Attention-based Caption Description Mun Jonghwan.

Convolutional Neural Networks

A Comparative Study of Convolutional Neural Network Models with Rosenblatt’s Brain Model Abu Kamruzzaman, Atik Khatri , Milind Ikke, Damiano Mastrandrea,

Final Presentation: Neural Network Doc Summarization

Notes Assignments Tutorial problems

Chap. 7 Regularization for Deep Learning (7.8~7.12 )

An Improved Neural Network Algorithm for Classifying the Transmission Line Faults Slavko Vasilic Dr Mladen Kezunovic Texas A&M University.

MEgo2Vec: Embedding Matched Ego Networks for User Alignment Across Social Networks Jing Zhang+, Bo Chen+, Xianming Wang+, Fengmei Jin+, Hong Chen+, Cuiping.

Neural Networks Geoff Hulten.

Department of Electrical Engineering

Outline Background Motivation Proposed Model Experimental Results

Neural Speech Synthesis with Transformer Network

Somi Jacob and Christian Bach

Time Series Forecasting Accelerator

Feature fusion and attention scheme

Attention for translation

Graph Attention Networks

Human-object interaction

Bug Localization with Combination of Deep Learning and Information Retrieval A. N. Lam et al. International Conference on Program Comprehension 2017.

Week 3 Presentation Ngoc Ta Aidean Sharghi.

DNN-BASED SPEAKER-ADAPTIVE POSTFILTERING WITH LIMITED ADAPTATION DATA FOR STATISTICAL SPEECH SYNTHESIS SYSTEMS Mirac Goksu Ozturk1, Okan Ulusoy1, Cenk.

Deep Neural Network Language Models

LHC beam mode classification

Visual Grounding.

LSTM Practical Exercise

Presentation transcript:

Forecasting Wavelet Transformed Time Series with Attentive Neural Networks Yi Zhao1, Yanyan Shen*1, Yanmin Zhu1, Junjie Yao2 1Shanghai Jiao Tong University 2East China Normal University ICDM 2018

Outline Motivation Preliminaries Model Experiments Conclusion

Motivation Forecasting complex time series demands time-domain & frequency-domain information. e.g., stock prices, web traffic, etc. Various methods to extract local time-frequency features which are important to predict the future values. Fourier Transform Short-time Fourier Transform Wavelet Transform Use the varying global trend to identify the most salient parts of local time-frequency information to better predict the future values.

Preliminaries Problem Statement Wavelets Given a time series 𝑿= 𝑥 𝑡 ∈𝑅 𝑡=1, 2, …, 𝑇}, predict 𝑥 𝑇+𝑛 𝑛∈ ℕ + , the future value in time 𝑇+𝑛 via a function 𝑓 : 𝑥 𝑇+𝑛 =𝑓(𝑿) Wavelets Given a basic wavelet function h(∙), we can get the wavelets： ℎ 𝑎,𝜏 = 1 𝑎 ℎ 𝑡 −𝜏 𝑎 Continuous Wavelet Transform (CWT) The continuous wavelet transform refers to the “similarity” between the signal x(𝑡) and the basis function ℎ 𝑎,𝜏 ∙ ： 𝐶𝑊𝑇 𝑥 (𝜏, 𝑎)= 1 𝑎 x(𝑡) ℎ ( 𝑡 −𝜏 𝑎 ) 𝑑𝑡

3. CNN feature extraction Model Overview 1. Input time series LSTM 𝑥 𝑇+𝑛 f_att(, W) 2. Scalogram 3. CNN feature extraction 4. Attention Module 5. Fusion & Prediction Preprocessing Given input time series 𝑿, we denote by 𝐶𝑊𝑇 𝑥 (𝜏, 𝑎) the wavelet transform coefficients matrix. The scalogram 𝑿 s is defined as follows: 𝑿 s = || 𝐶𝑊𝑇 𝑥 (𝜏, 𝑎)|| 2 Source: Wavelet Tutorial by Robi Polikar, http://users.rowan.edu/~polikar/WTpart3.html

Model D 𝑥 1 … … AttentionNet 𝑥 t … … … 𝑥 T … … …… C … … … D … LSTM 𝑥 1 ℎ 1 … … ℎ 𝑡−1 AttentionNet LSTM 𝑥 t … ℎ 𝑡 … … ℎ 𝑇−1 𝜶 𝟏 ℎ 𝑇 LSTM 𝑥 T … … 𝜶 𝟐 …… C 𝑥 𝑇+𝑛 … … 𝜶 𝑪−𝟏 … D 𝜶 𝑪 … VGG output features Attention Module Fusion & Prediction

Model CNN: extract local time-frequency features Feed scalogram 𝑿 s to a stack of convolution layers: 𝑿 𝑠 (𝑙) = 𝜙( 𝑾 𝑠 𝑙 ∗ 𝑿 s + 𝑏 𝑠 (𝑙) ) LSTM: learn global long-term trend and get hidden state 𝒉 𝑇 in the last step Attention module: discriminate the importance of local features dynamically Given time-frequency features 𝑿 𝑠 𝐿 = 𝒙 𝑖 𝑖∈[1, 𝐶]} and 𝒉 𝑇 Attention score: 𝑒 𝑖 = 𝑓 𝑎𝑡𝑡 𝒙 𝑖 , 𝒉 𝑇 = 𝒘 𝑇 𝜙 𝑾 𝑎 𝒙 𝑖 ; 𝒉 𝑇 + 𝒃 𝑎 +𝒃; 𝛼 𝑖 = exp⁡( 𝑒 𝑖 ) 𝑘=1 𝐶 exp⁡( 𝑒 𝑘 ) Weighted sum of local time-frequency features: 𝒛= 𝑖=1 𝐶 𝛼 𝑖 𝒙 𝑖 Fusion & Prediction: combine local and global features for prediction 𝑥 𝑇+𝑛 = 𝒘 𝑝 𝑇 𝑓 𝑝 𝑧; 𝒉 𝑇 + 𝑏 𝑝 Objective Function Squared Loss: 𝑳 𝑙𝑜𝑠𝑠 = 𝑖=1 𝑁 ( 𝑥 𝑡+𝑛 𝑖 − 𝑥 𝑡+𝑛 𝑖 ) 2 + 𝜆 ||𝑾|| 2

Datasets Stock opening prices Power consumption Collected from Yahoo! Finance. Daily opening prices of 50 stocks among 10 sectors from 2007 to 2016. Each stock has 2518 daily opening prices. Daily opening prices from 2007 to 2014 are used as training data, and those in 2015 and 2016 are used for validation and testing, respectively. Power consumption Electric power consumption in one household over 4 years. Sampled at one-minute rate. 475,023 data points in year 2010.

Main Results Metric: Baselines Mean Squared Error: 𝑀𝑆𝐸= 1 𝑁 𝑖=1 𝑁 ( 𝑥 𝑡+𝑛 𝑖 − 𝑥 𝑡+𝑛 𝑖 ) 2 Baselines Naïve: take the last value in the series as the prediction value Ensemble of LSTM & CNN: feed the concatenation of features from VGGnet and the last hidden state from LSTM into the fusion and prediction directly.

Case Study Illustration of attention mechanism Given an input of 20 stock prices, we show the scalogram, and the attention weights. The model attends to the local features that are similar to the global trend and helps in predicting the future value.

Conclusion Wavelet transform is able to explicitly disclose the latent components at different frequencies from a complex time series. We develop a novel attention-based neural network that leverages CNN to extract local time-frequency features and applies LSTM to capture the long-term global trend simultaneously. The experimental results on two real life datasets verify the usefulness of time-frequency information from wavelet transformed time series and the our method in terms of prediction accuracy.

THANK you! Q&A