CS910: Foundations of Data Analytics Graham Cormode Time Series Analysis.

Slides:



Advertisements
Similar presentations
Chapter 9. Time Series From Business Intelligence Book by Vercellis Lei Chen, for COMP
Advertisements

11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Linear Regression.
Part II – TIME SERIES ANALYSIS C3 Exponential Smoothing Methods © Angel A. Juan & Carles Serrat - UPC 2007/2008.
Forecasting OPS 370.
CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
Use of Kalman filters in time and frequency analysis John Davis 1st May 2011.
FTP Biostatistics II Model parameter estimations: Confronting models with measurements.
Part II – TIME SERIES ANALYSIS C1 Introduction to TSA © Angel A. Juan & Carles Serrat - UPC 2007/2008 "If we could first know where we are, then whither.
Gaussian process emulation of multiple outputs Tony O’Hagan, MUCM, Sheffield.
Motion Tracking. Image Processing and Computer Vision: 82 Introduction Finding how objects have moved in an image sequence Movement in space Movement.
Resampling techniques Why resampling? Jacknife Cross-validation Bootstrap Examples of application of bootstrap.
Time series analysis - lecture 5
MOVING AVERAGES AND EXPONENTIAL SMOOTHING
CHAPTER 3 Forecasting.
Chapter 13 Forecasting.
Based on Slides by D. Gunopulos (UCR)
Prediction and model selection
Part II – TIME SERIES ANALYSIS C2 Simple Time Series Methods & Moving Averages © Angel A. Juan & Carles Serrat - UPC 2007/2008.
Statistical Methods for long-range forecast By Syunji Takahashi Climate Prediction Division JMA.
Slides 13b: Time-Series Models; Measuring Forecast Error
Radial Basis Function Networks
© 2003 Prentice-Hall, Inc.Chap 12-1 Business Statistics: A First Course (3 rd Edition) Chapter 12 Time-Series Forecasting.
Collaborative Filtering Matrix Factorization Approach
Box Jenkins or Arima Forecasting. H:\My Documents\classes\eco346\Lectures\chap ter 7\Autoregressive Models.docH:\My Documents\classes\eco346\Lectures\chap.
© 2002 Prentice-Hall, Inc.Chap 13-1 Statistics for Managers using Microsoft Excel 3 rd Edition Chapter 13 Time Series Analysis.
FastDTW: Toward Accurate Dynamic Time Warping in Linear Time and Space
Time-Series Analysis and Forecasting – Part V To read at home.
Oceanography 569 Oceanographic Data Analysis Laboratory Kathie Kelly Applied Physics Laboratory 515 Ben Hall IR Bldg class web site: faculty.washington.edu/kellyapl/classes/ocean569_.
STAT 497 LECTURE NOTES 2.
TIME SERIES ANALYSIS Time Domain Models: Red Noise; AR and ARMA models LECTURE 7 Supplementary Readings: Wilks, chapters 8.
Time Series Data Analysis - I Yaji Sripada. Dept. of Computing Science, University of Aberdeen2 In this lecture you learn What are Time Series? How to.
DSc 3120 Generalized Modeling Techniques with Applications Part II. Forecasting.
Forecasting to account for seasonality Regularly repeating movements that can be tied to recurring events (e.g. winter) in a time series that varies around.
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 11: Bayesian learning continued Geoffrey Hinton.
Time Series Analysis and Forecasting
It’s About Time Mark Otto U. S. Fish and Wildlife Service.
Lesson 4 -Part A Forecasting Quantitative Approaches to Forecasting Components of a Time Series Measures of Forecast Accuracy Smoothing Methods Trend Projection.
Spatial Interpolation III
1 11 Simple Linear Regression and Correlation 11-1 Empirical Models 11-2 Simple Linear Regression 11-3 Properties of the Least Squares Estimators 11-4.
1 1 Slide Forecasting Professor Ahmadi. 2 2 Slide Learning Objectives n Understand when to use various types of forecasting models and the time horizon.
Forecasting Chapter 9. Copyright © 2013 Pearson Education, Inc. publishing as Prentice Hall Define Forecast.
© 1999 Prentice-Hall, Inc. Chap Chapter Topics Component Factors of the Time-Series Model Smoothing of Data Series  Moving Averages  Exponential.
ECE-7000: Nonlinear Dynamical Systems Overfitting and model costs Overfitting  The more free parameters a model has, the better it can be adapted.
Multiple Regression I 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 4 Multiple Regression Analysis (Part 1) Terry Dielman.
Welcome to MM305 Unit 5 Seminar Prof Greg Forecasting.
CS910: Foundations of Data Analytics Graham Cormode Time Series Analysis.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Simple Linear Regression Analysis Chapter 13.
Machine Learning 5. Parametric Methods.
1 BABS 502 Moving Averages, Decomposition and Exponential Smoothing Revised March 14, 2010.
Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 14 l Time Series: Understanding Changes over Time.
1 1 Chapter 6 Forecasting n Quantitative Approaches to Forecasting n The Components of a Time Series n Measures of Forecast Accuracy n Using Smoothing.
Forecasting is the art and science of predicting future events.
CHAPTER- 3.2 ERROR ANALYSIS. 3.3 SPECIFIC ERROR FORMULAS  The expressions of Equations (3.13) and (3.14) were derived for the general relationship of.
Managerial Decision Modeling 6 th edition Cliff T. Ragsdale.
EC 827 Module 2 Forecasting a Single Variable from its own History.
Welcome to MM305 Unit 5 Seminar Dr. Bob Forecasting.
Hidden Markov Models BMI/CS 576
Biointelligence Laboratory, Seoul National University
The simple linear regression model and parameter estimation
Operations Management Contemporary Concepts and Cases
Chapter 7. Classification and Prediction
Data Mining: Concepts and Techniques
Statistical Quality Control, 7th Edition by Douglas C. Montgomery.
Chapter 6: Autoregressive Integrated Moving Average (ARIMA) Models
Statistics for Managers using Microsoft Excel 3rd Edition
Autonomous Cyber-Physical Systems: Dynamical Systems
Chapter 6: Forecasting/Prediction
Computational Genomics Lecture #3a
Presentation transcript:

CS910: Foundations of Data Analytics Graham Cormode Time Series Analysis

Outline  Introduction to time series and their applications  Comparing time series: dynamic time warping  Predicting time series: autoregressive moving averages 2

Time Series  A sequence of (numeric) values, read at (regular) intervals – T = (t 1, t 2,..., t n ) with each t i a real number – E.g. Temperature read every 1 minute – Usually single values but could be multiple values at each step  Drawn from many possible domains – Finance: stock values, other commodities – Sensors: temperature, pressure, rainfall, light... – (Computer, Real-world) Networks: traffic amount, velocity...  Want to model, compare and predict time-series data 3 Time Voltage

Time Series Similarity  How to measure similarity between two time series? 4 time i i Euclidean/Manhattan distance point-by-point may do poorly i i+2 i time Can we map points from one series to another to get a better match?

Time series alignment  Find a mapping between two series t (length n), u (length m)  Mapping is a series of pairs p 1... p L : (n 1, m 1 )... (n L, m L ) so that 1. First points are mapped together, last points are mapped together p 1 = (1,1) p L = (n,m) 2. Monotonicity: no “crossovers” in the mapping, so n 1  n 2 ...  n L and m 1  m 2 ...  m L 3. No big steps: p l+1 – p l is in {(1,0), (0,1), (1,1)} 5 time

Mapping cost  The cost of a mapping given by path p is c p (t, u) =  j = 1 L c(t n j, u m j ) – c(.,.) is a local cost function e.g. absolute difference of values  Define the distance between t and u as min{c p (t,u) | p is valid} – Call this the “warping distance”, write as D(t, u)  Properties of Warping distance – May be several different paths that achieve same minimum cost – Distance is symmetric (provided function c is symmetric) Symmetric: D(t,u) = D(u,t) – Distance may be zero even if t  u  Example: t = (0, 1, 2), u = (0, 1, 1, 2), v = (0, 2, 2) and c = Hamming – D(t, u) = 0, D(t, v) = 1, D(u, v) = 2 – D(u, v) > D(t, u) + D(t, v) : does not satisfy triangle inequality 6

Permissible paths  Paths can be drawn as steps through a grid – Conditions on permissible paths limit the possible routes – Can find optimal paths via dynamic programming (DP) 7 Violates condition 1 (start/end) Violates condition 2 (monotonicity) Violates condition 3 (big steps)

Dynamic Time Warping – Fill in matrix starting from (1,1) Fairly efficient: O(nm) time – Can recover path from matrix 8 Initial condition: D(1,1) = c(1,1). DP-equation: D(i, j – 1) + c(i, j) D(i, j) = min D(i – 1, j – 1) + c(i, j) D(i – 1, j) + c(i, j) j m 1 n1i D(1,1) D(n, m) Time Series v Time Series u

Dynamic Time Warping Variations  Limit how large a “shift” is allowed between sequences – Require |i – j | < r for some bound r – Equivalent to only looking at a diagonal band in the matrix – Improves time cost to O(r(n+m))  Have different step size restrictions instead of {(1,0), (0,1), (1,1)} – Avoid long “horizontal/vertical” subpaths  Speed up computation by approximation – Coarsen sequences: average k values together  Related to (Levenshtein) edit distance between text strings – Similar DP, but each symbol matches at most 1 in other sequence 9

Time Series Modeling  Goal: come up with a model to describe time series values – Incorporate seasonality, trends, and cycles if needed  Give a formula to predict next value based on history – Don’t expect to be exactly right – Aim to minimize error, e.g. RMSE error  In principle, the next value could depend on entire history – Hard to model (too many parameters), so simplify – Markov assumption: next value depends on only recent values E.g. Only need to look at last two values to predict next 10

Simple Recurrent Predictors  Simplest possible model: predict next value is same as last – p i+1 = t i – “If it rained today, it will rain tomorrow” – Baseline to compare against  Can make generalizations – Trend: p i+1 = t i + (t i – t i-1 ) Assume constant difference between successive observations – Rate of change: p i+1 = t i * (t i / t i-1 ) Assume constant ratio between successive observations 11

Averaging Methods  The last value may be quite changeable – Use averaging to remove variability  Simple approach: always predict the global average value – Works well if time series is stationary (no trend, cycles, period)  Moving average: compute average of last k observations – For suitable k: 5? 10? 100? 1000?  Model: predict p i =  j = 1 k t i-j / k – Easy to update: add on new term, subtract k’th  Exponential moving average: decay contribution of old values – Predict p i =  j = 1 n  (1-  ) j t i-j for smoothing parameter 0 <  < 1 – Easy to update: next prediction p i =  t i-1 + (1-  ) p i-1 12

Autoregressive moving average models  Learn the parameters of how t i depends on recent values – Autoregressive moving average (ARMA) models  Consist of two parts – Autoregressive model: predicts based on k recent values – Moving average: predict based on recent prediction errors – Prediction is the sum of two parts  To simulate model, may additionally add “noise” to the output 13

Autoregressive (AR) Model  Assume that value depends on k recent values – p i = c +  j=1 k  i t i-k : with parameters c,  1,  2,...  k  How to learn the model parameters (assume k fixed)? – Train on historic data, aim to minimize error (t i – p i ) – Minimize squared error: a multilinear regression problem  Describe model as a function of k: AR(k) 14 AR(1),  1 = 0.95AR(1),  1 = 0.5 AR(2),  1 = 0.9,  2 = -0.9

Moving Average (MA) model  Let  i be “white noise” terms – Random variables with mean 0 and standard deviation 1 – Think of as randomly occurring events that influence the series  Moving Average (MA) model depends only on last q of the  i ’s – p i =  j=0 q  j  i-j – Mean of the model is 0, variance is  j=0 q  j 2  More complicated to fit the model – The noise values  i are not directly observed  Autoregressive moving average model: sum of AR and MA model – p i = c +  j=1 k  i t i-k +  j=0 q  j  i-j – Depends on parameters k and q: written ARMA(k,q) 15

ARMA models  To predict m steps ahead: predict 1 step ahead, then the next ARMA(2,4)

Summary and Further Reading  Comparing time series: dynamic time warping  Predicting time series: autoregressive moving averages Background Reading  Survey: “Time series data mining”, P. Esling and C. Agon  Dynamic Time Warping M. Muller (book chapter in “Information Retrieval for Music and Motion) Dynamic Time Warping 17