Nonstationarities in teletraffic data which may spoil your statistical tests Piotr Żuraniewski (UvA/TNO/AGH) Felipe Mata (UAM), Michel Mandjes (UvA), Marco.

Slides:



Advertisements
Similar presentations
FINANCIAL TIME-SERIES ECONOMETRICS SUN LIJIAN Feb 23,2001.
Advertisements

Research Directions Mark Crovella Boston University Computer Science.
Time series modelling and statistical trends
Cointegration and Error Correction Models
Autocorrelation Functions and ARIMA Modelling
Introduction Describe what panel data is and the reasons for using it in this format Assess the importance of fixed and random effects Examine the Hausman.
DSCI 5340: Predictive Modeling and Business Forecasting Spring 2013 – Dr. Nick Evangelopoulos Exam 1 review: Quizzes 1-6.
Hypothesis Testing Steps in Hypothesis Testing:
Independent t -test Features: One Independent Variable Two Groups, or Levels of the Independent Variable Independent Samples (Between-Groups): the two.
CHAPTER 21 Inferential Statistical Analysis. Understanding probability The idea of probability is central to inferential statistics. It means the chance.
STATIONARY AND NONSTATIONARY TIME SERIES
Time Series Building 1. Model Identification
AP STATISTICS LESSON 11 – 1 (DAY 3) Matched Pairs t Procedures.
Collinearity. Symptoms of collinearity Collinearity between independent variables – High r 2 High vif of variables in model Variables significant in simple.
10 Further Time Series OLS Issues Chapter 10 covered OLS properties for finite (small) sample time series data -If our Chapter 10 assumptions fail, we.
STAT 497 APPLIED TIME SERIES ANALYSIS
Chapter 12 Simple Regression
Deterministic Solutions Geostatistical Solutions
Time series analysis - lecture 5
Data Sources The most sophisticated forecasting model will fail if it is applied to unreliable data Data should be reliable and accurate Data should be.
Class notes for ISE 201 San Jose State University
1 11th INTERNATIONAL MEETING on STATISTICAL CLIMATOLOGY, EDINBURGH, JULY 12-16, 2010 Downscaling future climate change using statistical ensembles E. Hertig,
Text Exercise 1.38 (a) (b) (Hint: Find the probability of the event in question of occurring.) In the statement of this exercise, you are instructed to.
PY 427 Statistics 1Fall 2006 Kin Ching Kong, Ph.D Lecture 6 Chicago School of Professional Psychology.
1 BA 555 Practical Business Analysis Review of Statistics Confidence Interval Estimation Hypothesis Testing Linear Regression Analysis Introduction Case.
Statistical Tools for Environmental Problems NRCSE.
STATISTICAL PROCESS CONTROL SPC. PROCESS IN A STATE OF STATISTICAL CONTROL.
Mitchell Hoffman UC Berkeley. Statistics: Making inferences about populations (infinitely large) using finitely large data. Crucial for Addressing Causal.
Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman 1 Chapter 7 Hypothesis Testing 7-1 Overview 7-2 Fundamentals of Hypothesis Testing.
Assumption of linearity
AM Recitation 2/10/11.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 6 – Multiple comparisons, non-normality, outliers Marshall.
1/2555 สมศักดิ์ ศิวดำรงพงศ์
1 Technology and Theories of Economic Development: Neo-classical Approach Technical Change and the Aggregate Production Function by R. Solow, 1957 The.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Inferential Statistics.
Non-parametric Tests. With histograms like these, there really isn’t a need to perform the Shapiro-Wilk tests!
Fundamentals of Data Analysis Lecture 9 Management of data sets and improving the precision of measurement.
Education Research 250:205 Writing Chapter 3. Objectives Subjects Instrumentation Procedures Experimental Design Statistical Analysis  Displaying data.
PSY2004 Research Methods PSY2005 Applied Research Methods Week Eleven Stephen Nunn.
10.2 Tests of Significance Use confidence intervals when the goal is to estimate the population parameter If the goal is to.
Anomaly detection in VoIP and Ethernet traffic under presence of daily patterns Piotr Żuraniewski (UvA/TNO/AGH) Felipe Mata (UAM), Michel Mandjes (UvA),
It’s About Time Mark Otto U. S. Fish and Wildlife Service.
Properties of OLS How Reliable is OLS?. Learning Objectives 1.Review of the idea that the OLS estimator is a random variable 2.How do we judge the quality.
1 Chapter 10: Introduction to Inference. 2 Inference Inference is the statistical process by which we use information collected from a sample to infer.
Correlation & Regression Chapter 15. Correlation It is a statistical technique that is used to measure and describe a relationship between two variables.
Chapter 16 Data Analysis: Testing for Associations.
1 Statistical Significance Testing. 2 The purpose of Statistical Significance Testing The purpose of Statistical Significance Testing is to answer the.
Week 21 Stochastic Process - Introduction Stochastic processes are processes that proceed randomly in time. Rather than consider fixed random variables.
Chapter 10 The t Test for Two Independent Samples.
MANAGEMENT SCIENCE AN INTRODUCTION TO
© Wallace J. Hopp, Mark L. Spearman, 1996, Forecasting The future is made of the same stuff as the present. – Simone.
SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS Semester 2 Session 2009/2010 ASSOC. PROF. DR. AHMED MAHIR MOKHTAR BAKRI Faculty of Science and Technology.
Education 793 Class Notes Inference and Hypothesis Testing Using the Normal Distribution 8 October 2003.
Previously Definition of a stationary process (A) Constant mean (B) Constant variance (C) Constant covariance White Noise Process:Example of Stationary.
MODELS FOR NONSTATIONARY TIME SERIES By Eni Sumarminingsih, SSi, MM.
1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.
Stats Term Test 4 Solutions. c) d) An alternative solution is to use the probability mass function and.
1 SPSS MACROS FOR COMPUTING STANDARD ERRORS WITH PLAUSIBLE VALUES.
Stochastic Process - Introduction
T-Tests and ANOVA I Class 15.
Step 1: Specify a null hypothesis
Covariance, stationarity & some useful operators
Financial Econometrics Lecture Notes 2
© LOUIS COHEN, LAWRENCE MANION AND KEITH MORRISON
STATIONARY AND NONSTATIONARY TIME SERIES
P-value Approach for Test Conclusion
Inferential Statistics
Statistical Inference
Contrasts & Statistical Inference
BOX JENKINS (ARIMA) METHODOLOGY
Presentation transcript:

Nonstationarities in teletraffic data which may spoil your statistical tests Piotr Żuraniewski (UvA/TNO/AGH) Felipe Mata (UAM), Michel Mandjes (UvA), Marco Mellia (POLITO)

Stationarity Many models assume stationarity: statistical properties do not change over time –strong stationarity: all statistical properties remain the same over time –weak stationarity: statistical properties up to second order (mean, variance, covariance) remain unchanged

Nonstationarity – problems Real life: things are changing… Bad news: sample stationarity can not be positively verified Best answer we can get: ‘we found no evidence of given type of nonstationarity’ Some examples: –mean shift –polynomial deterministic trend –variance change

Example Change in the number of users in VoIP system Model: load change in M/G/inf queue Sample ACF suggests very high correlation –slow decay? –long range dependency?

Example Changepoint detection procedure we developed allows to separate parts with different load There is no significant correlation in either of this parts Sample ACF does not estimate ACF in case of nonstationarity

Changepoint detection Window of 50 samples presented to detection procedure Add newest observation, drop oldest and repeat detection procedure In this example: true change in window number 51 Changepoint detection works well – see output of 500 experiments

Changepoint detection However, if we add deterministic trend, things go wrong Observe high false alarm ratio after polluting data with trend

Work in progress Real VoIP data from Italian service provider and aggregated IP data from Spanish university backbone network Current research: estimate and remove trend from traffic Only than apply changepoint detection procedure(s)

Work in progress Trend estimation methods: –moving average? –kernel/wavelets smoothing? –parametric methods? –time series regression? How to judge if estimated trend is really significant? Models different than M/G/inf?

Conclusions Different types of nonstationarities may severely influence statistical tests or values of estimators Even if we try to detect one type of nonstationarity, the other type may ruin our original test We always have to pay attention to the assumptions of the theorems used Share your experience!