GAUSSIAN PROCESS REGRESSION FORECASTING OF COMPUTER NETWORK PERFORMANCE CHARACTERISTICS 1 Departments of Computer Science and Mathematics, 2 Department.

Slides:



Advertisements
Similar presentations
Copula Regression By Rahul A. Parsa Drake University &
Advertisements

Notes Sample vs distribution “m” vs “µ” and “s” vs “σ” Bias/Variance Bias: Measures how much the learnt model is wrong disregarding noise Variance: Measures.
INTRODUCTION TO MACHINE LEARNING Bayesian Estimation.
Pattern Recognition and Machine Learning
CS Statistical Machine learning Lecture 13 Yuan (Alan) Qi Purdue CS Oct
Integration of sensory modalities
Artificial Intelligence Lecture 2 Dr. Bo Yuan, Professor Department of Computer Science and Engineering Shanghai Jiaotong University
On Systems with Limited Communication PhD Thesis Defense Jian Zou May 6, 2004.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Forecasting JY Le Boudec 1. Contents 1.What is forecasting ? 2.Linear Regression 3.Avoiding Overfitting 4.Differencing 5.ARMA models 6.Sparse ARMA models.
Programme in Statistics (Courses and Contents). Elementary Probability and Statistics (I) 3(2+1)Stat. 101 College of Science, Computer Science, Education.
Machine Learning CMPT 726 Simon Fraser University
Collaborative Ordinal Regression Shipeng Yu Joint work with Kai Yu, Volker Tresp and Hans-Peter Kriegel University of Munich, Germany Siemens Corporate.
Chapter 5. Operations on Multiple R. V.'s 1 Chapter 5. Operations on Multiple Random Variables 0. Introduction 1. Expected Value of a Function of Random.
CHAPTER 4: Parametric Methods. Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2 Parametric Estimation X = {
CHAPTER 4: Parametric Methods. Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2 Parametric Estimation Given.
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges.
Statistics 350 Lecture 17. Today Last Day: Introduction to Multiple Linear Regression Model Today: More Chapter 6.
Modern Navigation Thomas Herring
Machine Learning Usman Roshan Dept. of Computer Science NJIT.
Spline and Kernel method Gaussian Processes
EE513 Audio Signals and Systems Statistical Pattern Classification Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
Gaussian process regression Bernád Emőke Gaussian processes Definition A Gaussian Process is a collection of random variables, any finite number.
Gaussian process modelling
Gaussian Processes Nando de Freitas University of British Columbia June 2010 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this.
INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, Lecture.
Least-Mean-Square Training of Cluster-Weighted-Modeling National Taiwan University Department of Computer Science and Information Engineering.
Soft Sensor for Faulty Measurements Detection and Reconstruction in Urban Traffic Department of Adaptive systems, Institute of Information Theory and Automation,
University of Southern California Department Computer Science Bayesian Logistic Regression Model (Final Report) Graduate Student Teawon Han Professor Schweighofer,
Sparse Inverse Covariance Estimation with Graphical LASSO J. Friedman, T. Hastie, R. Tibshirani Biostatistics, 2008 Presented by Minhua Chen 1.
7.4 – Sampling Distribution Statistic: a numerical descriptive measure of a sample Parameter: a numerical descriptive measure of a population.
Mixture Models, Monte Carlo, Bayesian Updating and Dynamic Models Mike West Computing Science and Statistics, Vol. 24, pp , 1993.
Ch 2. Probability Distributions (1/2) Pattern Recognition and Machine Learning, C. M. Bishop, Summarized by Yung-Kyun Noh and Joo-kyung Kim Biointelligence.
Empirical Research Methods in Computer Science Lecture 7 November 30, 2005 Noah Smith.
Heat Treatment Calculator (H.T.C) Sham Kashyap Computing and Information Sciences Kansas State University.
Gaussian Processes Li An Li An
Sparse Kernel Methods 1 Sparse Kernel Methods for Classification and Regression October 17, 2007 Kyungchul Park SKKU.
Lecture 2: Statistical learning primer for biologists
Over-fitting and Regularization Chapter 4 textbook Lectures 11 and 12 on amlbook.com.
Gaussian Processes For Regression, Classification, and Prediction.
Learning Photographic Global Tonal Adjustment with a Database of Input / Output Image Pairs.
Stats Term Test 4 Solutions. c) d) An alternative solution is to use the probability mass function and.
Introduction to Gaussian Process CS 478 – INTRODUCTION 1 CS 778 Chris Tensmeyer.
Gaussian Process Networks Nir Friedman and Iftach Nachman UAI-2K.
“Jožef Stefan” Institute Department of Systems and Control Modelling and Control of Nonlinear Dynamic Systems with Gaussian Process Models Juš Kocijan.
RECITATION 2 APRIL 28 Spline and Kernel method Gaussian Processes Mixture Modeling for Density Estimation.
Kalman Filter and Data Streaming Presented By :- Ankur Jain Department of Computer Science 7/21/03.
Presentation : “ Maximum Likelihood Estimation” Presented By : Jesu Kiran Spurgen Date :
Ch 1. Introduction Pattern Recognition and Machine Learning, C. M. Bishop, Updated by J.-H. Eom (2 nd round revision) Summarized by K.-I.
Machine Learning Usman Roshan Dept. of Computer Science NJIT.
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION.
Usman Roshan Dept. of Computer Science NJIT
Stat 223 Introduction to the Theory of Statistics
National Mathematics Day
Probability Theory and Parameter Estimation I
ICS 280 Learning in Graphical Models
Table 1. Advantages and Disadvantages of Traditional DM/ML Methods
Chapter Six Normal Curves and Sampling Probability Distributions
CSCI 5822 Probabilistic Models of Human and Machine Learning
Hidden Markov Models Part 2: Algorithms
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Pattern Recognition and Machine Learning
Stat 223 Introduction to the Theory of Statistics
Multivariate Methods Berlin Chen
Multivariate Methods Berlin Chen, 2005 References:
Machine Learning – a Probabilistic Perspective
Jia-Bin Huang Virginia Tech
Usman Roshan Dept. of Computer Science NJIT
Probabilistic Surrogate Models
Presentation transcript:

GAUSSIAN PROCESS REGRESSION FORECASTING OF COMPUTER NETWORK PERFORMANCE CHARACTERISTICS 1 Departments of Computer Science and Mathematics, 2 Department of Mathematics Bucknell University, Lewisburg, PA Christina Garman 1 (‘11) & Michael Frey, Ph.D. 2 Introduction Computer network conditions (available bandwidth, latency, and loss) concern: Users with large data transfers or resource-intensive applications Network engineers monitoring the quality of their network Network researchers We have investigated Gaussian process regression for forecasting network conditions. Gaussian process regression (GPR) allows us to make predictions of continuous quantities based on “learning” from a set of training data GPR accommodates – Asynchronous data sources – Periodic data – Actively measured data – Missing data – Structural data Because of the nature of computer networks, all of these situations are likely to occur, so our forecasting framework must be able to handle them. Gaussian Process Regression A Gaussian process is an indexed set of random variables, any finite number of which have a joint Gaussian distribution and can be completely specified by a mean function and covariance function Important Features – Covariance (or kernel) function – gives a model of the data and controls the properties of the Gaussian process – Hyperparameters – adjustable parameters, “learned” or inferred from a set of training data, allowing the kernel function to provide the best description of the current data GPR can model various different trends and properties of a data set Simple covariance functions can be combined to create more complex ones In the example below 3, note the long term rising trend, seasonal variation, and some small irregularities and how GPR is able to incorporate all of them into the forecast Basic Algorithm New Formulae for Updating GPR Forecasts Computationally efficient – no new matrix inversions No need to redo whole process each time a new data point is received Variance – Two Questions 1.What is the effect of history length on prediction error? 2.How does the variance change as our forecasting point moves out in time? Both of these questions boil down to a study of the same quantity: Using the Rayleigh-Ritz theorem, we can bound the quantity that we are interested in, giving us: Or more simply: Our forecasting efforts focus on the Department of Energy’s Energy Sciences Network (ESnet) pictured above 1 Forecasts are done in MATLAB 2. We have created a framework that allows the code to be run directly in MATLAB or from a C program. Revisit this work from an information theoretic perspective Improve network performance characteristics forecasting using multivariate data Future Work Acknowledgements Department of Energy Research Assistantship MATLAB Code: Carl Edward Rasmussen and Hannes Nickisch References 1.Department of Energy, Energy Sciences Network (Esnet), 2.Gaussian Process Regression and Classification Toolbox version 3.0, Carl Edward Rasmussen and Hannes Nickisch, , 3.Carl Edward Rasmussen and Christopher K. I. Williams, Gaussian Processes for Machine Learning, MIT Press, Maximum Likelihood Estimation Forecasting