Workload Characterization and Performance Assessment of Yellowstone using XDMoD and Exploratory data analysis (EDA) 1 August 2014 Ying Yang, SUNY, University.

Slides:



Advertisements
Similar presentations
Chapter 9. Time Series From Business Intelligence Book by Vercellis Lei Chen, for COMP
Advertisements

Module 4. Forecasting MGS3100.
DSCI 5340: Predictive Modeling and Business Forecasting Spring 2013 – Dr. Nick Evangelopoulos Exam 1 review: Quizzes 1-6.
Forecasting the Demand Those who do not remember the past are condemned to repeat it George Santayana ( ) a Spanish philosopher, essayist, poet.
PowerPoint presentation to accompany Chopra and Meindl Supply Chain Management, 5e 1-1 Copyright ©2013 Pearson Education, Inc. publishing as Prentice Hall.
Exponential Smoothing Methods
Time Series Analysis Autocorrelation Naive & Simple Averaging
Moving Averages Ft(1) is average of last m observations
Forecasting 5 June Introduction What: Forecasting Techniques Where: Determine Trends Why: Make better decisions.
Class 20: Chapter 12S: Tools Class Agenda –Answer questions about the exam News of Note –Elections Results—Time to come together –Giants prove that nice.
Chapter 12 - Forecasting Forecasting is important in the business decision-making process in which a current choice or decision has future implications:
MOVING AVERAGES AND EXPONENTIAL SMOOTHING
Chapter 13 Forecasting.
Forecasting & Time Series Minggu 6. Learning Objectives Understand the three categories of forecasting techniques available. Become aware of the four.
Part II – TIME SERIES ANALYSIS C2 Simple Time Series Methods & Moving Averages © Angel A. Juan & Carles Serrat - UPC 2007/2008.
Chapter 11 Solved Problems 1. Exhibit 11.2 Example Linear and Nonlinear Trend Patterns 2.
FORECASTING Operations Management Dr. Ron Lembke.
© 2011 Pearson Education, Inc. Statistics for Business and Economics Chapter 13 Time Series: Descriptive Analyses, Models, & Forecasting.
Slides 13b: Time-Series Models; Measuring Forecast Error
MOVING AVERAGES AND EXPONENTIAL SMOOTHING. Forecasting methods: –Averaging methods. Equally weighted observations –Exponential Smoothing methods. Unequal.
CHAPTER 18 Models for Time Series and Forecasting
© 2003 Prentice-Hall, Inc.Chap 12-1 Business Statistics: A First Course (3 rd Edition) Chapter 12 Time-Series Forecasting.
Diane Stockton Trend analysis. Introduction Why do we want to look at trends over time? –To see how things have changed What is the information used for?
Winter’s Exponential smoothing
1 Demand Planning: Part 2 Collaboration requires shared information.
LSS Black Belt Training Forecasting. Forecasting Models Forecasting Techniques Qualitative Models Delphi Method Jury of Executive Opinion Sales Force.
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall7-1 Chapter 7: Forecasting.
CHAPTER 3 FORECASTING.
Time-Series Analysis and Forecasting – Part V To read at home.
DSc 3120 Generalized Modeling Techniques with Applications Part II. Forecasting.
Chapter 17 Time Series Analysis and Forecasting ©.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Time Series Forecasting Chapter 16.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Time Series Forecasting Chapter 13.
Time series data: each case represents a point in time. Each cell gives a value for each variable for each time period. Stationarity: Data are stationary.
Forecasting Models Decomposition and Exponential Smoothing.
Time Series 1.
PowerPoint presentation to accompany Chopra and Meindl Supply Chain Management, 5e 1-1 Copyright ©2013 Pearson Education, Inc. publishing as Prentice Hall.
Definition of Time Series: An ordered sequence of values of a variable at equally spaced time intervals. The variable shall be time dependent.
Source: Time Series Data Library MONTHLY MINNEAPOLIS PUBLIC DRUNKENNESS INTAKES JAN.’66-JUL’78 Meghan Burke.
Time-Series Forecasting Overview Moving Averages Exponential Smoothing Seasonality.
1 1 Slide Forecasting Professor Ahmadi. 2 2 Slide Learning Objectives n Understand when to use various types of forecasting models and the time horizon.
Copyright ©2016 Cengage Learning. All Rights Reserved
Big Data at Home Depot KSU – Big Data Survey Course Steve Einbender Advanced Analytics Architect.
© 1999 Prentice-Hall, Inc. Chap Chapter Topics Component Factors of the Time-Series Model Smoothing of Data Series  Moving Averages  Exponential.
Welcome to MM305 Unit 5 Seminar Prof Greg Forecasting.
© Wallace J. Hopp, Mark L. Spearman, 1996, Forecasting The future is made of the same stuff as the present. – Simone.
Open XDMoD Overview Tom Furlani, Center for Computational Research
Forecasting Demand. Forecasting Methods Qualitative – Judgmental, Executive Opinion - Internal Opinions - Delphi Method - Surveys Quantitative - Causal,
Quantitative Forecasting Methods (Non-Naive)
Times Series Forecasting and Index Numbers Chapter 16 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Copyright 2011 John Wiley & Sons, Inc. 1 Chapter 11 Time Series and Business Forecasting 11.1 Time Series Data 11.2 Simple Moving Average Model 11.3 Weighted.
Demand Forecasting Production and Operations Management Judit Uzonyi-Kecskés Ph.D. Student Department of Management and Corporate Economics Budapest University.
Sales Analysis: Impact of Product Price Change December,
Forecasting. Model with indicator variables The choice of a forecasting technique depends on the components identified in the time series. The techniques.
Demand Forecasting Production and Operations Management Judit Uzonyi-Kecskés Ph.D. Student Department of Management and Corporate Economics Budapest University.
DSCI 346 Yamasaki Lecture 7 Forecasting.
Financial Analysis, Planning and Forecasting Theory and Application
Mechanical Engineering Haldia Institute of Technology
Demand Forecasting Production and Operations Management
OPERATIONS MANAGEMENT for MBAs Fourth Edition
Belinda Boateng, Kara Johnson, Hassan Riaz
Forecasting techniques
Forecasting Approaches to Forecasting:
FORCASTING AND DEMAND PLANNING
Module 2: Demand Forecasting 2.
Forecasting Elements of good forecast Accurate Timely Reliable
CSC 558 – Data Analytics II, Spring, 2018
OUTLINE Questions? Quiz Go over homework Next homework Forecasting.
Chapter 3 Supply network design.
Exponential Smoothing
Presentation transcript:

Workload Characterization and Performance Assessment of Yellowstone using XDMoD and Exploratory data analysis (EDA) 1 August 2014 Ying Yang, SUNY, University at Buffalo Mentor: Tom Engel, NCAR Co-Mentors: Shawn Strande, Dave Hart, NCAR

Big Picture 2 Background XDMoD and Yellowstone Job Data Enhancement of XDMoD for Yellowstone Additional Analyses of Yellowstone Job Data Summary & Future Work

Big Picture 3 Background XDMoD and Yellowstone Job Data Enhancement of XDMoD for Yellowstone Additional Analyses of Yellowstone Job Data Summary & Future Work

Background 4 What is XDMoD? Open XDMoD is an open source tool designed to audit and facilitate the utilization of supercomputers by providing a wide range of metrics on resources, including resource utilization, resource performance, and impact on scholarship and research. XDMoD is an acronym for "XSEDE Metrics on Demand” developed by the University of Buffalo for NSF's XSEDE under NSF grant OCI

Background 5 XDMoD Architecture Details

Big Picture 6 Background XDMoD and Yellowstone Job Data Enhancement of XDMoD for Yellowstone Additional Analyses of Yellowstone Job Data Summary & Future Work

XDMoD and Yellowstone Job Data 7 XDMoD runs on a dedicated server at NWSC, and that software was installed and configured by the SSG group Collaborated with CISL and SUNY at Buffalo developers to test a new shredder for ingesting LSF job termination accounting records. Shredded and ingested all of the LSF accounting data from Yellowstone, Geyser, and Caldera (November 2012 to the present) into open XDMoD. Total job records are shredded jobs are ingested.

XDMoD and Yellowstone Job Data 8

9 LSF

XDMoD and Yellowstone Job Data 10 LSF Yellowstone Shredded Data Yellowstone Shredded Data Yellowstone Ingested Data Yellowstone Ingested Data SuperMoD REST Service API

XDMoD and Yellowstone Job Data 11 XDMoD’s Summary tab

XDMoD and Yellowstone Job Data 12 XDMoD’s Metric Explorer (CPU time group by user)

Big Picture 13 Background XDMoD and Yellowstone Job Data Enhancement of XDMoD for Yellowstone Additional Analyses of Yellowstone Job Data Summary & Future Work

Enhancement of XDMoD for Yellowstone 14 Two new metrics (1) Job Size: Weighted By Core Hours (Core Count): The average NCAR job size weighted by Core hours. Defined as: sum(i = 0 to n){job i core count*job i core hours consumed }/sum(i = 0 to n){job i core hours consumed}.

Enhancement of XDMoD for Yellowstone 15 XDMoD’s Average Job Size

Enhancement of XDMoD for Yellowstone 16 Sophie’s Job Size Weighted By Core Hours (Core Count)

Enhancement of XDMoD for Yellowstone 17 Two new metrics (2) Yellowstone %Scheduled: The percentage of resources scheduled to be utilized by jobs running on Yellowstone. Yellowstone Scheduled Utilization: The ratio of the total scheduled CPU hours to Yellowstone jobs over a given time period divided by the total CPU hours that the system could have potentially provided during that period.

Enhancement of XDMoD for Yellowstone 18 Yellowstone %Scheduled: (by job size) Many 144-node (only 1 core per node) jobs are running.

Big Picture 19 Background XDMoD and Yellowstone Job Data Enhancement of XDMoD for Yellowstone Additional Analyses of Yellowstone Job Data Summary & Future Work

Additional Analyses of Yellowstone Job Data 20 Exploratory data analysis with ingested data using R  Question: What is the average job size and how has it varied over time?  Methods: Forecasting Using Exponential Smoothing Forecasting Using ARIMA Model Multiple Linear Regression K-Nearest Neighbor  Experiments and Results

Additional Analyses of Yellowstone Job Data 21  Methods: Exponential Smoothing a)Simple Exponential Smoothing An additive model with constant level and no seasonality b) Holt’s Exponential Smoothing An additive model with increasing or decreasing trend and no seasonality c) Holt-Winters Exponential Smoothing An additive model with increasing or decreasing trend and seasonality

Additional Analyses of Yellowstone Job Data 22  Methods: ARIMA Model Autoregressive Integrated Moving Average (ARIMA) models include an explicit statistical model for the irregular component of a time series, that allows for non-zero autocorrelations in the irregular component. Building the Model: Step1: Differencing a Time Series (diff() function) Step2:Selecting a Candidate ARIMA Model(acf(),pacf() function) Step3:Forecasting Using an ARIMA Model

Additional Analyses of Yellowstone Job Data 23  Experiments Naive method Mean method Drift method (week and month) Simple Exponential Smoothing (SES) Holt’s Exponential Smoothing (HES) Holt-Winters Exponential Smoothing (HWES) ARIMA Model Multiple Linear Regression K-Nearest Neighbor Descriptions: Data: data in 2013, total days:364. Day as training data, day as testing data. Prediction error: the percentage that the difference of predicted value and true value taking of the true value. Naive, Mean and Drift methods serve as performance comparisons. ES methods are predicted using all days before the predicting day. (e.g. day predict 101, day predict 102,..)

Additional Analyses of Yellowstone Job Data 24  Experiment Results

Big Picture 25 Background XDMoD and Yellowstone Job Data Enhancement of XDMoD for Yellowstone Additional Analyses of Yellowstone Job Data Summary & Future Work

26 Summary: Ingested all Yellostone accounting data into XDMoD(November 2012-Present) Developed two new metrics for Yellowstone and contribute back to open source Exploratory data analysis using R Future Work: Enhancement of XDMoD Further data analysis on Yellowstone data Integrate EDA into XDMoD

Acknowledgements 27 HSS and USS: Tom Engel HSS (Mentor) Shawn Strande HSS (Co-Mentor) Dave Hart USS (Co-Mentor) Davide Del Vento CSG Pamela Gillman DASG Erich Thanhardt MSSG Irfan Elahi SCSG IMAGe: Doug Nychka IMAGe

28

29