Workload Characterization and Performance Assessment of Yellowstone using XDMoD and Exploratory data analysis (EDA) 1 August 2014 Ying Yang, SUNY, University at Buffalo Mentor: Tom Engel, NCAR Co-Mentors: Shawn Strande, Dave Hart, NCAR
Big Picture 2 Background XDMoD and Yellowstone Job Data Enhancement of XDMoD for Yellowstone Additional Analyses of Yellowstone Job Data Summary & Future Work
Big Picture 3 Background XDMoD and Yellowstone Job Data Enhancement of XDMoD for Yellowstone Additional Analyses of Yellowstone Job Data Summary & Future Work
Background 4 What is XDMoD? Open XDMoD is an open source tool designed to audit and facilitate the utilization of supercomputers by providing a wide range of metrics on resources, including resource utilization, resource performance, and impact on scholarship and research. XDMoD is an acronym for "XSEDE Metrics on Demand” developed by the University of Buffalo for NSF's XSEDE under NSF grant OCI
Background 5 XDMoD Architecture Details
Big Picture 6 Background XDMoD and Yellowstone Job Data Enhancement of XDMoD for Yellowstone Additional Analyses of Yellowstone Job Data Summary & Future Work
XDMoD and Yellowstone Job Data 7 XDMoD runs on a dedicated server at NWSC, and that software was installed and configured by the SSG group Collaborated with CISL and SUNY at Buffalo developers to test a new shredder for ingesting LSF job termination accounting records. Shredded and ingested all of the LSF accounting data from Yellowstone, Geyser, and Caldera (November 2012 to the present) into open XDMoD. Total job records are shredded jobs are ingested.
XDMoD and Yellowstone Job Data 8
9 LSF
XDMoD and Yellowstone Job Data 10 LSF Yellowstone Shredded Data Yellowstone Shredded Data Yellowstone Ingested Data Yellowstone Ingested Data SuperMoD REST Service API
XDMoD and Yellowstone Job Data 11 XDMoD’s Summary tab
XDMoD and Yellowstone Job Data 12 XDMoD’s Metric Explorer (CPU time group by user)
Big Picture 13 Background XDMoD and Yellowstone Job Data Enhancement of XDMoD for Yellowstone Additional Analyses of Yellowstone Job Data Summary & Future Work
Enhancement of XDMoD for Yellowstone 14 Two new metrics (1) Job Size: Weighted By Core Hours (Core Count): The average NCAR job size weighted by Core hours. Defined as: sum(i = 0 to n){job i core count*job i core hours consumed }/sum(i = 0 to n){job i core hours consumed}.
Enhancement of XDMoD for Yellowstone 15 XDMoD’s Average Job Size
Enhancement of XDMoD for Yellowstone 16 Sophie’s Job Size Weighted By Core Hours (Core Count)
Enhancement of XDMoD for Yellowstone 17 Two new metrics (2) Yellowstone %Scheduled: The percentage of resources scheduled to be utilized by jobs running on Yellowstone. Yellowstone Scheduled Utilization: The ratio of the total scheduled CPU hours to Yellowstone jobs over a given time period divided by the total CPU hours that the system could have potentially provided during that period.
Enhancement of XDMoD for Yellowstone 18 Yellowstone %Scheduled: (by job size) Many 144-node (only 1 core per node) jobs are running.
Big Picture 19 Background XDMoD and Yellowstone Job Data Enhancement of XDMoD for Yellowstone Additional Analyses of Yellowstone Job Data Summary & Future Work
Additional Analyses of Yellowstone Job Data 20 Exploratory data analysis with ingested data using R Question: What is the average job size and how has it varied over time? Methods: Forecasting Using Exponential Smoothing Forecasting Using ARIMA Model Multiple Linear Regression K-Nearest Neighbor Experiments and Results
Additional Analyses of Yellowstone Job Data 21 Methods: Exponential Smoothing a)Simple Exponential Smoothing An additive model with constant level and no seasonality b) Holt’s Exponential Smoothing An additive model with increasing or decreasing trend and no seasonality c) Holt-Winters Exponential Smoothing An additive model with increasing or decreasing trend and seasonality
Additional Analyses of Yellowstone Job Data 22 Methods: ARIMA Model Autoregressive Integrated Moving Average (ARIMA) models include an explicit statistical model for the irregular component of a time series, that allows for non-zero autocorrelations in the irregular component. Building the Model: Step1: Differencing a Time Series (diff() function) Step2:Selecting a Candidate ARIMA Model(acf(),pacf() function) Step3:Forecasting Using an ARIMA Model
Additional Analyses of Yellowstone Job Data 23 Experiments Naive method Mean method Drift method (week and month) Simple Exponential Smoothing (SES) Holt’s Exponential Smoothing (HES) Holt-Winters Exponential Smoothing (HWES) ARIMA Model Multiple Linear Regression K-Nearest Neighbor Descriptions: Data: data in 2013, total days:364. Day as training data, day as testing data. Prediction error: the percentage that the difference of predicted value and true value taking of the true value. Naive, Mean and Drift methods serve as performance comparisons. ES methods are predicted using all days before the predicting day. (e.g. day predict 101, day predict 102,..)
Additional Analyses of Yellowstone Job Data 24 Experiment Results
Big Picture 25 Background XDMoD and Yellowstone Job Data Enhancement of XDMoD for Yellowstone Additional Analyses of Yellowstone Job Data Summary & Future Work
26 Summary: Ingested all Yellostone accounting data into XDMoD(November 2012-Present) Developed two new metrics for Yellowstone and contribute back to open source Exploratory data analysis using R Future Work: Enhancement of XDMoD Further data analysis on Yellowstone data Integrate EDA into XDMoD
Acknowledgements 27 HSS and USS: Tom Engel HSS (Mentor) Shawn Strande HSS (Co-Mentor) Dave Hart USS (Co-Mentor) Davide Del Vento CSG Pamela Gillman DASG Erich Thanhardt MSSG Irfan Elahi SCSG IMAGe: Doug Nychka IMAGe
28
29