Sampling and Analysis Tools for E-Center for Multi-domain Internet Performance Measurement Prasad Calyam, Ph.D. Winter ESCC Meeting, February 4 th, 2010
Topics of Discussion OSC/OARnet Project Overview Multi-domain Performance Measurement –Sampling and Analysis Requirements –Past and On-going Efforts –Focus Themes and Preliminary Results E-Center Anomaly Detection Vision Demo of “OnTimeDetect” – Anomalies Analysis Tool 2
The “network-awareness” gap! Network Researcher Bandwidth-on-demand DDoS Traceback Path Switching ….. 3 Measurements Provider Measurements collection Measurement graphs Measurement query ….. “Measurements Provider can provide measurements when, where and how ever I want!” “I am collecting all the measurements a network researcher would want!” “Hey Measurements Provider, I need pure periodic samples of available bandwidth on xyz paths crossing A, B and C domains for my performance forecasting” “Oh, I don’t collect pure periodic samples, and don’t know if A, B and C domains collect available bandwidth measurements!” This gap between “assumptions of theory” (researchers) and “delivering ability of reality” (ISPs) can be bridged by: Efficient sampling techniques that meet measurement timing demands Measurement federation policies to provision multi-domain measurements
Project Overview DOE ASCR Network Research Grant to OSC/OARnet –PI: Prasad Calyam, Ph.D. –Team: Weiping Mandrawa (Software Engineer), Amruta Joshi (Student Research Assistant), Pu Jialu (Student Research Assistant), Thomas Bitterman (Consultant) Goal: To develop multi-domain network status sampling techniques and tools to measure/analyze multi-layer performance –To be deployed on testbeds to support networking for DOE science E.g. “E-Center” for performance monitoring of LHC data feeds between Tier-1 sites Collaborations: LBNL, FermiLab, Bucknell U., U. of Delaware, Internet2 Expected outcomes to bridge the “network-awareness” gap: –Enhanced scheduling algorithms and tools to sample multi-domain and multi-layer network status with active/passive measurements –Algorithms validation with measurement analysis tools for network weather forecasting, anomaly detection, fault-diagnosis 4
Examples to show Inter-sampling timing needs Network Weather Forecasting Anomaly Detection “Network-awareness” for Real-time Control 5
Past and Ongoing Efforts (Efforts that we are leveraging and extending) Data Sources –OARnet testbed active/passive measurements –Multi-domain perfSONAR deployments (ESnet monitored sites) Data Sampling and Analysis Frameworks –OARnet OnTimeMeasure measurement scheduling algorithms –NLANR/SLAC anomaly detection algorithms –UCSD Network Weather Service forecasting algorithms –OGF NMWG (perfSONAR) Standardization Efforts –Many academic research papers… 6
Our R&D Focus Themes Multi-domain Measurement Scheduling Algorithms –Measurement schedulers that handle diverse sampling requirements and comply with diverse multi-domain policies Algorithms Validation with Measurements Analysis –Network anomaly detectors and network status predictors that are of interest to a variety of users (e.g., bulk data-transfer users, network engineers) and resource adaptation schemes DOE E-Center Integration –perfSONAR web-service extensions and new measurement analysis tools (e.g., “OnTimeMeasure”, “OnTimeDetect”, “OnTimePredict”) that request/analyze multi-layer and multi-domain measurements 7
E-Center Anomaly Detection Vision Collaboration with Prof. Mike Frey, Bucknell University Three-level abstraction of performance anomalies based adaptation –Anomaly events can be boons (good) or faults (bad) –Users actually want “value-neutral” notifications E.g., you have a problem, your upgrade yielded x% improved performance 8 Calyam-Frey collaboration - “Process Control Charts” Researcher demands ISP Delivers
“OnTimeDetect” v0.1 Demo 9 Trace-1 Trace-2
Ideas for “OnTimeSAT” in E-Center “OnTimeSAT”: OnTime Sampling and Analysis Toolkit –“OnTimeMeasure”, “OnTimeDetect”, “OnTimePredict” End-user toolkit that allows end-to-end performance analysis –Allows users to specify monitoring objectives and kickoff measurements on ESnet paths (if necessary) –Uses multi-layer measurements from ESnet perfSONAR deployments for analysis such as: Network paths monitoring Network weather forecasting Network performance anomaly detection Network-bottleneck fault-location diagnosis –Report interesting observations “metadata” Forum examples - ESnet’s Net Almanac, Twitter 10
Ideas for “OnTimeSAT” in E-Center (2) Toolkit that could extend perfSONAR “Web-Admin” Analysis capabilities –Add “Analyze” button in addition to existing “Graph” button Could enable E-Center users to perform online (on current data sets) and offline (on historic data sets) analysis of multi-domain measurements 11 Historic Data Set Current Data Set Existing “Graph” button Proposed Interactive web-form to adjust Anomaly Detection “Analysis” settings OnTimeDetect Output upon “Analyze” button click Proposed Graph with anomaly events marked Existing “Graph” generation options
Thank you for your attention!