The Network Weather Service A Distributed Resource Performance Forecasting Service for Metacomputing Rich Wolski, Neil T. Spring and Jim Hayes Presented.

Slides:



Advertisements
Similar presentations
Dynamic Replica Placement for Scalable Content Delivery Yan Chen, Randy H. Katz, John D. Kubiatowicz {yanchen, randy, EECS Department.
Advertisements

Network Weather Service Sathish Vadhiyar Sources / Credits: NWS web site: NWS papers.
IEEE INFOCOM 2004 MultiNet: Connecting to Multiple IEEE Networks Using a Single Wireless Card.
Welcome to Middleware Joseph Amrithraj
Scheduling in Web Server Clusters CS 260 LECTURE 3 From: IBM Technical Report.
Transport Layer3-1 Transport Overview and UDP. Transport Layer3-2 Goals r Understand transport services m Multiplexing and Demultiplexing m Reliable data.
1 Reading Report 8 Yin Chen 24 Mar 2004 Reference: A Network Performance Tool for Grid Environments, Craig A. Lee, Rich Wolski, Carl Kesselman, Ian Foster,
Technical Architectures
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 13 Introduction to SQL Programming Techniques.
EEC-484/584 Computer Networks Lecture 6 Wenbing Zhao
Internet Networking Spring 2006 Tutorial 12 Web Caching Protocols ICP, CARP.
Internet Traffic Patterns Learning outcomes –Be aware of how information is transmitted on the Internet –Understand the concept of Internet traffic –Identify.
Application Layer Anycasting: A Server Selection Architecture and Use in a Replicated Web Service Presented in by Jayanthkumar Kannan On 11/26/03.
The Transport Layer Chapter 6. The TCP Segment Header TCP Header.
The Network Weather Service: A Distributed Resource Performance Forecasting Service for Metacomputing, Rich Wolski, Neil Spring, and Jim Hayes, Journal.
In-Band Flow Establishment for End-to-End QoS in RDRN Saravanan Radhakrishnan.
Component-Based Software Engineering Introducing the Bank Example Paul Krause.
Measuring Performance Chapter 12 CSE807. Performance Measurement To assist in guaranteeing Service Level Agreements For capacity planning For troubleshooting.
Architecture & Performance Community Place case study Presented by u Jin Hyung, SEO.
Grid Computing, B. Wilkinson, 20046c.1 Globus III - Information Services.
University of Kansas Data Discovery on the Information Highway Susan Gauch University of Kansas.
P2P File Sharing Systems
Introduction Widespread unstructured P2P network
Barracuda Load Balancer Server Availability and Scalability.
Server Load Balancing. Introduction Why is load balancing of servers needed? If there is only one web server responding to all the incoming HTTP requests.
FALL 2005CSI 4118 – UNIVERSITY OF OTTAWA1 Part 4 Web technologies: HTTP, CGI, PHP,Java applets)
User-Perceived Performance Measurement on the Internet Bill Tice Thomas Hildebrandt CS 6255 November 6, 2003.
1 Chapter Client-Server Interaction. 2 Functionality  Transport layer and layers below  Basic communication  Reliability  Application layer.
Resource Allocation using Java RMI Amrish Kaushik Minal Malde CS599-Grid Computing Project Report USC Computer Science.
Location Based Information Service using CORBA CS597 Direct Reading Madhu Narayanan & Rahul Vaghela Advisor: Dr. Yugi Lee.
High Performance Computing & Communication Research Laboratory 12/11/1997 [1] Hyok Kim Performance Analysis of TCP/IP Data.
Performance of the Relational Grid Monitoring Architecture (R-GMA) CMS data challenges. The nature of the problem. What is GMA ? And what is R-GMA ? Performance.
An Integrated Instrumentation Architecture for NGI Applications Ian Foster, Darcy Quesnel, Steven Tuecke Argonne National Laboratory The University of.
Transport Layer: TCP and UDP. Overview of TCP/IP protocols Comparing TCP and UDP TCP connection: establishment, data transfer, and termination Allocation.
Computing Infrastructure for Large Ecommerce Systems -- based on material written by Jacob Lindeman.
P2Pedia A Distributed Wiki Network Management and Artificial Intelligence Laboratory Carleton University Presented by: Alexander Craig May 9 th, 2011.
Ch 2. Application Layer Myungchul Kim
4 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved. Computer Software Chapter 4.
The Replica Location Service The Globus Project™ And The DataGrid Project Copyright (c) 2002 University of Chicago and The University of Southern California.
CSCI 465 D ata Communications and Networks Lecture 27 Martin van Bommel CSCI 465 Data Communications & Networks 1.
Automatic Statistical Evaluation of Resources for Condor Daniel Nurmi, John Brevik, Rich Wolski University of California, Santa Barbara.
A Bandwidth Estimation Method for IP Version 6 Networks Marshall Crocker Department of Electrical and Computer Engineering Mississippi State University.
ICP and the Squid Web Cache Duane Wessels and K. Claffy 산업공학과 조희권.
Measuring the Capacity of a Web Server USENIX Sympo. on Internet Tech. and Sys. ‘ Koo-Min Ahn.
JS (Java Servlets). Internet evolution [1] The internet Internet started of as a static content dispersal and delivery mechanism, where files residing.
Copyright 2007, Information Builders. Slide 1 Machine Sizing and Scalability Mark Nesson, Vashti Ragoonath June 2008.
An Efficient Gigabit Ethernet Switch Model for Large-Scale Simulation Dong (Kevin) Jin.
Dispatching Java agents to user for data extraction from third party web sites Alex Roque F.I.U. HPDRC.
An Efficient Gigabit Ethernet Switch Model for Large-Scale Simulation Dong (Kevin) Jin.
Multimedia Retrieval Architecture Electrical Communication Engineering, Indian Institute of Science, Bangalore – , India Multimedia Retrieval Architecture.
Network Weather Service. Introduction “NWS provides accurate forecasts of dynamically changing performance characteristics from a distributed set of metacomputing.
KYUNG-HWA KIM HENNING SCHULZRINNE 12/09/2008 INTERNET REAL-TIME LAB, COLUMBIA UNIVERSITY DYSWIS.
CMSC 691B Multi-Agent System A Scalable Architecture for Peer to Peer Agent by Naveen Srinivasan.
LACSI 2002, slide 1 Performance Prediction for Simple CPU and Network Sharing Shreenivasa Venkataramaiah Jaspal Subhlok University of Houston LACSI Symposium.
TCP/IP1 Address Resolution Protocol Internet uses IP address to recognize a computer. But IP address needs to be translated to physical address (NIC).
Ch 2. Application Layer Myungchul Kim
Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe.
A System for Monitoring and Management of Computational Grids Warren Smith Computer Sciences Corporation NASA Ames Research Center.
Monitoring Dynamic IOC Installations Using the alive Record Dohn Arms Beamline Controls & Data Acquisition Group Advanced Photon Source.
Resource Characterization Rich Wolski, Dan Nurmi, and John Brevik Computer Science Department University of California, Santa Barbara VGrADS Site Visit.
Internet and Distributed Application Services
Operating System.
StratusLab Final Periodic Review
StratusLab Final Periodic Review
BDII Performance Tests
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S
Overview of SDN Controller Design
Replication Middleware for Cloud Based Storage Service
The Globus Toolkit™: Information Services
AWS Cloud Computing Masaki.
Presentation transcript:

The Network Weather Service A Distributed Resource Performance Forecasting Service for Metacomputing Rich Wolski, Neil T. Spring and Jim Hayes Presented By: Mohammad Al-Saeed

Organization Introduction Motivation: why the NWS? The NWS: what is the NWS? Related work NWS system architecture Design goals System components NWS components NWS interface Conclusion and future work

Motivation Searching for the environment that delivers the most Dynamic nature of metacomputing environments Adaptive applications Adapt to changing environments Knowledge needed for adaptation Resource discovery and allocation

The Network Weather Service A distributed system for producing short- term deliverable performance forecasts Goal: dynamically measure and forecast the performance deliverable at the application level from a set of network resources Measurements currently supported: Available fraction of CPU time End-to-end TCP connection time End-to-end TCP network latency End-to-end TCP network bandwidth

Related Work TReno: performance at transport layer using TCP Pathchar: bandwidth over a path bprobe/cprobe: bottleneck link speed and competing traffic Topology-d: uses ping and netperf to find bandwidth between hosts in a group then analyzes this data to find minimum-cost logical topology ReMoS: network resource monitoring

NWS System Architecture Design objectives Scalability: scales to any metacomputing infrastructure Predictive accuracy: provides accurate measurements and forecasts Non-intrusiveness: shouldn ’ t load the resources it monitors Execution longevity: available all time Ubiquity: accessible from everywhere, monitors all resources

System Components Four different component processes Persistent State process: handles storage of measurements Name Server process: directory server for the system Sensor processes: measure current performance of different resources Forecaster process: predicts deliverable performance of a resource during a given time

NWS Processes

NWS Components Persistent State Management Naming Server Performance Monitoring: NWS Sensors CPU Sensor Network Sensor Sensor Control Cliques: hierarchy and contention Adaptive time-out discovery Forecasting Forecaster and forecasting models Sample forecaster results

Persistent State Management All NWS processes are stateless The system state (measurements) are managed by the PS process: Storage & retrieval of measurements Measurements are time-stamped plain-text strings Measurements are written to disk immediately and acknowledged Measurements are stored in a circular queue of tunable size

Naming Server Primitive text string directory service for the NWS system The only component known system-wide Information stored include Name to IP binding information Group configuration Parameters for various processes Each process must refresh its registration with the name server periodically Centralized

Performance Monitoring Actual monitoring is performed by a set of sensors Accuracy vs. Intrusiveness A sensor ’ s life: { Register with the NS; Query the NS for parameters; Generate conditional test; Forever { if conditions are met then { perform test; time-stamp results and send them to the PS refresh registration with the NS } }

CPU Sensor Measures available CPU fraction Testing tools: Unix uptime: reports load average in the past x minutes Unix vmstat: reports idle-, user- and system- time Active probes Accuracy: Results assume a full priority job Doesn ’ t know the priority of jobs in the queue

Active Probing Improvements Measurements produced using uptime Measurements produced using vmstat

Network Sensor Carries network-related measurements Testing: using active network probes Establish and release TCP connections Moving large (small) data to measure bandwidth (delay) Measures connections with all peer sensors Problems Accuracy: depends on socket interface Complexity: N 2 -N tests, collisions (contention)

Network Sensor Control Sensors are organized into sensor sets called cliques Each clique is configurable and has one leader Clique sets are logical, but can be based on physical topology Leaders are elected using a distributed election protocol A sensor can participate in many cliques Advantages Scalability by organizing cliques in a hierarchy Reduce the N 2 -N Accuracy by more frequent tests

Clique Hierarchy National UTenn SDSC PCL UCSD

Contention Each leader maintains a clique token (and time between tokens) The sensor that has the token performs all its tests then passes the token to the next sensor in the list Adaptive time-out discovery Tokens have time-out field Tokens have sequence numbers The leader adaptively controls the time-out

Forecaster Process A forecasting driver and a set of compile- time prediction modules Forecasting process: Fetching required measurements from the PS Passing the time series to each prediction module Choosing the best returned prediction Incorporate sophisticated prediction techniques?

Sample Forecaster Results UC Santa Barbara – Kansas State U. Recorded Bandwidth UC Santa Barbara – Kansas State U. Forecasted Bandwidth

NWS Interface C API Quick short-term forecasts for applications InitForecaster() RequestForecasts() CGI interface Continuous access to NWS forecasts through the web Interactively produces graphs for performance and forecasts

Sample CGI-Generated Graph

Conclusion and Future Work NWS is scalable, stable and always available NWS relies on adaptivity to achieve its design goals NWS is open (adding sensors and forecasting models) Current forecasting is excellent compared to powerful sophisticated forecasting techniques Enhancements Basing the NS on LDAP Automatic clique configuration Forecasting methodologies