A Mobile-Agent-Based Performance-Monitoring System at RHIC Richard Ibbotson.

Slides:



Advertisements
Similar presentations
Mobile Agents Mouse House Creative Technologies Mike OBrien.
Advertisements

M. Muztaba Fuad Masters in Computer Science Department of Computer Science Adelaide University Supervised By Dr. Michael J. Oudshoorn Associate Professor.
® IBM Software Group © 2010 IBM Corporation What’s New in Profiling & Code Coverage RAD V8 April 21, 2011 Kathy Chan
A Presentation Management System for Collaborative Meetings Krzysztof Wrona (ZEUS) DESY Hamburg 24 March, 2003 ZEUS Electronic Meeting Management System.
T-FLEX DOCs PLM, Document and Workflow Management.
Objektorienteret Middleware Presentation 2: Distributed Systems – A brush up, and relations to Middleware, Heterogeneity & Transparency.
Network Management Overview IACT 918 July 2004 Gene Awyzio SITACS University of Wollongong.
Technical Architectures
An Adaptable Benchmark for MPFS Performance Testing A Master Thesis Presentation Yubing Wang Advisor: Prof. Mark Claypool.
The Organic Grid: Self- Organizing Computation on a Peer-to-Peer Network Presented by : Xuan Lin.
Rheeve: A Plug-n-Play Peer- to-Peer Computing Platform Wang-kee Poon and Jiannong Cao Department of Computing, The Hong Kong Polytechnic University ICDCSW.
Web Server Hardware and Software
NCS Grid Service Ken Meacham, IT Innovation Crystal Grid Workshop, Sept 2004.
Software Frameworks for Acquisition and Control European PhD – 2009 Horácio Fernandes.
Mobile Agents: A Key for Effective Pervasive Computing Roberto Speicys Cardoso & Fabio Kon University of São Paulo - Brazil.
OCT1 Principles From Chapter One of “Distributed Systems Concepts and Design”
Week 2 IBS 685. Static Page Architecture The user requests the page by typing a URL in a browser The Browser requests the page from the Web Server The.
A Mobile Agent Infrastructure for QoS Negotiation of Adaptive Distributed Applications Roberto Speicys Cardoso & Fabio Kon University of São Paulo – USP.
AgentOS: The Agent-based Distributed Operating System for Mobile Networks Salimol Thomas Department of Computer Science Illinois Institute of Technology,
BMC Control-M Architecture By Shaikh Ilyas
70-293: MCSE Guide to Planning a Microsoft Windows Server 2003 Network, Enhanced Chapter 7: Planning a DNS Strategy.
16: Distributed Systems1 DISTRIBUTED SYSTEM STRUCTURES NETWORK OPERATING SYSTEMS The users are aware of the physical structure of the network. Each site.
PRASHANTHI NARAYAN NETTEM.
CERN - IT Department CH-1211 Genève 23 Switzerland t Oracle and Streams Diagnostics and Monitoring Eva Dafonte Pérez Florbela Tique Aires.
1 DNS,NFS & RPC Rizwan Rehman, CCS, DU. Netprog: DNS and name lookups 2 Hostnames IP Addresses are great for computers –IP address includes information.
Barracuda Networks Confidential1 Barracuda Backup Service Integrated Local & Offsite Data Backup.
Remote Monitoring and Desktop Management Week-7. SNMP designed for management of a limited range of devices and a limited range of functions Monitoring.
Network File System (NFS) in AIX System COSC513 Operation Systems Instructor: Prof. Anvari Yuan Ma SID:
1 Network File System. 2 Network Services A Linux system starts some services at boot time and allow other services to be started up when necessary. These.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment, Enhanced Chapter 12: Managing and Implementing Backups and Disaster Recovery.
Introduction to Networking Concepts. Introducing TCP/IP Addressing Network address – common portion of the IP address shared by all hosts on a subnet/network.
Configuration Management and Server Administration Mohan Bang Endeca Server.
Using the WDK for Windows Logo and Signature Testing Craig Rowland Program Manager Windows Driver Kits Microsoft Corporation.
1 Guide to Novell NetWare 6.0 Network Administration Chapter 13.
COMP 410 & Sky.NET May 2 nd, What is COMP 410? Forming an independent company The customer The planning Learning teamwork.
Basic Concepts Of CITRIX XENAPP.
June 6 th – 8 th 2005 Deployment Tool Set Synergy 2005.
Unit – I CLIENT / SERVER ARCHITECTURE. Unit Structure  Evolution of Client/Server Architecture  Client/Server Model  Characteristics of Client/Server.
The Network Performance Advisor J. W. Ferguson NLANR/DAST & NCSA.
Contents 1.Introduction, architecture 2.Live demonstration 3.Extensibility.
What is a Distributed File System?? Allows transparent access to remote files over a network. Examples: Network File System (NFS) by Sun Microsystems.
IT 456 Seminar 5 Dr Jeffrey A Robinson. Overview of Course Week 1 – Introduction Week 2 – Installation of SQL and management Tools Week 3 - Creating and.
April 2004 At A Glance CAT is a highly portable exception monitoring and action agent that automates a set of ground system functions. Benefits Automates.
Oracle 10g Database Administrator: Implementation and Administration Chapter 2 Tools and Architecture.
Introduction to the Adapter Server Rob Mace June, 2008.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
1 The new Fabric Management Tools in Production at CERN Thorsten Kleinwort for CERN IT/FIO HEPiX Autumn 2003 Triumf Vancouver Monday, October 20, 2003.
A Brief Documentation.  Provides basic information about connection, server, and client.
70-291: MCSE Guide to Managing a Microsoft Windows Server 2003 Network, Enhanced Chapter 5: Managing and Monitoring DHCP.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment, Enhanced Chapter 11: Monitoring Server Performance.
CE Operating Systems Lecture 21 Operating Systems Protection with examples from Linux & Windows.
Transparent Mobility of Distributed Objects using.NET Cristóbal Costa, Nour Ali, Carlos Millan, Jose A. Carsí 4th International Conference in Central Europe.
CENTRALISED AND CLIENT / SERVER DBMS. Topics To Be Discussed………………………. (A) Centralized DBMS (i) IntroductionIntroduction (ii) AdvantagesAdvantages (ii)
David Adams ATLAS DIAL: Distributed Interactive Analysis of Large datasets David Adams BNL August 5, 2002 BNL OMEGA talk.
Distributed File Systems 11.2Process SaiRaj Bharath Yalamanchili.
Lecture 4 Mechanisms & Kernel for NOSs. Mechanisms for Network Operating Systems  Network operating systems provide three basic mechanisms that support.
IPS Infrastructure Technological Overview of Work Done.
Implementing Remote Procedure Calls Andrew D. Birrell and Bruce Jay Nelson 1894 Xerox Palo Alto Research Center EECS 582 – W16.
- World Class, Industry Leading Customer Support.
Third International Workshop on Networked Appliance 2001 SONA: Applying Mobile Agent to Networked Appliance Control S.Aoki, S.Makino, T.Okoshi J.Nakazawa.
Michael Mast Senior Architect Applications Technology Oracle Corporation.
CONTROL-M Training At Global Online Trainings IND: Skype: Global.onlinetrainings USA:
17 Copyright © 2006, Oracle. All rights reserved. Information Publisher.
Mobile Analyzer Concept M O B I L E A N A L Y Z E R A concept for distributed physics analysis application Mika K äki John White
SQL Database Management
PHP / MySQL Introduction
Lecture 1: Multi-tier Architecture Overview
Mobile Agents M. L. Liu.
Performance And Scalability In Oracle9i And SQL Server 2000
Mark Quirk Head of Technology Developer & Platform Group
Presentation transcript:

A Mobile-Agent-Based Performance-Monitoring System at RHIC Richard Ibbotson

2 Overview Motivation for a new monitoring system Motivation for a new monitoring system Design of the Instrumentation system Design of the Instrumentation system  Use of mobile agents (mobile programs vs remote procedures)  How it works, what it does and doesn’t do Practical experiences with a test instrument Practical experiences with a test instrument  What works well and what doesn’t Future enhancements Future enhancements

3 Monitoring System Purpose The system should: Provide performance monitoring at service-level Provide performance monitoring at service-level  “End-to-end” tests yielding mixed information on the functioning of several services  Track performance changes during configuration changes Monitor current health of system Monitor current health of system Provide some error-tracking/reporting capabilities Provide some error-tracking/reporting capabilities Be a tool for administrators & experimenters Be a tool for administrators & experimenters It will not: Provide detailed system information for fault diagnosis (system-specific, vendor-supplied tools already exist) Provide detailed system information for fault diagnosis (system-specific, vendor-supplied tools already exist)

4 Desired Features of the System View / compare past and current measurements View / compare past and current measurements Inspect correlations between metrics Inspect correlations between metrics Allow variation of sampling rate Allow variation of sampling rate  Automatically execute scheduled measurements  Can perform measurements on demand at shorter intervals Perform OS-independent measurements Perform OS-independent measurements Use a small fraction of available resources Use a small fraction of available resources

5 “Instruments” which perform measurements “Instruments” which perform measurements Centralized database of Instruments (code) and time-stamped results Centralized database of Instruments (code) and time-stamped results  Allows simple addition of new metrics  Allows previously run tests to be reproduced Mechanism for remote execution of Instruments Mechanism for remote execution of Instruments  IBM “Aglets” mobile-agent system ( Components of the System code monitor sequence of measurements parameters

6 Mobile Agents vs. RPC Remote Procedure Call Remote Procedure Call Dataset to search Local search utility Search request User’s system Remote system A pre-defined procedure on remote host executes and returns result Mobile Agent Mobile Agent Daemon on remote host accepts agent and allows execution Dataset to search Local search utility Search request User’s system Remote system Increased network load for large agents

7 Advantages of Mobile Agents Metrics can be defined at any time, and implemented on the central host Metrics can be defined at any time, and implemented on the central host Performance is measured on the relevant host Performance is measured on the relevant host Aglets system is Java-based, providing platform-independent execution Aglets system is Java-based, providing platform-independent execution Sophisticated security model exists for restricting actions of the agents Sophisticated security model exists for restricting actions of the agents

8 Use of Mobile Agents In Monitoring Simplest approach, “Single-Remote- Host” was implemented for initial configuration Simplest approach, “Single-Remote- Host” was implemented for initial configuration Waiting between tests is done on central server for reliability Waiting between tests is done on central server for reliability Itinerary approach Single Remote Hostapproach Central server Target host Central server

9 Inherits from Anatomy of an Instrument storeInDB() setInvalid()...ResultInstrument loadParams() storeResult()... MobilityPattern startTrip() nextTransfer()... StatusUpdater registerWithMonitor() updateMonitor()... ParameterList loadParams() getValue(key)... SpecificInstrument onMeasuring() onInstrumentCreation()... Aglet onCreation() run()... Inherits from The code defining a specific implementation of an Instrument is  30 lines

10 Test Instrument: File Access NFS access time (write) used as test of concept NFS access time (write) used as test of concept File size, location (file-system) are passed as parameters in database (specified at run-time) File size, location (file-system) are passed as parameters in database (specified at run-time) Measurements are started by automated process as specified by Schedule table in database Measurements are started by automated process as specified by Schedule table in database Tested access to one file-system on several client computers: Tested access to one file-system on several client computers:  Linux (PIII) system with NFSv2, 1KB blocksize  Linux (PIII) system with NFSv2, 8KB blocksize  Linux (PIII) system with NFSv3  Solaris system with NFSv3

11 Report Generation Tool Sample tests are carried out automatically by a “Scheduler” Aglet Sample tests are carried out automatically by a “Scheduler” Aglet Reports are requested via an html form. Users specify a test-type, parameter-set and target host. A Perl cgi-script queries the database and plots results using Gnuplot. Reports are requested via an html form. Users specify a test-type, parameter-set and target host. A Perl cgi-script queries the database and plots results using Gnuplot.

12 Sample Report for File access Nightly backups Weekly de-frag Results indicate server load, client config

13 Problems With the Mobile Agents Transfer interrupted when several agents move to / from the same host within  1-2 sec Transfer interrupted when several agents move to / from the same host within  1-2 sec  Small size of Aglets currently used (  15KB) cannot explain the effective dead-time  The failure is presented to the Aglet as a refusal (can detect, wait and retry)  Congestion at central host can be relieved by following a “circuit” before returning (multiple hosts)

14 Future System Development Solve transfer interruption problem Solve transfer interruption problem Development of other mobility patterns Development of other mobility patterns  NFS read-access may be tested by writing on one host and timing a read on a different host (to avoid caching)  Use of “itinerary” can ease network congestion at the central server A tracking / error-reporting system is being developed, and will be connected to a paging system A tracking / error-reporting system is being developed, and will be connected to a paging system

15 Summary Initial implementation is proving useful Initial implementation is proving useful Mobile agent architecture adds design work but eases implementation, adds flexibility Mobile agent architecture adds design work but eases implementation, adds flexibility Transfer interruption causing scalability problems, but not insurmountable Transfer interruption causing scalability problems, but not insurmountable Plan to have expanded system running before data-taking begins Plan to have expanded system running before data-taking begins

Thanks to… David Stampf, BNL Tom Throwe, BNL Bruce Gibbard, BNL Questions... Richard Ibbotson, BNL Richard Ibbotson, BNL