California Institute of Technology

Slides:



Advertisements
Similar presentations
TCP Monitor and Auto Tuner. Need Analysis Enable monitoring of TCP Connections Enable maximum bandwidth utilization No such utility available in MONALISA.
Advertisements

INTRODUCTION TO COMPUTER NETWORKS Zeeshan Abbas. Introduction to Computer Networks INTRODUCTION TO COMPUTER NETWORKS.
May 2005 Iosif Legrand 1 Iosif Legrand California Institute of Technology May 2005 An Agent Based, Dynamic Service System to Monitor, Control and Optimize.
May 2005 Iosif Legrand 1 Iosif Legrand California Institute of Technology ICFA WORKSHOP Daegu, May 2005 Daegu, May 2005 An Agent Based, Dynamic Service.
A Java Architecture for the Internet of Things Noel Poore, Architect Pete St. Pierre, Product Manager Java Platform Group, Internet of Things September.
Notes to the presenter. I would like to thank Jim Waldo, Jon Bostrom, and Dennis Govoni. They helped me put this presentation together for the field.
June 2003 Iosif Legrand MONitoring Agents using a Large Integrated Services Architecture Iosif Legrand California Institute of Technology.
Monitoring and controlling VRVS Reflectors Catalin Cirstoiu 3/7/2003.
May TERENA workshopStarPlane StarPlane: Application Specific Management of Photonic Networks Paola Grosso SNE group - UvA.
October 2003 Iosif Legrand Iosif Legrand California Institute of Technology.
The new The new MONARC Simulation Framework Iosif Legrand  California Institute of Technology.
© DSRG 2001www.cs.agh.edu.pl Cross Grid Workshop - Kraków Krzysztof Zieliński, Sławomir Zieliński University of Mining and Metallurgy {kz,
Grid Monitoring By Zoran Obradovic CSE-510 October 2007.
September 2005 Iosif Legrand 1 End User Agents: extending the "intelligence" to the edge in Distributed Service Systems Iosif Legrand California Institute.
Online Monitoring with MonALISA Dan Protopopescu Glasgow, UK Dan Protopopescu Glasgow, UK.
Active Monitoring in GRID environments using Mobile Agent technology Orazio Tomarchio Andrea Calvagna Dipartimento di Ingegneria Informatica e delle Telecomunicazioni.
ACAT 2003 Iosif Legrand Iosif Legrand California Institute of Technology.
Ramiro Voicu December Design Considerations  Act as a true dynamic service and provide the necessary functionally to be used by any other services.
1 Ramiro Voicu, Iosif Legrand, Harvey Newman, Artur Barczyk, Costin Grigoras, Ciprian Dobre, Alexandru Costan, Azher Mughal, Sandor Rozsa Monitoring and.
Monitoring, Accounting and Automated Decision Support for the ALICE Experiment Based on the MonALISA Framework.
February 2006 Iosif Legrand 1 Iosif Legrand California Institute of Technology February 2006 February 2006 An Agent Based, Dynamic Service System to Monitor,
1 VINCI : Virtual Intelligent Networks for Computing Infrastructures An Integrated Network Services System to Control and Optimize Workflows in Distributed.
1 Iosif Legrand, Harvey Newman, Ramiro Voicu, Costin Grigoras, Catalin Cirstoiu, Ciprian Dobre An Agent Based, Dynamic Service System to Monitor, Control.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
INTERNET AND ADHOC SERVICE DISCOVERY BY: NEHA CHAUDHARY.
OS Services And Networking Support Juan Wang Qi Pan Department of Computer Science Southeastern University August 1999.
LISHEP 2004 Iosif Legrand Iosif Legrand California Institute of Technology DISTRIBUTED SERVICES.
GVis: Grid-enabled Interactive Visualization State Key Laboratory. of CAD&CG Zhejiang University, Hangzhou
Overview of ALICE monitoring Catalin Cirstoiu, Pablo Saiz, Latchezar Betev 23/03/2007 System Analysis Working Group.
CEOS Working Group on Information Systems and Services - 1 Data Services Task Team Discussions on GRID and GRIDftp Stuart Doescher, USGS WGISS-15 May 2003.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
1 MonALISA Team Iosif Legrand, Harvey Newman, Ramiro Voicu, Costin Grigoras, Ciprian Dobre, Alexandru Costan MonALISA capabilities for the LHCOPN LHCOPN.
Xrootd Monitoring and Control Harsh Arora CERN. Setting Up Service  Monalisa Service  Monalisa Repository  Test Xrootd Server  ApMon Module.
April 2003 Iosif Legrand MONitoring Agents using a Large Integrated Services Architecture Iosif Legrand California Institute of Technology.
PPDG February 2002 Iosif Legrand Monitoring systems requirements, Prototype tools and integration with other services Iosif Legrand California Institute.
Intro to Distributed Systems and Networks Hank Levy.
GRID ANATOMY Advanced Computing Concepts – Dr. Emmanuel Pilli.
October 2006 Iosif Legrand 1 Iosif Legrand California Institute of Technology An Agent Based, Dynamic Service System to Monitor, Control and Optimize Distributed.
03/09/2007http://pcalimonitor.cern.ch/1 Monitoring in ALICE Costin Grigoras 03/09/2007 WLCG Meeting, CHEP.
1 R. Voicu 1, I. Legrand 1, H. Newman 1 2 C.Grigoras 1 California Institute of Technology 2 CERN CHEP 2010 Taipei, October 21 st, 2010 End to End Storage.
Towards a High Performance Extensible Grid Architecture Klaus Krauter Muthucumaru Maheswaran {krauter,
January 2006 Iosif Legrand 1 Iosif Legrand California Institute of Technology January 2006 An Agent Based, Dynamic Service System to Monitor, Control and.
Storage discovery in AliEn
Interaction and Animation on Geolocalization Based Network Topology by Engin Arslan.
Introduction to Oracle Forms Developer and Oracle Forms Services
INTRODUCTION TO COMPUTER NETWORKS
ECSAC- August 2009, Veli Losink California Institute of Technology
Introduction to Oracle Forms Developer and Oracle Forms Services
ALICE Monitoring
Introduction to Oracle Forms Developer and Oracle Forms Services
StarPlane: Application Specific Management of Photonic Networks
Grid Computing.
Integration of Network Services Interface version 2 with the JUNOS Space SDK
Storage elements discovery
University of Technology
IS3120 Network Communications Infrastructure
Chapter 16: Distributed System Structures
An Introduction to Computer Networking
Department of Computer Science Northwestern University
ExaO: Software Defined Data Distribution for Exascale Sciences
The Globus Toolkit™: Information Services
CLUSTER COMPUTING.
Initial job submission and monitoring efforts with JClarens
INTRODUCTION TO COMPUTER NETWORKS
INTRODUCTION TO COMPUTER NETWORKS
Specialized Cloud Architectures
Ron Carovano Manager, Business Development F5 Networks
AIMS Equipment & Automation monitoring solution
Host and Small Network Relaying Howard C. Berkowitz
STATEL an easy way to transfer data
Presentation transcript:

California Institute of Technology An Agent Based, Dynamic Service System to Monitor, Control and Optimize Distributed Systems March 2006 Iosif Legrand California Institute of Technology March 2006 Iosif Legrand

The MonALISA Framework MonALISA is a Dynamic, Distributed Service System capable to collect any type of information from different systems, to analyze it in near real time and to provide support for automated control decisions and global optimization of workflows in complex grid systems. The MonALISA system is designed as an ensemble of autonomous multi-threaded, self-describing agent-based subsystems which are registered as dynamic services, and are able to collaborate and cooperate in performing a wide range of monitoring tasks. These agents can analyze and process the information, in a distributed way, and to provide optimization decisions in large scale distributed applications. March 2006 Iosif Legrand

MonALISA is A Dynamic, Distributed Service Architecture The framework is based on a hierarchical structure of loosely coupled agents acting as distributed services which are independent & autonomous entities able to discover themselves and to cooperate using a dynamic set of proxies or self describing protocols. An agent-based architecture provides the ability to invest the system with increasing degrees of intelligence; to reduce complexity and make global systems manageable in real time. For an effective use of distributed resources, these services provide adaptability and self-organization. March 2006 Iosif Legrand

The MonALISA Discovery System & Services Fully Distributed System with no Single Point of Failure Global Services or Clients Clients , HL services repositories Dynamic load balancing Scalability & Replication Security AAA for Clients Proxies AGENTS Distributed System for gathering and Analyzing Information. MonALISA services Distributed Dynamic Discovery- based on a lease Mechanism and REN Network of JINI-LUSs Secure & Public March 2006 Iosif Legrand

Monitoring Grid sites, Running Jobs, Network Traffic and Topology ACCOUNTING March 2006 Iosif Legrand

Monitoring OSG: Resources, Jobs & Accounting Running Jobs Accounting 42 SITES ~ 4 000 Nodes ( 10 000 CPUs) Thousands of Jobs 60 000 parameters March 2006 Iosif Legrand

Monitoring CMS Jobs Running Jobs per Site Type of Jobs Rate for Processing Events per Site Integrated No. of Events March 2006 Iosif Legrand

FTP Data Transfer between GRID sites Total FTP Traffic per VO March 2006 Iosif Legrand

Monitoring Internet2 backbone Network Test for a Land Speed Record ~ 7 Gb/s in a single TCP stream from Geneva to Caltech March 2006 Iosif Legrand

The UltraLight Network BNL ESnet IN /OUT March 2006 Iosif Legrand

Monitoring The GLORIAD Ring March 2006 Iosif Legrand

Monitoring Network Topology Latency, Routers NETWORKS AS ROUTERS March 2006 Iosif Legrand

Bandwidth Challenge at SC2005 151 Gbs ~ 500 TB Total in 4h March 2006 Iosif Legrand

Available Bandwidth Measurements one-to-one realtime bandwidth estimation. March 2006 Iosif Legrand

Monitoring the Execution of Jobs and the Time Evolution SPLIT JOBS LIFELINES for JOBS Job Job1 Job2 Job3 31 32 Summit a Job DAG March 2006 Iosif Legrand

ApMon – Application Monitoring Library of APIs (C, C++, Java, Perl. Python) that can be used to send any information to MonALISA services Flexibility, dynamic configuration, high communication performance dynamic reloading Config Servlet Automated system monitoring Accounting information MonALISA hosts APPLICATION parameter1: value parameter2: value App. Monitoring ... Time;IP;procID MonALISA Service ApMon MonitoringData UDP/XDR MonitoringData UDP/XDR APPLICATION Mbps_out: 0.52 Status: reading App. Monitoring MB_inout: 562.4 MonALISA Service ApMon MonitoringData UDP/XDR No Lost Packages System Monitoring ApMon configuration generated automatically by a servlet / CGI script load1: 0.24 ApMon Config processes: 97 pages_in: 83 March 2006 Iosif Legrand

LISA- Localhost Information Service Agent End User / Client Agent LISA- Localhost Information Service Agent Authorization Service discovery Local detection of the hardware and software configuration Complete end-system monitoring: Per-process load, I/O and network throughputs, etc. End-to-end performance measurements Will act as an active listener for all events related with the requests generated by its local applications. March 2006 Iosif Legrand

MonALISA agents to create on demand on an optical path or tree Discovery & Secure Connection 2 MonALISA ML Agent 3 ML Demon Optical Switch Optical Switch 1 Time to create a path on demand <1s independent of the location and the number of connections Control and Monitor the switch Optical Switch MonALISA ML Agent MonALISA ML Agent Runs a ML Demon >ml_path IP1 IP4 “copy file IP4” 4 ML proxy services used in Agent Communication March 2006 Iosif Legrand

Monitoring and Controlling Optical Planes Port power monitoring March 2006 Iosif Legrand

Monitoring Optical Switches Agents to Create on Demand an Optical Path March 2006 Iosif Legrand

The Functionality of the VINCI System ML proxy services MonALISA ML Agent MonALISA ML Agent MonALISA ML Agent Layer 3 ROUTERS Agent ETHERNET LAN-PHY or WAN-PHY Layer 2 Agent Agent DWDM FIBER Agent Layer 1 Agent Site A Site B Site C March 2006 Iosif Legrand

Vertical Integration of Services use of networking resources Real Time Correlations & Feedback between Major Layers is Crucial for Dynamic Load Balancing , Adaptability and Self-Organization MPLS/GMPLS/TL1 Networking Farms & Data Servers Job 31 Job1 Job3 Job2 Job Job 32 Applications Note: Grid Services will have to Interact & Negotiate with Network Services for Network “Resources” Job User “On-Demand”, Dynamic and Self Adapting use of networking resources March 2006 Iosif Legrand

Communities using MonALISA Major Communities OSG CMS ALICE D0 STAR VRVS LGC RUSSIA SE Europe GRID APAC Grid UNAM Grid ABILENE ULTRALIGHT GLORIAD LHC Net RoEduNET MonALISA Running 24 X 7 at 250 Sites Collecting 250,000 parameters in near real-time Update rate of 25,000 parameter updates per second Monitoring 12,000 computers > 100 WAN Links Thousands of Grid jobs running con- currently ABILENE Demonstrated at: SC2003 Telecom 2003 WSIS 2003 SC 2004 I2 2005 TERENA 2005 IGrid 2005 SC 2005 CENIC 2006 CMS-DC04 - - GRID3 VRVS ALICE March 2006 Iosif Legrand

The MonALISA Architecture Provides: Distributed Registration and Discovery for Services and Applications. Monitoring all aspects of complex systems : System information for computer nodes and clusters Network information : WAN and LAN Monitoring the performance of Applications, Jobs or services The End User Systems, its performance Video streaming Can interact with any other services to provide in near real-time customized information based on monitoring data Secure, remote administration for services and applications Agents to supervise applications, trigger alarms, restart or reconfigure them, and to notify other services when certain conditions are detected. The MonALISA framework is used to develop higher level decision services, implemented as a distributed network of communicating agents, to perform global optimization tasks. Graphical User Interfaces to visualize complex information March 2006 Iosif Legrand