Presentation is loading. Please wait.

Presentation is loading. Please wait.

California Institute of Technology

Similar presentations


Presentation on theme: "California Institute of Technology"— Presentation transcript:

1 California Institute of Technology
An Agent Based, Dynamic Service System to Monitor, Control and Optimize Distributed Systems March Iosif Legrand California Institute of Technology March Iosif Legrand

2 The MonALISA Framework
MonALISA is a Dynamic, Distributed Service System capable to collect any type of information from different systems, to analyze it in near real time and to provide support for automated control decisions and global optimization of workflows in complex grid systems. The MonALISA system is designed as an ensemble of autonomous multi-threaded, self-describing agent-based subsystems which are registered as dynamic services, and are able to collaborate and cooperate in performing a wide range of monitoring tasks. These agents can analyze and process the information, in a distributed way, and to provide optimization decisions in large scale distributed applications. March Iosif Legrand

3 MonALISA is A Dynamic, Distributed Service Architecture
The framework is based on a hierarchical structure of loosely coupled agents acting as distributed services which are independent & autonomous entities able to discover themselves and to cooperate using a dynamic set of proxies or self describing protocols. An agent-based architecture provides the ability to invest the system with increasing degrees of intelligence; to reduce complexity and make global systems manageable in real time. For an effective use of distributed resources, these services provide adaptability and self-organization. March Iosif Legrand

4 The MonALISA Discovery System & Services
Fully Distributed System with no Single Point of Failure Global Services or Clients Clients , HL services repositories Dynamic load balancing Scalability & Replication Security AAA for Clients Proxies AGENTS Distributed System for gathering and Analyzing Information. MonALISA services Distributed Dynamic Discovery- based on a lease Mechanism and REN Network of JINI-LUSs Secure & Public March Iosif Legrand

5 Monitoring Grid sites, Running Jobs, Network Traffic and Topology
ACCOUNTING March Iosif Legrand

6 Monitoring OSG: Resources, Jobs & Accounting
Running Jobs Accounting 42 SITES ~ Nodes ( CPUs) Thousands of Jobs parameters March Iosif Legrand

7 Monitoring CMS Jobs Running Jobs per Site Type of Jobs
Rate for Processing Events per Site Integrated No. of Events March Iosif Legrand

8 FTP Data Transfer between GRID sites
Total FTP Traffic per VO March Iosif Legrand

9 Monitoring Internet2 backbone Network
Test for a Land Speed Record ~ 7 Gb/s in a single TCP stream from Geneva to Caltech March Iosif Legrand

10 The UltraLight Network
BNL ESnet IN /OUT March Iosif Legrand

11 Monitoring The GLORIAD Ring
March Iosif Legrand

12 Monitoring Network Topology Latency, Routers
NETWORKS AS ROUTERS March Iosif Legrand

13 Bandwidth Challenge at SC2005
151 Gbs ~ 500 TB Total in 4h March Iosif Legrand

14 Available Bandwidth Measurements
one-to-one realtime bandwidth estimation. March Iosif Legrand

15 Monitoring the Execution of Jobs and the Time Evolution
SPLIT JOBS LIFELINES for JOBS Job Job1 Job2 Job3 31 32 Summit a Job DAG March Iosif Legrand

16 ApMon – Application Monitoring
Library of APIs (C, C++, Java, Perl. Python) that can be used to send any information to MonALISA services Flexibility, dynamic configuration, high communication performance dynamic reloading Config Servlet Automated system monitoring Accounting information MonALISA hosts APPLICATION parameter1: value parameter2: value App. Monitoring ... Time;IP;procID MonALISA Service ApMon MonitoringData UDP/XDR MonitoringData UDP/XDR APPLICATION Mbps_out: 0.52 Status: reading App. Monitoring MB_inout: 562.4 MonALISA Service ApMon MonitoringData UDP/XDR No Lost Packages System Monitoring ApMon configuration generated automatically by a servlet / CGI script load1: 0.24 ApMon Config processes: 97 pages_in: 83 March Iosif Legrand

17 LISA- Localhost Information Service Agent
End User / Client Agent LISA- Localhost Information Service Agent Authorization Service discovery Local detection of the hardware and software configuration Complete end-system monitoring: Per-process load, I/O and network throughputs, etc. End-to-end performance measurements Will act as an active listener for all events related with the requests generated by its local applications. March Iosif Legrand

18 MonALISA agents to create on demand on an optical path or tree
Discovery & Secure Connection 2 MonALISA ML Agent 3 ML Demon Optical Switch Optical Switch 1 Time to create a path on demand <1s independent of the location and the number of connections Control and Monitor the switch Optical Switch MonALISA ML Agent MonALISA ML Agent Runs a ML Demon >ml_path IP1 IP4 “copy file IP4” 4 ML proxy services used in Agent Communication March Iosif Legrand

19 Monitoring and Controlling Optical Planes
Port power monitoring March Iosif Legrand

20 Monitoring Optical Switches Agents to Create on Demand an Optical Path
March Iosif Legrand

21 The Functionality of the VINCI System
ML proxy services MonALISA ML Agent MonALISA ML Agent MonALISA ML Agent Layer 3 ROUTERS Agent ETHERNET LAN-PHY or WAN-PHY Layer 2 Agent Agent DWDM FIBER Agent Layer 1 Agent Site A Site B Site C March Iosif Legrand

22 Vertical Integration of Services use of networking resources
Real Time Correlations & Feedback between Major Layers is Crucial for Dynamic Load Balancing , Adaptability and Self-Organization MPLS/GMPLS/TL1 Networking Farms & Data Servers Job 31 Job1 Job3 Job2 Job Job 32 Applications Note: Grid Services will have to Interact & Negotiate with Network Services for Network “Resources” Job User “On-Demand”, Dynamic and Self Adapting use of networking resources March Iosif Legrand

23 Communities using MonALISA
Major Communities OSG CMS ALICE D0 STAR VRVS LGC RUSSIA SE Europe GRID APAC Grid UNAM Grid ABILENE ULTRALIGHT GLORIAD LHC Net RoEduNET MonALISA Running 24 X 7 at 250 Sites Collecting 250,000 parameters in near real-time Update rate of 25,000 parameter updates per second Monitoring 12,000 computers > 100 WAN Links Thousands of Grid jobs running con- currently ABILENE Demonstrated at: SC2003 Telecom 2003 WSIS 2003 SC 2004 I2 2005 TERENA 2005 IGrid 2005 SC 2005 CENIC 2006 CMS-DC04 - - GRID3 VRVS ALICE March Iosif Legrand

24 The MonALISA Architecture Provides:
Distributed Registration and Discovery for Services and Applications. Monitoring all aspects of complex systems : System information for computer nodes and clusters Network information : WAN and LAN Monitoring the performance of Applications, Jobs or services The End User Systems, its performance Video streaming Can interact with any other services to provide in near real-time customized information based on monitoring data Secure, remote administration for services and applications Agents to supervise applications, trigger alarms, restart or reconfigure them, and to notify other services when certain conditions are detected. The MonALISA framework is used to develop higher level decision services, implemented as a distributed network of communicating agents, to perform global optimization tasks. Graphical User Interfaces to visualize complex information March Iosif Legrand


Download ppt "California Institute of Technology"

Similar presentations


Ads by Google