LHCONE Monitoring Thoughts June 14 th, LHCOPN/LHCONE Meeting Jason Zurawski – Research Liaison.

Slides:



Advertisements
Similar presentations
Circuit Monitoring July 16 th 2011, OGF 32: NMC-WG Jason Zurawski, Internet2 Research Liaison.
Advertisements

Key Multi-domain GÉANT Network Services June 2011.
Distributed Data Processing
Trial of the Infinera PXM Guy Roberts, Mian Usman.
VAP What is a Virtual Application ? A virtual application is an application that has been optimized to run on virtual infrastructure. The application software.
Customized cloud platform for computing on your terms !
Configuration Management T3 Webinar Feb 21, 2008 Chuck Larsen ITS Program Coordinator Oregon Department of Transportation.
Cloud Computing Kwangyun Cho v=8AXk25TUSRQ.
1 © 1999 BMC SOFTWARE, INC. 2/10/00 SNMP Simple Network Management Protocol.
TeraPaths TeraPaths: establishing end-to-end QoS paths - the user perspective Presented by Presented by Dimitrios Katramatos, BNL Dimitrios Katramatos,
A smart signalling system for Indian railways Smart signalling system – user’s view Full capacity realisation Flexibility of movements Easy to operate.
Network Monitoring for OSG Shawn McKee/University of Michigan OSG Staff Planning Retreat July 10 th, 2012 July 10 th, 2012.
Internet2 Performance Update Jeff W. Boote Senior Network Software Engineer Internet2.
1 Introduction to Middleware. 2 Outline What is middleware? Purpose and origin Why use it? What Middleware does? Technical details Middleware services.
Thoughts on Future LHCOPN Some ideas Artur Barczyk, Vancouver, 31/08/09.
INDIANAUNIVERSITYINDIANAUNIVERSITY Grid Monitoring from a GOC perspective John Hicks HPCC Engineer Indiana University October 27, 2002 Internet2 Fall Members.
Internet2 & perfSONAR-PS August 23 rd 2012, OSG & perfSONAR-PS Meeting Jason Zurawski – Senior Research Engineer.
Connect. Communicate. Collaborate Implementing Multi-Domain Monitoring Services for European Research Networks Szymon Trocha, PSNC A. Hanemann, L. Kudarimoti,
Scientific Networking: The Cause of and Solution to All Problems April 14 th Workshop on High Performance Applications of Cloud and Grid Tools Jason.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
Routing integrity in a world of Bandwidth on Demand Dave Wilson DW238-RIPE
PerfSONAR use at US LHC Facilities March 9 th 2010, LHCOPN Eric Boyd, Deputy Technology Officer.
ISA Server 2004 Introduction Владимир Александров MCT, MCSE, MCSD, MCDBA Корус, Управител
Microsoft Management Seminar Series SMS 2003 Change Management.
Managing the CERN LHC Tier0/Tier1 centre Status and Plans March 27 th 2003 CERN.ch.
NORDUnet Nordic Infrastructure for Research & Education Workshop Introduction - Finding the Match Lars Fischer LHCONE Workshop CERN, December 2012.
PerfSONAR-PS Functionality February 11 th 2010, APAN 29 – perfSONAR Workshop Jeff Boote, Assistant Director R&D.
Introduction to Grids By: Fetahi Z. Wuhib [CSD2004-Team19]
LHC Open Network Environment Architecture Overview and Status Artur Barczyk/Caltech LHCONE meeting Amsterdam, September 26 th,
NOVA A Networked Object-Based EnVironment for Analysis “Framework Components for Distributed Computing” Pavel Nevski, Sasha Vanyashin, Torre Wenaus US.
Plug-in Architectures Presented by Truc Nguyen. What’s a plug-in? “a type of program that tightly integrates with a larger application to add a special.
Globus and PlanetLab Resource Management Solutions Compared M. Ripeanu, M. Bowman, J. Chase, I. Foster, M. Milenkovic Presented by Dionysis Logothetis.
How to Build a NOC. Identify Customers –Who are your customers? Understand Customer Expectations –What are your user expectations? –SLA’s? Support Service.
Internet2 and Cyberinfrastructure Russ Hobby Program Manager,
Networking Components WILLIAM NELSON LTEC HUB  Device that operated on Layer 1 of the OSI stack.  All I/O flows out all other ports besides the.
Aneka Cloud ApplicationPlatform. Introduction Aneka consists of a scalable cloud middleware that can be deployed on top of heterogeneous computing resources.
Connect communicate collaborate perfSONAR MDM for LHCOPN/LHCONE: partnership, collaboration, interoperability, openness Domenico Vicinanza perfSONAR MDM.
PerfSONAR Update Shawn McKee/University of Michigan LHCONE/LHCOPN Meeting Cambridge, UK February 9 th, 2015.
OSG Networking: Summarizing a New Area in OSG Shawn McKee/University of Michigan Network Planning Meeting Esnet/Internet2/OSG August 23 rd, 2012.
State of Georgia Release Management Training
PerfSONAR-PS Working Group Aaron Brown/Jason Zurawski January 21, 2008 TIP 2008 – Honolulu, HI.
Active Directory. Computers in organizations Computers are linked together for communication and sharing of resources There is always a need to administer.
Advanced research and education networking in the United States: the Internet2 experience Heather Boyles Director, Member and Partner Relations Internet2.
DICE: Authorizing Dynamic Networks for VOs Jeff W. Boote Senior Network Software Engineer, Internet2 Cándido Rodríguez Montes RedIRIS TNC2009 Malaga, Spain.
NORDUnet Nordic Infrastructure for Research & Education Report of the CERN LHCONE Workshop May 2013 Lars Fischer LHCONE Meeting Paris, June 2013.
THE GLUE DOMAIN DEPLOYMENT The middleware layer supporting the domain-based INFN Grid network monitoring activity is powered by GlueDomains [2]. The GlueDomains.
TCR Remote Monitoring for the LHC Technical Infrastructure 6th ST Workshop, Thoiry 2003U. Epting, M.C. Morodo Testa, S. Poulsen1 TCR Remote Monitoring.
 Cloud Computing technology basics Platform Evolution Advantages  Microsoft Windows Azure technology basics Windows Azure – A Lap around the platform.
Grid Deployment Technical Working Groups: Middleware selection AAA,security Resource scheduling Operations User Support GDB Grid Deployment Resource planning,
Campana (CERN-IT/SDC), McKee (Michigan) 16 October 2013 Deployment of a WLCG network monitoring infrastructure based on the perfSONAR-PS technology.
Amazon Web Services. Amazon Web Services (AWS) - robust, scalable and affordable infrastructure for cloud computing. This session is about:
Connect. Communicate. Collaborate Place your organisation logo in this area End-to-End Coordination Unit Marian Garcia, Operations Manager, DANTE LHC Meeting,
AB-CO Exploitation 2006 & Beyond Presented at AB/CO Review 20Sept05 C.H.Sicard (based on the work of Exploitation WG)
LHCOPN / LHCONE Status Update John Shade /CERN IT-CS Summary of the LHCOPN/LHCONE meeting in Amsterdam Grid Deployment Board, October 2011.
The Internet2 Network and LHC Rick Summerhill Director Network Research, Architecture, and Technologies Internet2 LHC Meeting 23 October 2006 FERMI Lab,
Prof. Jong-Moon Chung’s Lecture Notes at Yonsei University
Phase 4: Manage Deployment
Containers as a Service with Docker to Extend an Open Platform
Regional Operations Centres Core infrastructure Centres
GÉANT Multi-Domain Bandwidth-on-Demand Service
Integration of Network Services Interface version 2 with the JUNOS Space SDK
Internet2 Performance Update
Introduction to Networking
Client-Server Interaction
Distributed Content in the Network: A Backbone View
Leigh Grundhoefer Indiana University
Casablanca Platform Enhancements to Support 5G Use Case (Network Deployment, Slicing, Network Optimization and Automation Framework) 5G Use Case Team.
Extending the Measurement Infrastructure of Pipes beyond Abilene
PLANNING A SECURE BASELINE INSTALLATION
Internet Engineering Course
Presentation transcript:

LHCONE Monitoring Thoughts June 14 th, LHCOPN/LHCONE Meeting Jason Zurawski – Research Liaison

LHCONE Monitoring = Hard Problem(?) Current Use Case: USATLAS Possible Solutions 2 – 3/9/2016, © 2011 Internet2 LHCONE Monitoring

Participants – Multiple Domains (e.g. Building, Campus, Regional, Backbone, Exchange Point) – Multiple Parties (e.g. VO management, Local/Regional/National IT staff) Technologies – Monitoring at all layers of the OSI stack (e.g. light levels all the way up to application performance) Governance – Conversations about this over last 2 days – who runs LHCONE? Can someone enforce monitoring rules? – Value add: installation of monitoring tools and someone to ensure they work – Some central facility to manage the tickets/process? 3 – 3/9/2016, © 2011 Internet2 Hard Problem To Solve?

LHCONE Monitoring = Hard Problem(?) Current Use Case: USATLAS Possible Solutions 4 – 3/9/2016, © 2011 Internet2 LHCONE Monitoring

Hardware/Software – perfSONAR-PS Performance Toolkit ( ) – 2 Dedicated Machines per T1 and T2 (Bandwidth and Latency Monitoring) Use Case – Regular full mesh testing (OWAMP/BWCTL/PingER) – Diagnostic tools on demand (NDT/NPAD) – Alarms built using NAGIOS Throughput drops below threshold Loss/Latency increase beyond threshold Monitoring hosts/services become unreachable 5 – 3/9/2016, © 2011 Internet2 USATLAS

6 – 3/9/2016, © 2011 Internet2 USATLAS – Setting Up Tests

7 – 3/9/2016, © 2011 Internet2 USATLAS – Simple Graph

Implementing other Components – Dashboard Python based, integrated with perfSONAR-PS NAGIOS probes. Web Service calls to remote instances to gain status info Developed by BNL for USATLAS – Integration into data movement software Still in pipe-dream phase – use perfSONAR-PS APIs to get data from monitoring hosts Intelligent decisions about data movement (e.g. who to download from ala bit-torrent, or when to start a dynamic circuit vs use IP) 8 – 3/9/2016, © 2011 Internet2 USATLAS

9 – 3/9/2016, © 2011 Internet2 USATLAS – “Complete” View

10 – 3/9/2016, © 2011 Internet2 USATLAS – Per-Pair Performance

Support Structure – “Open Source” Software = “Open Source” support. – Community mailing lists, meetings with pS PS engineers for debugging/feature requests – Support on installation/upgrading (2 or so times a year) as required – No PERT – performance problems are handled by USATLAS with the help of Internet2/ESnet typically organizing testing and resource coordination with peer networks Difference vs MDM – No Help Desk – Machines are under local control only (we don’t maintain persistent login access) – Testing is up to the VO, we can help get things started. 11 – 3/9/2016, © 2011 Internet2 USATLAS

LHCONE Monitoring = Hard Problem(?) Current Use Case: USATLAS Possible Solutions 12 – 3/9/2016, © 2011 Internet2 LHCONE Monitoring

Similar to USATLAS Approach – Mandate direct participants purchase at least 1 (preferable 2) machines for monitoring purposes. Stationed at the network core (near the storage/processing equipment) Bonus deployment at the edge – Encourage Backbone/Regional/XP operators to do the same. Harder to enforce outside of the VO… – Compile list of desirable functionality (e.g. regular testing, on demand testing, complete OS vs packages, etc.). Market based study of what is available vs what could be developed. – Form (or use this) WG to serve as community support Installation Configuration Trouble shooting 13 – 3/9/2016, © 2011 Internet2 Potential Solutions

Possible Enhancements – New Software Development LHC Community is not afraid to innovate – do the current solutions in the monitoring space scale? Is anything new needed? Anything need to be changed? – PERT/Help Desk Home for the homeless w/ regards to trouble tickets. Work with the networking partners to track progress Handle issues with installation/configuration in the event that the open source model is not sufficient – Non-Local Control of monitoring Central authority to own/maintain the infrastructure instead of allowing domains to manage this role 14 – 3/9/2016, © 2011 Internet2 Potential Solutions

15 – 3/9/2016, © 2011 Internet2 Discussion