1 LHC-OPN 2008, Madrid, 10-11 th March. Bruno Hoeft, Aurelie Reymund GridKa – DE-KIT procedurs Bruno Hoeft LHC-OPN Meeting 10. – 11. 03. 08.

Slides:



Advertisements
Similar presentations
TAB, 03. March 2006 Bruno Hoeft German LHC – WAN present state – future concept Forschungszentrum Karlsruhe GmbH Institute for Scientific Computing P.O.
Advertisements

T0/T1 – Network Meeting Bruno Hoeft T0/T1 – GridKa/CERN Network Forschungszentrum Karlsruhe GmbH Institute for Scientific Computing P.O. Box.
Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft Torsten Antoni – LCG Operations Workshop, CERN 02-04/11/04 Global Grid User Support - GGUS -
KIT – The cooperation of Forschungszentrum Karlsruhe GmbH und Universität Karlsruhe (TH) MDM Monitoring Steinbuch Centre for Computing.
KIT – The cooperation of Forschungszentrum Karlsruhe GmbH und Universität Karlsruhe (TH) DE-KIT Monitoring Steinbuch Centre for Computing.
KIT – University of the State of Baden-Wuerttemberg and National Research Center of the Helmholtz Association Steinbuch Centre for Computing (SCC)
T1-NREN Luca dell’Agnello CCR, 21-Ottobre The problem Computing for LHC experiments –Multi tier model (MONARC) –LHC computing based on grid –Experiment.
1 Semester 2 Module 4 Learning about Other Devices Yuda college of business James Chen
Trial of the Infinera PXM Guy Roberts, Mian Usman.
Routing and Routing Protocols
Lecture Week 3 Introduction to Dynamic Routing Protocol Routing Protocols and Concepts.
1 Semester 2 Module 6 Routing and Routing Protocols YuDa college of business James Chen
We will be covering VLANs this week. In addition we will do a practical involving setting up a router and how to create a VLAN.
Connect. Communicate. Collaborate Place your organisation logo in this area End-to-End Coordination Unit Toby Rodwell, Network Engineer, DANTE TNLC, 28.
1 Chapter 22 Network layer Delivery, Forwarding and Routing (part2)
M.Menelaou CCNA2 ROUTING. M.Menelaou ROUTING Routing is the process that a router uses to forward packets toward the destination network. A router makes.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks LHCOPN Operations update Guillaume Cessieux.
Objectives Configure routing in Windows Server 2008 Configure Routing and Remote Access Services in Windows Server 2008 Network Address Translation 1.
Access Control List ACL. Access Control List ACL.
Repeaters and Hubs Repeaters: simplest type of connectivity devices that regenerate a digital signal Operate in Physical layer Cannot improve or correct.
T0/T1 network meeting July 19, 2005 CERN
Using E2E technology for LHC Apr 3, 2006 HEPiX Spring Meeting 2006
Computer Emergency Notification System (CENS)
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks LHCOPN operations Presentation and training.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks LHCOPN Ops WG Act 5 Guillaume Cessieux (CNRS/IN2P3-CC,
Use cases Navigation Problem notification Problem analysis.
Brookhaven Science Associates U.S. Department of Energy 1 Network Services BNL USATLAS Tier 1 / Tier 2 Meeting John Bigrow December 14, 2005.
KIT – The cooperation of Forschungszentrum Karlsruhe GmbH und Universität Karlsruhe (TH) ITIL and Grid services at GridKa CHEP 2009,
Connect. Communicate. Collaborate perfSONAR MDM Service for LHC OPN Loukik Kudarimoti DANTE.
LHCOPN operational working group report Guillaume Cessieux (FR-CCIN2P3 / EGEE-SA2) on behalf of the Ops WG LHCOPN meeting, , Copenhagen.
Switch Features Most enterprise-capable switches have a number of features that make the switch attractive for large organizations. The following is a.
KIT – Universität des Landes Baden-Württemberg und nationales Forschungszentrum in der Helmholtz-Gemeinschaft STEINBUCH CENTRE FOR COMPUTING - SCC
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The EGEE User Support Infrastructure Torsten.
Chapter 4 Version 1 Virtual LANs. Introduction By default, switches forward broadcasts, this means that all segments connected to a switch are in one.
How to Build a NOC. Identify Customers –Who are your customers? Understand Customer Expectations –What are your user expectations? –SLA’s? Support Service.
Network infrastructure at FR-CCIN2P3 Guillaume Cessieux – CCIN2P3 network team Guillaume. cc.in2p3.fr On behalf of CCIN2P3 network team LHCOPN.
1 © 2003, Cisco Systems, Inc. All rights reserved. CCNA 2 Module 4 Learning About Other Devices.
U.S. ATLAS Tier 1 Networking Bruce G. Gibbard LCG T0/1 Network Meeting CERN 19 July 2005.
Operations model Maite Barroso, CERN On behalf of EGEE operations WLCG Service Workshop 11/02/2006.
26/01/2007Riccardo Brunetti OSCT Meeting1 Security at The IT-ROC Status and Plans.
CERN IT Department CH-1211 Geneva 23 Switzerland t James Casey CCRC’08 April F2F 1 April 2008 Communication with Network Teams/ providers.
LHCOPN operational model - 4 use-cases Guillaume Cessieux (FR-CCIN2P3 / EGEE networking support) on behalf of the Ops WG LHCOPN meeting, , Berlin.
David Foster, CERN GDB Meeting April 2008 GDB Meeting April 2008 LHCOPN Status and Plans A lot more detail at:
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks LHCOPN Operational model: Roles and functions.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks LHCOPN operations Presentation and training.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks LHCOPN operations Presentation and training.
David Foster, CERN LHC T0-T1 Meeting, Seattle, November November Meeting David Foster SEATTLE.
EMI INFSO-RI Testbed for project continuous Integration Danilo Dongiovanni (INFN-CNAF) -SA2.6 Task Leader Jozef Cernak(UPJŠ, Kosice, Slovakia)
LHC-OPN operations Roberto Sabatino LHC T0/T1 networking meeting Amsterdam, 31 January 2006.
LHCOPN operational model Guillaume Cessieux (CNRS/FR-CCIN2P3, EGEE SA2) On behalf of the LHCOPN Ops WG GDB CERN – November 12 th, 2008.
CERN - IT Department CH-1211 Genève 23 Switzerland t Service Level & Responsibilities Dirk Düllmann LCG 3D Database Workshop September,
CCNP Routing and Switching Exam Pass4sure.
David Foster, CERN LHC T0-T1 Meeting, Cambridge, January 2007 LHCOPN Meeting January 2007 Many thanks to DANTE for hosting the meeting!! Thanks to everyone.
Javier Orellana EGEE-JRA4 Coordinator CERN March 2004 EGEE is proposed as a project funded by the European Union under contract IST Network.
Networks and Security Great Demo
Connect. Communicate. Collaborate Place your organisation logo in this area End-to-End Coordination Unit Marian Garcia, Operations Manager, DANTE LHC Meeting,
INFN-Grid WS, Bari, 2004/10/15 Andrea Caltroni, INFN-Padova Marco Verlato, INFN-Padova Andrea Ferraro, INFN-CNAF Bologna EGEE User Support Report.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks ENOC status LHC-OPN meeting – ,
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Operating an Optical Private Network: the.
OPEN SOURCE NETWORK MANAGEMENT TOOLS
LHCOPN operational handbook Documenting processes & procedures Presented by Guillaume Cessieux (CNRS/IN2P3-CC) on behalf of CERN & EGEE-SA2 LHCOPN meeting,
T0-T1 Networking Meeting 16th June Meeting
Instructor Materials Chapter 5: Network Security and Monitoring
Instructor Materials Chapter 7: Access Control Lists
BGP 1. BGP Overview 2. Multihoming 3. Configuring BGP.
Instructor Materials Chapter 9: Testing and Troubleshooting
Thanks to everyone for attending!!
Chapter 5: Network Security and Monitoring
Chapter 4: Access Control Lists (ACLs)
Validating MANRS of a network
Presentation transcript:

1 LHC-OPN 2008, Madrid, th March. Bruno Hoeft, Aurelie Reymund GridKa – DE-KIT procedurs Bruno Hoeft LHC-OPN Meeting 10. –

2 LHC-OPN 2008, Madrid, th March. Bruno Hoeft, Aurelie Reymund LHC-OPN Hardware at DE-KIT (GridKa): fully redundant border router setup are in place (resilience) two border router Cisco Catalyst 6509 Router - 2 sup engines WS-SUP720-3B ( IOS s72033_rp-IPSERVICESK9_WAN-VM), Version 12.2(33)SXF9). -- line cards WS-x GE, facilitated with single mode transceiver XENPAK-10GB-SR -DFN 2 Huawei DWDM  - one DWDM is providing the light colour from DE-KIT (GridKa) to CERN and SARA (direction north from Karlsruhe) - the second DWDM is providing the light colour from DE-KIT (GridKa) to IN2P3 and CNAF (direction south from Karlsruhe) The direction to CERN from Karlsruhe is north since the DANTE peering to DFN is located in Frankfurt for the DFN/Dante link DE-KIT(GridKa) – CERN.

3 LHC-OPN 2008, Madrid, th March. Bruno Hoeft, Aurelie Reymund DE-KIT LHC-OPN links Interface (Layer-2) VLan IP (Layer-3) / Link Name (DFN)Description Te 7/ /30 GE10/HUA0674_FRA_FZK (Frankfurt/Dante ->Genf) CERN (fra-gen_LHC_CERN-DFN_06006) Te 1/ /30 GE10/HUA0778_FZK_MUE Muenster/Surfnet-> Amsterdam/SARA (DFN/Surfnet CBF) R-inet-gis-I R-inet-gis-II Interface (Layer-2) Vlan IP (Layer-3) / Link Name (DFN)Description Te 3/ /30 / GE10/HUA1106_FZK_KEH (Kehl) IN2P3 (DFN/RENATER CBF) Te 2/ /30 / GE10/HUA0673_BAS_FZK (Milano) Bologna INFN(CNAF) (DFN/Switch/GARR CBF)

4 LHC-OPN 2008, Madrid, th March. Bruno Hoeft, Aurelie Reymund Operative service levels three service levels entities: -First level support is GGUS (5*8) -General FZK network support: (5*8, (plus an automated incident broadcast (SMS) 24*7) – Telematis (an external Company is covering the “off workinghours” incident broadcast on call support) -Expert Support: (5*8, plus Experts on call) The combination of the three operative service levels are providing a 24*7 LHC-OPN support. This will match the requirements specified by the LHC experiments in there CDR. All operators will be granted a fully transparent access to the DE-KIT (GridKa) wiki knowledge base, the DE-KIT (GridKa) log analyser facility and monitoring system as well as LHC-OPN monitoring systems, as they are: o - DE-KIT (GridKa) local – DE-KIT (GridKa) general monitoring site [ cacti, netflow, ganglia, nagios, log analyser iepm [ - LHC-OPN central monitoring pages – BGP – ENOC monitoring page – Dante E2Ecu monitoring page - Several DE-KIT (GridKa) local information sites are restricted to local access only. three service levels entities: -First level support is GGUS (5*8) -General FZK network support: (5*8, (plus an automated incident broadcast (SMS) 24*7) – Telematis (an external Company is covering the “off workinghours” incident broadcast on call support) -Expert Support: (5*8, plus Experts on call) The combination of the three operative service levels are providing a 24*7 LHC-OPN support. This will match the requirements specified by the LHC experiments in there CDR. All operators will be granted a fully transparent access to the DE-KIT (GridKa) wiki knowledge base, the DE-KIT (GridKa) log analyser facility and monitoring system as well as LHC-OPN monitoring systems, as they are: o - DE-KIT (GridKa) local – DE-KIT (GridKa) general monitoring site [ cacti, netflow, ganglia, nagios, log analyser iepm [ - LHC-OPN central monitoring pages – BGP – ENOC monitoring page – Dante E2Ecu monitoring page - Several DE-KIT (GridKa) local information sites are restricted to local access only.

5 LHC-OPN 2008, Madrid, th March. Bruno Hoeft, Aurelie Reymund Incident origination: -DE-KIT (GridKa) Monitoring (LogMonitoring/PortMonitoring) -DE-KIT (GridKa) Monitoring tools triggering an incident, automated /SMS (e.g. router port up/down, flapping, bgp changes…), or by router operators -operation at DE-KIT (GridKa) will open a GGus (or LCU) ticket -GGus (or LCU) will control the ticket - the mainly involved tier-1 site (DE-KIT (GridKa)) will operate the ticket, until the ticket is solved or closed. -appropriate partner(s) affected by the incident will be included in the ticket. -GGus/LCU: -GGUS/LCU ticket initiated by HEP user, distant NOC/Tier-0/1 or NREN -GGus/LCU submits the ticket to the appropriate site (DE-KIT (GridKa)) -the ticket will still be controlled by GGus(/LCU) and DE-KIT (GridKa) will take over the operative part -LIPCU (LCU)/E2ECU: -no difference to a GGus/LCU ticket. -Information by a site: -request to open a GGus/LCU ticket -however appropriate actions will be taken immediately to solve the issue. -maintenance/changes at DE-KIT (GridKa) / EGEE Broadcast: -GGus (and/or LCU) ticket will be opened and it will be announced in GOC, this should inform all LHC- OPN sites via EGEEBroadcast as well as through GOC (for each EGEE broadcast should exist an according ticket) Incident and ticket handling -ticket of an incident is handled and controlled by either GGus, LCU, or E2Ecu -operation of certain actions are transferred to the affected/coresponding location like a tier-1 centre DE-KIT (GridKa) or a “NREN” -the management will still resides at the ticket owner (GGUS, LCU/LIPCU, E2ECU

6 LHC-OPN 2008, Madrid, th March. Bruno Hoeft, Aurelie Reymund Operation of an Incident (1) -Layer-1 incident (An issue on layer-1 has for consequence that there is no light on the path) -No light ( Descr.: there is a light cut somewhere on the path ) Actions: - check the router / transceiver / hardware / cable / logs - evaluate the impact (backup path available) - contact DFN and Di-Data as well as T0/T1 - send an EGEE broadcast if no backup path (depended on –estimated length, and impact) and escalate to Experts - report the incident and its solution in the documentation Involved groups:- Internal: GIS / NG (Network Group) - External: DFN, Di-Data, T0/T1 network responsible, NREN / Dante - Momitoring eg.: -Local hardware failure ( Descr.: a hardware element seems to be deficient on the local network ) Actions: - check the router / transceiver / hardware / cable / logs - evaluate the impact (backup path available) - contact T0/T1 - send an EGEE broadcast if no backup path (depended on –estimated length, and impact) and escalate to Experts - report the incident and its solution in the documentation Involved groups:-Internal: GIS / NG - External: DFN, Di-Data, T0/T1 network responsible, NREN / Dante -Remote hardware failure ( Descr.: a hardware element seems to be deficient on the remote network ) Actions: - check the router / transceiver / hardware / cable / logs - evaluate the impact (backup path available) - if nothing suspicious detected, contact T0/T1 - send an EGEE broadcast if no backup path (depended on –estimated length, and impact) and escalate to Experts - report the incident and its solution in the documentation Involved groups:- Internal: GIS / NG - External: DFN, Di-Data, T0/T1 network responsible, NREN / Dante

7 LHC-OPN 2008, Madrid, th March. Bruno Hoeft, Aurelie Reymund Operation of an Incident (2) -Layer-2 (the light on the path is maintained, but there is no connectivity to the neighbour) -No MAC ( Descr.: missing mac entry from the neighbor’s network ) Actions:- check router configuration - evaluate the impact - contact T0/T1 - send EGEE broadcast if no backup path (estimated length, and impact), escalate to Experts - report the incident and its solution in the documentation Groups involved: - Internal: GIS / NG - External: T0/T1 network responsible

8 LHC-OPN 2008, Madrid, th March. Bruno Hoeft, Aurelie Reymund Operation of an Incident (3) -Layer-3 (By a routing issue on layer-3, the light on the path is maintained, but there is no reachability to the neighbour) -Routing issue : no route to neighbour ( Descr.: T1-center cannot reach the neighbour ) Actions:- check router configuration / routing / acls - evaluate the impact - contact T0/T1 - send EGEE broadcast if no backup path (estimated length, and impact),escalate to Experts - report the incident and its solution in the documentation Involved groups:- Internal: GIS / NG - External: T0/T1 network responsible -BGP issue : no announcement from neighbour ( Descr.: the bgp table shows ) Actions:- check router configuration / routing / acls - evaluate the impact - contact T0/T1 - send EGEE broadcast if no backup path (estimated length, and impact), escalate to Experts - eport the incident and its solution in the documentation Involved groups:- Internal: GIS / NG - External: T0/T1 network responsible -BGP issue : no routes advertised to neighbour ( Descr.: local bgp does not advertise the network(s) correctly to the neighbour ) Actions:- check router configuration / routing / acls - evaluate the impact - contact T0/T1 - send EGEE broadcast if no backup path (estimated length, and impact), escalate to Experts - report the incident and its solution in the documentation Involved groups:- Internal: GIS / NG - External: T0/T1 network responsible

9 LHC-OPN 2008, Madrid, th March. Bruno Hoeft, Aurelie Reymund Maintenance window -The light path and/or the connectivity / reachability can be affected -- Descr.: T1-center plans maintenance on the network infrastructure Actions:- send an EGEE broadcast - contact T0/T1, NREN, Dante Involved groups:- Internal: GIS / NG / Security - External: T0/T1 network responsible, NREN (DFN) / Dante

10 LHC-OPN 2008, Madrid, th March. Bruno Hoeft, Aurelie Reymund Configuration / Infrastructure change -Configuration change (The light path and/or the connectivity / reachability can be affected -- Descr.: T1-center makes a change on the network configuration) Actions:- send an EGEE broadcast - contact T0/T1, NREN, Dante Involved groups:- Internal: GIS / NG / Security - External: T0/T1 network responsible, NREN (DFN) / Dante -Infrastructure change (The light path and/or the connectivity / reachability can be affected -- Descr.: T1-center plans a change in the network infrastructure/topology) Actions:- send an EGEE broadcast - contact T0/T1, NREN, Dante Involved groups:- Internal: GIS / NG / Security - External: T0/T1 network responsible, NREN (DFN) / Dante -General remarks: -all LHC-OPN involving actions: -(as long as planable) shall as possible 3 days in advanced anounced (ticket, GOC, EGEEBroadcast) -Changes of the infrastructure (e.g. routing/reorganisation of router port) shall be discussed with the affected site, cern and the coordination unit (LCU/LIPCU) -The configuration of the DE-KIT (GridKa) installation will be documented, as well as all changes will be included in the documentation