Presentation is loading. Please wait.

Presentation is loading. Please wait.

US LHC NWG Dynamic Circuit Services in US LHCNet Artur Barczyk, Caltech Joint Techs Workshop Honolulu, 01/23/2008.

Similar presentations


Presentation on theme: "US LHC NWG Dynamic Circuit Services in US LHCNet Artur Barczyk, Caltech Joint Techs Workshop Honolulu, 01/23/2008."— Presentation transcript:

1 US LHC NWG Dynamic Circuit Services in US LHCNet Artur Barczyk, Caltech Joint Techs Workshop Honolulu, 01/23/2008

2 US LHC NWG US LHCNet Overview Mission oriented network: Provide trans-Atlantic network infrastructure to support the US LHC program Mission oriented network: Provide trans-Atlantic network infrastructure to support the US LHC program Four PoPs:  CERN  Starlight (→ Fermilab)  Manlan (→ Brookhaven)  SARA Four PoPs:  CERN  Starlight (→ Fermilab)  Manlan (→ Brookhaven)  SARA CERN SARA Manlan Starlight 2008: 30 (40) Gbps trans-Atlantic bandwidth 2008: 30 (40) Gbps trans-Atlantic bandwidth (roadmap: 80 Gbps by 2010) 2008: 30 (40) Gbps trans-Atlantic bandwidth 2008: 30 (40) Gbps trans-Atlantic bandwidth (roadmap: 80 Gbps by 2010)

3 US LHC NWG ALICE  pp  s =14 TeV L=10 34 cm -2 s -1  27 km Tunnel in Switzerland & France CMS Atlas Higgs, SUSY, Extra Dimensions, CP Violation, QG Plasma, … the Unexpected 6000+ Physicists & Engineers 250+ Institutes 60+ Countries Challenges: Analyze petabytes of complex data cooperatively Harness global computing, data & network resources Large Hadron Collider @ CERN LHCb Start in 2008

4 US LHC NWG The LHC Data Grid Hierarchy Emerging Vision: A Richly Structured, Global Dynamic System 10 Gbps CERN/Outside Ratio ~1:4 T0/(  T1)/(  T2) ~1:2:2 ~40% of Resources in Tier2s US T1s and T2s Connect to US LHCNet PoPs Online Online 10 – 40 Gbps GEANT2+NRENS Germany T1 BNL T1 USLHCNet + ESnet Outside/CERN Ratio Larger; Expanded Role of Tier1s & Tier2s: Greater Reliance on Networks

5 US LHC NWG The Roles of Tier Centers Tier 0 (CERN) (CERN) Tier 2 Tier 3 11 Tier1s, over 100 Tier2s → LHC Computing will be more dynamic & network-oriented 11 Tier1s, over 100 Tier2s → LHC Computing will be more dynamic & network-oriented Requirements for Dynamic Circuit Services in US LHCNet  Prompt calibration and alignment  Reconstruction  Store complete set of RAW data  Reprocessing  Store part of processed data  Monte Carlo Production  Physics Analysis Physics Analysis Tier 1 Defines the dynamism of data transfers

6 US LHC NWG CMS Data Transfer Volume (May – Aug. 2007) 10 PetaBytes transferred Over 4 Mos. = 8.0 Gbps Avg. (15 Gbps Peak)

7 US LHC NWG 88 Gbps Peak; 80+ Gbps Sustainable for Hours, Storage-to-Storage 40 G In 40 G Out End-system capabilities growing

8 US LHC NWG Managed Data Transfers  The scale of the problem and the capabilities of the end-systems require a managed approach with scheduled data transfer requests  The dynamism of the data transfers defines the requirements for scheduling  Tier0 → Tier1, linked to duty cycle of the LHC  Tier1 → Tier1, whenever data sets are reprocessed  Tier1 → Tier2, distribute data sets for analysis  Tier2 → Tier1, distribute MC produced data  Transfer Classes  Fixed allocation  Preemptible transfers  Best effort  Priorities  Preemption  Use LCAS to squeeze low(er) priority circuits  Interact with End-Systems  Verify and monitor capabilities All of this will happen “on demand” from Experiment’s Data Management systems Needs to work end-to-end: collaboration in GLIF, DICE

9 US LHC NWG Managed Network Services Operations Scenario  Receive request, check capabilities, schedule network resources  “Transfer N Gigabytes from A to B with target throughput R1”  Authenticate/authorize/prioritize  Verify end-host rate capabilities R2 (achievable rate)  Schedule bandwidth B > R2; estimate time to complete T(0)  Schedule path with priorities P(i) on segment S(i)  Check progress periodically  Compare rate R(t) to R2, update time to complete T(i) to T(i-1)  Trigger on behaviours requiring further action  Error (e.g. segment failure)  Performance issues (e.g. poor progress, channel underutilized, long waits)  State change (e.g. new high priority transfer submitted)  Respond dynamically: to match policies and optimize throughput  Change channel size(s)  Build alternative path(s)  Create new channel(s) and squeeze others in class

10 US LHC NWG Managed Network Services: End-System Integration  Integration of network services and end-systems  Requires end-to-end view of the network and end-systems, real-time monitoring  Robust, real-time and scalable messaging infrastructure  Information extraction and correlation  e.g. network state, end-host state, transfer queues-state  Obtain via network services  end-host agent (EHA) interactions  Provide sufficient information for decision support  Cooperation of EHAs and network services  Automate some operational decisions using accumulated experience  Increase level of automation to respond to: increases in usage, number of users, and competition for scarce network resources Required for a robust end-to-end production system

11 US LHC NWG Lightpaths in US LHCNet domain (Virtual Intelligent Networks for Computing Infrastructures in Physics) Control Plane Data Plane Dynamic setup and reservation of lightpaths has been successfully demonstrated by the VINCI project controlling optical switches

12 US LHC NWG Planned Interfaces I-NNI: I-NNI: VINCI (custom) protocols E-NNI: E-NNI: Web Services (DCN IDC) UNI: UNI: VINCI custom protocol, client = EHA  Most, if not all, LHC data transfers will cross more than one domain  E.g. in order to transfer data from CERN to Fermilab:  CERN → US LHCNet → ESnet → Fermilab  VINCI Control Plane for intra-domain,  DCN (DICE/GLIF) IDC for inter-domain provisioning UNI: UNI: DCN IDC? LambdaStation? TeraPaths?

13 US LHC NWG Protection Schemes  Mesh-protection at Layer 1  US LHCNet links are assigned to primary users  CERN – Starlight for CMS  CERN – Manlan for Atlas  In case of link failure cannot blindly use bandwidth belonging to the other collaboration  Carefully choose protection links, e.g. use the indirect path (CERN- SARA-Manlan)  Designated Transit Lists, and DTL- Sets  High-level protection features implemented in VINCI  Re-provision lower priority circuits  Preemption, LCAS Needs to work end-to-end: collaboration in GLIF, DICE

14 US LHC NWG Basic Functionality To-Date 14 High performance servers Ciena CoreDirectors US LHCNet routers Ultralight routers Pre-production (R&D) setup: Local domain: Local domain: routing of private IP subnets onto tagged VLANs Core network (TDM): Core network (TDM): VLAN based Virtual Circuits Pre-production (R&D) setup: Local domain: Local domain: routing of private IP subnets onto tagged VLANs Core network (TDM): Core network (TDM): VLAN based Virtual Circuits  Semi-automatic intra-domain circuit provisioning  Bandwidth adjustment (LCAS)  End-host tuning by the End-Host Agent  End-to-End monitoring  Semi-automatic intra-domain circuit provisioning  Bandwidth adjustment (LCAS)  End-host tuning by the End-Host Agent  End-to-End monitoring

15 US LHC NWG CERN Geneva Manlan USLHCnet MonALISA: Monitoring the US LHCNet Ciena CDCI Network SARA Starlight

16 US LHC NWG Roadmap Ahead  The current capabilities include  End-to-End monitoring  Intra-domain circuit provisioning  End-host tuning by the End-Host Agent  Towards a production system (intra-domain)  Integrate existing end-host agent, monitoring and measurement services  Provide a uniform user/application interface  Integration with experiments’ Data Management Systems  Automated fault handling  Priority-based transfer scheduling  Include Authorisation, Authentication and Accounting  Towards a production system (inter-domain)  Interface to DCN IDC  Work with DICE, GLIF on IDC protocol specification  Topology exchange, routing, end-to-end path calculation  Extend AAA infrastructure to multi-domain

17 US LHC NWG Summary and Conclusions  Movement of LHC data will be highly dynamic  Follow LHC data grid hierarchy  Different data sets (size, transfer speed and duration), different priorities  Data Management requires network-awareness  Guaranteed bandwidth end-to-end (storage-system to storage-system)  End-to-end monitoring including end-systems  We are developing the intra-domain control plane for US LHCNet  VINCI project, based on MonALISA framework  Many services and agents are already developed or in advanced state  Use Internet2’s IDC protocol for inter-domain provisioning  Collaboration with Internet2, ESNet, LambdaStation, Terapaths on end-to-end circuit provisioning


Download ppt "US LHC NWG Dynamic Circuit Services in US LHCNet Artur Barczyk, Caltech Joint Techs Workshop Honolulu, 01/23/2008."

Similar presentations


Ads by Google