Download presentation
Presentation is loading. Please wait.
Published byPercival Underwood Modified over 8 years ago
1
1 ESnet Network Requirements ASCAC Networking Sub-committee Meeting April 13, 2007 Eli Dart ESnet Engineering Group Lawrence Berkeley National Laboratory
2
2 Requirements from Instruments and Facilities This is the ‘hardware infrastructure’ of DOE science – types of requirements can be summarized as follows – Bandwidth: Quantity of data produced, requirements for timely movement – Connectivity: Geographic reach – location of instruments, facilities, and users plus network infrastructure involved (e.g. ESnet, Internet2, GEANT) – Services: Guaranteed bandwidth, traffic isolation, etc.; IP multicast Data rates and volumes from facilities and instruments – bandwidth, connectivity, services – Large supercomputer centers (NERSC, NLCF) – Large-scale science instruments (e.g. LHC, RHIC) – Other computational and data resources (clusters, data archives, etc.) Some instruments have special characteristics that must be addressed (e.g. Fusion) – bandwidth, services Next generation of experiments and facilities, and upgrades to existing facilities – bandwidth, connectivity, services – Addition of facilities increases bandwidth requirements – Existing facilities generate more data as they are upgraded – Reach of collaboration expands over time – New capabilities require advanced services
3
3 Requirements from Examining the Process of Science (1) The geographic extent and size of the user base of scientific collaboration is continuously expanding – DOE US and international collaborators rely on ESnet to reach DOE facilities – DOE Scientists rely on ESnet to reach non-DOE facilities nationally and internationally (e.g. LHC, ITER) – In the general case, the structure of modern scientific collaboration assumes the existence of a robust, high- performance network infrastructure interconnecting collaborators with each other and with the instruments and facilities they use – Therefore, close collaboration with other networks is essential for end-to-end service deployment, diagnostic transparency, etc. Robustness and stability (network reliability) are critical – Large-scale investment in science facilities and experiments makes network failure unacceptable when the experiments depend on the network – Dependence on the network is the general case
4
4 Requirements from Examining the Process of Science (2) Science requires several advanced network services for different purposes – Predictable latency, quality of service guarantees Remote real-time instrument control Computational steering Interactive visualization – Bandwidth guarantees and traffic isolation Large data transfers (potentially using TCP-unfriendly protocols) Network support for deadline scheduling of data transfers Science requires other services as well – for example – Federated Trust / Grid PKI for collaboration and middleware Grid Authentication credentials for DOE science (researchers, users, scientists, etc.) Federation of international Grid PKIs – Collaborations services such as audio and video conferencing
5
5 Science Network Requirements Aggregation Summary Science Drivers Science Areas / Facilities End2End Reliability Connectivity2006 End2End Band width 2010 End2End Band width Traffic Characteristics Network Services Advanced Light Source - DOE sites US Universities Industry 1 TB/day 300 Mbps 5 TB/day 1.5 Gbps Bulk data Remote control Guaranteed bandwidth PKI / Grid Bioinformatics- DOE sites US Universities 625 Mbps 12.5 Gbps in two years 250 Gbps Bulk data Remote control Point-to-multipoint Guaranteed bandwidth High-speed multicast Chemistry / Combustion - DOE sites US Universities Industry -10s of Gigabits per second Bulk data Guaranteed bandwidth PKI / Grid Climate Science - DOE sites US Universities International -5 PB per year 5 Gbps Bulk data Remote control Guaranteed bandwidth PKI / Grid High Energy Physics (LHC) 99.95+% (Less than 4 hrs/year) US Tier1 (DOE) US Tier2 (Universities) International (Europe, Canada) 10 Gbps60 to 80 Gbps (30-40 Gbps per US Tier1) Bulk data Remote control Guaranteed bandwidth Traffic isolation PKI / Grid
6
6 Science Network Requirements Aggregation Summary Science Drivers Science Areas / Facilities End2End Reliability Connectivity2006 End2End Band width 2010 End2End Band width Traffic Characteristics Network Services Magnetic Fusion Energy 99.999% (Impossible without full redundancy) DOE sites US Universities Industry 200+ Mbps 1 Gbps Bulk data Remote control Guaranteed bandwidth Guaranteed QoS Deadline scheduling NERSC- DOE sites US Universities Industry International 10 Gbps20 to 40 Gbps Bulk data Remote control Guaranteed bandwidth Guaranteed QoS Deadline Scheduling PKI / Grid NLCF- DOE sites US Universities Industry International Backbone Band width parity Backbone band width parity Bulk data Nuclear Physics (RHIC) - DOE sites US Universities International 12 Gbps70 Gbps Bulk data Guaranteed bandwidth PKI / Grid Spallation Neutron Source High (24x7 operation) DOE sites640 Mbps2 Gbps Bulk data
7
7 LHC ATLAS Bandwidth Matrix as of April 2007 Site ASite ZESnet AESnet ZA-Z 2007 Bandwidth A-Z 2010 Bandwidth CERNBNLAofA (NYC)BNL10Gbps20-40Gbps BNLU. of Michigan (Calibration) BNL (LIMAN)Starlight (CHIMAN) 3Gbps10Gbps BNLBoston University BNL (LIMAN) Internet2 / NLR Peerings 3Gbps (Northeastern Tier2 Center) 10Gbps (Northeastern Tier2 Center) BNLHarvard University BNLIndiana U. at Bloomington BNL (LIMAN) Internet2 / NLR Peerings 3Gbps (Midwestern Tier2 Center) 10Gbps (Midwestern Tier2 Center) BNLU. of Chicago BNLLangston University BNL (LIMAN)Internet2 / NLR Peerings 3Gbps (Southwestern Tier2 Center) 10Gbps (Southwestern Tier2 Center) BNLU. Oklahoma Norman BNLU. of Texas Arlington BNLTier3 AggregateBNL (LIMAN)Internet2 / NLR Peerings 5Gbps20Gbps BNLTRIUMF (Canadian ATLAS Tier1) BNL (LIMAN)Seattle1Gbps5Gbps
8
8 LHC CMS Bandwidth Matrix as of April 2007 Site ASite ZESnet AESnet ZA-Z 2007 Bandwidth A-Z 2010 Bandwidth CERNFNALStarlight (CHIMAN) FNAL (CHIMAN) 10Gbps20-40Gbps FNALU. of Michigan (Calibration) FNAL (CHIMAN) Starlight (CHIMAN) 3Gbps10Gbps FNALCaltechFNAL (CHIMAN) Starlight (CHIMAN) 3Gbps10Gbps FNALMITFNAL (CHIMAN) AofA (NYC)/ Boston 3Gbps10Gbps FNALPurdue UniversityFNAL (CHIMAN) Starlight (CHIMAN) 3Gbps10Gbps FNALU. of California at San Diego FNAL (CHIMAN) San Diego3Gbps10Gbps FNALU. of Florida at Gainesville FNAL (CHIMAN) SOX3Gbps10Gbps FNALU. of Nebraska at Lincoln FNAL (CHIMAN) Starlight (CHIMAN) 3Gbps10Gbps FNALU. of Wisconsin at Madison FNAL (CHIMAN) Starlight (CHIMAN) 3Gbps10Gbps FNALTier3 AggregateFNAL (CHIMAN) Internet2 / NLR Peerings 5Gbps20Gbps
9
9 Aggregation of Requirements from All Case Studies Analysis of diverse programs and facilities yields dramatic convergence on a well-defined set of requirements – Reliability Fusion – 1 minute of slack during an experiment (99.999%) LHC – Small number of hours (99.95+%) SNS – limited instrument time makes outages unacceptable Drives requirement for redundancy, both in site connectivity and within ESnet – Connectivity Geographic reach equivalent to that of scientific collaboration Multiple peerings to add reliability and bandwidth to interdomain connectivity Critical both within the US and internationally – Bandwidth 10 Gbps site to site connectivity today 100 Gbps backbone by 2010 Multiple 10 Gbps R&E peerings Ability to easily deploy additional 10 Gbps lambdas and peerings Per-lambda bandwidth of 40 Gbps or 100 Gbps should be available by 2010 – Bandwidth and service guarantees All R&E networks must interoperate as one seamless fabric to enable end2end service deployment Flexible rate bandwidth guarantees – Collaboration support (federated trust, PKI, AV conferencing, etc.)
10
10 ESnet Traffic has Increased by 10X Every 47 Months, on Average, Since 1990 Terabytes / month Log Plot of ESnet Monthly Accepted Traffic, January, 1990 – June, 2006 Oct., 1993 1 TBy/mo. Aug., 1990 100 MBy/mo. Jul., 1998 10 TBy/mo. 38 months 57 months 40 months Nov., 2001 100 TBy/mo. Apr., 2006 1 PBy/mo. 53 months
11
11 Requirements from Network Utilization Observation In 4 years, we can expect a 10x increase in traffic over current levels without the addition of production LHC traffic – Nominal average load on busiest backbone links is greater than 1 Gbps today – In 4 years that figure will be over 10 Gbps if current trends continue Measurements of this kind are science-agnostic – It doesn’t matter who the users are, the traffic load is increasing exponentially Bandwidth trends drive requirement for a new network architecture – New ESnet4 architecture designed with these drivers in mind
12
12 Requirements from Traffic Flow Observations Most ESnet science traffic has a source or sink outside of ESnet – Drives requirement for high-bandwidth peering – Reliability and bandwidth requirements demand that peering be redundant – Multiple 10 Gbps peerings today, must be able to add more flexibly and cost-effectively Bandwidth and service guarantees must traverse R&E peerings – “Seamless fabric” – Collaboration with other R&E networks on a common framework is critical Large-scale science is becoming the dominant user of the network – Satisfying the demands of large-scale science traffic into the future will require a purpose-built, scalable architecture – Traffic patterns are different than commodity Internet Since large-scale science will be the dominant user going forward, the network should be architected to serve large-scale science
13
13 Aggregation of Requirements from Network Observation Traffic load continues to increase exponentially – 15-year trend indicates an increase of 10x in next 4 years – This means backbone traffic load will exceed 10 Gbps within 4 years requiring increased backbone bandwidth – Need new architecture – ESnet4 Large science flows typically cross network administrative boundaries, and are beginning to dominate – Requirements such as bandwidth capacity, reliability, etc. apply to peerings as well as ESnet itself – Large-scale science is becoming the dominant network user
14
14 Required Network Services Suite for DOE Science We have collected requirements from diverse science programs, program offices, and network analysis – the following summarizes the requirements: – Reliability 99.95% to 99.999% reliability Redundancy is the only way to meet the reliability requirements –Redundancy within ESnet –Redundant peerings –Redundant site connections where needed – Connectivity Geographic reach equivalent to that of scientific collaboration Multiple peerings to add reliability and bandwidth to interdomain connectivity Critical both within the US and internationally – Bandwidth 10 Gbps site to site connectivity today 100 Gbps backbone by 2010 Multiple 10+ Gbps R&E peerings Ability to easily deploy additional lambdas and peerings – Service guarantees All R&E networks must interoperate as one seamless fabric to enable end2end service deployment Guaranteed bandwidth, traffic isolation, quality of service Flexible rate bandwidth guarantees – Collaboration support Federated trust, PKI (Grid, middleware) Audio and Video conferencing – Production ISP service
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.