Download presentation
Presentation is loading. Please wait.
Published byRonald Webb Modified over 9 years ago
1
1 Experiences and results from implementing the QBone Scavenger Les Cottrell – SLAC Presented at the CENIC meeting, San Diego, May 2002 www.slac.stanford.edu/grp/scs/talk/cenic-may02.html Partially funded by DOE/MICS Field Work Proposal on Internet End-to-end Performance Monitoring (IEPM), also supported by IUPAP
2
2 Outline Needs for High Energy & Nuclear Physics (HENP) Why we need scavenger service What is scavenger service How is it used Results of tests with 10Mbps, 100bps and 2Gbps bottlenecks How we could use it all
3
3 HENP Experiment Model World wide collaborations necessary for large undertakings Regional computer centers in France, Italy, UK & US –Spending Euros on data center at SLAC not attractive –Leverage local equipment & expertise Resources available to all collaborators Requirements – bulk (60% of SLAC traffic): –Bulk data replication (current goal > 100MBytes/s) –Optimized cached read access to 10-100GB from 1PB data set
4
4 Data requirements for HEP HEP physics accelerator experiments generate 10’s to 100’s Mbytes/s of raw data (100Mbytes/s == 3.6TB/hr) –Already heavily filtered in trigger hardware/software to only choose “potentially” interesting events –Data rate limited by ability to record and use data Data is analyzed to reconstruct tracks etc., and events from the electronics signal data –Requires computing resources at several (tier 1) sites worldwide, for BaBar this includes: France, UK & Italy –Data has to be sent to sites, reconstructions have to be shared Reconstructed data is summarized into an object oriented data base providing parameters of the events Summarized data is analyzed by physicists around the world looking for physics and equipment understanding, thousands of physicists in hundreds of institutions in tens of countries. In addition use Monte Carlo methods to create simulated events, to compare with real events –Also very cpu intensive, so done at multiple sites such as LBNL, LLNL, Caltech, and results shared with other sites
5
5 HENP Data Grid Hierarchy Tier 1 Tier2 Center Online System CERN 700k SI95 ~1 PB Disk; Tape Robot FNAL: 200k SI95; 600 TB IN2P3 Center INFN Center RAL Center Institute Institute ~0.25TIPS Workstations ~100-400 MBytes/sec 2.5 Gbps 100 - 1000 Mbits/sec Physicists work on analysis “channels” Each institute has ~10 physicists working on one or more channels Physics data cache ~PByte/sec ~2.5 Gbits/sec Tier2 Center ~2.5 Gbps Tier 0 +1 Tier 3 Tier 4 Tier2 Center Tier 2 Experiment CERN/Outside Resource Ratio ~1:2 Tier0/( Tier1)/( Tier2) ~1:1:1
6
6 HEP Next Gen. Networks needs Providing rapid access to event samples and subsets from massive data stores –From ~400 Terabytes in 2001, ~Petabytes by 2002, ~100 Petabytes by 2007, to ~1 Exabyte by ~2012. Providing analyzed results with rapid turnaround, by coordinating and managing the LIMITED computing, data handling and NETWORK resources effectively Enabling rapid access to the data and the collaboration –Across an ensemble of networks of varying capability Advanced integrated applications, such as Data Grids, rely on seamless operation of our LANs and WANs –With reliable, quantifiable (monitored), high performance –For “Grid-enabled” event processing and data analysis, and collaboration
7
7 Also see http://www-iepm.slac.stanford.edu/monitoring/bulk/; and the Internet2 E2E Initiative: http://www.internet2.edu/e2e * Throughputs today Can get 400Mbits/s TCP throughput regularly from SLAC to well connected sites on production ESnet or Internet 2 within US. Need big windows & multiple streams, > 500MHz cpus Usually single transfer is disk limited to < 70Mbits/s Trans-Atlantic * * *
8
8 Why do we need even higher speeds Data growth exceeds Moore’s law New experiments coming on line Experiment with higher speeds: –Understand next limitations: End hosts: disks, memory, compression Application steering, windows, streams, tuning stacks, choosing replicas… Improve or replace TCP stacks, forward error correction, non congestion related losses … Coexistence, need for QoS, firewalls … Set expectations Change mindset NTON enabled us to be prepared to change from shipping tapes to using network, assisted in more realistic planning
9
9 In addition … Requirements – interactive: –Remote login, video conferencing, document sharing, joint code development, co-laboratory (remote operations, reduced travel, more humane shifts) –Modest bandwidth – often < 1 Mbps –Emphasis on quality of service & sub-second responses How to get the best of both worlds: –Use all available bandwidth –Minimize impact on others One answer is to be a scavenger
10
10 What is QBSS QBSS stands for QBone Scavenger Services. It’s an Internet2 initiative, to let users and applications. –take advantage of otherwise unused bandwidth. – without affecting performance of the default best-effort class of service. QBSS corresponds to a specific Differentiated Service Code Point (DSCP): DSCP = 001000 (binary) The IPv4 ToS (Type of Service) octet looks like: Bits 0-2 = Class selector Bits 0-5 = DSCP (Differentiated Service Code Point) Bits 6-7 = Early Congestion Notification (ECN) 60541237
11
11 How is it used Users can voluntarily mark their traffic with QBSS codepoint: As they would type nice on Unix; Routers can mark packets for users/applications Routers that see traffic marked with QBSS code point can: Be configured to handle it Forward at a lower priority than best effort traffic, with possibility of expanding bandwidth when other traffic is not using all capacity Not know about it Treat is as regular Best Effort (DSCP 000000)
12
12 Impact on Others Make ping measurements with & without iperf TCP loading –Loss loaded vs unloaded –RTT Looking at how to avoid impact: e.g. QBSS/LBE, application pacing, control loop on RTT, reducing streams, want to avoid scheduling
13
13 QBSS test bed with Cisco 7200s Set up QBSS testbed –Has a 10Mbps bottleneck Configure router interfaces –3 traffic types: QBSS, BE, Priority –Define policy, e.g. QBSS > 1%, priority < 30% –Apply policy to router interface queues 10Mbps 100Mbps 1Gbps Cisco 7200s
14
14 Using bbcp to make QBSS measurements Run bbcp src data /dev/zero, dst=/dev/null, report throughput at 1 second intervals –with TOS=32 10 (QBSS) –After 20 s. run bbcp with no TOS bits specified (BE) –After 20 s. run bbcp with TOS=40 10 (priority) –After 20 more secs turn off Priority –After 20 more secs turn off BE
15
15 Example of effects Also tried: 1 stream for all, and priority at 70%
16
16 QBSS with Cisco 6500 6500s + Policy Feature Card (PFC) –Routing by PFC2, policing on switch interfaces –2 queues, 2 thresholds each –QBSS assigned to own queue with 5% bandwidth – guarantees QBSS gets something –BE & Priority traffic in 2 nd queue with 95% bandwidth –Apply ACL to switch port to police Priority traffic to < 30% 100Mbps 1Gbps Cisco 6500s + MSFC/Sup2 Time 100% BE Priority (30%) QBSS (~5%)
17
17 Impact on response time (RTT) Run ping with Iperf loading with various QoS settings, iperf ~ 93Mbps –No iperf ping avg RTT ~ 300usec (regardless of QoS) –Iperf = QBSS, ping=BE or Priority: RTT~550usec 70% greater than unloaded –Iperf=Ping QoS (exc. Priority) then RTT~5msec > factor of 10 larger RTT than unloaded
18
18 SC2001 Our challenge: Bandwidth to the world Demonstrate the current data transfer capabilities to several sites worldwide: –26 sites all over the world –IPERF servers on each remote side that can accept data coming from the show floor; Mimic a high energy physics tier 0 or tier 1 site (an accelerator or major computation site) in distributing copies of the raw data to multiple replica sites.
19
19 SC2001 Setup SC2001 NOC To the world! The configuration of the two 6509 switches defined the baseline for QBSS traffic at 5% of the total bandwidth. The Gig lines to the NOC were Ether- Channeled together so to have an aggregate 2 Gig line The setup at SC2001 had three Linux PCs with a total of 5 gig Eth interfaces.
20
20 Pings to host on show floor Priority: 9+-2 ms BE: 18.5+-3ms QBSS:54+-100 ms SC2001 demo 1/2 Send data from 3 SLAC/FNAL booth computers to over 20 other sites with good connections in about 6 countries –Iperf TCP throughputs ranged from 3Mbps to ~ 300Mbps Saturate 2Gbps connection to floor network –Maximum aggregate throughput averaged over 5 min. ~ 1.6Gbps Apply QBSS to highest performance site, and rest BE 100 0 Time Mbits/s Iperf TCP Throughput Per GE interface QBSS No QBSS 5mins
21
21 Possible usage Apply priority to lower volume interactive voice/video-conferencing and real time control Apply QBSS to high volume data replication Leave the rest as Best Effort Since 40-65% of bytes to/from SLAC come from a single application, we have modified to enable setting of TOS bits Need to identify bottlenecks and implement QBSS there Bottlenecks tend to be at edges so hope to try with a few HEP sites
22
22 Acknowledgements &More Information Official Internet2 page: –http://qbone.internet2.edu/qbss/http://qbone.internet2.edu/qbss/ IEPM/PingER home site: –www-iepm.slac.stanford.edu/www-iepm.slac.stanford.edu/ Bulk throughput site: –www-iepm.slac.stanford.edu/bwwww-iepm.slac.stanford.edu/bw QBSS measurements –www-iepm.slac.stanford.edu/monitoring/qbss/measure.htmlwww-iepm.slac.stanford.edu/monitoring/qbss/measure.html CENIC Network Applications Magazine, vol 2, April ’02 –www.cenic.org/InterAct/interactvol2.pdfwww.cenic.org/InterAct/interactvol2.pdf Thanks to Stanislav Shalunov of Internet 2, for inspiration and encouragement; Paola Grosso, Stefan Luitz, Warren Matthews & Gary Buhrmaster of SLAC for setting up routers and helping with measurements.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.