Download presentation
Presentation is loading. Please wait.
Published byTimothy Curtis Modified over 9 years ago
1
Center of Excellence Wireless and Information Technology CEWIT 2008 TeraPaths: Managing Flow-Based End-to-End QoS Paths Experience and Lessons Learned Dimitrios Katramatos, Dantong Yu, Kunal Shroff Brookhaven National Laboratory Thomas Robertazzi Stony Brook University Shawn McKee University of Michigan
2
Center of Excellence Wireless and Information Technology 2 CEWIT 2008 Abstract TeraPaths is a Department of Energy funded network research project to support efficient, predicable, and prioritized peta-scale data replication in modern high- speed networks The TeraPaths network management framework establishes on-demand and manages true end-to-end, QoS-aware, virtual network paths across multiple administrative network domains TeraPaths dedicates network resources to data flows specifically authorized to use such network paths, in a transparent and scalable manner. This ensures that only selected flows receive a pre-determined, guaranteed level of QoS in terms of bandwidth, jitter, delay, etc.
3
Center of Excellence Wireless and Information Technology 3 CEWIT 2008 Speaker’s Biography Dantong Yu Brookhaven National Laboratory Dantong Yu received the Ph.D. degree in Computer Science from State University of New York at Buffalo, USA, in 2001. His research interests include high- speed network performance, network Quality of Service, cluster/grid computing, information retrieval, data mining, databases, and data warehouses. He leads the large volume WAN data transfer between CERN, BNL, ATLAS and RHIC collaboration institutes over high-speed networks with Grid middleware
4
Center of Excellence Wireless and Information Technology 4 CEWIT 2008 Outline Background: the TeraPaths project Establishing flow-based end-to-end QoS paths Domain interoperation Encountered issues and proposed solutions Project status and future work Conclusions
5
Center of Excellence Wireless and Information Technology 5 CEWIT 2008 Background Provide QoS guarantees at the individual data flow level, all the way to the end hosts, transparently –Data flows have varying priority/importance Video streams Critical data Long duration transfers –Default “best effort” network behavior treats all data flows as equal –Capacity is not unlimited Congestion causes bandwidth and latency variations Performance and service disruption problems, unpredictability Dynamic flow-based SLAs = schedule network utilization –Regulate and classify (prioritize) traffic
6
Center of Excellence Wireless and Information Technology 6 CEWIT 2008 End-to-End Setup site border router virtual border router site host / border router regional provider router regional provider router site border router host router host router WAN domains host a2 Site A host a1 Site B Site C host c1 host b1 ACLs: a1 b1 a2 c1 ACLs: b1 a1 ACLs: c1 a2 VLAN X 10.100.1.y1 VLAN Y 10.100.1.x1 10.100.1.y2 10.100.1.x2
7
Center of Excellence Wireless and Information Technology 7 CEWIT 2008 Establishing End-to-End QoS Paths Multiple administrative domains –Cooperation, trust, but each maintains full control –Heterogeneous environment –Domain controller coordination through web services Coordination models –Star Requires extensive information for all domains –Daisy chain Requires common flexible protocol across all domains –Hybrid (end-sites first) Independent protocols Direct end site negotiation … … …
8
Center of Excellence Wireless and Information Technology 8 CEWIT 2008 Path Setup (2) End site subnets are configured by TeraPaths software instances (TeraPaths Domain Controllers or TDCs) –TDCs configure end site LANs to prioritize and regulate authorized flows via the DiffServ framework at the network device level –Source site polices/marks authorized flow packets –Destination site admits/re-polices/re-marks packets –End site LANs tx/rx marked packets to/from the WAN WAN provides MPLS tunnels or dynamic circuits –Initiating TDC requests MPLS tunnel or dynamic circuit with matching bandwidth and lifetime, or… –TDC groups flows with common src/dst into MPLS tunnel or dynamic circuit with aggregate bandwidth and lifetime –WAN preserves packet markings
9
Center of Excellence Wireless and Information Technology 9 CEWIT 2008 Path Setup (3) WAN domains interoperate –Each end site’s TDC has a single point of contact for WAN services –TDCs have no knowledge of WAN internals other than what is exposed by the WAN services End sites have no direct control over the WAN Either tunnel or circuit through WAN –TeraPaths does not mix and match the layer 2 and layer 3 technology. TeraPaths “proxy” servers –Implement interface required by TeraPaths core –Hide WAN service differences –Clients to WAN web services (currently OSCARS / DRAGON) Close cooperation with ESnet and I2 development teams –Submit reservations for MPLS tunnels or dynamic circuits –Handle security requirements –Handle errors
10
Center of Excellence Wireless and Information Technology 10 CEWIT 2008 Addressing L2-Specific Issues Limitations with VLANs –Tag range (tentatively selected 50 VLANs – 3550 to 3599) Each site may have its own range –Tag conflicts Rely on WAN service Eliminate by synchronizing site databases VLAN renaming (if/when possible) Scalability issues –Limited number of VLAN tags/Circuits: Flow grouping / circuit consolidation –Forward flows through same virtual WAN circuit »Create circuit with new parameters / switch current flows / cancel old circuit »Modify WAN reservations (if/when possible) –PBR overhead Virtual border router Sensitive/3 rd party network segments –VLAN pass-thru
11
Center of Excellence Wireless and Information Technology 11 CEWIT 2008 Flow Grouping/Circuit Consolidation Flows between same src and dst sites can share circuit, policing maintains bandwidth guarantee Multiple TeraPaths reservations associate with the same circuit reservation –Easy when requirements are known in advance –Modification of reservations required otherwise Selection/optimization to minimize resource waste Trade-off based on Δ bw (bandwidth difference), Δ t b, Δ t a (time period before and after a reservation) 2 1 3 4 5 2 1 3 4 5 time bandwidth ΔtΔt Δbw current time
12
Center of Excellence Wireless and Information Technology 12 CEWIT 2008 Flow Grouping/Circuit Consolidation (2) Similar approach to disk buffering (read ahead / write behind) –Bring up ahead / teardown behind –Reuse existing active circuits –Reserve circuits with more bandwidth and longer duration depending on differences in start time, duration, bandwidth of reservations –Delay teardown, modify circuit duration and/or bandwidth if possible 2 1 3 4 5 2 1 3 4 5 time bandwidth current time ΔtbΔtb ΔtaΔta 2 1 3 4 5 time bandwidth current time ΔtaΔta 2 1 3 4 5
13
Center of Excellence Wireless and Information Technology 13 CEWIT 2008 Limitation of Dynamic Circuits A recent incident in BNL’s LHCOPN subnet: –Cisco’s PBR implementation only uses the status of an interface to decide whether or not to forward packets –A network circuit breaks somewhere along the path, but the involved interfaces on both ends are still up –No probes and/or heartbeat exist to check the “health” of circuits –Fail-over to the backup link does not work since primary interfaces are up even when such a problem exists End site monitoring is the most effective way to detect such a problem
14
Center of Excellence Wireless and Information Technology 14 CEWIT 2008 Active Circuit Probing Each TeraPaths site instance periodically verifies “well being” of reservations: –Selects active reservations initiated by site (site responsibility) –Finds circuit/VLAN associated with each reservation –Performs a circuit check with a quick pinging of other site’s router (private ip address space) –Less than 100% success triggers a recheck with longer duration pings in both directions (to and from other site) –Low success % triggers reservation cancellation reverting traffic to best effort network –Optionally, the system adapts reservation data and attempts to setup a new end-to-end path (for given time period/number of attempts)
15
Center of Excellence Wireless and Information Technology 15 CEWIT 2008 Prioritizing Traffic competing traffic causes dramatic drop in bandwidth QoS / circuit reservation active
16
Center of Excellence Wireless and Information Technology 16 CEWIT 2008 Recovering from Circuit Failure circuit interruption recovery to best effort
17
Center of Excellence Wireless and Information Technology 17 CEWIT 2008 Competing against BE traffic
18
Center of Excellence Wireless and Information Technology 18 CEWIT 2008 Status BNL, UMich, BU, all with 10Gbps connections, multiple pass-thru configurations (BNL, UMich, NoX, Merit, MiLR) Utilization of L3 paths (MPLS tunnels, ESnet only), L2 paths (dynamic circuits, ESnet and Internet2) Multiple QoS reservations through same circuit (support for circuit consolidation) Multiple circuits per site subject to per-site VLAN availability (flow grouping/circuit consolidation) Active circuit probing for failures with fallback to best effort network/attempt to reconfigure e2e path (in testing phase) Dynamic bandwidth allocation within service classes (in testing phase) New command line client
19
Center of Excellence Wireless and Information Technology 19 CEWIT 2008 Future Work Continue working on automatic flow grouping / circuit consolidation. Configurable reservation negotiation Grid-style AAA (GUMS/VOMS) Plug-ins: SRM (dCache), others Compatibility with Lambda Station Support for different hardware as needed ATLAS Production: –Replicate ATLAS Physics data from BU and UMich with the existing ATLAS DDM stack, and with end-to-end QoS circuits –Tier 1 (BNL) and Tier 2 data replication http://www.terapaths.org
20
Center of Excellence Wireless and Information Technology 20 CEWIT 2008 Conclusions Demonstrated the effective prioritization and protection from interference of selected data transfers between three LHC experiment institutes – Brookhaven National Laboratory, the University of Michigan, and Boston University – through guaranteed bandwidth virtual paths, at the presence of intensive best-effort IP traffic sharing the same network resources A practical and economical end-to-end network resource reservation system, extending new capabilities to users/applications of end sites without requiring additional, expensive network infrastructure components
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.