Packet Drop in HP Switches Guoming. Cause: packet based hashing in F10 LAG + HP switch buffer Assumption: link utilization 50% In the hashing, several.

Slides:



Advertisements
Similar presentations
NetFPGA Project: 4-Port Layer 2/3 Switch Ankur Singla Gene Juknevicius
Advertisements

Congestion Control Reasons: - too many packets in the network and not enough buffer space S = rate at which packets are generated R = rate at which receivers.
1.  Congestion Control Congestion Control  Factors that Cause Congestion Factors that Cause Congestion  Congestion Control vs Flow Control Congestion.
Partial Packet Discard The effective throughput of TCP over ATM can be quite low when cells are dropped at the congested ATM switch. The low throughput.
1 Updates on Backward Congestion Notification Davide Bergamasco Cisco Systems, Inc. IEEE 802 Plenary Meeting San Francisco, USA July.
Router Architecture : Building high-performance routers Ian Pratt
1 Sources of Instability in Data Center Multicast Dmitry Basin Ken Birman Idit Keidar Ymir Vigfusson LADIS 2010.
Nick McKeown CS244 Lecture 6 Packet Switches. What you said The very premise of the paper was a bit of an eye- opener for me, for previously I had never.
6/16/20151 On Designing Improved Controllers for AQM Routers Supporting TCP flows By C.V Hollot, Vishal Mishra, Don Towsley and Wei-Bo Gong Presented by.
Analysis and Simulation of a Fair Queuing Algorithm
FF-1 9/30/2003 UTD Practical Priority Contention Resolution for Slotted Optical Burst Switching Networks Farid Farahmand The University of Texas at Dallas.
1 K. Salah Module 4.0: Network Components Repeater Hub NIC Bridges Switches Routers VLANs.
A Switch-Based Approach to Starvation in Data Centers Alex Shpiner Joint work with Isaac Keslassy Faculty of Electrical Engineering Faculty of Electrical.
Reduced TCP Window Size and Adaptive Playout for Legacy LAN VoIP Niko Färber, Yi Liang November 29, 2000.
Spring Routing & Switching Umar Kalim Dept. of Communication Systems Engineering 03/04/2007.
1 Computer Networks Switching Technologies. 2 Switched Network Long distance transmission typically done over a network of switched nodes End devices.
Core Stateless Fair Queueing Stoica, Shanker and Zhang - SIGCOMM 98 Rigorous fair Queueing requires per flow state: too costly in high speed core routers.
Analysis of Input Queueing More complex system to analyze than output queueing case. In order to analyze it, we make a simplifying assumption of "heavy.
Core Stateless Fair Queueing Stoica, Shanker and Zhang - SIGCOMM 98 Fair Queueing requires per flow state: too costly in high speed core routers Yet, some.
Buffer Management for Shared- Memory ATM Switches Written By: Mutlu Apraci John A.Copelan Georgia Institute of Technology Presented By: Yan Huang.
CS144 An Introduction to Computer Networks
Advance Computer Networking L-5 TCP & Routers Acknowledgments: Lecture slides are from the graduate level Computer Networks course thought by Srinivasan.
TCP : Transmission Control Protocol Computer Network System Sirak Kaewjamnong.
1 Flow Identification Assume you want to guarantee some type of quality of service (minimum bandwidth, maximum end-to-end delay) to a user Before you do.
Team Members Xuan Bao Jacob Cox Bryan Fleming Wenzhong Wu 20 February 2009.
Congestion Control Ian Colloff LWG San Francisco September 25, 2006.
Computer Networks with Internet Technology William Stallings
Network Architecture for the LHCb DAQ Upgrade Guoming Liu CERN, Switzerland Upgrade DAQ Miniworkshop May 27, 2013.
CS640: Introduction to Computer Networks Aditya Akella Lecture 20 - Queuing and Basics of QoS.
N. Hu (CMU)L. Li (Bell labs) Z. M. Mao. (U. Michigan) P. Steenkiste (CMU) J. Wang (AT&T) Infocom 2005 Presented By Mohammad Malli PhD student seminar Planete.
CEN 5501C - Computer Networks - Spring UF/CISE - Newman1 Computer Networks Chapter 5 – Hubs, Switches, VLANs, Fast Ethernet.
Impact of memory size on ECM and E 2 CM Single-Hop High Degree Hotspot Cyriel Minkenberg & Mitch Gusat IBM Research GmbH, Zurich May 10, 2007.
CS640: Introduction to Computer Networks Aditya Akella Lecture 15 TCP – III Reliability and Implementation Issues.
CS640: Introduction to Computer Networks Aditya Akella Lecture 15 TCP – III Reliability and Implementation Issues.
Queuing Delay 1. Access Delay Some protocols require a sender to “gain access” to the channel –The channel is shared and some time is used trying to determine.
Intel Slide 1 A Comparative Study of Arbitration Algorithms for the Alpha Pipelined Router Shubu Mukherjee*, Federico Silla !, Peter Bannon $, Joel.
Jiaxin Cao, Rui Xia, Pengkun Yang, Chuanxiong Guo,
DAQ interface + implications for the electronics Niko Neufeld LHCb Electronics Upgrade June 10 th, 2010.
LKr readout and trigger R. Fantechi 3/2/2010. The CARE structure.
Management of the LHCb DAQ Network Guoming Liu *†, Niko Neufeld * * CERN, Switzerland † University of Ferrara, Italy.
Pertemuan 7 Introduction to LAN Switching and Switch Operation
Univ. of TehranIntroduction to Computer Network1 An Introduction Computer Networks An Introduction to Computer Networks University of Tehran Dept. of EE.
Artur BarczykRT2003, High Rate Event Building with Gigabit Ethernet Introduction Transport protocols Methods to enhance link utilisation Test.
Univ. of TehranIntroduction to Computer Network1 An Introduction to Computer Networks University of Tehran Dept. of EE and Computer Engineering By: Dr.
Scheduling Mechanisms Applied to Packets in a Network Flow CSC /15/03 By Chris Hare, Ricky Johnson, and Fulviu Borcan.
Flow OAM Requirements Janardhanan Pathangi Balaji Venkat Venkataswami DELL Richard Groves – Microsoft Peter Hoose – Facebook
Network Processing Systems Design
Ethernet Packet Filtering – Part 2 Øyvind Holmeide 10/28/2014 by.
Chapter 5 Link Layer Computer Networking: A Top Down Approach 6 th edition Jim Kurose, Keith Ross Addison-Wesley March 2012 A note on the use of these.
William Stallings Data and Computer Communications
Advanced Network Tap application for
Youngstown State University Cisco Regional Academy
Congestion Control in Data Networks and Internets
How to Train your Dragonfly
Corelite Architecture: Achieving Rated Weight Fairness
QoS & Queuing Theory CS352.
Buffer Management and Arbiter in a Switch
RT2003, Montreal Niko Neufeld, CERN-EP & Univ. de Lausanne
Chapter 4 Data Link Layer Switching
Advance Computer Networking
Design Review.
Advance Computer Networking
EE 122: Lecture 7 Ion Stoica September 18, 2001.
Native Simulation of Round-Robin Queuing
Buffer Management for Shared-Memory ATM Switches
Congestion Control Reasons:
Transport Layer: Congestion Control
Ram Dantu University of North Texas,
Control-Data Plane Separation
Ram Dantu University of North Texas,
Presentation transcript:

Packet Drop in HP Switches Guoming

Cause: packet based hashing in F10 LAG + HP switch buffer Assumption: link utilization 50% In the hashing, several events with varied IP_ID could use the same output link, but from long term’s view, the load is still balanced very well among all LAG members Example: 4 1 F10 Ports 231 HP Switch Congestion 1 Event, 1: destination 1 Another round after have sent to all farm nodes

What makes thing worse? HP switch available buffer: 350 ~ 500 KB Available buffer depends on the frame size Big event size e.g. the case in slide 2: two events contest the same output port in HP, packets get dropped if the event size is bigger the buffer

Simulation Studies with simulation Assumptions / simplifications 1) All frames have the same size 2) Full farm size: 5-port LAG X 100, 30 nodes/rack 3) To speed up the simulation: 12 pkts/event, 1.2 KB/frame

Result: Max. Queue Length vs MEP factor 12 pkts/event Link utilization: 80% No clear correlation between Queue length and MEP factor Mep factor F10Q Max (pkt)‏ HPQ Max (pkt)‏

Result: Max. Queue Length vs link number per LAG Link utilization: 80% Link number/LAG 5678 F10Q Max HPQ Max

Result: Max. Queue Length vs link utilization 5 links/LAG X100 Link Utilization (%)‏ F10Q Max HPQ Max

Possible solutions Enable flowcontrol Others if flowcontrol does not help – Small MEP factor – Change IP_IDENT (see the result in next slide)‏ 1) half with IP_IDENT, the other half with IP_IDENT+1 2) 1/3 IP_IDENT, 1/3 IP_IDENT+1, 1/3 IP_IDENT+2 3) some other schemes... – Change back to the original scheme: no LAG, small VLAN – Feature request for F10: round-robin hashing – Upgrade HP switches

Simulation Result: changing IP_IDENT (1)‏ 1/2 + 1/2 1/3 + 1/3+ 1/3 Link Utilization (%)‏ F10Q Max HPQ Max Link Utilization (%)‏ F10Q Max HPQ Max