Download presentation
Published byVirgil Gilbert Modified over 8 years ago
1
Linux Traffic Control and usage of tc/tcng for traffic engineering in Linux
October 14, 2008 Laziz Yunusov Advanced Networking Technology Lab. (YU-ANTL) Dept. of Information & Comm. Eng, Graduate School, Yeungnam University, KOREA (Tel : , Fax :
2
Outline Introduction Processing of network data in Linux kernel
Queuing disciplines Introduction to tc Terminology Example configurations using tc – FIFO, PQ Overview of tcng Installation of tcng Slides to be added References Laziz Yunusov, Linux Traffic Control
3
Introduction Linux traffic control can be used to build a complex combination of queuing disciplines, classes and filters that control the packets that are sent on the output interface The input de-multiplexer examines the incoming packets to determine if the packets are destined for the local node If so, they are sent to the higher layer for further processing If not, it sends the packets to the forwarding block May have also received locally generated packets from the higher layer Looks up the routing table and determines the next hop for the packet After this, it queues the packets to be transmitted on output interface Laziz Yunusov, Linux Traffic Control
4
Processing of network data in Linux kernel
How the kernel processes data received from the network? Blue blocks are called the "Traffic Control Code of the Linux kernel" Ingress policing Undesirable incoming packets may be discarded, e.g. if traffic is arriving too fast After policing, packets are either directly forwarded to the network (e.g. on a different interface, if the machine is acting as a router or a bridge), or they are passed up to higher layers in the protocol stack (e.g. to a transport protocol like UDP or TCP) for further processing Those higher layers may also generate data on their own and hand it to the lower layers for tasks like encapsulation, routing, and eventually transmission Laziz Yunusov, Linux Traffic Control
5
Processing of network data in Linux kernel (2)
Forwarding includes the selection of the output interface, the selection of the next hop, encapsulation, etc Once all this is done, packets are queued on the respective output interface This is a second place when traffic control comes into play At the output queuing, Traffic control can decide if packets are queued or if they are dropped (e.g. if the queue has reached some length limit, or if the traffic exceeds some rate limit) decide in which order packets are sent (e.g. to give priority to certain flows) delay the sending of packets (e.g. to limit the rate of outbound traffic), etc Once traffic control has released a packet for sending, the device driver picks it up and emits it on the network Laziz Yunusov, Linux Traffic Control
6
Major components of Linux Traffic Control Code
Four major conceptual components Queuing disciplines Classes Filters Policers Laziz Yunusov, Linux Traffic Control
7
Queuing disciplines Queuing disciplines (qdisc)
Algorithms which control how packets enqueued are treated Simplest example: queuing discipline is a FIFO queue of 16 packets Packets enter at the tail and leave the queue from the head If packets arrive very fast, faster than they can be dispatched, the router will drop them Laziz Yunusov, Linux Traffic Control
8
Queuing disciplines (2)
But many there are other qdiscs, more complex than FIFO These qdiscs do not store packets themselves Instead, they contain other qdiscs, which they give packets to and take packets from Interesting point is, that these contained queuing disciplines could be arbitrary queuing disciplines In Linux this concept is handled by classes, and such qdiscs are known as classful qdiscs Classes Each class can be viewed as a socket to which you can plug in any other queuing discipline Classes normally don't store their packets themselves, but they use another queuing discipline to take care of that That queuing discipline can be arbitrarily chosen from the set of available queuing disciplines, and it may well have classes, which in turn use queuing disciplines, etc Laziz Yunusov, Linux Traffic Control
9
Filters Filters Classifying the packets, i.e. assign incoming packets to particular class, based on some packet properties Can be IP address, port #, MPLS label, etc Queuing disciplines may use filters to distinguish among different classes of packets and process each class in a specific way, e.g. by giving one class priority over other classes Example Note: multiple filters may map to the same class! Laziz Yunusov, Linux Traffic Control
10
Filters (2) Packets which are selected by the filter go to the high-priority class, while all other packets go to the low-priority class Whenever there are packets in the high-priority queue, they are sent before packets in the low-priority queue (e.g. the sch_prio queuing discipline works this way) In order to prevent high-priority traffic from starving low-priority traffic, we use the token bucket filter (TBF) queuing discipline, which enforces a rate of at most 1 Mbps Finally, the queuing of low-priority packets is done by a FIFO queuing discipline Laziz Yunusov, Linux Traffic Control
11
Examples of qdiscs, currently supported in Linux
Class Based Queue (CBQ) Hierarchical Token Bucket (HTB) – advancement to CBQ Token Bucket Flow (TBF) First In First Out (FIFO) Priority Traffic Equalizer (PTEQL) Stochastic Fair Queuing (SFQ) Asynchronous Transfer Mode (ATM) Random Early Detection (RED) Generalized RED (GRED) As of scheduler, Linux provides round-robin scheduler WRR (Weighted Round Robin) is implemented by Christian Worm Mortensen Can be used by applying patches for kernel and iproute2 sources Laziz Yunusov, Linux Traffic Control
12
tc TC Command line tool, written by Alexey N. Kuznetsov and included in Linux kernels since 2.2 Provides an interface to the kernel structures which perform the shaping, scheduling, policing and classifying The syntax of tc is, however, puzzling Laziz Yunusov, Linux Traffic Control
13
Terminology Queueing Discipline (qdisc) root qdisc Classless qdisc
An algorithm that manages the queue of a device, either incoming (ingress) or outgoing (egress) root qdisc The root qdisc is the qdisc attached to the device Classless qdisc A qdisc with no configurable internal subdivisions Classful qdisc Some of these classes contains a further qdisc, which may be classful, but need not be Decided from user’s prospective, i.e., qdisc is classless if the classes can't be touched with the tc tool Classes A classful qdisc may have many classes, each of which is internal to the qdisc A class can have a qdisc as parent or an other class A leaf class is a class with no child classes When you create a class, a fifo qdisc is attached to it For a leaf class, this fifo qdisc can be replaced with an other more suitable qdisc Laziz Yunusov, Linux Traffic Control
14
Terminology Classifier Filter Scheduling Shaping Policing
Each classful qdisc needs to determine to which class it needs to send a packet This is done using the classifier Filter Classification can be performed using filters A filter contains a number of conditions which if matched, make the filter match Scheduling A qdisc may, with the help of a classifier, decide that some packets need to go out earlier than others This process is called Scheduling, and is performed for example by the pfifo_fast qdisc Shaping The process of delaying packets before they go out to make traffic confirm to a configured maximum rate Shaping is performed on egress Colloquially, dropping packets to slow traffic down is also often called Shaping Policing Delaying or dropping packets in order to make traffic stay below a configured bandwidth In Linux, policing can only drop a packet and not delay it - there is no 'ingress queue‘! Laziz Yunusov, Linux Traffic Control
15
Syntax Configure qdisc Configure a filter Configure a class
tc qdisc [ add | change | replace | link ] dev DEV [ parent qdisc-id | root ] [ handle qdisc-id ] qdisc [ qdisc specific parameters ] Configure a class tc class [ add | change | replace ] dev DEV parent qdisc-id [classid class-id ] qdisc [ qdisc specific parameters ] Configure a filter tc filter [add | change | replace] dev DEV [ parent qdisc-id | root ] protocol protocol prio priority filtertype [filtertype specific parameters] flowid flow-id Display the current configuration tc [-s | -d ] qdisc show [ dev DEV ] tc [-s | -d ] class show dev DEV tc filter show dev DEV Laziz Yunusov, Linux Traffic Control
16
Configure FIFO qdisc using tc
FIFO is the default queuing discipline used by Linux In case you don't specify any specific qdisc, Linux assembles its interfaces with this type of queue Command to setup a FIFO queue on Ethernet interface eth0 using tc 'qdisc' for telling tc we are configuring a queue discipline (it could be 'class' or 'filter' if we are configuring a class or a filter respectively) 'add' because we are adding a new qdisc 'dev eth0' for adding the qdisc to the device or interface Ethernet eth0 'root' - it is a root qdisc (it doesn't apply to pfifo that is classless - only a root qdisc exists - but required to normalize the command utilization) 'pfifo' because our queue is a pfifo (packet-fifo) queue Finally pfifo requires only one parameter: just 'limit' to indicate the length of the queue (number of packets that the queue can hold) Our pfifo queue is a 10 packets queue Laziz Yunusov, Linux Traffic Control
17
Configure FIFO qdisc using tc (2)
After creating our queueing discipline we can ask tc what we have configured The following command will respond with we have a pfifo qdisc numbered as 8001: (this really means 8001:0) with a limit capacity of 10 packets qdiscs and components are numbered, or better yet identified by a 32 bits handler formed by a 16 bits major number and a 16 bits minor number The minor number is always zero for queues When we added our pfifo queue we didn't assign it any handler and then tc created one for it (8001:0) To delete the pfifo queue from our device eth0 Laziz Yunusov, Linux Traffic Control
18
PRIO queuing discipline
Classic Priority Queuing In classic PQ, packets are first classified by the system and then placed into different priority queues Packets are scheduled from the head of a given queue only if all queues of higher priority are empty Laziz Yunusov, Linux Traffic Control
19
PRIO queuing discipline (2)
What is PRIO qdisc in Linux? qdisc that contains an arbitrary number of classes of different priority When a packet is enqueued a sub-qdisc is chosen based on a filter command that you give with tc When you create a new PRIO queue three pfifo sub-queuing disciplines are created In fact, automatically 3 classes named m:1, m:2 and m:3 are created where m is the major number of the queuing discipline Each of these classes is assembled with a pfifo as its own qdisc Laziz Yunusov, Linux Traffic Control
20
PRIO queuing discipline (3)
Whenever a packet needs to be dequeued, class :1 is tried first When it is empty class :2 is tried, and finally class :3 Problem What if departure rate is less than arrival rate due to some congestion problem in the output link? And at the same time AF11 flows' arrival rate is higher than the departure rate In this case always you will have AF11 packets waiting in the class :1 queue and classes :2 and :3 will not be served This problem can be fixed changing the class :1 pfifo qdisc for some type of qdisc that put an upper level to AF11 flows Laziz Yunusov, Linux Traffic Control
21
Setting up PRIO qdisc using tc
The following parameters are recognized by tc when setting up PRIO bands – number of bands to create Each band is in fact a class. If it is changed, priomap should also be changed priomap If you do not provide tc filters to classify traffic, the PRIO qdisc looks at the TC_PRIO priority to decide how to enqueue traffic The kernel assigns each packet a TC_PRIO priority, based on TOS flags or socket options passed by the application The TC_PRIO is decided based on the TOS, and mapped as follows The bands are classes, and are called major:1 to major:3 by default, so if your PRIO qdisc is called 12:, tc filter traffic to 12:1 (band 0) to grant it more priority Repeating, band 0 goes to minor number 1, band 1 to minor number 2, etc Laziz Yunusov, Linux Traffic Control
22
Setting up PRIO qdisc using tc (2)
Quite similar to the previous example However, this time we are numbering our PRIO qdisc ourselves as 1:0 Because PRIO is some sort of automatic queue this command instantly create classes 1:1, 1:2 and 1:3 and each of them has its pfifo queue already installed How to implement the filters for PRIO qdisc? Idea is to assign AF classes AF11, AF21 and AF31 to prio classes 1:1, 1:2 and 1:3 Hexadecimal values are used in filter AF11 = = 0x28 AF21 = = 0x48 AF31 = = 0x68 Now we can add our filter Laziz Yunusov, Linux Traffic Control
23
Setting up PRIO qdisc using tc (3)
'tc filter add dev eth0' asks tc to add a filter on device eth0 'parent 1:0' means "the parent of the filter is going to be the object identify with the number 1:0 that happens to be our PRIO qdisc“ 'prio 1' means "try this filter with priority 1" (there can be another filters with prio 2,3,…) 'protocol ip' means "we are working with the ip protocol“ 'u32' means simplest u32 filter (u32 match is provided by the kernel) 'match ip' means "what follows has to be matched against the ip header of the packet“ 'tos 0x28 0xff' means "match exactly tos byte 0x28" the first term is multiplied with the second to get the number to match 0x28 * 0xff is just 0x28 Using 0xf0, for example, we can change the matching number 'flowid 1:1' means "flows matching this filter have to be sent, directly, to the class identify with number 1:1"; this is actually, our PRIO 1:1 class Laziz Yunusov, Linux Traffic Control
24
Setting up PRIO qdisc using tc (3)
Similarly, we add two more filters Benefits of PQ For softwarebased routers, PQ places a lower computational load on the system when compared with more elaborate queuing disciplines Allows routers to organize buffered packets, and then service one class of traffic differently from other classes of traffic (real-time vs. non real-time) Disadvantages If the amount of highpriority traffic is not policed at the edges of the network, lowerpriority traffic may experience excessive delay as it waits for unbounded higherpriority traffic to be serviced If you attempt to use PQ to place TCP flows into a higherpriority queue than UDP flows, TCP window management and flow control mechanisms will attempt to consume all of the available bandwidth on the output port, thus starving your lowerpriority UDP flows Laziz Yunusov, Linux Traffic Control
25
MPLS configuration example using tc
At E2 tc qdisc add dev eth0 root handle 1: prio tc filter add dev eth0 parent 1:0 protocol 0x8847 prio 1 tcindex mask 0xf shift 0 pass_on tc filter add dev eth0 parent 1:0 protocol 0x8847 prio 1 handle 1 tcindex tc filter add dev eth0 parent 1:0 protocol 0x8847 prio 2 handle 2 tcindex tc filter add dev eth0 parent 1:0 protocol 0x8847 prio 3 handle 3 tcindex The Ethernet layer generates a frame with the protocol field set to the code assigned to MPLS unicast packets (0x8847) (skb->tc_index & 0xf)>>0 Laziz Yunusov, Linux Traffic Control
26
TCNG TCNG The goal of tcng project is to extend the existing traffic control to become more user-friendly, and to make the interfaces of the configuration system more flexible, by layering a language on top of the powerful tc command line tool This is done by adding another layer of abstraction Laziz Yunusov, Linux Traffic Control
27
Components of tcng The traffic control compiler “tcc”
Takes configuration scripts in new “tcng” language, translates them to a common internal representation, and then generates commands in tc Layering tcng on top of tc also allows the use of tcng without requiring any change to the kernel, or to the tc utility The “tcng” language Quite similar to C and Perl Expressed with language syntax Variables and arithmetic expressions, etc Classifiers can be expressed in a C-like syntax cpp can be used to include files and write macros Laziz Yunusov, Linux Traffic Control
28
Compare tc and tcng scripts for the same configuration
Laziz Yunusov, Linux Traffic Control
29
Tcng configuration syntax
The configuration begins with the interface name and the role, i.e., ingress or egress Then, the entire classification is expressed in the rules of form action if expression where, action can be the selection of a class, or just drop to drop a packet The Boolean expression uses the same syntax and the same precedence rules as C All the common fields in IP, UDP, TCP, etc. headers are predefined E.g., tcp_dport is the TCP destination port, ip_src is IPv4 source address Users can easily add their own definitions Classes are selected with class (<$variable>), where the variable is later defined in the queuing and scheduling section Laziz Yunusov, Linux Traffic Control
30
tcng installation There are two ways to install the tcng
From source tarball From installer package (.deb, .rpm) Installation from source code is quite difficult and not always successful Cannot automatically find installed packages: iproute2, etc Especially troublesome when use generic 2.6.* kernel(e.g., MPLS-Linux) Easier way is to install from .deb installer Download latest tcng_*_i386.deb from To install on Debian/Ubuntu run dpkg -i tcng_*_i386.deb Laziz Yunusov, Linux Traffic Control
31
tcng installation (2) To install on Fedora we need to convert .deb package to .rpm package using alien program If alien is not installed Download tarball from here Install the alien as follows (not like standard install methods for tar-balls) tar xzf alien_*.tar.gz cd alien/ perl Makefile.PL make PREFIX=/usr su -c 'make PREFIX=/usr install‘ After you have alien installed in your RH/Fedora system, execute alien -r tcng_*_i386.deb (this will create the rpm file) rpm –Uvh tcng_*_i386.rpm Now we have tcng installed on our system Laziz Yunusov, Linux Traffic Control
32
Slides to be included soon
More on writing tcng scripts How to add & use user’s own definitions in tcng, for example, like mpls_exp or mpls_label Installation procedures of WRR (weighted round-robin) scheduler in 2.6.* kernels Kernel patching and iproute2+tc packages patching Fixing possible error messages Demonstrate an example scenario with WRR etc Laziz Yunusov, Linux Traffic Control
33
References [1] Differentiated Service on Linux HOWTO, [2] MPLS-Linux labs, [3] tcng homepage, [4] Traffic Control - Next Generation, Reference Manual. Laziz Yunusov, Linux Traffic Control
34
Q&A Thank you for your kind attention! Laziz Yunusov,
Linux Traffic Control
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.