FFPF: Fairly Fast Packet Filters uspace kspace nspace Vrije Universiteit Amsterdam Herbert Bos Willem de Bruijn Trung Nguyen Mihai Cristea Georgios Portokalidis Universiteit Leiden Vrije Universiteit Amsterdam u k n
Network Monitoring ● Increasingly important – traffic characterisation, security traffic engineering, SLAs, billing, etc. ● Existing solutions: – designed for slow networks or traffic engineering/QoS – not very flexible ● We’re hurting because of – hardware (bus, memory) – software (copies, context switches) -process at lowest possible level -minimise copying -minimise context switching -freedom at the bottom demand for solution: - scales to high link rates - scales in no. of apps - flexible spread of SAPPHIRE in 30 minutes
HTTP RTSP RTP bytecount generalised notion of flow Flow: “a stream of packets that match arbitrary user criteria” TCP SYN UID 0 eth0 U TCP UDP IP “contains worm” Flowgraph UDP with CodeRed
? x ? ? ? kernel userspace network card efficient ● reduced copying and context switches ● sharing data ● flowgraphs: sharing computations “push filtering tasks as far down the processing hierarchy as possible”
Application B reduce copying ● FFPF avoids both ‘horizontal’ and ‘vertical’ copies Application A U K ‘filter’ - no ‘vertical’ copies - no ‘horizontal’ copies within flow group - more than ‘just filtering’ in kernel (e.g.,statistics)
(device,eth0) | (device,eth1) -> (sampler,2) -> (FPL-2,”..”) | (BPF,”..”) -> (bytecount) (device,eth0) -> (sampler,2) -> (BPF,”..”) -> (packetcount) Extensible ✔ modular framework ✔ language agnostic ✔ plug-in filters
Buffers O O O O O OO W R ● PacketBuf – circular buffer with N fixed-size slots – large enough to hold packet ● IndexBuf – circular buffer with N slots – contains classification result + pointer
Buffers O O O O O OO W R ● PacketBuf – circular buffer with N fixed-size slots – large enough to hold packet ● IndexBuf – circular buffer with N slots – contains classification result + pointer
X X X X X OO W R Buffers ● PacketBuf – circular buffer with N fixed-size slots – large enough to hold packet ● IndexBuf – circular buffer with N slots – contains classification result + pointer
Buffer management what to do if writer catches up with slowest reader? ● slow reader preference – drop new packets (traditional way of dealing with this) – overall speed determined by slowest reader ● fast reader preference – overwrite existing packets – application responsible for keeping up ● can check that packets have been overwritten ● different drop rates for different apps O O O O O OO R1 O O O O O O O O O W
Languages ● FFPF is language neutral ● Currently we support: – BPF – C – OKE Cyclone – FPL simple to use compiles to optimised native code resource limited (e.g., restricted FOR loop) access to persistent storage (scratch memory) calls to external functions (e.g., fast C functions or hardware assists) compiler for uspace, kspace, and nspace (ixp1200) IF (PKT.IP_PROTO == PROTO_TCP) THEN // reg.0 = hash over flow fields R[0] = Hash (14,12,1024) // increment pkt counter at this // location in MBuf MEM[ R[0] ]++ FI IF (PKT.IP_PROTO == PROTO_TCP) THEN // reg.0 = hash over flow fields R[0] = Hash (14,12,1024) // increment pkt counter at this // location in MBuf MEM[ R[0] ]++ FI
packet sources uspace kspace nspace ● currently three kinds implemented - netfilter -net_if_rx() -IXP1200 ● implementation on IXPs : NIC-FIX -bottom of the processing hierarchy -eliminates mem & bus bottlenecks
Network Processors “programmable NIC” zero copy copy once on-demand copy
Performance results pkt loss: FFPF: < 0.5% LSF: 2-3%
Performance results pkt loss: LSF:64-75% FFPF: 10-15%
Performance
Summary concept of ‘flow’ generalised copying and context switching minimised processing in kernel/NIC complex programs + ‘pipes’ FPL: FFPF Packet Languages fast + flexible persistent storage flow-specific state authorisation + third-party code any user flow groups applications sharing packet buffers
More Information
microbenchmarks