NetFlow and OARnet Mark Fullmer
Agenda OARnet - who we are, what we do. Flow-tools summary - software we use to collect and process NetFlow data. NetFlow deployment at OARnet. How we use NetFlow. Detailed examples with flow-tools. Hopefully some tips that can be used or passed on to customers.
OARnet Internet access to about 100 of Ohio’s colleges and universities. Some commercial clients / co-lo services. Internet2 / Abilene access to member school’s in Ohio. Sink over 1Gb/s of traffic from the Internet during peak hours.
OARnet Multiple Transit providers, access to two of the NAPs, OC48 to Abilene. In the middle of a state wide network rebuild, moving from ATM circuits over leased lines to POS and GigE over DWDM using Cisco 15454’s.
Flow-tools Started back in June of ‘96 working from Darren Kerr’s fdg.c (a very simple example of displaying NetFlow v1 data). No commercial or open source NetFlow collectors available at the time. Released as an open source toolkit for working with NetFlow data from Cisco routers. At the time mostly used internally at Ohio State University, usually for network forensics.
Flow-tools Still available as open source, active development on and off as resources (time) are available. Keep working it as long as people are interested. Many other options available now for collectors and post processing both open source and commercial. A lot of sites use it, over 700 subscribers to the support mailing list.
NetFlow deployment Collector running FreeBSD and flow-capture at each POP (distributed collection). Sampled NetFlow data feeds (where software permits). Flow-capture stores the exports to disk with line compression at 5 minute intervals. Some reports run on the collectors, some at back at a central server (combination of distributed and centralized report generation).
NetFlow deployment Flow collector. 80GB RAID storage. Core router Aggregation routers. OC48’s to other POPs.
NetFlow At Work
Traffic Engineering We have multiple transit providers with various contractual obligations. With provider T we must purchase a minimum of 300Mb/s / month. Provider Q recently reduced their prices to half of provider T. So we want to minimize our transit costs by using provider T for no more than 300Mb/s during peak hours until contract runs out (or they reduce their prices).
Traffic Engineering Clearly above 300Mb/s on on outbound traffic, a little over on inbound.
Traffic Engineering 170 prefixes that are announced to the internet. Prefix length does not necessarily correlate to traffic load. I.e. some smaller school’s with /16’s. So to influence provider T to send us less traffic to a prefix, pad the AS path.
Traffic Engineering Use NetFlow to determine which prefixes to pad. Select traffic that has an input interface of the provider circuit. Add local masks. Scale the packets and octets by 100 (sampling rate is 1/100). Summarize on destination IP & mask (zero out the host bits).
Traffic Engineering Collected flows(flow-capture). Select a days data (flow-cat) Filter on provider interface (flow-nfilter) Fix masks (flow-mask) Summarize by destination IP & mask (flow-report) Format tabular data (flow-rptfmt)
Traffic Engineering # ifIndex 46 is the POS interface to the Internet filter-primitive CLMBO-R4-INTERNET type ifindex permit 46 # Match on traffic to the Internet by POS interface. filter-definition CLMBO-R4-INTERNET-OUT match output-interface CLMBO-R4-INTERNET # Match on traffic from the Internet by POS interface. filter-definition CLMBO-R4-INTERNET-IN match input-interface CLMBO-R4-INTERNET Flow-nfilter configuration:
Traffic Engineering mask-definition AS600-ADV-ALL prefix /16 16 prefix /16 16 prefix /16 16 prefix /24 24 prefix /24 24 prefix /16 16 prefix /16 16 prefix /23 23 prefix /16 16 prefix / Flow-mask configuration:
Traffic Engineering Flow-report configuration: stat-report CLMBO-R4-R8-FROM-INTERNET-BY-DESTINATION-PREFIX type ip-destination-address filter CLMBO-R4-INTERNET-IN ip-destination-address-format prefix-mask scale 100 output options +header,+xheader sort +octets fields -duration,+bps,+pps output sort +octets fields -duration path |flow-rptfmt -Fip-destination-address,flows,octets,packets >
Traffic Engineering Run the report: #!/bin/sh FLOW_DATA=../CLMBO-R4/ /* flow-cat $FLOW_DATA | flow-report -s report.cfg -Sdaily-summaries \ -vP=daily-summaries/
Traffic Engineering Flow-report output: # Report Information # build-version: flow-tools 0.68p # name: CLMBO-R4-FROM-INTERNET-BY-DESTINATION-PREFIX # type: ip-destination-address # scale: 100 # options: +header,+xheader # ip-dst-addr-type: prefix-mask # sort_field: +octets # fields: +key,+flows,+octets,+packets,+other # filter: CLMBO-R4-INTERNET-IN # records: 139 # first-flow: Mon Nov 8 00:01: # last-flow: Tue Nov 9 00:01: # now: Thu Nov 11 19:57: # # mode: streaming # compress: off # byte order: little # stream version: 3 # export version: 5 #
Traffic Engineering Flow-report output: XXX # recn: ip-destination-address*,flows,octets,packets /16, , , /18, , , /16,678257, , /16,786497, , /16,523565, , /16,651195, , /16,389654, , /16,379746, , /16,328851, , /16,416366, , /16,511276, , /16,746850, , /16,308359, , /16,441714, , /16,268486, , /16,380376, , …
Traffic Engineering Flow-report-fmt output: # ['./flow-rptfmt', '-Fip-destination-address,flows,octets,packets'] ip-destination-address flows octets packets / / / / / / / / / / / / / ……….
Traffic Engineering Flow-report-fmt output (top 10 percent totals): # ['./flow-rptfmt', '-m10', '-p', '-Fip-destination-address,flows,octets,packets'] ip-destination-address flows octets packets / / / / / / / / / /
Traffic Engineering A day long summary usually works for situations like this, but we may want to look at a time-series report over a day period (or longer). Gnuplot, or rrdtool, or gdchart (among others) can be used for this. Flow-rpt2rrd will convert the flow-report output to rrd format. For automated reports rrdtool works well since it bounds the storage requirements.
Traffic Engineering #!/bin/sh FLOW_DATA=../CLMBO-R4/ /* for name in $FLOW_DATA; do echo working...$name flow-report -s report.cfg -S5min-summaries \ -vP=5min-summaries < $name done for name in 5min-summaries/*; do echo working...$name./flow-rpt2rrd -K as600.keys -p rrds < $name done
Traffic Engineering During peak hours moving traffic for /16 elsewhere will drop inbound load by between 20 and 30Mb/s.
Traffic Engineering Minor changes to the report to get other results Outbound traffic by source prefix (busy sources in OARnet). Outbound traffic by destination prefix or AS (where are our sinks). Inbound traffic by source prefix (where do our sinks draw traffic from). Source / Destination pairs.
Traffic Engineering stat-report CLMBO-R4-TO-INTERNET-BY-SOURCE-PREFIX type ip-source-address filter CLMBO-R4-INTERNET-OUT ip-source-address-format prefix-mask scale 100 output options +header,+xheader sort +octets fields -duration
Traffic Engineering stat-report CLMBO-R4-TO-INTERNET-BY-DESTINATION-AS type destination-as filter CLMBO-R4-INTERNET-OUT scale 100 output options +header,+xheader sort +octets fields -duration
Traffic Engineering # ['./flow-rptfmt', '-p', '-n', '-Fdestination-as,flows,octets,packets'] destination-as flows octets packets SCRR ATT-INTERNET DNEO-OSP CCINET ROGERS-AS CHARTER-NET-HKY-NC RR-CINCINNATI-ASN SCRR VIDEOTRON-LTEE SCRR
Traffic Engineering During peak hours moving traffic destined to AS elsewhere will drop outbound load by between 20 and 30Mb/s.
RPF checks Provider Q wants to turn on strict RPF checks on our peerings. For a variety of reasons this won’t work for us (load distribution for one), but we do agree to implementing a traffic filter. There are places OARnet has not enforced RPF checks, mostly for historical reasons.
RPF checks So want to understand the impact of applying the traffic filter before applying it to prevent dropping valid customer traffic. Expect problems mostly to be multi-homed clients who may not be advertising all their address space to us. …Plus unexpected surprises.
RPF checks Collected flows(flow-capture). Select a days data (flow-cat) Filter on provider interface (flow-nfilter) Filter on announced address space Format tabular data (flow-rptfmt) Summarize results with flow-report
RPF checks Flow-nfilter configuration: #; OSU filter-primitive AS600-ADV-ALL type ip-address-prefix default permit deny /16 deny /16 deny /16 deny /24 deny /24 deny /16 deny /16 deny /23 deny /16 deny /16...
RPF checks Flow-report configuration: stat-report CLMBO-R4-RPF-FAIL type ip-source-address filter CLMBO-R4-RPF-FAIL ip-source-address-format prefix-mask scale 100 output options +header,+xheader sort +octets fields -duration path |./flow-rptfmt stat-definition CLMBO-R4-RPF-FAIL report CLMBO-R4-RPF-FAIL
RPF checks Results % FLOW_DATA=../clmbo-r4/ /* % % flow-cat $FLOW_DATA | flow-report -s report.cfg -SCLMBO-R4-RPF-FAIL ip-source-address flows octets packets / / / / / / / / / / /
Access Lists Same procedure can be used to test a firewall or access list before applying it to live traffic. Determine what the impact of a change will be before impacting customers.
Traffic Classification Abilene / Internet2 is a high speed research network that OARnet provides connectivity to for Ohio school’s. Only some school’s participate. Of those school’s that participate they can sponsor smaller school’s or other institutions for example a medical college.
Traffic Classification The participants all share the cost for an OC48 to Abilene. They want to know who’s using it both by school and group. School’s may have multiple networks. Not all school’s have AS numbers. School traffic is combination of prefixes. Group traffic is combination of schools.
Traffic Classification Flow-tag adds a new 32 bit source tag and destination tag to the flows. Tags can be set based on criteria in the flow, such as source or destination prefix. Reports can then be run on the tagged traffic.
Traffic Classification Collected flows(flow-capture) Select a months data (flow-cat) Add tags based on interface and prefix (flow-tag) Graph tabular data (gnuplot) Summarize results with flow-report Filter on provider interface (flow-nfilter)
Traffic Classification # # tag format # # # (32 bits) # RRRRRRRRRRRRRR TTTT NNNNNNNNNNNNNNNNNNN # | | | Site name # | | Site type # | Reserved # ID Name # # 0x0001 OSU # 0x0002 CWRU # 0x0003 BGSU # 0x0004 UC # 0x0005 UAKRON # 0x0006 WRIGHT # 0x0007 KENT … # ID Type # # 0x01 Participant # 0x02 SEGP # 0x03 Sponsored-Participant # 0x04 Gigapop # 0x05 MULTICAST
Traffic Classification tag-action OHIO-GIGAPOP_DST type dst-prefix # OSU match /16 set-dst 0x match /16 set-dst 0x match /16 set-dst 0x match /24 set-dst 0x match /30 set-dst 0x match /30 set-dst 0x # CHMCC match /24 set-dst 0x match /24 set-dst 0x match /24 set-dst 0x match /24 set-dst 0x match /24 set-dst 0x match /24 set-dst 0x match /30 set-dst 0x030014
Traffic Classification stat-report TS-CLMBQ-R2-FROM-ABILENE-BY-TAG-GROUP type destination-tag tag-mask 0x00FF0000 0x00FF0000 filter CLMBQ-R2-ABILENE-IN scale 100 output options +header,+xheader sort +octets fields -duration stat-definition clmbq-r2-daily-summaries tag OHIO-GIGAPOP report TS-CLMBQ-R2-FROM-ABILENE-BY-TAG-GROUP report TS-CLMBQ-R2-FROM-ABILENE-BY-TAG-CUSTOMER report TS-CLMBQ-R2-TO-ABILENE-BY-TAG-GROUP report TS-CLMBQ-R2-TO-ABILENE-BY-TAG-CUSTOMER
Traffic Classification
Groups of IP addresses - campus department traffic summaries. Individual IP addresses - Find busy hosts on a campus. Ports and protocols - find all the SMTP clients or hosts using file sharing services.
Traffic Classification We have the ability to provision dedicated ’s between many of the larger school’s. Cost money…Some clients want to know if they should be purchasing dedicated bandwidth between each other.
Traffic Classification
Enterprise/Campus proactive Security A lot of low hanging fruit. Hosts with high packet/octet rates to/from Internet. Hosts that connect to a large number of external sites. Hosts that use many ports. Usually not useful in our environment -- problems easy to find, hard to fix.
Enterprise/Campus security Usually use NetFlow to diagnose problems on the fly or historical events. Customer calls and wants to know why their T1 is full. Usually network scans or DoS events. A few weeks of historical data is very useful. If there is one compromised host there are usually others…
Enterprise/Campus security Hosts with known security issue(s) Campus Network Internet NetFlow collector & Archive. NetFlow v5 exports
Enterprise/Campus security Compromised hosts. Campus Network Internet NetFlow collector & Archive. NetFlow v5 exports Scan & compromise
Enterprise/Campus security Compromised hosts. 1/5 used to attack. Campus Network Internet NetFlow collector & Archive. NetFlow v5 exports Remote trigger attack
Enterprise/Campus security For some reason alerted to attack - proactive measures, local network slowdowns, victim network complaints, etc. Find and disable compromised host. Use NetFlow archive to find suspicious traffic from attacker. Look back in history for other traffic from that IP. Find other compromised hosts before they are used in future attacks. At least disable the attacker IP to campus.
Enterprise/Campus security Good success rate at retroactively identifying worms and viruses. Not too smart, usually have same packet signature. Many 1 packet per flow from an infected host, or constant packet size, constant source or destination port. Can at least identify infected machines that are causing disruption to a campus. One would guess as more tools like NetFlow are deployed miscreants will make efforts to hide signatures.
Flow-tools A lot more is possible. Flow-nfilter can filter on any fields in NetFlow v1 - v8 flows. Flow-report has about 80 built in reports. Usually reports and graphs are generated automatically. On old PC with FreeBSD/Linux is enough to get started.
Flow-tools Contributed software such as Dave Plonka’s FlowScan and Perl module Other information at
Thanks…