Netflow Collection & Processing David Ripley. 2 Lead Network Security Developer, Advanced Network Management Laboratory Indiana University Network security.

Slides:



Advertisements
Similar presentations
ICmyNet.Flow Network Traffic Analysis System If You Want to See Your Net
Advertisements

Transport Layer3-1 Transport Overview and UDP. Transport Layer3-2 Goals r Understand transport services m Multiplexing and Demultiplexing m Reliable data.
Data and Computer Communications Eighth Edition by William Stallings Lecture slides by Lawrie Brown Chapter 2 – Protocol Architecture, TCP/IP, and Internet-Based.
How do Networks work – Really The purposes of set of slides is to show networks really work. Most people (including technical people) don’t know Many people.
Data and Computer Communications Eighth Edition by William Stallings Lecture slides by Lawrie Brown Chapter 2 – Protocol Architecture, TCP/IP, and Internet-Based.
Internet Control Message Protocol (ICMP)
CSCI 4550/8556 Computer Networks Comer, Chapter 20: IP Datagrams and Datagram Forwarding.
NetFlow Analyzer Drilldown to the root-QoS Product Overview.
Chapter 23: ARP, ICMP, DHCP IS333 Spring 2015.
Passive traffic measurement Capturing actual Internet packets in order to measure: –Packet sizes –Traffic volumes –Application utilisation –Resource utilisation.
Practical Networking. Introduction  Interfaces, network connections  Netstat tool  Tcpdump: Popular network debugging tool  Used to intercept and.
IP-UDP-RTP Computer Networking (In Chap 3, 4, 7) 건국대학교 인터넷미디어공학부 임 창 훈.
Using Argus Audit Trails to Enhance IDS Analysis Jed Haile Nitro Data Systems
FIREWALL TECHNOLOGIES Tahani al jehani. Firewall benefits  A firewall functions as a choke point – all traffic in and out must pass through this single.
Experiences in Analyzing Network Traffic Shou-Chuan Lai National Tsing Hua University Computer and Communication Center Nov. 20, 2003.
Reading Report 14 Yin Chen 14 Apr 2004 Reference: Internet Service Performance: Data Analysis and Visualization, Cross-Industry Working Team, July, 2000.
Flow tools APRICOT 2008 Network Management Taipei, Taiwan February 20-24, 2008.
Internet Control Message Protocol (ICMP)
NetfFow Overview SANOG 17 Colombo, Sri Lanka. Agenda Netflow –What it is and how it works –Uses and Applications Vendor Configurations/ Implementation.
Data and Computer Communications Eighth Edition by William Stallings Lecture slides by Lawrie Brown Chapter 2 – Protocol Architecture, TCP/IP, and Internet-Based.
Copyright © 2002 OSI Software, Inc. All rights reserved. PI-NetFlow and PacketCapture Eric Tam, OSIsoft.
PA3: Router Junxian (Jim) Huang EECS 489 W11 /
1 IP Forwarding Relates to Lab 3. Covers the principles of end-to-end datagram delivery in IP networks.
1 Chapter Overview TCP/IP DoD model. 2 Network Layer Protocols Responsible for end-to-end communications on an internetwork Contrast with data-link layer.
ECE4112 Lab 7: Honeypots and Network Monitoring and Forensics Group 13 + Group 14 Allen Brewer Jiayue (Simon) Chen Daniel Chu Chinmay Patel.
COMT 429 The Internet Protocols COMT 429. History 1969First version of a 4 node store and forward network, the ARPAnet 1972Formal demonstration of ARPAnet.
Dividing the Pizza An Advanced Traffic Billing System An Advanced Traffic Billing System Christopher Lawrence Burke The University of Queensland.
1 © 2003, Cisco Systems, Inc. All rights reserved. CCNA 2 Module 8 TCP/IP Suite Error and Control Messages.
© 2002, Cisco Systems, Inc. All rights reserved..
1 The Research on Analyzing Time- Series Data and Anomaly Detection in Internet Flow Yoshiaki HARADA Graduate School of Information Science and Electrical.
NetFlow: Digging Flows Out of the Traffic Evandro de Souza ESnet ESnet Site Coordinating Committee Meeting Columbus/OH – July/2004.
The Saigon CTT Semester 1 CHAPTER 10 Le Chi Trung.
CS332, Ch. 26: TCP Victor Norman Calvin College 1.
workshop eugene, oregon What is network management? System & Service monitoring  Reachability, availability Resource measurement/monitoring.
Transport Layer 3-1 Chapter 3 Transport Layer Computer Networking: A Top Down Approach 6 th edition Jim Kurose, Keith Ross Addison-Wesley March 2012 Part.
Internetworking Internet: A network among networks, or a network of networks Allows accommodation of multiple network technologies Universal Service Routers.
Internetworking Internet: A network among networks, or a network of networks Allows accommodation of multiple network technologies Universal Service Routers.
Graphing and statistics with Cacti AfNOG 11, Kigali/Rwanda.
Wide-scale Botnet Detection and Characterization Anestis Karasaridis, Brian Rexroad, David Hoeflin In First Workshop on Hot Topics in Understanding Botnets,
April 4th, 2002George Wai Wong1 Deriving IP Traffic Demands for an ISP Backbone Network Prepared for EECE565 – Data Communications.
Project Requirements (NetFlow Generator) 정승화 분산 처리 및 네트워크 관리 연구실 포항 공과 대학교
Open-Eye Georgios Androulidakis National Technical University of Athens.
Lecture 4 Overview. Ethernet Data Link Layer protocol Ethernet (IEEE 802.3) is widely used Supported by a variety of physical layer implementations Multi-access.
Interpreting Network Traffic Flows Bill Jensen, Paul Nazario and Perry Brunelli.
CSC 600 Internetworking with TCP/IP Unit 5: IP, IP Routing, and ICMP (ch. 7, ch. 8, ch. 9, ch. 10) Dr. Cheer-Sun Yang Spring 2001.
IP addresses IPv4 and IPv6. IP addresses (IP=Internet Protocol) Each computer connected to the Internet must have a unique IP address.
Net Flow Network Protocol Presented By : Arslan Qamar.
1 © 2004, Cisco Systems, Inc. All rights reserved. CCNA 2 v3.1 Module 8 TCP/IP Suite Error and Control Messages.
CS/EE 145A Reliable Transmission over Unreliable Channel II Netlab.caltech.edu/course.
POSTECH DP&NM Lab Detailed Design Document NetFlow Generator 정승화 DPNM Lab. in Postech.
ITP 457 Network Security Networking Technologies III IP, Subnets & NAT.
K. Salah1 Security Protocols in the Internet IPSec.
Lecture 21: Network Primer 7/9/2003 CSCE 590 Summer 2003.
TCP/IP1 Address Resolution Protocol Internet uses IP address to recognize a computer. But IP address needs to be translated to physical address (NIC).
Q and A, Ch. 21 IS333, Spring 2016 Victor Norman.
NT1210 Introduction to Networking
Process-to-Process Delivery:
1 © 2004, Cisco Systems, Inc. All rights reserved. CCNA 2 v3.1 Module 8 TCP/IP Suite Error and Control Messages.
CCNA 2 Router and Routing Basics Module 8 TCP/IP Suite Error and Control Messages.
Data and Computer Communications Chapter 2 – Protocol Architecture, TCP/IP, and Internet-Based Applications.
Introduction to TCP/IP networking
Chapter 3 outline 3.1 Transport-layer services
06- Transport Layer Transport Layer.
Byungchul Park ICMP & ICMPv DPNM Lab. Byungchul Park
Topic 5: Communication and the Internet
IP : Internet Protocol Surasak Sanguanpong
Internetworking Outline Best Effort Service Model
Networking Essentials For Firewall-1 Administrators
Routing and the Network Layer (ref: Interconnections by Perlman
Chapter 3 Transport Layer
Presentation transcript:

Netflow Collection & Processing David Ripley

2 Lead Network Security Developer, Advanced Network Management Laboratory Indiana University Network security infrastructure development and research for the ANML. Background in physics, image processing, satellite remote sensing, system administration. David A. J. Ripley MSc., ARCS

3 Overview What is a “flow”? What is Netflow specifically? Netflow collection infrastructure. Netflow processing, problems and issues

4 Netflow Recap Q. What is a flow? A. In a general sense, a flow is a series of packets with some attribute(s) in common.

5 Netflow Recap Common attributes define a flow Source and/or destination of the traffic. Protocol - TCP, UDP, ICMP? Timing - start, end, and duration of the traffic. Routing information - interfaces, AS, etc.

6 Netflow Recap Flows can be unidirectional or bidirectional - the latter adds possible information. Aggregated flows. Application flows - classify packets by inspecting their contents We’re not going to worry too much about these cases.

7 Netflow Recap As far as we’re concerned, a flow is a series of packets with the same: IP Protocol (UDP, TCP, ICMP) Source and destination ports Source and destination addresses

8 Netflow Recap The recording of a flow is subject to idiosyncrasies of sampling frequency and sampling window Bucket timeout - systems typically consider one minute windows. Flows longer than one minute will appear as two flow records Multiple flows (with the same characteristics) within a single one minute window will appear as a single flow record Sampling rate Router will only consider one out of every N packets; N=??? - data loss vs. expensive operations.

9 An example Host A gets a web page from Host B This will show up as two flows (usually) Host A, port Host B, port 80 Host B, port 80 Host A, port 12345

10 Why Netflow? What kinds of information can we gather? What percentage of traffic on the network is web traffic? ssh? IRC? What is the average transfer rate for network communications? Who uses the network the most? Have usage patterns changed over time? For the Chicago region, how much of the traffic of the region is staying in the region? Many others

11 Why Netflow? Historically, traffic accounting, acceptable use enforcement; Researchers and engineers needed to answer all kinds of questions about network traffic. Traffic accounting in the form of flow records provided that information.

12 Why Netflow? Traffic Engineering/Accounting How traffic is shared with competitors; how customers are billed. Security/Policy monitoring DoS/DDoS detection Research Measuring the growth of networks Identifying how the network is being used.

13 What data is there? It depends. We keep talking about “flows” - we really mean Cisco’s Version 5 flow records A Cisco-defined “standard” Used on Abilene - so that’s what we use.

14 Netflow Version 5 Cisco-defined de-facto standard Efforts are underway in the IETF to make this standard official Flows are exported as UDP packets Each packet contains a number of flow records plus a header with information common to these records Delivery is not guaranteed! There are sequence numbers so we know how many packets we’ve lost.

15 Netflow V5 Header Byte 1Byte 2Byte 3Byte 4 VersionCount SysUpTime UNIX Seconds (seconds since Epoch) UNIX Nanoseconds (residual nanoseconds) Flow Sequence Number Engine TypeEngine IDReserved

16 Netflow V5 Record Source IP Address Destination IP Address Next Hop IP Address Input ifIndexOutput ifIndex Packets Bytes Start time of flow End time of flow Source portDestination port PaddingTCP FlagsIP ProtocolTOS Source ASDestination AS Source Mask Length Destination Mask Length Padding Byte 1Byte 2Byte 3Byte 4

17 Convenience, or lack of it Flow records are exported in a format that is convenient for the router, not for us. e.g. The flow start and end times are in a form that is not immediately useful, milliseconds since system boot. We have to combine data from individual flow records with header data. Seconds since epoch is the Right Thing Flow Start Time = Unix Seconds + Unix Nanoseconds - sysUpTime + flow_start (After we’ve converted all these to the right units) ICMP Type is stored in the destination port field

18 Examining Netflow Part of our job is using netflow data to see what’s happened/is happening on the network We spend a significant amount of time processing the archived data looking for particular behaviors. Typically in response to institutional requests

19 Netflow Collection We collect flow data from Abilene core routers. Archives raw records (up to 3 months) (Redirect to other lab machines) Primary data source for research & responses to operational issues.

Problems with Pre- processing We can do all kinds of pre-processing ahead of time. You rarely know what kind of behaviour you’re going to be looking for ahead of time. You can’t cover all the bases Waste time generating products that you’ll never use. But there are some simple things that are very useful.

21 MS-RPC (Attempts)

22 MS-RPC Infections (Maybe)

Traffic Graphing Something as simple as graphing traffic volume can be a pain in the neck How much traffic went to/from a given range of addresses, IP Ports, etc. Often done using counters on routers There are serious performance issues with this; the number of counters is limited. It’s relatively easy if you know what you’re looking for But we need perspective; we have to be able to turn back the clock Using counters on routers just doesn’t work for this.

Traffic Graphing Even with services running on known ports, there are too many in use to record all of them using routers “bad” traffic has a habit of turning up on odd ports It’s kind of obliged to.

Traffic Graphing 2^16 Source ports, 2^16 destination ports; A lot. We can get this information from the netflow archive; But it’s a lot of detailed data to plough through, takes a long time. We can aggregate the data as it comes in. Even more hosts/networks than ports It’s hard to estimate the number of hosts; Somewhere around 9 or 10 million on Abilene

Traffic Graphing Simple aggregation of flow records 15 minute intervals (convenient given archive granularity) Break data into ICMP/TCP/UDP Aggregate by source port, destination port, source address, destination address, and AS number

Traffic Graphing How do we go about this? Some cron and some Perl scripts aggregate new flow records and put them into the database every half hour There’s a web front end so we can take a look at the graphs.

Traffic Graphing

This is not exactly rocket science; And yet not many people do this kind of thing. We get requests all the time “Can I see the traffic on ports X, Y and Z for the last couple of weeks?

Traffic Graphing Upside: We can generate a historical view of traffic to or from any source or destination port; any Autonomous System; or any IP address or prefix. Downside: Aggregation means loss of data; Plot traffic to a given port, you lose IP info and vice versa. It still takes a while (but only a few minutes)

Traffic Graphing

Vague Questions Why is this important? Perspective matters. History teaches us, even if it’s just the history of network traffic over the past couple of weeks. Why isn’t it more common? Why doesn’t everyone do it? Because they don’t think it’s especially important It’s rather broad, isn’t it? Macro and micro.