Flow Overview

This article covers the following topics related to flow and the configuration of routers to collect and export it:

Note: For information about configuring devices to send flow to Kentik, see:
- Router Configuration for routers;
- Host Configuration for hosts.

 

About Flow

Flow records are metadata (information) about each IP conversation (collection of related packets) that traverses a device such as a router, switch, or host. If a given device is configured to enable it, a flow record (data about a flow) can be collected in a cache and exported by sending it to a specified destination (e.g. Kentik) at a specified interval. For example if IP 1.1.1.1 is sending packets to 2.2.2.2 through a flow-enabled device, information about that conversation can be collected in a flow record that includes the following basic flow fields:

Field Description
source IP Source IP address
dest IP Destination IP address
protocol IP protocol type (for example, TCP = 6; UDP = 17)
L4 source port TCP/UDP source port number or equivalent
L4 dest port TCP/UDP destination port number or equivalent
TCP flags (logical OR) Cumulative OR of TCP flags
input interface ID SNMP index of input interface
output interface ID SNMP index of output interface
byte count Total number of Layer 3 bytes in the packets of the flow
packet count Packets in the flow
ToS / DSCP value IP type of service (ToS)
next-hop IP IP address of next hop router

Note: For a list of devices that can send flow to Kentik, see Supported Device Types.

 

Flow Protocols

Four primary protocols exist for flow data. Kentik accepts all four protocols. The protocol used to send flow data to Kentik from a given device should be chosen based on what that router/switch supports and handles most efficiently. Where multiple protocols are supported, Kentik recommends them in the following order:

  • sFlow: designed by the sFlow.org consortium as a statistical monitoring tool for networks; configurable through SNMP.
  • IPFIX (Internet Protocol Flow Information Export): created by the Internet Engineering Task Force (IETF) as a universal standard for exported flow data.
  • NetFlow version 9: a Cisco flow protocol that is compatible with and nearly identical to IPFIX.
  • NetFlow version 5 (known as “JFlow” on Juniper devices): an earlier Cisco flow protocol.

The following table shows some of the features that are supported in the various protocols:

Features NetFlow v5 NetFlow v9 IPFIX sFlow
Basic flow fields
(see About Flow)
Yes Yes Yes Yes
Embedded sampling rate Yes No Yes Yes
IPv6 support No Yes Yes Yes
MAC address fields No Available on some platforms
using custom template
Available on some platforms
using custom template
Yes
VLAN ID fields No Available on some platforms
using custom template
Available on some platforms
using custom template
Yes
Includes payload sample No No No Yes

Note: Some routers and switches will not report layer-2 traffic; they only report flows as they traverse a layer-3/route decision. For additional information, consult your device vendor.

 

Flow Sampling

Flow sampling means exporting a flow record for only one in every X flows. When X is 1 then flow is unsampled (a flow record is generated for every flow), but when X is 10,000 then a flow record is generated for one out of every 10,000 flows. As a result, there's an inverse relationship between the sampling rate (the ratio of total flows to sampled flows) and the resolution, meaning that a lower sampling rate (100) is higher resolution than a higher rate (10,000).

While it's tempting to assume that accuracy requires setting each device to the lowest possible rate (e.g. 1), testing by Kentik and others has established that even when the sampling rate is high it's possible to measure small “needle-in-a-haystack” traffic flows with accuracy that is adequate for all common use cases (see our blog post Accuracy in Low-Volume Flow Sampling).

Why Flow Sampling

The advantage of flow sampling is that it vastly improves the efficiency with which resources devoted to the collection, transport, ingest, and storage of flow records are utilized, enabling much more network infrastructure to be covered for a given resource expenditure. Specifically, flow sampling is recommended (regardless of the size of your operation) for the following reasons:

  • Sampling reduces the device cycles required for processing and collection of flow.
  • Sampling reduces the network utilization (bandwidth) required for sending flow data (the required bandwidth is typically not significant).
  • Sampling makes it easier to see flow data in real time, because when flow is unsampled some routers hold the flow data for minutes or longer before sending it to the collector.

Sampling is especially important during unexpected high traffic volumes. During an attack or high PPS event, for example, a router that may otherwise be able to handle unsampled flow can be overwhelmed by the sheer volume of data and may stop processing flow. That can cause a lack of visibility at precisely the moment when it is most critical to be able to collect and analyze traffic data.

Optimal Sampling Rates

For a given network device, the ideal sample rate is low enough to capture critical information but high enough to efficiently handle peaks. Optimal rates vary by the device role, the desired resolution of the flow record dataset (which is dependent on use case), and the total active throughput of the device. The following table provides recommended sample rate ranges for individual devices, which are based on Kentik’s analysis of hundreds of devices in live production accounts:

    Sampling rate (flows per sample), by max device throughput
Device role Resolution 1 Gbps 10 Gbps 100 Gbps 1 Tbps
Edge/Internet-facing Standard N.A. 3000 - 7000 8000 - 10,000 11,000 - 15,000
Edge/Internet-facing Enhanced N.A. 2000 - 4000 5000 - 7000 8000 - 10,000
Data Center and Core Standard 400 - 800 1000 - 1500 10,000 - 20,000 25,000 - 50,000
Data Center and Core Enhanced 200 - 400 500 - 800 5000 - 14,000 15,000 - 30,000

Notes:
- Device vendors use a variety of algorithms (random, consistent, etc.) for flow sampling; Kentik works with all such algorithms in current use.
- Sample rate considerations specific to hosts are covered in Sample Rate for Hosts.
- Please consult with Kentik (see Customer Care) for answers to questions or for help calculating the proper sampling rate for your unique network environment.

 

Ingress and Egress

Depending on how a given device is configured, flow may be created by examining traffic at either of the following points:

  • Ingress — as traffic comes into an interface;
  • Egress — as traffic exits an interface.

It is recommended that you enable flow on all interfaces and configure all devices for ingress flow creation only (NetFlow was originally designed for this scenario but has since expanded to allow egress flow creation as well to handle special cases such as compression and VPN services).

Enabling ingress flow creation on all interfaces will give you a full picture of all traffic traversing the router. The Kentik system will, for example, allow you to examine traffic that has left an interface by grouping all flows that were destined to that interface as they traversed the ingress from other interfaces.

Enabling both ingress and egress flow creation may result in the same traffic being counted twice. Kentik allows you to verify that flows are not being double-counted, because whenever you view traffic for an individual interface Kentik will report both the flow traffic and the SNMP traffic (use the device/interface tab and select traffic or use the Data Explorer and filter by interface). When the flow traffic on an interface (or set of interfaces) is compared to the SNMP recorded interface traffic the two metrics should be within 20 percent of one another.

Note: Kentik does not currently remove duplicate flows resulting from enabling both ingress and egress flow creation.

© 2014- Kentik
In this article:
×