Hosts that send flow data to Kentik do so via kprobe host agent software. The use of kprobe with Kentik is covered in the following topics:
- About kprobe
- Host Metrics and Dimensions
- kprobe Requirements
- kprobe Traffic Capacity
- Registering kprobe Devices
- kprobe Download and Install
- kprobe Configuration
- Host Flow Via Proxy
- For help installing and configuring kprobe, please contact email@example.com.
- kprobe has replaced nProbe as the host agent software used by Kentik to collect traffic data from hosts. While existing nProbe devices will continue to work, it’s no longer possible to create new nProbe devices.
Kentik is designed to enable flow monitoring not only of routers and switches but also of hosts, which can send augmented flow data. In addition to flow records (NetFlow, sFlow, IPFIX), this augmented data includes Network Performance Metrics (retransmits, network latency, and application latency) as well as Layer 7 info such as requests/responses for both DNS and HTTP. This information is unified in the Kentik Data Engine (KDE) with other data such as GEO and BGP, providing the user — and our anomaly detection/alerting system — with a comprehensive view of where traffic is originating and terminating, traffic performance relative to Internet routing paths, and actual HTTP and DNS requests.
The host application/agent that enables the above functionality is kprobe, which is agent software that runs on Linux hosts. kprobe enables customers to send encrypted flow records from a host to Kentik. The kprobe agent listens for incoming and outgoing packets on any network interface and generates flow data from the packets received.
- kprobe is included with your Kentik subscription or trial.
- Each instance of kprobe will send HTTPS-encrypted data directly to Kentik on port 443. If it’s not possible to connect to the Internet, the data can be sent through an HTTP proxy instead, see Host Flow Via Proxy.
Host Metrics and Dimensions
Based on the data sent from kprobe, Kentik is able to make available a comprehensive set of host-related metrics and dimensions, which are covered in the following topics:
- Host Traffic Dimensions: see Host Traffic Dimensions.
- Host Traffic Metrics: see Host Traffic Metrics.
The following resources must be available to support the use of kprobe:
- Up to one CPU core per kprobe instance.
- RAM allocation of 2GB per instance; actual usage is typically 1GB.
In addition to the above, communication between kprobe and Kentik will require you to enable kprobe to open multiple https sessions to multiple *.kentik.com hosts destined to port 443 (or *.kentik.eu if your organization is registered on Kentik's EU cluster). Please ensure that any proxies, firewalls, routers, and NAT boxes permit this communication. kprobe will work properly through NAT and proxies.
kprobe Traffic Capacity
Each monitored interface requires its own individual instance of kprobe, with only one such instance per interface. Each of these instances uses only a single core, which prevents excessive CPU utilization but also determines the volume of traffic that can be handled per interface. The following table provides a very rough guide to how kprobe’s traffic capacity per interface (maximum in-plus-out bits) varies by sampling rate.
|Sampling ratio||Max traffic volume|
|1:1 or 1:2||100 Mbps|
|1:4092||> 10 Gbps|
- Actual performance is affected by a number of factors. A procedure for matching sample rate to traffic volume is outlined in Setting Sampling Rate.
- If protocol decoding (e.g. DNS/WWW data) is not needed, you can optimize performance by disabling decoding (see Disabling Protocol Decoding).
Registering kprobe Devices
Each kprobe instance that will be sending flow records to Kentik must be registered as a device with Kentik. Device registration may be handled in either of the following ways:
- On the Add Device page of the Kentik portal (see Add a Device).
- Using the Device Create call from the Device API in V5 Admin APIs.
Registering a host in Kentik involves specifying the fields that are described in the KB topic Device Admin Dialogs. The following information about specific fields will help ensure correct registration of a kprobe host:
- Device type: Choose type “Kentik Host Agent (kprobe).”
- Device IP(s): Enter the IP address of the host that will be sending data. This can be either public or private. If the device has multiple IPs you can choose any of them.
Note: This is an identifier used to associate a given device object in Kentik with the set of kprobe instances on a given host. It must be unique within each company's account.
- BGP Type: To look at Host Traffic Metrics by path you should assign a BGP table to the device by setting BGP Type to "Use table from another peered device."
Once a kprobe host is registered as a device it will be represented as a row in the Device List on the Devices page (Admin » Devices; see Device List):
- The Flow column will show a checkmark to indicate receipt of flow data.
- The device's type in the list will be indicated as DNS and its flow type will be indicated as “Kentik” (internal Kentik flow format).
- Note the value of the new host’s ID. You will need to enter this value in the command line when launching kprobe, thereby linking the newly registered device to a specific kprobe instance.
kprobe Download and Install
To use kprobe, you'll download and install the executable on each host that you want to monitor:
- Go to our downloads page at https://packagecloud.io/kentik.
- Click kprobe in the list of available downloads.
- On the kprobe Packages page, confirm that a version exists for your Linux distribution and version.
- Click Installation in the sidebar at left.
- On the resulting Installation Instructions page, use the Bash Scripts column to choose the tab corresponding to the package type (deb, rpm, node, python, or gem) that you need for your distribution of Linux.
- At the top of the resulting tab, click the Copy button to copy the quick install cURL to the clipboard.
- Run the quick install cURL inTerminal. The package script will run, downloading and installing the packages.
Note: kprobe must run as root.
kprobe is configured with command line settings that are covered in the following topics:
kprobe Command Line
The following command line arguments are used for the standard kprobe setup:
- --email (required): Your email address (assuming that you are a registered Kentik user) as displayed on your User Profile, which is accessed by clicking your username at the right of the navbar.
- --token (required): A Kentik-generated string used to authenticate a registered user (must be the same user as for --email), which is found in the API Token field of the User Profile.
- --interface (required): The name of the interface that kprobe will monitor, e.g. eth0. Each interface uses its own individual instance of kprobe.
- Any of the following mutually exclusive parameters may be used to bind the host to the corresponding device that has been registered in Kentik (see Registering kprobe Devices). If none of the below are present, kprobe will attempt to identify the device by name using the hostname of the system on which kprobe itself is running:
--device-id (recommended): The Device ID shown for the host in the Device List or returned as the value of the id field in the response JSON of the Device API.
--device-ip: The IP address (ipV4) shown for the host in the Device List or returned as the value of the sending_ips field in the response JSON of the Device API (see Device JSON).
--device-if: The name, e.g. eth0, of the interface having the IP Address that the device has been registered to in Kentik. This option may be preferred for deployments involving automated or script based provisioning.
--device-name: The name shown for the host in the Device List or returned as the value of the name field in the response JSON of the Device API. This option may be preferred for deployments involving automated or script based provisioning.
Note: Any character other than A-Z, a-z, 0-9, or "_" (underscore) will be replaced with an underscore.
- --device-plan (may be required): The ID of the Kentik plan to which a new device should be assigned. Will be ignored if the device already exists (e.g. re-launching a device).
- --sample (optional): The denominator of a ratio that represents captured_flows/total_flows where captured_flows (numerator) always equals one and is not stated. For example, if this parameter is specified as 256 then one out of every 256 flows will be captured. For recommended setting, see Setting Sampling Rate.
Note: If you don't specify sample rate in the command line then the sample rate will be determined by the Sample Rate field (see Device General Settings) on the Add Device or Edit Device dialog of the portal's Devices page (Admin » Devices).
The following example shows the structure of a typical command line using the arguments described above (with placeholder values highlighted):
# /usr/local/bin/kprobe --email firstname.lastname@example.org --token user_api_token --interface eth0 --device-id ##### --device-plan ##### --sample ####
- The above example would result in protocol decoding (e.g. DNS/WWW data), which could impact kprobe traffic capacity (see kprobe Traffic Capacity). To disable protocol decoding, use the --no-decode flag (see kprobe Optional Features).
- To send encrypted flow from kprobe to Kentik via Kentik’s kproxy (NetFlow proxy agent), use the optional --proxy-url parameter.
- Either " " or "=" can be used between the argument name and the value.
kprobe Optional Features
The following command line parameters and flags are used to enable optional features and behaviors:
- --no-decode: This flag disables all protocol decoding (see Disabling Protocol Decoding).
- --proxy-url: The IP address on which you want kproxy to listen when it is used to forward flow data from kprobe to Kentik (see Host Flow Via Proxy).
Example: --proxy-url http://proxy.example.com
- --http-port: Set a port on which to decode HTTP traffic (in addition to port 80). Specify multiple times for multiple additional ports.
Example: --http-port 8080
- --promisc: This flag enables promiscuous capture of all network traffic seen by the NIC (see Wikipedia article).
- --translate: Replace, in the IP and port fields of kprobe-generated flow records, existing values with alternate values. Typically used to replace internal IP and port with external IP and port when running kprobe in a cloud environment. This parameter can be included multiple times per command line to configure multiple translations. The parameter value is a comma-separated list in the following order: existing IP, existing port, alternate IP, alternate port.
Example: --translate 184.108.40.206,80,220.127.116.11,8080
- --device-site: The ID of the site (see About Sites) to which a new device will be assigned when it is created.
HTTP Status Server
The following additional command line options are used to start a simple HTTP status server:
- --status-host: The listen address for the HTTP status server. If status-port is present but this parameter is not then the host will default to 127.0.0.1.
- --status-port: The listen port for the HTTP status server. The status server will only be started if this parameter is present and its value is not zero.
The server started with the above parameters will be accessible via a GET to http://host:port/v1/status. The call will return some basic flow statistics in JSON, as shown in the following example:
Consult with Kentik support (email@example.com) before using these command line parameters for debugging:
- --api-url: Kentik API URL. Default value is https://api.kentik.com/api/internal
Example: --api-url http://example.com/api
- --flow-url: Kentik flow intake URL. Default value is https://flow.kentik.com/chf
Example: --flow-url http://example.com/flow
- --metrics-url: Kentik metrics URL. Default value is https://flow.kentik.com/tsdb
Example: --metrics-url http://example.com/metrics
- --snaplen: Optional maximum packet capture length.
Example: --snaplen 1024
Note: The above parameters and flags should not be changed in normal use.
Setting Sampling Rate
General considerations related to setting the flow sampling rate for Kentik devices are covered in Flow Sampling. The following additional examples may help you optimize the sampling rate when using kprobe.
For a host handling 10-20 Gbps:
- Start with a sample:flow ratio of 1:256 (--sample 256) or 1:512.
- Check the FPS reported in the Max FPS 5m column of the Device List (Admin » Devices). The preferred range is 1000-2000. Adjust the sample rate as needed.
- Check CPU utilization. If you see kprobe at greater than 90% CPU then increase the ratio.
For a host handling only a few hundred Mb/s:
- Start with a low sample rate such as 10 (--sample 10).
- Check FPS in the device list and tune the sample rate to achieve the desired FPS. At this flow volume, about 100 FPS provides good resolution, but you may want to vary that depending on how you are using the data (e.g. lower the sampling value for forensics, or increase it for traffic engineering).
Note: The maximum FPS available for any given device depends on the Plan (see About Plans) to which that device belongs. If the maximum FPS is exceeded, Kentik will downsample.
Disabling Protocol Decoding
Collection of DNS/WWW data (see Host Traffic Dimensions) is enabled by default. When monitoring host interfaces where collection of DNS/WWW data is not required, kprobe's traffic capacity can be optimized by disabling protocol decoding. To disable decoding, add the following optional flag to the kprobe command line.
Enabling OTT DNS Collection
Kentik’s OTT Service Tracking workflow enables you to track traffic for the various categories of OTT services reaching your subscribers (video, gaming, social media, etc.), to see top-X breakdowns of OTT providers in each category, and to drill down into traffic details for individual services and providers. These features depend on collection of DNS records, which requires enabling kprobe's DNS submode. When this submode is enabled, kprobe will collect DNS data from the host (presumably running on a DNS resolver) and send it to Kentik for analysis.
- kprobe's DNS submode does not make DNS data accessible to Kentik customers.
- When the DNS submode is enabled the kprobe device will appear as offline in the status columns of the Device List (Settings » Devices).
DNS Submode Example
The following command line is used to run kprobe in DNS submode:
kprobe --email firstname.lastname@example.org --token 012345API-Token543210 --device-id deviceID --interface ens18 --sample 1 dns
Note: OTT DNS collection must also be turned on by Kentik Customer Service for your organization's account.
Host Flow Via Proxy
In situations where it’s not possible for kprobe to communicate with Kentik directly via the Internet, kprobe can be used in conjunction with an HTTP proxy such as kproxy, Kentik’s NetFlow proxy agent. The proxy agent will enable Kentik customers to route flow data from multiple hosts to Kentik via a single point of contact rather than directly from each individual host. To do so:
- Modify the kprobe configuration as described in Configure kprobe for kproxy.
- Download kproxy as described in kproxy Download and Install.
- Configure the kproxy installation as described in Configure kproxy for kprobe.
Configure kprobe for kproxy
To use kprobe with kproxy, add a --proxy-url parameter when configuring kprobe (see kprobe Command Line) and set the value to the IP address on which you want kproxy to listen, as shown in the following example (placeholder in italics):
Configure kproxy for kprobe
The command line arguments used when configuring kproxy for use with kprobe are described in the following list.
- -api_email (required): The email address of a registered user as displayed on that user's API System page, which is accessed via the API button for that user on the Users page.
- -api_token (required): A Kentik-generated string that kproxy will use to authenticate a registered user (must be the same user as for -api_email). The API token of a registered user is found on that user's API System page.
- -proxy-http (required): set to the port (e.g. 2020) on which kproxy will listen.
The following example shows the structure of a typical command line using the arguments described above (with placeholder values highlighted):
kproxy -api_email=api_email -api_token=api_token -proxy-http=0.0.0.0:2020
Alternatively, if you prefer to keep the API token invisible, you can hide it in an ENV variable:
KENTIK_API_TOKEN=api_token kproxy -api_email=api_email -proxy-http=0.0.0.0:2020
- If kproxy fails to launch, add the -verbose flag and try again so that you can provide the output to email@example.com in order to facilitate troubleshooting.
- Use -h to return a list of arguments.