KDE Tables

This article covers the following topics related to the tables of the Kentik Data Engine (KDE):

Notes:
- For general information about KDE, see KDE Overview.
- For information about querying Kentik Data Engine see Querying KDE.

 

About KDE Tables

KDE keeps a discrete set of databases for the flow records of each customer. These databases are made up of “main tables” for the flow records and associated data received for each device (router or host) as well as for supplemental data that is derived from the flow data. The data for all devices of a given customer are also merged into that customer's All Devices Table.

The columns in these tables form the basis for the dimensions (see Dimensions Reference) that are used for filtering and group-by in Kentik queries. Most of these dimensions are exposed in the portal UI's dimension selectors for filtering (see Dimension Selector Dialog) and group-by (see Dimension Selectors). A small number of additional dimensions, covered in the topics below, are available for use only when querying via the Query SQL Method of the Query API.

 

Universal Data Records

Universal Data Records is Kentik's technique for populating the fields of KDE tables. As described in About KDE Tables, many of the columns of the main tables in which flow data is stored correspond to fields of the flow records ingested from individual devices. Rather than reserving columns for fields that are not used by a given device type (router, switch, firewall, etc.), Universal Data Records varies the mapping of fields to columns depending on the device type. This flexible schema enables Kentik to rapidly extend KDE's support for storing and querying data from new and disparate sources, which allows visibility into a much wider range of data points about customer networks and infrastructure. Among the many advantages of this approach are the ability to store vendor-specific flow fields (see Device-specific Dimensions), increased capacity for Custom Dimensions, and the ability to store data from flow records with non-standard fields.

 

Main Table Columns

The topics below cover the columns of the main table that are not exposed as filter and/or group-by dimensions in the portal UI:

Note:
- The columns in the topics below are used only when querying with SQL via the Query SQL Method of the Query API.
- For information on columns that are exposed in the portal as filter and/or group-by dimension, see Dimensions Reference.
- Some columns fit into more than one category. If you don’t find a column where you think it would be, check other categories.

 
top  |  section

Query Columns

Used to specify dataseries (Full or Fast; see Resolution Overview), and other query-related data.

Column Type Description
i_fast_dataset boolean Input only. Determines the dataseries on which a query is run:
- if not specified, dataseries selection is automatic;
- if false, use Full dataseries;
- if true, use Fast dataseries.
i_sub_limit integer Passes a limit value that applies to all subqueries used in the current KDE query. For example, if querying for top IPs with i_sub_limit = 2, then for every main table queried by a subquery only the top two IPs are contributed to the main query results.

 
top  |  section

Time Columns

Used to specify time ranges.

Column Type Description
ctimestamp bigint Unix Epoch when flow was received by Kentik, 1 second resolution.
i_start_time timestamp with time zone The time that flow was received, rounded to the nearest minute for display. Searching allows granularity to the nearest second (see Time Management in Queries).

 
top  |  section

Metrics Columns

Metrics are used to specify the units in which traffic is measured, e.g. for ranking in a top-X list. The metrics listed here are not exposed as dimensions in the portal, but may be used in API calls made with the Query SQL Method.

Column Type Description
both_bytes bigint Sum of in_bytes and out_bytes.
both_pkts bigint Sum of in_pkts and out_pkts.
ooorder_in_pkts bigint Out of order packets sent through the ingress interface (host devices only).
ooorder_out_pkts bigint Out of order packets sent through the egress interface (host devices only).
retransmitted_in_pkts bigint Retransmitted packets sent through the ingress interface (host devices only).
tcp_retransmit integer Number of packets re-transmitted (host only).

Note: For metrics that are exposed as dimensions in the portal, see Dimensions Reference.

 
top  |  section

Additional Columns

While the dimensions needed for almost any flow-based query are exposed in the portal UI (see Dimensions Reference), the columns listed below are available as additional dimensions that can be used in queries made with the Query SQL Method of the Query API.

Tags Columns

Matches to user-specified tags (see About Flow Tags).

Column Type Description
dst_flow_tags text Tags that match on destination flow.
src_flow_tags text Tags that match on source flow.

NPM Columns

Network performance monitoring data, including retransmits, out-of-order packets, and latency (client, server, and application).

Column Type Description
fragments bigint Packets that have been split into smaller packets for delivery across the network.
ooorder_in_pkts bigint Packets that arrived out of sequence.
ooorder_out_pkts bigint Packets that were sent out of sequence.
retransmitted_in_pkts bigint Packets re-sent from source to destination.
Note: Valid only with a reliable transport protocol such as TCP.

DNS/WWW Columns

Data related to DNS lookup and HTTP, including domain name, referrer, status, etc.).

Column Type Description
i_kflow_dns_query_type_name text The name of the resource record type requested by a DNS query.

Ultimate Exit Columns

Information about the Ultimate Exit of the traffic, which is where it leaves the network (device, interface, site, etc.; see Using Ultimate Exit).

Column Type Description
i_inet_ult_exit text The IP address, either IPv4 or IPv6, of the interface through which the flow leaves.
i_ult_exit_site_id bigint The ID of the site (see About Sites) through which the flow leaves.
ult_exit_device_id bigint ID of device through which the flow leaves.

Interface Classification Columns

Information about the interface though which a flow enters and leaves a device; see Interface Classification).

Column Type Description
dst_connect_type bigint A numeric code representing a connectivity type value (destination interface).
src_connect_type bigint A numeric code representing a connectivity type value (source interface).
ult_exit_connect_type bigint A numeric code representing a connectivity type value (ultimate exit interface).
dst_network_bndry bigint A numeric code representing a network boundary value (destination interface).
src_network_bndry bigint A numeric code representing a network boundary value (source interface).
ult_exit_network_bndry bigint A numeric code representing a network boundary value (ultimate exit interface).

CDN Attribution Columns

Information about content delivery networks associated with source and destination traffic (see About CDN Attribution):

Column Type Description
dst_cdn_int bigint Destination CDN (content delivery network).
i_dst_cdn text The commercial name of the CDN derived from the destination IP (inet_dst_addr) of an ingested flow.
i_src_cdn text The commercial name of the CDN derived from the source IP (inet_src_addr) of an ingested flow.
src_cdn_int bigint Source CDN (content delivery network).

Threat Feed Columns

Information about source and destination hosts and IPs that have been identified as a security threat by Spamhaus (updated daily):

Column Type Description
dst_threat_type text Destination IP type of threat (as defined by Spamhaus).
src_threat_type text Source IP type of threat (as defined by Spamhaus).

 

Using Main Tables

The following topics cover special considerations affecting the use of KDE Main Table columns:

 
top  |  section

Aggregation Duration

In queries covering a timespan of more than a few minutes, KDE performs aggregation to standardize both the interval between reported data points and the start and end times of the covered data. The duration used for this aggregation is determined internally by KDE depending on the query's specified timespan. An aggregation duration of, for example, 60 seconds results in a data point interval of one minute with the start and end of the timespan adjusted to the closest minute.

To represent the aggregation duration in the query itself, KDE supports a variable named i_duration. The i_duration value is expressed in seconds. It will be the same for all rows of all tables queried, and can be used to normalize bytes/packets to bps or pps. When a query includes a GROUP clause covering other fields (e.g. src_as), i_duration is used in an aggregate function in the SELECT clause, e.g. SELECT max(i_duration).

Note: i_duration cannot be used in a WHERE clause.

 
top  |  section

Required Select Columns

The inclusion in a KDE query of any of the main table columns described above is for the most part optional. But unless the SELECT statement in a given query includes one or more of the following columns, that query's results will be empty (no rows returned):

  • in_bytes
  • in_pkts
  • out_bytes
  • out_pkts
  • both_bytes
  • both_pkts
  • tcp_retransmit
 
top  |  section

MAC Address Columns

The following information applies to the Main Table columns src_eth_mac and dst_eth_mac, which store MAC address data collected from flow data of type sFlow, IPFIX (depending on configuration), and pcap:

  • MAC address entry in the portal, psql client, or API may be any of the following formats:
    - 11.11.11.11.11.11
    - 11:11:11:11:11:11
    Note: Addresses may be entered with or without leading zeros (e.g. 1.1.1.1.1.1 or 01.01.01.01.01.01).
  • MAC address display in the portal and in query results is always period-separated (e.g. 11.11.11.11.11.11).
  • Only the following operators are supported in queries for use with these columns:
    =, <>, !=.
 

All Devices Table

In addition to the individual table for each device, each customer’s database includes a special table named all_devices, which is a merge into one table of the data from all of that customer’s devices. The All Devices table includes all the fields of an individual device table, but also includes the following additional fields:

Column Type Description
i_device_id text Kentik-assigned unique numerical ID of the device.
Used for all functions.
i_device_name text User-defined name for the device.
i_device_site_name text Name of the site to which the device has been assigned (see About Sites). If the device hasn't been assigned to a site, returns an empty string.
Notes:
- Supported operators for WHERE clause: case-insensitive equality, LIKE, IN, and regex matching.
- Site assignments in the table may lag Admin settings by up to 10 minutes.
i_device_type text Type of device, e.g. router or host (see Supported Device Types).
Note: Used only for selection (filtering with WHERE clause), not for display or GROUP_BY.

The additional fields of the All Devices table can be used to filter returned data by individual routers, or to further group by device using i_device_id.

Note: The All Devices fields above are exposed as filtering and group-by dimensions in the portal (see Dimensions Reference).

© 2014- Kentik
In this article:
×