Kentik for AWS
Bringing VPC Flow Logs Into Kentik
As cloud-only and hybrid cloud services become increasingly commonplace, network operators of all kinds need a unified environment within which to view and analyze the data generated by their network activities. Kentik® collects, derives, and correlates this network traffic data — flow records, BGP, GeoIP, SNMP, etc. — to enable visualization, monitoring, alerting, and analytics. The data may be collected not only from routers (including related hardware like switches) and hosts (via a software host agent) but also from flow logs generated by your resources that are hosted by cloud service providers such as Amazon Web Services.
In this article we'll look at how to get log data from AWS to Kentik. The following topics will guide you through the setup process:
- This article outlines the manual setup procedure. Kentik also supports automated setup with Terraform; see AWS Automated Setup.
- For help with any aspect of the setup workflow outlined below, please contact Kentik Customer Support.
About AWS Flow Logs
The basics of AWS flow logs are covered in the following topics:
AWS Flow Log Overview
In the world of Amazon Web Services, flow logs are analogous to the flow records (e.g. NetFlow, sFlow, etc.; see About Flow) generated by devices on physical networks. A flow log consists of a set of records about the flows that either originated or ended in a given Virtual Private Cloud, with each individual record made up of a set of fields giving information about a single flow.
Amazon allows you to set up a VPC Flow Log for a VPC, a subnet, or a network interface, and to publish that flow log to a destination folder within a “bucket” in Amazon Simple Storage Service (S3), which is the location from which a Kentik service pulls the logs for ingest into Kentik.
AWS flow logs are ingested from an S3 bucket into Kentik via a "cloud export" that is configured on the Kentik portal's Monitor your AWS Cloud page (see AWS Cloud Setup) and managed via the Public Clouds Page (Settings » Public Clouds). By default, each destination log folder in the bucket will be represented in Kentik as a "cloud device." For more information, see Cloud Exports and Devices.
AWS Flow Log Formats
AWS now supports the export of flow logs in two distinct formats:
- Default format (AWS standard format version 2): Each line of the log is a space-separated string with fields in the following order:
<version> <account-id> <interface-id> <srcaddr> <dstaddr> <srcport> <dstport> <protocol> <packets> <bytes> <start> <end> <action> <log-status>
- Custom format (AWS log format version 3 through 5): Each line is made up of one or more fields in a custom-specified order. The available fields, and the VPC flow logs version in which each field was introduced, are listed in the AWS documentation topic Available Fields.
Custom AWS flow logs must meet the following requirements for ingest into Kentik:
- The following fields are required: <srcaddr>, <destaddr>, <srcport>, <dstport>, <packets>, <bytes>, <protocol>, <version>, and <start>.
- In addition to the required fields, a custom log format must include at least six other AWS fields.
AWS Flow Log Deletion
Flow log deletion minimizes the costs associated with log data retention in the cloud. Kentik is designed to support the following approaches to flow log deletion:
- Deletion by Kentik: Kentik's built in log ingest process results in log files being deleted within 15 minutes of being posted to an AWS S3 bucket. To utilize this approach, you must give Kentik full access to the bucket (AmazonS3FullAccess) when setting permissions for the AWS role associated with the bucket. This setting is made in the Filter policies field on the Create Role page; see Create an AWS Role.
- Deletion by customer: If your role permissions for a given log bucket are set to AmazonS3ReadOnlyAccess then Kentik will not be able to delete log files automatically. AWS provides a number of options for deleting the contents, including log files, from a bucket; see the AWS documentation at Empty a bucket.
AWS Flow Log Resources
For detailed information about VPC Flow Logs, please refer to AWS documentation:
- For information about the structure of each flow record, including it’s various fields, see the AWS topic Flow Log Records.
- For information about the limitations of AWS flow logging, see the AWS topic Flow Log Limitations.
- For general information from Amazon about AWS VPCs, see Amazon VPC FAQs. or VPC Quick Start.
- For general information from Amazon about S3 buckets, see Creating and Configuring an S3 Bucket.
AWS Logging Setup Overview
Kentik accesses your flow logs by pulling them from a bucket in Amazon Simple Storage Service (S3). Assuming that you already have a VPC in AWS, the following setup workflow (detailed in the topics below) will enable Kentik to access the logs for ingest into Kentik:
- In AWS’s Identity and Access Management (IAM) console, create a new AWS role and configure it with permissions that enable AWS services associated with Kentik’s account to access resources associated with your account.
- In AWS’s S3 console, create a bucket to which logs can be published.
Note: You may create a separate bucket for each region from which you will collect VPC flow logs or a combined bucket for all regions (see Bucket Allocation).
- In AWS’s VPC Dashboard, configure each VPC (or subnet or interface) to publish logs to a destination folder in the bucket. Kentik recommends using a different destination folder for each VPC.
- Back in the S3 console, confirm that logs are being published to the destination folders.
- In the Kentik portal, create a new cloud export (see AWS Cloud Setup) pointing to the bucket, which typically results in the creation of one Kentik “cloud device” for each destination log folder (see Cloud Exports and Devices).
Note: As noted earlier, AWS allows you to set up a VPC Flow Log for a VPC, a subnet, or a network interface. Both the list above and the topics below describe these tasks using a VPC as the example. Individual steps in these tasks may vary slightly if you are instead enabling logging from a subnet or interface. For details, please refer to the AWS documentation topic Creating a Flow Log that Publishes to Amazon S3.
Cloud Exports in the Portal
By default, successful completion of the tasks listed in the overview above will have the following effect in the Kentik portal (see Cloud Exports and Devices):
- A new cloud export will be shown as an added row in the Kentik portal’s Cloud Exports list (Admin » Public Clouds). The cloud export will represent the collection of VPCs (or subnets or interfaces) whose logs are pulled from the bucket specified during setup of the cloud export.
- The Devices column in the Cloud Exports list will show one or more cloud devices:
- Each of these cloud devices will represent all VPCs, subnets, and/or interfaces that publish logs to one destination log folder in the bucket.
- Each device will be named after it’s corresponding log folder.
- Each flow record ingested into KDE from a given cloud device will include the device’s name in the virtual column i_device_name, enabling you to group-by and filter on the device using the Device Name dimension.
Note: If the default approach described above results in inefficient allocation of flow to cloud devices, Kentik Customer Support will contact your organization to propose alternative allocation strategies that we can use.
AWS Logging Setup Tasks
The tasks required to set up the publishing of VPC Flow Logs to an S3 bucket are covered in the following topics:
Create an AWS Role
One part of enabling Kentik to export your VPC flow logs is to give permission to our services to access the needed resources in your account. You’ll do this by creating a new “role” in AWS for each region from which you wish to export the logs. You'll then assign to each of those roles a set of permissions that grant access by Kentik to the EC2 APIs corresponding to the VPC instances from which the logs will be exported.
Note: AWS recommends creating a new role for logging rather than using an existing role.
Create Policy for Role
To create a new AWS role, you'll first need to create a policy:
- Log into the console of your AWS account and go to the Identity and Access Management (IAM) page at https://console.aws.amazon.com/iamv2/home#/policies
- Click the Create Policy button.
- Click the JSON tab on this page.
- In the resulting JSON editor, overwrite the existing JSON by pasting in the JSON shown below in AWS Policy JSON.
- Click the Next: Tags button at bottom right. In the subsequent step, you may choose to add tags to describe this policy.
- Click the Next: Review button at bottom right.
- Supply a name and description for the newly created policy, for example:
- Name: Kentik-Metadata-Policy
- Description: Policy allowing Kentik Technologies permissions to read and list all resources in the CloudWatch, Direct Connect, EC2, ELB and Network Firewall services.
- Click the Create Policy button at bottom right.
AWS Policy JSON
The following JSON is used on the JSON tab of the Create Policy page:
Attach Policy to Role
Once you've created a policy, you can attach it to a new role:
- Log into the console of your AWS account and go to the Identity and Access Management (IAM) Roles page at https://console.aws.amazon.com/iamv2/home#/roles.
- On the Roles page, click the Create Role button.
- On the first tab of the Create Role page:
- Select “Another AWS Account” as the type of trusted entity.
- Enter “834693425129” as the Account ID.
- Click the Next: Permission button at bottom right.
- Use the Filter policies field to find the policy that was just created, then check the checkbox to attach it to the new role.
- Decide which of the following permissions you'd like to use (see AWS Flow Log Deletion):
- AmazonS3FullAccess if you want Kentik to delete the log files.
- AmazonS3ReadOnlyAccess if you want your own organization to manage the deletion of log files.
Note: Undeleted log files may lead to additional data storage charges from AWS.
- Clear the Filter policies field, then use it to find the permission you chose in the previous step. Check the checkbox to attach the permission to the new role.
- Click the Next: Tags button at the bottom right.
- Click the Next: Review button at bottom right:
- In the Role name field, enter a name for the new role.
- In the Role description field, enter a brief description for the new role.
- Next, click the Create Role button at bottom right. You'll be taken back to the main Roles page. Your new role should appear at the bottom of the list of roles (if the list is long you can filter for the new role by entering its name in the filter field).
Configure the AWS Role
Once you've created a new role, you'll need to configure the "trust relationship" that allows Kentik services to access resources owned by your account:
- On the main Roles page in the IAM section of the AWS console, click the new role in the list.
- On the resulting Summary page for the role, click the Trust Relationships tab, then click the Edit Trust Relationship button to open the Edit Trust Relationship page.
- Paste the Trust Relationships JSON (below) into the Policy Document field, then click the Update Trust Policy button.
- Back on the role's Summary page, click the Copy to Clipboard icon at the right of the Role ARN field (first line of summary). Save the copied role ARN (Amazon resource name), which you'll need when you finish the log import workflow in the Kentik portal.
We’ve now created a new role, e.g. “Flow_Logs_Test” and established a trusted relationship between that role and the role “eks-ingest-node” from the AWS account 834693425129 (Kentik). This relationship gives the Kentik role permission to use a specified set of AWS services (AmazonEC2ReadOnlyAccess, plus either AmazonS3FullAccess or AmazonS3ReadOnlyAccess) on some AWS resources (i.e. an S3 bucket) that we will create and assign to the new role.
Trust Relationships JSON
The following JSON defines the trust relationship for the AWS role that allows Kentik to export flow logs:
Create an S3 Bucket
After creating and configuring a new role you’ll need to establish a container into which your flow logs can be collected and accessed by the Kentik role eks-ingest-node. To do this we’ll create a “bucket” in Amazon Simple Storage Service, commonly referred to as “S3.”
Flow logs may be exported from AWS to Kentik using either of the following approaches to allocating buckets:
- Local buckets: Send the logs from resources in a given region to an S3 bucket in the same region. You'll work through the steps below once for each local bucket, and set that bucket as the flow log destination for all of the resources that you want represented in Kentik as a single cloud export (see Exports and Devices in AWS).
- Centralized buckets: Send the logs generated in some enabled regions to S3 buckets that may not be in the same regions. You'll work through the steps below once to create each centralized bucket. You can then use that bucket as the flow log destination for AWS resources that may be in multiple regions.
To create an S3 bucket:
- Navigate to the Amazon S3 console at https://console.aws.amazon.com/s3/.
- Click the Create Bucket button, which opens the Create Bucket dialog.
- Enter a name for the new bucket In the Bucket Name field. You will need this name later when configuring one or more VPCs to send logs to this bucket (see Configure Log Publishing).
Note: For bucket naming conventions, see the AWS document Bucket Naming Rules.
- For Region, choose the region in which to locate the bucket (see Bucket Allocation), which is the region from which Kentik will access the collected VPC flow logs (see AWS Regions and Zones).
- Click the Create button.
Note: The default settings on the Configure Options, Set Permissions, and Review tabs of the dialog can be left as-is.
- Back on the Amazon S3 Console you’ll see your new bucket in your... bucket list.
AWS Regions and Zones
A given region in AWS may include multiple “availability zones” (locations). A resource in the region “US East,” for example, may be located in either of that region's two availability zones: “Ohio” or “N. Virginia.” The following factors may influence your choice of bucket location:
- You may be able to optimize latency, minimize costs, or address regulatory requirements by choosing an AWS Region that is geographically close to you.
- All else being equal, the best location for a bucket that will collect flow logs is likely to be same region/zone as the VPCs that will be publishing to it.
Additional information about regions and availability zones is available from AWS documentation:
- For additional information on choosing a region for AWS resources, please refer to Regions and Availability Zones.
- For a list of regions that are available for S3 buckets, refer to Amazon Simple Storage Service.
Configure Log Publishing
Now that we've created and configured an S3 bucket that can be accessed by Kentik, we need to configure the VPC from which we want to publish logs to the bucket. To do this we’ll use AWS’s VPC Dashboard to tell the VPC to create flow logs, and we’ll set the destination of those logs to a folder within the S3 bucket that we just created.
Note: Use a different folder (as described in step 6 below) for each of the VPCs whose logs you want to store in this bucket.
To publish logs to a destination folder using the VPC Dashboard:
- Navigate to the VPC Dashboard at https://console.aws.amazon.com/vpc/.
- In the sidebar at left, click on Your VPCs to go to the Your VPCs page.
- In the list of VPCs, find the row for the VPC from which you’d like to send flow logs to the bucket.
Note: Select only one VPC (see VPCs and Rate Limits).
- Click the button at the left of the row. A new pane, which includes a list of existing flow logs, will appear at the bottom of the page.
- From the drop-down Actions menu above the list of VPCs, choose Create flow log.
- In the resulting Create Flow Log dialog:
- Set Filter to All (recommended for best visibility).
- Set Destination to Send to an S3 bucket.
- Set S3 bucket ARN to a string built from “arn:aws:s3:::” plus the name of the bucket plus the name of the folder, e.g. arn:aws:s3:::test-logs-bucket/VPC_dest_folder.
Note: Clicking in the S3 bucket ARN field will show a list of available buckets. If you simply choose a bucket from this list, without constructing a full ARN as described above, the create flow log operation will fail.
- Click the Create button. The resulting Create flow log page will confirm creation of the log and state the ID assigned to the log by AWS. Click the Close button to go back to the Your VPCs page, where the new log will now be listed at the bottom of the page.
- To publish logs from additional VPCs to destination folders in this bucket, repeat steps 3 through 7 above.
Note: To log from an interface or subnet instead of a VPC, see Creating a Flow Log that Publishes to Amazon S3.
VPCs and Rate Limits
When logs are pulled from an S3 bucket for ingest, Kentik treats the flow records from each individual destination log folder within your bucket as being from one "cloud device." Each such device is analogous, for the purpose of Kentik billing plans (see About Plans), to one physical device. As a result, for AWS flow logs, the per-device rate limits in your plan are applied per destination log folder within your bucket. Kentik therefore recommends using a separate destination log folder for each VPC.
Check Log Collection
AWS flow logs are published to a directory — the destination log folder — that is automatically added into the designated S3 bucket when a flow log is created (see Configure Log Publishing). The logs are collected and published from the VPC every 5 minutes, so it may take several minutes for them to start appearing in the directory.
Check Log Creation
To check if any flow logs are actually being created and published to your S3 bucket:
- Navigate to the Amazon S3 console at https://console.aws.amazon.com/s3/.
- In the list of buckets, click on the bucket to which you’ve exported flow logs.
- On the Overview tab of the resulting bucket page you’ll see a list of top-level folders, which are the destination log folders you specified for each VPC that you set up to send logs. Click on one of these folders.
Note: Flow logs are only created when there is traffic on the VPC. If there is no destination log folder it may be because there’s no traffic in your VPC.
Check Log Contents
If a destination log folder exists, you can drill down into its contents to check whether the logs include flows from a given date or to see the contents of an individual log file:
- Inside the destination log folder you'll see a folder named AWSLogs. Click to open. The list will now show a folder within AWSLogs whose name is your AWS account number.
- Click on the folder in the list. The list will now show a folder named vpcflowlogs.
- Click on the vpcflowlogs folder. The list will now show a folder whose name corresponds to the code (e.g. us-east-2) for the availability zone (e.g. “US East (Ohio)”) in which the VPC exists.
- Click on the folder named for the availability zone. A set of one or more folders will appear that are each named after a year (e.g. “2018”).
- Click on a folder for a year, and continue clicking on folders to drill down through months and days until the list contains individual log files.
- Click on a file in the list to open the page corresponding to that file. The page will display information about the log file, including owner, last-modified timestamp, and size.
- To look at the file contents, click the Download button at upper left, which downloads a compressed (.gz) version of the file. Then uncompressed the file and open it.
Create a Kentik Cloud Export
So far we've established one or more S3 buckets in which flow logs can be collected, enabled Kentik to access that resource, set one or more VPCs to publish to folders within the buckets, and checked that flow is actually being published. Assuming that all has gone well, we're now done with setup in AWS. To complete the setup process we’ll move on to the Kentik portal.
The last stage of our workflow is to create a "cloud export" in Kentik that represents all of the VPCs publishing to the buckets created above, at which point a “cloud device” will be automatically created in Kentik for each individual VPC (assuming that, as recommended, a separate destination log folder has been specified for each).
Note: To create a Kentik cloud from an AWS data source that includes logs from multiple regions, you must use Kentik portal v4.
Configure a Cloud Export
The creation of a new cloud export begins with getting to the Monitor your AWS Cloud page in the v4 portal:
- In the main navbar menu, click Settings at far left.
- At the top of the resulting Settings page, click on Public Clouds in the card at top right.
- On the resulting Public Clouds page, click the Add AWS Cloud button at top, which takes you to the Monitor your AWS Cloud page.
- To complete the settings for the cloud export, follow the instructions in AWS Manual Setup.
At this point we’ve completed the setup process. On the Settings » Public Clouds page, you should now be able to see the changes to the Cloud Exports list that are described in Cloud Exports in the Portal. As time passes and flow records from the VPC are ingested into Kentik you’ll be able to use the names of your cloud devices as group-by and/or filter values for the Device Name dimension in Kentik queries.