Kentik for AWS
Bringing VPC Flow Logs Into Kentik
As cloud-only and hybrid cloud services become increasingly commonplace, network operators of all kinds need a unified environment within which to view and analyze the data generated by their network activities. Kentik® collects, derives, and correlates this network traffic data — flow records, BGP, GeoIP, SNMP, etc. — to enable visualization, monitoring, alerting, and analytics. The data may be collected not only from routers (including related hardware like switches) and hosts (via a software host agent) but also from flow logs generated by cloud service providers such as Amazon Web Services.
In this article we'll look at how to get log data from AWS to Kentik. The following topics will guide you through the setup process:
Note: For help with any aspect of the setup workflow outlined below, please contact Kentik Customer Success (firstname.lastname@example.org).
About AWS VPC Flow Logs
The basics of AWS flow logs are covered in the following topics:
AWS Flow Log Overview
In the world of Amazon Web Services, VPC Flow Logs are analogous to the flow records (e.g. NetFlow, sFlow, etc.; see About Flow) generated by devices on physical networks. A flow log consists of a set of records about the flows that either originated or ended in a given Virtual Private Cloud, with each individual record made up of a set of fields giving information about a single flow.
Amazon allows you to set up a VPC Flow Log for a VPC, a subnet, or a network interface, and to publish that flow log to a destination folder within a “bucket” in Amazon Simple Storage Service (S3), which is the location from which a Kentik service pulls the logs for ingest into Kentik.
AWS Flow Log Formats
AWS now supports the export of flow logs in two distinct formats:
- Default format (AWS standard format version 2): Each line of the log is a space-separated string with fields in the following order:
<version> <account-id> <interface-id> <srcaddr> <dstaddr> <srcport> <dstport> <protocol> <packets> <bytes> <start> <end> <action> <log-status>
- Custom format (AWS log format version 3 through 5): Each line is made up of one or more fields in a custom-specified order. The available fields, and the VPC flow logs version in which each field was introduced, are listed in the AWS documentation topic Available Fields.
Custom AWS flow logs must meet the following requirements for ingest into Kentik:
- The following fields are required: <srcaddr>, <destaddr>, <srcport>, <dstport>, <packets>, <bytes>, <protocol>, <version>, and <start>.
- In addition to the required fields, a custom log format must include at least six other AWS fields.
- V5 fields may be included in logs sent to Kentik but are not currently ingested or stored by Kentik.
AWS Flow Log Deletion
Flow log deletion minimizes the costs associated with log data retention in the cloud. Kentik is designed to support the following approaches to flow log deletion:
- Deletion by Kentik: Kentik's built in log ingest process results in log files being deleted within 15 minutes of being posted to an AWS S3 bucket. To utilize this approach, you must give Kentik full access to the bucket (AmazonS3FullAccess) when setting permissions for the AWS role associated with the bucket. This setting is made in the Filter policies field on the Create Role page; see Create an AWS Role.
- Deletion by customer: If your role permissions for a given log bucket are set to AmazonS3ReadOnlyAccess then Kentik will not be able to delete log files automatically. AWS provides a number of options for deleting the contents, including log files, from a bucket; see the AWS documentation at Empty a bucket.
AWS Flow Log Resources
For detailed information about VPC Flow Logs, please refer to AWS documentation:
- For information about the structure of each flow record, including it’s various fields, see the AWS topic Flow Log Records.
- For information about the limitations of AWS flow logging, see the AWS topic Flow Log Limitations.
- For general information from Amazon about AWS VPCs, see Amazon VPC FAQs. or VPC Quick Start.
- For general information from Amazon about S3 buckets, see Creating and Configuring an S3 Bucket.
AWS Logging Setup Overview
Kentik accesses your flow logs by pulling them from a bucket in Amazon Simple Storage Service (S3). Assuming that you already have a VPC in AWS, the following setup workflow (detailed in the topics below) will enable Kentik to access the logs for ingest into Kentik:
- In AWS’s Identity and Access Management (IAM) console, create a new AWS role and configure it with permissions that enable AWS services associated with Kentik’s account to access resources associated with your account.
- In AWS’s S3 console, create a bucket to which logs can be published.
Note: You may create a separate bucket for each region from which you will collect VPC flow logs or a combined bucket for all regions (see Bucket Allocation).
- In AWS’s VPC Dashboard, configure each VPC (or subnet or interface) to publish logs to a destination folder in the bucket. Kentik recommends using a different destination folder for each VPC.
- Back in the S3 console, confirm that logs are being published to the destination folders.
- In the Kentik portal, create a new Cloud (see About Clouds) pointing to the bucket, which results in a Kentik “cloud device” being automatically created for each destination log folder.
Note: As noted earlier, AWS allows you to set up a VPC Flow Log for a VPC, a subnet, or a network interface. Both the list above and the topics below describe these tasks using a VPC as the example. Individual steps in these tasks may vary slightly if you are instead enabling logging from a subnet or interface. For details, please refer to the AWS documentation topic Creating a Flow Log that Publishes to Amazon S3.
Clouds in the Portal
Successful completion of the tasks listed in the overview above will have the following effect in the Kentik portal:
- A new Cloud will be shown as an added row in the Kentik portal’s Clouds list (Admin » Clouds). The Cloud will represent the collection of VPCs (or subnets or interfaces) whose logs are pulled from the bucket specified in the Add Cloud dialog.
- The Device Groups column in the Clouds list will show one or more cloud devices:
- Each of these cloud devices will represent all VPCs, subnets, and/or interfaces that publish logs to one destination log folder in the bucket.
- Each device will be named after it’s corresponding log folder.
- Each flow record ingested into KDE from a given cloud device will include the device’s name in the virtual column i_device_name, enabling you to group-by and filter on the device using the Device Name dimension.
AWS Logging Setup Tasks
The tasks required to set up the publishing of VPC Flow Logs to an S3 bucket are covered in the following topics:
Create an AWS Role
One part of enabling Kentik to export your VPC flow logs is to give permission to our services to access the needed resources in your account. You’ll do this by creating a new “role” in AWS for each region from which you wish to export the logs. You'll then assign to each of those roles a set of permissions that grant access by Kentik to the EC2 APIs corresponding to the VPC instances from which the logs will be exported.
Note: AWS recommends creating a new role for logging rather than using an existing role.
To create a new AWS role:
- Log into the console of your AWS account and go to the Identity and Access Management (IAM) page at https://console.aws.amazon.com/iam/home.
- In the sidebar at left, click Roles.
- On the Roles page, click the Create Role button.
- On the first tab of the Create Role page:
- Select “Another AWS Account” as the type of trusted entity.
- Enter “834693425129” as the Account ID.
- Click the Next: Permission button at bottom right.
- On the resulting page, click the Create policy button, which will open a browser tab with a new page called "Create policy." Click the JSON tab on this page.
- In the resulting JSON editor, overwrite the existing JSON by pasting in paste the JSON shown in AWS Role JSON.
- Click the Next: Tags button. In the subsequent step, you may choose to add tags to describe this policy.
- Supply a name and description for the newly created policy, for example:
- Name: Kentik-Metadata-Policy
- Description: Policy allowing Kentik Technologies permissions to read and list all resources in the CloudWatch, Direct Connect, EC2, ELB and Network Firewall services.
- Click the Create policy button.
- Switch back to the browser tab you were on when you clicked the Create policy button in step 5, then click the refresh button to update the list of policies.
- Use the Filter policies field to find the policy that was just created, then check the checkbox to attach it to the new role.
- Clear the Filter policies field, then use it to find one or the other of the permission policies listed below. Check the checkbox to attach it to the new role.
- AmazonS3FullAccess if you want Kentik to delete the log files.
- AmazonS3ReadOnlyAccess if you want your own organization to manage the deletion of log files.
Note: Undeleted log files may lead to additional data storage charges from AWS.
- When done, click the Next: Review button at bottom right. On the third tab of the Create Role page:
- In the Role name field, enter a name for the new role.
- In the Role description field, enter a brief description for the new role.
- Next, click the Create Role button at bottom right. You'll be taken back to the main Roles page. Your new role should appear at the bottom of the list of roles (if the list is long you can filter for the new role by entering its name in the filter field).
AWS Role JSON
The following JSON is used on the JSON tab of the Create Policy page:
Configure the AWS Role
Once you've created a new role, you'll need to configure the "trust relationship" that allows Kentik services to access resources owned by your account:
- On the main Roles page in the IAM section of the AWS console, click the new role in the list.
- On the resulting Summary page for the role, click the Trust Relationships tab, then click the Edit Trust Relationship button to open the Edit Trust Relationship page.
- Paste the Trust Relationships JSON (below) into the Policy Document field, then click the Update Trust Policy button.
- Back on the role's Summary page, click the Copy to Clipboard icon at the right of the Role ARN field (first line of summary). Save the copied role ARN (Amazon resource name), which you'll need when you finish the log import workflow in the Kentik portal.
We’ve now created a new role, e.g. “Flow_Logs_Test” and established a trusted relationship between that role and the role “eks-ingest-node” from the AWS account 834693425129 (Kentik). This relationship gives the Kentik role permission to use a specified set of AWS services (AmazonEC2ReadOnlyAccess, plus either AmazonS3FullAccess or AmazonS3ReadOnlyAccess) on some AWS resources (i.e. an S3 bucket) that we will create and assign to the new role.
Trust Relationships JSON
The following JSON defines the trust relationship for the AWS role that allows Kentik to export flow logs:
Create an S3 Bucket
After creating and configuring a new role you’ll need to establish a container into which your flow logs can be collected and accessed by the Kentik role eks-ingest-node. To do this we’ll create a “bucket” in Amazon Simple Storage Service, commonly referred to as “S3.”
VPC flow logs may be exported to Kentik using either of the following approaches to allocating buckets:
- Combined bucket: Send the logs generated in all enabled regions to a single S3 bucket. You'll work through the steps below once to create the combined bucket, then use that bucket as the flow log destination for your VPCs in all regions.
- Regional buckets: Send the logs to one or more S3 buckets per region in which logging is enabled for EC2 resources. You'll work through the steps below once for each regional bucket, and set each bucket as the flow log destination for the VPCs in the corresponding region.
To create an S3 bucket:
- Navigate to the Amazon S3 console at https://console.aws.amazon.com/s3/.
- Click the Create Bucket button, which opens the Create Bucket dialog.
- Enter a name for the new bucket In the Bucket Name field. You will need this name later when configuring one or more VPCs to send logs to this bucket (see Configure Log Publishing).
Note: For bucket naming conventions, see the AWS document Bucket Naming Rules.
- For Region, choose the region in which to locate the bucket (see Bucket Allocation), which is the region from which Kentik will access the collected VPC flow logs (see AWS Regions and Zones).
- Click the Create button.
Note: The default settings on the Configure Options, Set Permissions, and Review tabs of the dialog can be left as-is.
- Back on the Amazon S3 Console you’ll see your new bucket in your... bucket list.
AWS Regions and Zones
A given region in AWS may include multiple “availability zones” (locations). A resource in the region “US East,” for example, may be located in either of that region's two availability zones: “Ohio” or “N. Virginia.” The following factors may influence your choice of bucket location:
- You may be able to optimize latency, minimize costs, or address regulatory requirements by choosing an AWS Region that is geographically close to you.
- All else being equal, the best location for a bucket that will collect flow logs is likely to be same region/zone as the VPCs that will be publishing to it.
Additional information about regions and availability zones is available from AWS documentation:
Configure Log Publishing
Now that we've created and configured an S3 bucket that can be accessed by Kentik, we need to configure the VPC from which we want to publish logs to the bucket. To do this we’ll use AWS’s VPC Dashboard to tell the VPC to create flow logs, and we’ll set the destination of those logs to a folder within the S3 bucket that we just created.
Note: Use a different folder (as described in step 6 below) for each of the VPCs whose logs you want to store in this bucket.
To publish logs to a destination folder using the VPC Dashboard:
- Navigate to the VPC Dashboard at https://console.aws.amazon.com/vpc/.
- In the sidebar at left, click on Your VPCs to go to the Your VPCs page.
- In the list of VPCs, find the row for the VPC from which you’d like to send flow logs to the bucket.
Note: Select only one VPC (see VPCs and Rate Limits).
- Click the button at the left of the row. A new pane, which includes a list of existing flow logs, will appear at the bottom of the page.
- From the drop-down Actions menu above the list of VPCs, choose Create flow log.
- In the resulting Create Flow Log dialog:
- Set Filter to All (recommended for best visibility).
- Set Destination to Send to an S3 bucket.
- Set S3 bucket ARN to a string built from “arn:aws:s3:::” plus the name of the bucket plus the name of the folder, e.g. arn:aws:s3:::test-logs-bucket/VPC_dest_folder.
Note: Clicking in the S3 bucket ARN field will show a list of available buckets. If you simply choose a bucket from this list, without constructing a full ARN as described above, the create flow log operation will fail.
- Click the Create button. The resulting Create flow log page will confirm creation of the log and state the ID assigned to the log by AWS. Click the Close button to go back to the Your VPCs page, where the new log will now be listed at the bottom of the page.
- To publish logs from additional VPCs to destination folders in this bucket, repeat steps 3 through 7 above.
Note: To log from an interface or subnet instead of a VPC, see Creating a Flow Log that Publishes to Amazon S3.
VPCs and Rate Limits
When logs are pulled from an S3 bucket for ingest, Kentik treats the flow records from each individual destination log folder within your bucket as being from one "cloud device." Each such device is analogous, for the purpose of Kentik billing plans (see About Plans), to one physical device. As a result, for AWS flow logs, the per-device rate limits in your plan are applied per destination log folder within your bucket. Kentik therefore recommends using a separate destination log folder for each VPC.
Check Log Collection
AWS flow logs are published to a directory — the destination log folder — that is automatically added into the designated S3 bucket when a flow log is created (see Configure Log Publishing). The logs are collected and published from the VPC every 5 minutes, so it may take several minutes for them to start appearing in the directory.
Check Log Creation
To check if any flow logs are actually being created and published to your S3 bucket:
- Navigate to the Amazon S3 console at https://console.aws.amazon.com/s3/.
- In the list of buckets, click on the bucket to which you’ve exported flow logs.
- On the Overview tab of the resulting bucket page you’ll see a list of top-level folders, which are the destination log folders you specified for each VPC that you set up to send logs. Click on one of these folders.
Note: Flow logs are only created when there is traffic on the VPC. If there is no destination log folder it may be because there’s no traffic in your VPC.
Check Log Contents
If a destination log folder exists, you can drill down into its contents to check whether the logs include flows from a given date or to see the contents of an individual log file:
- Inside the destination log folder you'll see a folder named AWSLogs. Click to open. The list will now show a folder within AWSLogs whose name is your AWS account number.
- Click on the folder in the list. The list will now show a folder named vpcflowlogs.
- Click on the vpcflowlogs folder. The list will now show a folder whose name corresponds to the code (e.g. us-east-2) for the availability zone (e.g. “US East (Ohio)”) in which the VPC exists.
- Click on the folder named for the availability zone. A set of one or more folders will appear that are each named after a year (e.g. “2018”).
- Click on a folder for a year, and continue clicking on folders to drill down through months and days until the list contains individual log files.
- Click on a file in the list to open the page corresponding to that file. The page will display information about the log file, including owner, last-modified timestamp, and size.
- To look at the file contents, click the Download button at upper left, which downloads a compressed (.gz) version of the file. Then uncompressed the file and open it.
Create a Cloud in Kentik
So far we've established one or more S3 buckets in which flow logs can be collected, enabled Kentik to access that resource, set one or more VPCs to publish to folders within the buckets, and checked that flow is actually being published. Assuming that all has gone well, we're now done with setup in AWS. To complete the setup process we’ll move on to the Kentik portal.
The last stage of our workflow is to create a Cloud in Kentik that represents all of the VPCs publishing to the buckets created above, at which point a “cloud device” will be automatically created in Kentik for each individual VPC (assuming that, as recommended, a separate destination log folder has been specified for each).
Note: To create a Kentik cloud from an AWS data source that includes logs from multiple regions, you must use Kentik portal v4.
Open the Add AWS Cloud Page
The first stage in creating a new cloud is getting to the Add AWS Cloud page in the v4 portal:
- In the main navbar menu, click Settings at far left.
- At the top of the resulting Settings page, click on Public Clouds in the card at top right.
- On the resulting Public Clouds page, click the Add AWS Cloud button at top, which takes you to the Monitor your AWS Cloud page.
Complete AWS Cloud Settings
On the Monitor your AWS Cloud page:
- Select the Manual Configuration tab.
- Enter the complete ARN of the IAM role created to grant Kentik’s AWS services access to the bucket (see Create an AWS Role). An IAM role ARN is structured as arn:aws:iam:: plus your AWS account number plus :role/ plus the name you gave to your role when you created it.
- Click the Verify Role button to confirm that Kentik can access the role.
- From the Select Your Region drop-down, choose the AWS region where the VPC instances that you wish to represent with this cloud reside
- Click the Verify Region button to confirm that Kentik can access the region.
- In the S3 Bucket Name field, enter the name of the bucket you created in Create an S3 Bucket, e.g. test-logs-bucket.
Note: If you're sending the flow logs for this Cloud to a combined bucket (see Bucket Allocation), and you've already entered the name of that bucket when setting up a different Cloud, then specify the S3 Bucket Name field as “KENTIK_NO_FLOW.”
- Click the Verify Bucket button to confirm that Kentik can find the bucket from which to pull the logs for this cloud.
- If you want your Kentik cloud to represent logs from multiple buckets that are not in the same region, turn on the Collect logs from alternative S3 bucket(s) switch. The S3 Bucket Name field will be inactivated.
- Use the Delete After Read switch to determine whether you'd like Kentik to delete the logs after they've been ingested into Kentik or if you prefer to manage log deletion on your own.
- Click the Save button to save the new Cloud and return to the Clouds Page.
At this point we’ve completed the setup process. On the Admin » Clouds page, you should now be able to see the Clouds list changes described in Clouds in the Portal. As time passes and flow records from the VPC are ingested into Kentik you’ll be able to use the names of your cloud devices as group-by and/or filter values for the Device Name dimension in Kentik queries.