In this article:

Contents Search
   

 

Alert Dashboards

Kentik Detect’s Alerting section includes two pages that are used to view the alarms and mitigations that are generated by alert policies. These tabs are covered in the following topics:

Notes:
- Alert dashboards are specific to the alerting system and are distinct from the dashboards in the Dashboards section of the Kentik Detect portal.
- For general information on policy alerting, see Policy Alert Overview.
- For information on settings for alert policies, see Alert Policies.
- For information on alert-related notifications, see Alert Notifications.
- For information on mitigation for alerts, see Alert Mitigation.

 

 
 top

Active Alerts

The Active Alerts page of the portal’s Alerting section is covered in the following topics:

 

 
 top  |  section

About Active Alerts

The primary function of the Active Alerts page is to provide a list of active alerts and mitigations as well as those that are waiting for acknowledgement. The list displays important information about each alert or mitigation, including the alert policy that triggered it (see About Alerts and Policies), its current state (see Alert States), and the key whose traffic matched the conditions specified in one of the policy’s thresholds. The content on this page, which also contains additional indicators and information (see Active Alerts Page UI), is refreshed at a user-selectable interval and displays up to 500 items.

 

 
 top  |  section

Active Alerts Page UI

The Active page is made up of the following UI elements:

  • Alerting Summary: A display across the top showing the total number of mitigations, alarms, and required acknowledgements reported during the last seven days. The background color of each tile varies depending on the severity of the current state. See Scoreboard Summary.
  • Alerting Scoreboard: See Scoreboard Matrix.
  • Filter field: Filters the Active Alerts List to show only rows containing the entered text in one of the following fields: policy, key, value, ID, start/end time.
  • Auto-refresh selector: Choose the interval at which the page will be refreshed.
  • Active Alerts list: A table listing currently active alerts (see Active Alerts List).

 

 
 top  |  section

Alerting Scoreboard

The alerting scoreboard at the top of the Active page is a high-level overview of current alert activity, enabling users to see at a glance the items most likely to need attention.

 

Scoreboard Summary

The top part of the scoreboard is a set of summary tiles (shown above), one for each of three types of events:

  • Mitigations: Shows a count of how many alerts are currently being mitigated, either automatically or manually. A button (+ sign) for manual mitigation is also included. The background color of the tile varies depending on the count:
    - Grey: No mitigations currently in progress.
    - Purple: 1 or more mitigations currently in progress.
  • Alarms: Shows a count of alerts that are in ALARM state, meaning that the conditions defined in the alert policy have been met and notifications have been triggered. A count of the alarms at each severity is also included. The background color of the tile varies depending on the severity (minor, major, critical) of the most severe alarm:
    - Grey: No alarms currently active
    - Dark Red: The highest severity level is Critical.
    - Red: The highest severity level is Major.
    - Orange: The highest severity level is Minor.
  • Acknowledgements: Shows a count of alerts that are in ACK_REQ state, meaning that the conditions that resulted in an alarm are no longer present, but an acknowledgement is required from a user in your organization before the alert is removed from the active list. The background color of the tile varies depending on the count:
    - Grey: No acknowledgements pending.
    - Blue: 1 or more Acknowledgements pending.

 

Scoreboard Matrix

Below the summary tiles is a matrix whose rows represent either mitigations or alert policies that are in alarm. The columns represent the top values of a dimension chosen when the matrix was configured (click the gear button to edit the configuration). The matrix lets you quickly see what’s going on with the policies that are most in need of attention.

 

 
 top  |  section

Scoreboard Configuration

The alerting scoreboard is configured in the Configure Scoreboard dialog. To open the dialog, click the Configure Scoreboard button, which is labeled with a gear icon and appears in the heading row of the Scoreboard Matrix.

More information coming soon.

 

 
 top  |  section

Active Alerts List

The Active Alerts List is a table of up to 500 rows in which each row is one of the following:

  • An alert that is currently active (in ALARM state).
  • An alert that is waiting for acknowledgement (in ACK_REQ state) as specified with the Acknowledge Required setting in a threshold; see General Threshold Settings.
  • A mitigation that is currently active or waiting for acknowledgement (see Threshold Mitigations).

The Active Alerts List provides the following information and actions for the rows in the list:

  • Select checkbox: A checkbox that includes the row in a set of rows that will be acted on by the Clear button. To select all rows at once, click the selection box in the column header.
  • Clear button: Appears to the right of the Filter field when one or more select checkboxes is checked. The clear action applied to each selected row depends on the type of that row (alarm or mitigation):
    - Clear alarm: Takes the alert that generated the alarm out of alarm state.
    Note: If the conditions that caused the alarm to trigger are still occurring at the next refresh (the timing of which depends on the polling frequency of the alert policy), then a new alarm for the same threshold will appear on the Active Alerts List.
    - Clear mitigation: Stops the mitigation (equivalent to clicking the Stop icon in the actions at the right of the row).
  • State: Indicates the current state of the alert (see Alert States).
  • Policy: Indicates the policy name and criticality (Critical, Major2, Major, Minor2, Minor) as defined in an alert threshold. Clicking the name drops down a menu with the following items:
    - View Alarms: Opens the History page, filtered by the policy (see Alert History Filter).
    - Edit Policy: Opens the Edit Alert Policy dialog (see Alert Policy Dialogs).
  • Key/Dimension: The dimensions of the key definition, and their values for the keys that caused the alert to enter alarm state (see About Keys). The key can be placed in Silent Mode by clicking the Plus (+) icon in the row’s column.
  • Value: For alarms and matches (not mitigations), this cell contains the following:
    - Value: A line giving the sum total value returned by the key as defined by the alert policy’s query. The top-X ranking of traffic is performed by evaluating the volume, as measured in the primary metric, of the traffic (across the selected devices and filtered by the specified filters) represented by the key.
    - Baseline: A line giving the baseline value from which the alarm threshold has deviated. The baseline can be either static or calculated as defined by the alert policy.
    Note: Hover over the baseline information to open a tool tip containing a baseline code (see Baseline Codes).
  • Mit ID/Alarm ID: The system-generated unique ID assigned to the alarm or mitigation when it was triggered. The ID can be clicked to display the item on the History page along with any related alarms and mitigations.
  • Start/End: The time (UTC) of the following:
    - The start time of the event that triggered the alarm state or mitigation.
    - If the event is waiting for an acknowledgement, the end time of the event that triggered the alarm state. Otherwise the alarm or mitigation is indicated as “Currently Active.”
  • Actions: See Active Alerts Actions.

 

 
 top  |  section

Active Alerts Actions

The actions that can be taken on an active alert or mitigation in the Active Alerts List are applied with the action icons shown at the right of each row. Available actions depend on whether the row represents an alarm or a mitigation.

The following actions are available for alerts:

  • Open in Explorer: Opens Data Explorer in a new browser window or tab, with the sidebar set to correspond to the values of the alarm’s key. For example, if the key’s dimension is Destination:IP (IP_dst) and the value of the key in the alarm is 60.54.101.8 then there will be a filter in the Data Explorer sidebar for inet_dst_addr ILIKE 60.54.101.8.
  • Open in Dashboard: Opens, in a new browser window or tab, the dashboard associated with the policy of the alert that generated the alarm (see Policy Dashboard in General Policy Settings), with the dashboard set to correspond to the values of the alarm’s key. For example, if the key’s dimension is Destination:IP (IP_dst) and the value of the key in the alarm is 60.54.101.8 then there will be a filter in the dashboard for inet_dst_addr ILIKE 60.54.101.8.
  • Debug Alarm: Displays the alarm in a modal Debug window (see Alert Debug).

The following actions are available for mitigations:

  • Mitigation History: Opens the History page and displays the events (state changes; see Alert States) associated with the mitigation.
  • Stop Mitigation: Manually stops the mitigation.
  • Start Mitigation: Manually starts or restarts the mitigation.

 

 
 top

Alert History

The History page of the portal’s Alerting section is covered in the following topics:

 

 
 top  |  section

About Alert History

The primary function of the History page is to display the History List, a filterable table listing alarms, mitigations, and matches (up to 1000) for a specified time range (default is last 24 hours). The list displays important information about each alarm, mitigation, and match, including the alert policy that triggered it (see About Alerts and Policies), its current state (see Alert States), and the key whose traffic matched the conditions specified in one of the policy’s thresholds. The page also contains additional indicators, information, and filter controls (see History Page UI).

Note: Unlike the Active page, the History page is not refreshed automatically. Instead the page’s contents are updated in response to the following actions:
- Apply filtering in the Alert History Filter section.
- Click the Reset button at top right; if any filters have been set they will be removed and time range will be reset to default (last 24 hours).

 

 
 top  |  section

History Page UI

The History page includes the following UI elements:

  • Alert History Filter: Filters the History List by time range, alarm properties, and row type (alarm, mitigation, match). See Alert History Filter.
  • Alert Activity graph: A plot of alarms along a timeline covering the specified time range in either one-hour increments (for time ranges of one week or less) or one-day increments (for time ranges longer than one week). At each increment:
    - The height of the red line shows the total number of alarms that are active at that time.
    - The height of the blue bar, if any, represents the number of new alarms that occurred during that time increment. Hovering the cursor over a blue bar opens a tool tip that shows the number of new and total alarms for that time increment.
  • History List: A table listing alarms that occurred during the specified time range (see History List).

Note: When the browser window is sized to less than 1200 pixels wide, the UI elements listed above (except for the History List) will occupy the full width of the window.

 

 
 top  |  section

Alert History Filter

The Alert History Filter allows you to filter the History List based on time range, alarm properties, and row type (alarm, mitigation, or match).

 

General Filter Controls

The filter section includes the following general controls at the right:

  • Apply button: Applies the current settings of the Alert History Filter
  • Reset button: Restores defaults for all filter settings.
    - Time range: last 24 hours.
    - Filter By: none.
    - Types: Mitigations and Alarms

 

Filter Time Range

The following controls filter the list by time range:

  • From: Two fields used to define the start of the time range:
    - Date field: Pops up a calendar.
    - Time field: Drops down a time list.
  • To: Two fields used to define the end of the time range (see From fields above).
  • Current time button: Click the circular arrow icon to set the end time to the current time.

 

Filter by Properties

The following controls filter the History page by properties of the underlying alerts and alarms:

  • Filter By: The alert property (see options listed below) that will be filtered for the value in the Filter Value field.
  • Filter Value: The string that the Filter By property will be filtered for.

The following Filter By options are supported:

  • Alert Policy: Filters for rows having the alert policy name chosen from the drop-down Filter Value list.
    Note: You can also filter by alert policy with one of the following actions:
    - Click a policy in the Policy/State column of the Active list (Active page) or History List (History page).
    - Click a policy in the Top Policies table.
  • Key (Exact): Filters for rows whose key is identical to the entered string.
  • Key (Partial): Filters for rows whose key contains the entered string.
  • Dimension:Key: Filters for rows whose dimension:key is identical to the entered string.
    Note: You can also filter for a key with one of the following actions:
    - Click a key in the Key/Dimension column of the Alarms list (Alarms page) or History List (History page).
    - Click a key in the Top Keys table.
  • Alarm ID: Filters for rows whose alarm ID is identical to the entered string.
    Note: You can also filter for an alarm ID by clicking it in the Mit ID/Alarm ID column of the Alarms list (Alarms page) or History List (History page).
  • Mitigation ID: Filters for alarms and mitigations whose mitigation ID is identical to the entered string.
    Note: You can also filter for a mitigation ID by clicking it in the Mit ID/Alarm ID column of the Alarms list (Alarms page) or History List (History page).
  • Old State (Partial): Filters for rows whose old state contains the entered string.
    Note: You can also filter for an old state by clicking it in the Policy/State column (left) on the History page.
  • New State (Partial): Filters for rows whose new state contains the entered string.
    Note: You can also filter for a new state by clicking it in the Policy/State column (right) on the History page.
  • Any State (Partial): Filters for alarms and mitigations whose old or new state contains the entered string.

 

Filter by Type

A drop-down menu includes the following checkboxes, which filter the History page by the type of event:

  • Mitigations: Determines whether the History page will include mitigations that occurred in the specified time range.
  • Alarms: Determines whether the History page will include alarms (alerts that entered ALARM state; see Alert States) that occurred in the specified time range.
  • Matches: Determines whether the History page will include all matches (see Matches in History) that occurred in the specified time range.
  • Silenced: Includes only matches generated by alert policies that were in silent mode (see Alert Silent Mode) at the time of the match.
  • Debug: Internal use only.

Note: A match indicates that traffic met conditions defined in an alert policy threshold but does not necessarily indicate that an alarm was triggered.

 

Matches in History

Unlike the Active page, the History page can be filtered to show matches (conditions that meet alert threshold criteria; see Threshold Conditions) that didn’t cause an alert to enter alarm state. This allows you to graph a history of matches for an alert during a specified time range, even if that alert does not enter ALARM state because there were not enough matches during a defined time period (see Activate When Settings).

 

 
 top  |  section

History List

The History List is a filterable table of up to 1000 rows in which each row represents one of the following events that has occurred during the specified time range:

The History List provides the following columns that display information and actions for the rows in the list:

  • Policy: Indicates the following:
    - Policy (all alert types): A line giving the name of the policy involved in the event.
    - State: (alarms and mitigations only): A second line giving the prior and current states of the policy. For example, the event may be a change in the policy’s state from ALARM to ACK_REQ (see Alert States).
  • Key/Dimension: Includes the following:
    - Silent mode button: The key can be placed in Silent Mode by clicking the speaker icon.
    - Dimensions and values: The dimensions of the key definition, and the corresponding values for the keys that caused the alert to match (see About Keys).
  • Value: For alarms and matches (not mitigations), this cell contains the following:
    - Value: A line giving the sum total value returned by the key as defined by the alert policy’s query. The top-X ranking of traffic is performed by evaluating the volume, as measured in the primary metric, of the traffic (across the selected devices and filtered by the specified filters) represented by the key.
    - Baseline: A line giving the baseline value from which the alarm threshold has deviated. The baseline can be either static or calculated as defined by the alert policy.
    Note: Hover over the baseline information to open a tool tip containing a baseline code (see Baseline Codes).
  • Mit ID/Alarm ID: The system-generated unique ID assigned to the alarm or mitigation when it was triggered. The IDs can be clicked to filter the History page for that ID.
    Note: Match rows only include an alarm ID if the match was included in the count of matches that triggered an alarm.
  • Timestamp (UTC): The time of the event that triggered the alarm, mitigation, or match.
  • Actions: See History List Actions.

 

 
 top  |  section

Baseline Codes

The following table describes the explanation codes that appear for matches in the Baseline Used column of the History List:

Code
Comparison direction
Comparator found?
Description
NO_USE_BASELINE
N.A.
N.A.
The match was triggered by a Static threshold (no baselining).
CALCULATED_USED_FOR_BASELINE
Current to History
Yes The match was triggered when the key’s traffic exceeded the baseline.
TRIGGER_USED_NO_BASELINE
Current to History
No The match was triggered when no baseline was found.
DEFAULT_USED_FOR_BASELINE
Current to History
No The match was triggered when no baseline was found and the key’s traffic exceeded the specified value.
LOWEST_USED_FOR_BASELINE
Current to History
No The match was triggered when no baseline was found and the key’s traffic exceeded the lowest historical top-x value.
NOT_FOUND_EXISTS_NO_BASELINE
Current to History
N.A. The match was triggered when the key was in the threshold’s current top-x but not in the historical top-x.
ACT_CURRENT_MISSING_TRIGGER
History to Current
No The match was triggered when the key was in the threshold’s historical top-x but not in the current top-x.
ACT_CURRENT_USED_FOUND
History to Current
Yes The match was triggered when the key’s historical traffic exceeded its current traffic.
ACT_CURRENT_NOT_FOUND_EXISTS
History to Current
N.A. The match was triggered when the key was in the threshold’s historical top-x but not in the current top-x.

Note: The situations in which the above codes are used depend on the Condition Type and the Threshold Configuration Settings of the threshold that triggered the alarm.

 

 
 top  |  section

History List Actions

The actions that can be taken on an alarm or match in the History List are applied with the action icons shown at the right of each row. There are no actions available for mitigations in the History List.

The following actions are available for alarms and matches:

  • Open in Explorer: Opens Data Explorer in a new browser window or tab, with the sidebar set to correspond to the values of the alarm’s key. For example, if the key’s dimension is Destination:IP (IP_dst) and the value of the key in the alarm is 60.54.101.8, then there will be a filter in the Data Explorer sidebar for inet_dst_addr ILIKE 60.54.101.8.
  • Open in Dashboard: Opens, in a new browser window or tab, the dashboard associated with the policy of the alert that generated the alarm (see Policy Dashboard in General Policy Settings), with the dashboard set to correspond to the values of the alarm’s key. For example, if the key’s dimension is Destination:IP (IP_dst) and the value of the key in the alarm is 60.54.101.8, then there will be a filter in the dashboard for inet_dst_addr ILIKE 60.54.101.8.
  • Debug: Displays the alarm in a modal Debug window (see Alert Debug).
  • Positional Data: Opens the Positional Details dialog; see Positional Details Dialog.

 

 
 top  |  section

Positional Details Dialog

The Positional Details dialog is accessed via the Positional Data button in the Actions at the right of each row of the History List. The dialog includes two tables, which provide the following information:

  • Current Key List: A list showing the relative position of all keys currently in the top-X (including the featured key, which is the key that triggered the alarm corresponding to the dialog), allowing you to determine the impact of other keys on the position of the key you are looking at (e.g. is another key about to push the current key out of the top-X?).
  • Baseline List: A list of the baseline values and relative position for the top-X keys of the policy at the time that the featured key triggered the alert to enter ALARM state.

Additional information is coming soon.

 

 
 top

Alert States

The state of alarms and mitigations is covered in the following topics:

 

 
 top  |  section

Alert State Display

The state of alarms and mitigations is displayed in the following locations:

  • Active Alerts List: The current state of each active alert is shown in the State column.
  • History List: The most recent change in state for each alert is shown in the second line of each cell of the Policy column.

In both lists the state is represented by a label rather than by the actual value of the backend constant for that state. The meanings of the state constants are covered in the topics below.

Note: In the History List, when you hover over a state label, a pop up will open that displays the actual constant value.

 

 
 top  |  section

Alarm Row States

Alarm rows can appear in both the Active Alerts List and the History List. The following table lists the possible states represented by alarm rows, as well as the corresponding labels:

Label State Description
Alarm ALARM Active alarm: an alarm that is currently in alarm state.
ACK Required ACK_REQ An alarm that is no longer active but still requires user acknowledgement before being cleared (see General Threshold Settings).
Cleared CLEAR An alarm that has been cleared.
Note: Cleared alarms are removed from Active Alerts List but appear in History List.

Notes:
- In the History List, which includes an entry for each time a given alarm or mitigation undergoes a change of state, the label is determined by the new state.
- The History List can also display matches (see About Matches), which have no state.

 

 
 top  |  section

Mitigation Row States

Mitigation rows can appear in both the Active Alerts List and the History List. The following table lists the possible mitigation states represented by mitigation rows, as well as the corresponding labels:

Label State Description
ACK Required ACK_REQ A mitigation that is no longer active but still requires user acknowledgement before being cleared (see Common Method Settings).
Clear CLEAR The mitigation has been cleared, either manually or as a result of acknowledgement.
Note: Cleared mitigations are removed from the Active Alerts List but appear in the History List.
Manual Clear CLEAR_MANUAL The mitigation has been manually cleared.
Note: Cleared mitigations are removed from the Active Alerts List but appear in the History List.
End Grace END_GRACE The mitigation has ended but the grace period has not yet expired (see Grace period in Common Method Settings).
End Time Conf END_TIMED_CONF Mitigation stop is pending: The conditions that triggered the mitigation no longer exist but one of the following is required before stopping (see User Acknowledgement Unless Timer Expired under Clear Mitigation in Threshold Mitigations):
- expiration of timer;
- user acknowledgement.
End Wait Conf END_WAIT_CONF Mitigation stop is pending: The conditions that triggered the mitigation no longer exist but user acknowledgement is required before stopping (see User Acknowledgement under Clear Mitigation in Threshold Mitigations).
Initializing NONE The mitigation is initializing and is not yet active.
Active MITIGATING The mitigation is active and was not started or restarted manually.
Failure MITIGATING_FAIL Mitigation was attempted but was unable to execute as configured.
Failure-Rogue ROGUE_MITIGATING_FAIL External mitigation system indicates the existence of a Kentik-initiated mitigation for which Kentik has no internal record.
Manual MITIGATING_MANUAL One of the following:
- The mitigation was triggered and then stopped (manually or automatically), and has been restarted manually.
- The mitigation was manually started using either the manual mitigation button or API.
Start Timed Conf START_TIMED_CONF Mitigation start is pending: Mitigation has been triggered but requires one of the following before starting (see User Acknowledgement Unless Timer Expired under Apply Mitigation in Threshold Mitigations):
- expiration of timer;
- user acknowledgement.
Start Wait Conf START_WAIT_CONF Mitigation start is pending: Mitigation has been triggered but user acknowledgement is required before starting (see User Acknowledgement under Apply Mitigation in Threshold Mitigations).

Notes:
- In the History List, which includes an entry for each time a given alarm or mitigation undergoes a change of state, the label is determined by the new state.
- The History List can also display matches (see About Matches), which have no state.

 

 
 top

Alert Debug

The Debug page provides a list showing values of keys (combinations of one or more dimension; see About Keys) from the most-recent evaluation of a chosen alert and the corresponding baseline (if any) for that alert.

More information is coming soon.