Define a Recovery Alert for a Resource Type Alert

Available only in vFabric Hyperic

Understanding Recovery Alerts

A recovery alert is special type of alert definition that you pair with a properly configured primary alert definition to streamline alert management. The purpose of a recovery alert is to fire when the condition that fired another alert - the "primary" alert - is no longer true, and then mark the primary alert "fixed" and re-enable the primary alert definition. This strategy prevents redundant alerts and automates the task of marking an alert "fixed".

You can define a recovery alert for a resource alert, and in vFabric Hyperic, a resource type alert. You cannot and don't need to define a recovery alert for a resource group alert in vFabric Hyperic — recovery alert behavior is automatic for resource group alerts.

To effectively leverage the benefits of recovery alert functionality you need to:

  • Configure the primary alert definition to fire once when triggered and then disable itself until that fired alert is fixed. This prevents multiple alerts for a single incident.

  • Configure a recovery alert definition and assign it to the primary alert definition. Make the recovery alert condition the opposite of the primary alert condition. The recovery alert fires when the primary alert condition is no longer true. Upon firing, the recovery alert marks the alert fired by the primary alert "fixed", and re-enables the primarily alert definition, so that if the problem occurs again, the primary alert is again triggered.

Properly configured primary and recovery alert definitions keep users notified of problems without deluging them with alert notifications.

Define Primary Alert Definition to Disable Itself

You can only define a recovery alert for a primary alert definition that already exists. Before setting up a recovery alert, create the primary alert definition, and choose the "Disable alert until re-enabled manually or by recovery alert" option.

Create a Recovery Alert Definition for a Resource Type Alert

To create a recovery alert definition for a resource type alert:

  1. Click Administration in the masthead.

  2. Click Monitoring Defaults in "HQ Server Settings" section of the page.

  3. On the HQ Monitoring Defaults Configuration page, click Edit Alerts for the resource type to which the primary alert is defined. The Monitoring Defaults page will display any alert definitions already assigned to the alert.

  4. Click New and follow the directions in Define an Alert for a Resource Type, making sure, when defining the "Condition Set" to

    1. specify the condition that is the opposite of the primary alert definition's condition.  For example if the primary alert condition is "1 Minute Load Avg > 2.0.", define the recovery alert condition as "1 Minute Load Avg < 2.0.

    2. Use the Recovery Alert pulldown to select the primary alert.