To ensure optimal vSphere HA cluster performance, VMware recommends that you follow certain best practices. Networking configuration and redundancy are important when designing and implementing your cluster.

When vSphere HA or Fault Tolerance take action to maintain availability, for example, a virtual machine failover, you can be notified about such changes. Configure alarms in vCenter Server to be triggered when these actions occur, and have alerts, such as emails, sent to a specified set of administrators.

Several default vSphere HA alarms are available.

Insufficient failover resources (a cluster alarm)

Cannot find master (a cluster alarm)

Failover in progress (a cluster alarm)

Host HA status (a host alarm)

VM monitoring error (a virtual machine alarm)

VM monitoring action (a virtual machine alarm)

Failover failed (a virtual machine alarm)

Note

The default alarms include the feature name, vSphere HA.

A valid cluster is one in which the admission control policy has not been violated.

A cluster enabled for vSphere HA becomes invalid (red) when the number of virtual machines powered on exceeds the failover requirements, that is, the current failover capacity is smaller than configured failover capacity. If admission control is disabled, clusters do not become invalid.

The cluster's Summary tab in the vSphere Client displays a list of configuration issues for clusters. The list explains what has caused the cluster to become invalid or overcommitted (yellow).

DRS behavior is not affected if a cluster is red because of a vSphere HA issue.

Configuration issues and other errors can occur for your cluster or its hosts that adversely affect the proper operation of vSphere HA. You can monitor these errors by looking at the Cluster Operational Status screen, which is accessible in the vSphere Client from the vSphere HA section of the cluster's Summary tab. Address issues listed here.

Most configuration issues have a matching event that is logged. All vSphere HA events include "vSphere HA" in the description. You can search for this term to find the corresponding events.

In clusters where ESXi 5.0 hosts and ESX/ESXi 4.1 or prior hosts are present and where Storage vMotion is used extensively or Storage DRS is enabled, VMware recommends that you do not deploy vSphere HA. vSphere HA might respond to a host failure by restarting a virtual machine on a host with an ESXi version different from the one on which the virtual machine was running before the failure. A problem can occur if, at the time of failure, the virtual machine was involved in a Storage vMotion action on an ESXi 5.0 host, and vSphere HA restarts the virtual machine on a host with a version prior to ESXi 5.0. While the virtual machine might power on, any subsequent attempts at snapshot operations could corrupt the vdisk state and leave the virtual machine unusable.

The following recommendations are best practices for vSphere HA admission control.

Select the Percentage of Cluster Resources Reserved admission control policy. This policy offers the most flexibility in terms of host and virtual machine sizing. In most cases, a calculation of 1/N, where N is the number of total nodes in the cluster, yields adequate sparing.

Ensure that you size all cluster hosts equally. An unbalanced cluster results in excess capacity being reserved to handle failure of the largest possible node.

Try to keep virtual machine sizing requirements similar across all configured virtual machines. The Host Failures Cluster Tolerates admission control policy uses slot sizes to calculate the amount of capacity needed to reserve for each virtual machine. The slot size is based on the largest reserved memory and CPU needed for any virtual machine. When you mix virtual machines of different CPU and memory requirements, the slot size calculation defaults to the largest possible, which limits consolidation.

You can use vSphere HA and Auto Deploy together to improve the availability of your virtual machines. Auto Deploy provisions hosts when they power up and you can also configure it to install the vSphere HA agent on such hosts during the boot process. To have Auto Deploy install the vSphere HA agent, the image profile you assign to the host must include the vmware-fdm VIB. See the Auto Deploy documentation included in vSphere Installation and Setup for details.

If you need to perform network maintenance that might trigger host isolation responses, VMware recommends that you first suspend vSphere HA by disabling Host Monitoring. After the maintenance is complete, reenable Host Monitoring.