VMware recommends best practices for the configuration of host NICs and network topology for vSphere HA. Best Practices include recommendations for your ESXi hosts, and for cabling, switches, routers, and firewalls.

The following network maintenance suggestions can help you avoid the accidental detection of failed hosts and network isolation because of dropped vSphere HA heartbeats.

When making changes to the networks that your clustered ESXi hosts are on, VMware recommends that you suspend the Host Monitoring feature. Changing your network hardware or networking settings can interrupt the heartbeats that vSphere HA uses to detect host failures, and this might result in unwanted attempts to fail over virtual machines.

When you change the networking configuration on the ESXi hosts themselves, for example, adding port groups, or removing vSwitches, VMware recommends that in addition to suspending Host Monitoring, you place the hosts on which the changes are being made into maintenance mode. When the host comes out of maintenance mode, it is reconfigured, which causes the network information to be reinspected for the running host. If not put into maintenance mode, the vSphere HA agent runs using the old network configuration information.

Note

Because networking is a vital component of vSphere HA, if network maintenance needs to be performed inform the vSphere HA administrator.

To identify which network operations might disrupt the functioning of vSphere HA, you should know which management networks are being used for heart beating and other vSphere HA communications.

On legacy ESX hosts in the cluster, vSphere HA communications travel over all networks that are designated as service console networks. VMkernel networks are not used by these hosts for vSphere HA communications.

On ESXi hosts in the cluster, vSphere HA communications, by default, travel over VMkernel networks, except those marked for use with vMotion. If there is only one VMkernel network, vSphere HA shares it with vMotion, if necessary. For ESXi, if you wish to use a network other than the one vCenter Server uses to communicate with the host for HA, you must explicitly enable the Management traffic checkbox for vSphere HA to use this network. For ESX hosts, HA uses by default all the service console networks. To contain HA traffic to a subset of the ESX console networks, use the allowedNetworks advanced option.

Note

To keep vSphere HA agent management traffic separate from other network traffic, VMware recommends that you configure hosts so vmkNICs used by vSphere HA do not share subnets with vmkNICs used for other purposes. vSphere HA agents send packets using any pNIC that is associated with a given subnet if there is also at least one vmkNIC configured for vSphere HA management traffic. Consequently, to ensure network flow separation, the vmkNICs used by vSphere HA and by other features must be on different subnets.

A network isolation address is an IP address that is pinged to determine whether a host is isolated from the network. This address is pinged only when a host has stopped receiving heartbeats from all other hosts in the cluster. If a host can ping its network isolation address, the host is not network isolated, and the other hosts in the cluster have failed. However, if the host cannot ping its isolation address, it is likely that the host has become isolated from the network and no failover action is taken.

By default, the network isolation address is the default gateway for the host. Only one default gateway is specified, regardless of how many management networks have been defined. You should use the das.isolationaddress[...] advanced attribute to add isolation addresses for additional networks. See vSphere HA Advanced Attributes.

You should consider other things when configuring the networking that supports your vSphere HA cluster.

Configuring Switches. If the physical network switches that connect your servers support the PortFast (or an equivalent) setting, enable it. This setting prevents a host from incorrectly determining that a network is isolated during the execution of lengthy spanning tree algorithms.

Port Group Names and Network Labels. Use consistent port group names and network labels on VLANs for public networks. Port group names are used to reconfigure access to the network by virtual machines. If you use inconsistent names between the original server and the failover server, virtual machines are disconnected from their networks after failover. Network labels are used by virtual machines to reestablish network connectivity upon restart.

Configure the management networks so that the vSphere HA agent on a host in the cluster can reach the agents on any of the other hosts using one of the management networks. If you do not set up such a configuration, a network partition condition can occur after a master host is elected.