When a datastore becomes inaccessible, VMCP might not terminate and restart the affected virtual machines.

When an All Paths Down (APD) or Permanent Device Loss (PDL) failure occurs and a datastore becomes inaccessible, VMCP might not resolve the issue for the affected virtual machines.

In an APD or PDL failure situation, VMCP might not terminate a virtual machine for the following reasons:

VM is not protected by vSphere HA at the time of failure.

VMCP is disabled for this virtual machine.

Furthermore, if the failure is an APD, VMCP might not terminate a VM for several reasons:

APD failure is corrected before the VM was terminated.

Insufficient capacity on hosts with which the virtual machine is compatible

During a network partition or isolation, the host affected by the APD failure is not able to query the master host for available capacity. In such a case, vSphere HA defers to the user policy and terminates the VM if the VM Component Protection setting is aggressive.

vSphere HA terminates APD-affected VMs only after the following timeouts expire:

APD timeout (default 140 seconds).

APD failover delay (default 180 seconds). For faster recovery, this can be set to 0.

Note

Based on these default values, vSphere HA terminates the affected virtual machine after 320 seconds (APD timeout + APD failover delay)

To address this issue, check and adjust any of the following:

Insufficient capacity to restart the virtual machine

User-configured timeouts and delays

User settings affecting VM termination

VM Component Protection policy

Host monitoring or VM restart priority must be enabled