vCenter Server supports a rich set of customization options, including monitoring, virtual machine fault tolerance, and so on. For each feature, this VMware Validated Design specifies the design decisions.

When VM and Application Monitoring is enabled, the VM and Application Monitoring service, which uses VMware Tools, evaluates whether each virtual machine in the cluster is running. The service checks for regular heartbeats and I/O activity from the VMware Tools process running on guests. If the service receives no heartbeats or I/O activity, it is likely that the guest operating system has failed or that VMware Tools is not being allocated time for heartbeats or I/O activity. In this case, the service determines that the virtual machine has failed and reboots the virtual machine.

Enable Virtual Machine Monitoring for automatic restart of a failed virtual machine. The application or service that is running on the virtual machine must be capable of restarting successfully after a reboot or the VM restart is not sufficient.

Monitor Virtual Machines Design Decision

Decision ID

Design Decision

Design Justification

Design Implication

SDDC-VI-VC-025

Enable Virtual Machine Monitoring for each cluster.

Virtual Machine Monitoring provides adequate in-guest protection for most VM workloads.

There is no downside to enabling Virtual Machine Monitoring.

SDDC-VI-VC-026

Create Virtual Machine Groups for use in startup rules in the management and shared edge and compute clusters.

By creating Virtual Machine groups, rules can be created to configure the startup order of the SDDC management components.

Creating the groups is a manual task and adds administrative overhead.

SDDC-VI-VC-027

Create Virtual Machine rules to specify the startup order of the SDDC management components.

The rules enforce the startup order of virtual machine groups to ensure the correct startup order of the SDDC management components.

Creating the rules is a manual task and adds administrative overhead.

vSphere Distributed Resource Scheduling provides load balancing of a cluster by migrating workloads from heavily loaded hosts to less utilized hosts in the cluster. DRS supports manual and automatic modes.

Manual

Recommendations are made but an administrator needs to confirm the changes

Automatic

Automatic management can be set to five different levels. At the lowest setting, workloads are placed automatically at power on and only migrated to fulfill certain criteria, such as entering maintenance mode. At the highest level, any migration that would provide a slight improvement in balancing will be executed.

vSphere Distributed Resource Scheduling Design Decision

Decision ID

Design Decision

Design Justification

Design Implication

SDDC-VI-VC-028

Enable DRS on all clusters and set it to Fully Automated, with the default setting (medium).

The default settings provide the best trade-off between load balancing and excessive migration with vMotion events.

In the event of a vCenter outage, mapping from virtual machines to ESXi hosts might be more difficult to determine.

EVC works by masking certain features of newer CPUs to allow migration between hosts containing older CPUs. EVC works only with CPUs from the same manufacturer and there are limits to the version difference gaps between the CPU families.

If you set EVC during cluster creation, you can add hosts with newer CPUs at a later date without disruption. You can use EVC for a rolling upgrade of all hardware with zero downtime.

Set EVC to the highest level possible with the current CPUs in use.

VMware Enhanced vMotion Compatibility Design Decision

Decision ID

Design Decision

Design Justification

Design Implication

SDDC-VI-VC-029

Enable Enhanced vMotion Compatibility on all clusters. Set EVC mode to the lowest available setting supported for the hosts in the cluster.

Allows cluster upgrades without virtual machine downtime.

You can enable EVC only if clusters contain hosts with CPUs from the same vendor.