VMware Validated Design for Software-Defined Data Center 3.0 Release Notes

Updated on: 15 NOV 2016

VMware Validated Design for Software-Defined Data Center 3.0 | 29 SEP 2016

Check for additions and updates to these release notes.

What's in the Release Notes

The release notes cover the following topics:

About VMware Validated Design for Software-Defined Data Center 3.0

VMware Validated Designs provide a set of prescriptive documents that explain how to plan, deploy, and configure a Software-Defined Data Center (SDDC). The architecture, the detailed design, and the deployment guides provide instructions about configuring a dual-region SDDC.

VMware Validated Designs are tested by VMware to ensure that all components and their individual versions work together, scale, and perform as expected. Unlike Reference Architectures which focus on an individual product or purpose, a VMware Validated Design is a holistic approach to design, encompassing many products in a full stack for a broad set of use case scenarios in an SDDC.

This VMware Validated Design supports a number of use cases and is optimized for integration, expansion, day 2 operations, as well as future upgrades and updates. As new products are introduced, and new versions of existing products are released, VMware continues to qualify the cross-compatibility and upgrade paths of the VMware Validated Designs. Designing with a VMware Validated Design ensures that future upgrade and expansion options are available and supported.
 

VMware Software Components in the Validated Design

VMware Validated Design for Software-Defined Data Center 3.0 is based on a set of individual VMware products with different versions that are available in a common downloadable package.

For information about the products and versions in the VMware Validated Design for Software-Defined Data Center 3.0.2 release, see the VMware Validated Design for Software-Defined Data Center 3.0.2 Release Notes.

Product Group and Edition Product Name Product Version
VMware vSphere Enterprise Plus ESXi 6.0 Update 2
vCenter Server Appliance 6.0 Update 2
VMware Virtual SAN Standard or higher Virtual SAN 6.2
VMware vSphere Replication VMware vSphere Replication 6.1.1
VMware Site Recovery Manager Enterprise VMware Site Recovery Manager 6.1.1
VMware NSX for vSphere Enterprise NSX for vSphere 6.2.4
VMware vRealize Automation Advanced or higher vRealize Automation 7.0.1
vRealize Orchestrator 7.0.1
vRealize Orchestrator Plug-in for NSX 1.0.3
vRealize Orchestrator Plug-in for vRealize Automation 7.0.1 7.0.1
VMware vRealize Business for Cloud Advanced vRealize Business for Cloud 7.0.1 and 7.0.1 Express Patch
VMware vRealize Operations Manager Advanced or higher vRealize Operations Manager 6.2.1
vRealize Operations Management Pack for NSX for vSphere 3.0.2
vRealize Operations Management Pack for vRealize Log Insight 1.0.1
vRealize Operations Management Pack for vRealize Automation 2.0
vRealize Operations Management Pack for Storage Devices 6.0.4
VMware vRealize Log Insight vRealize Log Insight 3.3.2
vRealize Log Insight Content Pack for NSX for vSphere 3.3
vRealize Log Insight Content Pack for Virtual SAN 2.0
vRealize Log Insight Content Pack for vRealize Automation 7.0 1.0
vRealize Log Insight Content Pack for vRealize Orchestrator 7.0 1.1
vRealize Log Insight Content Pack for vRealize Operations Manager 6.x 1.6
VMware vSphere Data Protection vSphere Data Protection 6.1.2

VMware makes available patches and releases to address critical security issues for several products. Verify that you are using the latest security patches for a given component when deploying VMware Validated Design.

What's New

VMware Validated Design for Software-Defined Data Center 3.0 provides the following new features:
  • Simplified 2-pod design that reduces the minimal hardware requirements for deploying the SDDC
  • Updated Bill of Materials that incorporates all new product versions
  • Implementation guides for deploying dual-region SDDC that supports disaster recovery
  • CertGenVVD tool that automates the generation of CA-signed certificates
For more information, see the VMware Validated Design for Software-Defined Data Center 3.0 page.

Internationalization

This VMware Validated Design release is available only in English.

Compatibility

This VMware Validated Design guarantees that product versions in the VMware Validated Design for Software-Defined Data Center 3.0, and the design chosen, are fully compatible. Any minor known issues that exist are described in this release notes document.

Installation

To install and configure an SDDC according to this validated design, follow the guidance in the VMware Validated Design for Software-Defined Data Center 3.0 documentation. For product download information, and guides download, see the VMware Validated Design for Software-Defined Data Center 3.0 page.

Caveats and Limitations

To install vRealize Automation, you must open certain ports in the Windows firewall. This VMware Validated Design instructs that you disable the Windows firewall before you install vRealize Automation. It is possible to keep Windows firewall active and install vRealize Automation by opening only the ports that are required for the installation. This process is described in the vRealize Automation Installation and Configuration documentation.

Known Issues

The known issues are grouped as follows.

vCenter Server
  • Heavy access of the Platform Services Controller Appliance 6.0 causes VMware Identity Management Service to crash

    The vCenter Server Web client becomes inaccessible and produces the following exception:

    HTTP Status 500 - Request processing failed; nested exception is java.lang.IllegalStateException: BadRequest

    Workaround: See VMware Knowledge Base article 2144070

  • During failover between regions, the vAPI Endpoint on the vCenter Server instances in the protected region might trigger an alarm and report that the vAPI Router fails to load the Inventory Service

    In the vSphere Web Client, you see the following warning alarms in the Alarms sidebar panel:

    vAPI Endpoint (vCenter Server Hostname)
    VMware vAPI Endpoint Service
    ​Warning
    vapi-endpoint status changed from green to yellow

    where vCenter Server Hostname in the context of this VMware Validated Design is mgmt01vc51.lax01.rainpole.local or comp01vc51.lax01.rainpole.local.

    The AdministrationSystem Configuration section of the vSphere Web Client reports the following health message for the vAPI Endpoint:

    vAPI Router failed to load inventory service

    The endpoint.log file (/var/log/vmware/vapi/endpoint on vCenter Server Appliance and C:\ProgramData\VMware\vCenterServer\logs\vapi\endpoint\ on vCenter Server on Windows) contains the following log messages: 

    2016-09-08T17:26:29.662Z | INFO  | state-manager1  | InvProviderClientFactory | Login to IS server...
    2016-09-08T17:29:39.076Z | WARN  | state-manager1  | InvProviderClientFactory  | Unable to login/logout to/from inventory service

    Workaround: During disaster recovery, perform a failback operation and bring back online the vCenter Server and Platform Services Controller instances that were powered off. For information about resolving the vAPI Endpoint issue in the case of unavailable linked vCenter Server instances, see VMware  Knowledge Base article 2146538.

vRealize Automation
  • vRealize Automation blueprint deployments that include NSX objects fail when provisioning to a cluster where the NSX manager has the secondary role

    In a cross-vCenter deployment of NSX, NSX universal objects, such as edge gateways, new virtual-wires, and load balancer must be provisioned utilizing the NSX manager that has the primary role. If you attempt to provision universal objects to a secondary NSX manager the process fails with an error. vRealize Automation does not support provisioning of NSX universal objects to a vSphere endpoint with network and security integration where the specified NSX manager has the secondary role.

    Workaround: To be able to use NSX global objects, you must create region specific NSX local transport zone and virtual wires. For detailed instructions, see VMware Knowledge Base article 2147240.

  • After failover or failback during disaster recovery, login to the vRealize Automation Rainpole portal takes several minutes and an attempt for test login to vRealize Orchestrator fails

    This issue occurs during both failover to Region B and failback to Region A of the Cloud Management Platform when the root Active Directory is not available from the protected region. You see the following symptoms:

    • Login takes several minutes or times out
      • When you log in to the vRealize Automation Rainpole portal at  https://vra01svr01.rainpole.local/vcac/org/rainpole using the ITAC-TenantAdmin user, the vRealize Automation portal loads after 2 to 5 minutes. 
      • An attempt for a Test Login to vRealize Orchstrator from the vRealize Orchestrator Control Center at https://vra01vro01a.rainpole.local:8283/vco-controlcenter/ by using the svc-vra user account fails with the following error message:

        Error: I/O error on POST request for "https://vra01svr01.rainpole.local:443/SAAS/t/rainpole/auth/oauthtoken? grant_type=password": Read timed out; nested exception is java.net.SocketTimeoutException: Read timed out
    • Login to the vRealize Automation Rainpole portal fails with an error about incorrect user name and password.

    Workaround: Perform one of the following workarounds according to the recovery operation type.

    • Failover to Region B
      1. Log in to the vra01svr01a.rainpole.local appliance using SSH as the root user.
      2. Open the file /usr/local/horizon/conf/domain_krb.properties in a text editor.
      3. Add the following list of the domain-to-host values and save the domain_krb.properties file.
        Use only lowercase characters when you type the domain name. For example:
        rainpole.local=dc51rpl.rainpole.local:389.
      4. Change the ownership of the domain_krb.properties.
        chown horizon:www /usr/local/horizon/conf/domain_krb.properties
      5. Open the file /etc/krb5.conf in a text editor.
      6. Update the realms section of the krb5.conf file with the same domain-to-host values that you configued in the domain_krb.properties file, but omit the port number as shown in the following example.
        [realms]
         RAINPOLE.LOCAL = {
          auth_to_local = RULE:[1:$0\$1](^RAINPOLE\.LOCAL\\.*)s/^RAINPOLE\.LOCAL/RAINPOLE/
          auth_to_local = RULE:[1:$0\$1](^RAINPOLE\.LOCAL\\.*)s/^RAINPOLE\.LOCAL/RAINPOLE/
          auth_to_local = RULE:[1:$0\$1](^SFO01\.RAINPOLE\.LOCAL\\.*)s/^SFO01\.RAINPOLE\.LOCAL/SFO01/
          auth_to_local = RULE:[1:$0\$1](^LAX01\.RAINPOLE\.LOCAL\\.*)s/^LAX01\.RAINPOLE\.LOCAL/LAX01/
          auth_to_local = DEFAULT
          kdc = dc51rpl.rainpole.local
        }
      7. Restart the workspace service.
        service horizon-workspace restart
      8. Repeat this procedure on the other vRealize Automation appliance vra01svr01b.rainpole.local.
    • Failback to Region A

      If dc51rpl.rainpole.local becomes unavailable in Region B during failback, perform the steps for the failover case using dc01rpl.rainpole.local as the domain controller instead of dc51rpl.rainpole.local and restarting the services.

    This workaround optimizes the synchronization with the Active Directory by pointing to a specific domain controller that is reachable from vRealize Automation appliance in the event of disaster recovery.

  • New vRealize Automation blueprint deployment fails with the error “The maximum number of transaction retry attempts, 40, was reached”

    The following exception is published to the log file.

    System.Data.Services.DataServiceException: Error requesting machine. ---> System.Reflection.TargetInvocationException: Exception has been thrown by the target of an invocation. ---> System.Data.Services.DataServiceException: Error requesting machine. ---> DynamicOps.Common.RetryTransactionException: The maximum number of transaction retry attempts, 40, was reached. ---> System.Data.EntityCommandExecutionException: An error occurred while executing the command definition. See the inner exception for details. ---> System.Data.SqlClient.SqlException: Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding. ---> System.ComponentModel.Win32Exception: The wait operation timed out

    Workaround: Run the SQL query sp_updatestats against the Microsoft SQL Server database used by vRealize Automation.

vRealize Orchestrator
  • vRealize Orchestrator nodes show a Not Responding status 

    Loss of network connectivity between vRealize Orchestrator and vRealize Log Insight might cause the vRealize Orchestrator nodes to become unresponsive. 
    The /var/log/vmware/vco/app-server/catalina.out log file reports the following messages:

    log4j:ERROR Attempted to append to closed appender named [SOCKET]. tcp: DOWN (org.productivity.java.syslog4j.SyslogRuntimeException: java.net.NoRouteToHostException: No route to host)
    INFO <134>Aug 19 15:00:00 vra01vro01b.rainpole.local vco: ae21b98e-1cff-4142-9ecc-9eb1f2f4ee52 prio:INFO

    The /var/log/vmware/vco/app-server/server.log log file reports the following messages:

    9444 2016-09-15 01:49:49.123+0000 [Shared Factory release pool] DEBUG {} [VSOFactoryClient] << Disconnecting Shared Factory !
    9445 2016-09-15 01:49:49.123+0000 [Shared Factory release pool] DEBUG {} [VSOFactoryClient] Disconnect from server
    9446 2016-09-15 01:49:35.777+0000 [http-nio-0.0.0.0-8281-exec-313] DEBUG {} [TokenAuthenticationFilter] Token Authentication Authorization header found for user 'cafe-PCZ_HVulmA@vsphere.local'
    9447 2016-09-15 01:47:14.194+0000 [Heartbeat] WARN {} [HeartBeatServiceImpl] Unable to send heartbeat signal:
    9448 org.springframework.jdbc.CannotGetJdbcConnectionException: Could not get JDBC Connection; nested exception is org.apache.tomcat.jdbc.pool.PoolExhaustedException: [Heartbeat] Timeout: Pool em pty. Unable to fetch a connection in 30 seconds, none available[size:1; busy:1; idle:0; lastwait:30000].

    Workaround: After you restore the network connectivity between vRealize Orchestrator and vRealize Log Insight, restart the unresponsive vRealize Orchestrator nodes. After nodes are running again, log in to а Control Center and verify the the current status of the Orchestrator server service is Running for all nodes.

Virtual SAN
  • Virtual SAN datastores run out of disk space unusually fast due to thick provisioned OVA and OVF files

    When deploying OVA and OVF files on Virtual SAN datastores, the vSphere Web Client provides thick provisioning for virtual machine disks.
    As a result, datastores can quickly run out of disk space.

    Workaround: See VMware Knowledge Base article 2145798.

vRealize Operations Manager
  • You might not be able to access vRealize Operations Manager from the vSphere Web Client in the case of many simultaneous requests for access

    vRealize Operations Manager does not register with vCenter Server with the Virtual IP (VIP) address that is allocated on the load balancer but with the address of an individual node from the analytics cluster. As a result, when you start the vRealize Operations Manager user interface from the vSphere Web Client, the vSphere Web Client redirects you to this node instead of to the load-balanced VIP. The access to the vRealize Operations Manager user interface is not balanced if you launch it from the vSphere Web Client. The node, where the vSphere Web Client redirects you, might become overloaded and you might not be able to access vRealize Operations Manager.

    Workaround: Log in to vRealize Operations Manager directly at the VIP address https://vrops-cluster-01.rainpole.local instead of from the vSphere Web Client.

  • The dashboards in vRealize Operations Manager indicate that no data is available although data is collected

    The dashboards display "no data available" in vRealize Operations Manager. The settings of the dashboards are correct, the adapters collect metrics and the assigned license has enough capacity to collect data.

    Workaround: Save the widgets that do not show data without making any changes.

  • vRealize Operations Manager health badges might be unavailable in the vSphere Web Client

    When vCenter Server is registered in vRealize Operations Manager using its fully-qualified domain name (FQDN) and not using its IP address, the health badges are missing from the Summary and Monitor tabs for the inventory objects in vSphere Web Client. You cannot see the health status of vCenter Server, clusters, ESXi hosts, virtual machines, and so on.

    Workaround: Use one of the following approaches:

    • To be able to view the health badges in the vSphere Web Client, see VMware Knowledge Base article 2145264.
    • To proceed with monitoring your environment until the health badges become available in the vSphere Web Client, log in to vRealize Operations Manager at https://vrops-cluster-01.rainpole.local.
  • After you perform a failover operaiton, the vRealize Operations Manager analytics cluster might fail to start because of an NTP time drift between the nodes.
    • The vRealize Operations Manager user interface might report that some of the analytics nodes are not coming online with the status message Waiting for Analytics.
    • The log information on the vRealize Operations Manager master or master replica node might contain certain NTP-related details.
      • The NTP logs in the /var/log/ folder might report the following messages:
        ntpd[9764]: no reply; clock not set
        ntpd[9798]: ntpd exiting on signal 15
      • The analytics-wrapper.log file in the /storage/log/vcrops/logs/ folder might report the following message:
        INFO | jvm 1 | YYYY/MM/DD | >>> AnalyticsMain.run failed with error: IllegalStateException: time difference between servers is 37110 ms. It is greater than 30000 ms. Unable to operate, terminating...

         

    Workaround: Perform the following tasks:

    • Ensure that all NTP servers that are used by both the analytics and remote collector nodes are accessible.
    • Update the ntp.conf file with new NTP servers on each vRealize Operations Manager node if the original NTP servers are no longer available.
vRealize Log Insight
  • New You see no logs from the vRealize Proxy Agent in vRealize Log Insight because these logs are located in a custom folder

    In VMware Validated Design 3.0, the vSphere Proxy Agent has a custom name vSphere-Agent-01. As a result, the log data is stored in the log-insight-home-dir\vSphere-Agent-01\logs, instead of in the default log-insight-home-dir\vSphere\logs folder.

    Workaround: Configure the log agent on the vSphere Proxy Agents with the valid log directory.

    1. Log in to the vRealize Log Insigt instance in the region.
      Region vRealize Log Insight URL
      Region A https://vrli-cluster-01.sfo01.rainpole.local
      Region B https://vrli-cluster-51.lax01.rainpole.local
    2. Click the Configuration drop-down menu icon and select Administration
    3. Under Management, click Agents.
    4. From the drop-down at the top, select vRealize Automation 7 - Windows from the Available Templates section.
    5. Under File Logs, select vra-agent-vcenter and in the Directory text box enter the following value.
      Region Directory Value
      Region A  C:\Program Files (x86)\VMware\vCAC\Agents\vSphere-Agent-01\logs\
      Region B  C:\Program Files (x86)\VMware\vCAC\Agents\vSphere-Agent-51\logs\
    6. Click Save Agent Group.
vSphere Data Protection
  • The vSphere Data Protection icon is not visible in the vSphere Web Client after deployment and registration with vCenter Server

    After you deploy vSphere Data Protection and register it with vCenter Server, the VDP icon of the vSphere Data Protection plug-in is not visible on the Home page of the vSphere Web Client. The files of the vSphere Data Protection plug-in are not available in the /etc/vmware/vsphere-client/vc-packages/vsphere-client-serenity folder on the vCenter Server Appliance. As a result, the user interface of vSphere Data Protection is not accessible and you cannot back up and restore applications.

    Workaround: None.

  • The vSphere Web Client stops responding when you attempt to connect to the vSphere Data Protection appliance version 6.1.2

    An issue in vSphere Data Protection version 6.1.2 prevents the vSphere Data Protection plug-in in the vSphere Web Client from connecting to the vSphere Data Protection appliance if the appliance uses a distributed switch for networking.

    Workaround: Contact VMware Technical Support.

  • After you restore a Platform Services Controller appliance, you might not be able to enable the Bash shell or some services might not be started

    After you restore a Platform Services Controller appliance, you might encounter one or more of the following issues:

    • If you run the shell.set --enabled True command to enable the Bash shell on a restored Platform Services Controller virtual appliance, the following error might appear:

      Command> shell.set --enabled True
      Unknown command: 'shell.set'
      Command> shell
      Shell is disabled

       
    • Running the psc_restore script fails to restore the Platform Services Controller. Not all services of Platform Services Controller start, synchronization with the partner Platform Services Controller fails, and so on.

    Workaround: Restore the Platform Services Controller from another restore point in vSphere Data Protection.

  • An error message is reported during backup of the vRealize Automation VMs by using vSphere Data Protection

    During backup of the vRealize Automation VMs, the following error message appears in the vSphere Events section: 

    Failed to add disk scsi0:4.

    Regardless of the error message, the backup job completes successfully. If you restore the VMs from the backup, the process completes and validates successfully.

    Workaround: None.

  • vSphere Data Protection backup of the vRealize Log Insight VMs slows down after the job reaches 92%

    When you backup the vRealize Log Insight VMs with vSphere Data Protection, the process becomes slower after it reaches 92%. The rest of the process takes about 4 to 5 hours to complete. No errors are reported during the entire job.

    Workaround: None.

  • vSphere Data Protection restore of the vRealize Log Insight VMs slows down after the job reaches 92%

    When you restore the vRealize Log Insight VMs with vSphere Data Protection, the process becomes slower after it reaches 92%. The rest of the process takes about 4 to 5 hours to complete. No errors are reported during the entire job.

    Workaround: None.

  • Backup of a VM that requires disk consolidation fails

    If you try to back up a VM that requires disk consolidation, the job fails due to a disk lock error. vSphere Data Protection can not release that lock and perform backup. If you try to power on the VM or consolidate its disks, the same error occurs.

    Workaround: Perform either of the following workarounds:

    • Migrate the VM that experiences the issue to a different ESXi host and consolidate the disk manually. After successful consolidation, back up the VM.
    • Migrate the VM that experiences the issue to a different datastore and consolidate the disk manually. After successful consolidation, back up the VM.