The routing design considers different levels of routing within the environment from which to define a set of principles for designing a scalable routing solution.

North/south

The Provider Logical Router (PLR) handles the north/south traffic to and from a tenant and management applications inside of application virtual networks.

East/west

Internal east/west routing at the layer beneath the PLR deals with the application workloads.

Routing Model Design Decisions

Decision ID

Design Decision

Design Justification

Design Implications

SDDC-VI-SDN-017

Deploy NSX Edge Services Gateways in an ECMP configuration for north/south routing in both management and shared edge and compute clusters.

The NSX ESG is the recommended device for managing north/south traffic. Using ECMP provides multiple paths in and out of the SDDC. This results in faster failover times than deploying Edge service gateways in HA mode.

ECMP requires 2 VLANS for uplinks which adds an additional VLAN over traditional HA ESG configurations.

SDDC-VI-SDN-018

Deploy a single NSX UDLR for the management cluster to provide east/west routing across all regions.

Using the UDLR reduces the hop count between nodes attached to it to 1. This reduces latency and improves performance.

UDLRs are limited to 1,000 logical interfaces. When that limit is reached, a new UDLR must be deployed.

SDDC-VI-SDN-019

Deploy a single NSX UDLR for the shared edge and compute, and compute clusters to provide east/west routing across all regions for workloads that require mobility across regions.

Using the UDLR reduces the hop count between nodes attached to it to 1. This reduces latency and improves performance.

UDLRs are limited to 1,000 logical interfaces. When that limit is reached a new UDLR must be deployed.

SDDC-VI-SDN-020

Deploy a DLR for the shared edge and compute and compute clusters to provide east/west routing for workloads that require on demand network objects from vRealize Automation.

Using the DLR reduces the hop count between nodes attached to it to 1. This reduces latency and improves performance.

DLRs are limited to 1,000 logical interfaces. When that limit is reached a new DLR must be deployed.

SDDC-VI-SDN-021

Deploy all NSX UDLRs without the local egress option enabled.

When local egress is enabled, control of ingress traffic, is also necessary (for example using NAT). This becomes hard to manage for little to no benefit.

All north/south traffic is routed through Region A until those routes are no longer available. At that time, all traffic dynamically changes to Region B.

SDDC-VI-SDN-022

Use BGP as the dynamic routing protocol inside the SDDC.

Using BGP as opposed to OSPF eases the implementation of dynamic routing. There is no need to plan and design access to OSPF area 0 inside the SDDC. OSPF area 0 varies based on customer configuration.

BGP requires configuring each ESG and UDLR with the remote router that it exchanges routes with.

SDDC-VI-SDN-023

Configure BGP Keep Alive Timer to 1 and Hold Down Timer to 3 between the UDLR and all ESGs that provide north/south routing.

With Keep Alive and Hold Timers between the UDLR and ECMP ESGs set low, a failure is detected quicker, and the routing table is updated faster.

If an ESXi host becomes resource constrained, the ESG running on that host might no longer be used even though it is still up.

SDDC-VI-SDN-024

Configure BGP Keep Alive Timer to 4 and Hold Down Timer to 12 between the ToR switches and all ESGs providing north/south routing.

This provides a good balance between failure detection between the ToRs and the ESGs and overburdening the ToRs with keep alive traffic.

By using longer timers to detect when a router is dead, a dead router stays in the routing table longer and continues to send traffic to a dead router.

SDDC-VI-SDN-025

Create one or more static routes on ECMP enabled edges for subnets behind the UDLR and DLR with a higher admin cost then the dynamically learned routes.

When the UDLR or DLR control VM fails over router adjacency is lost and routes from upstream devices such as ToR's to subnets behind the UDLR are lost.

This requires each ECMP edge device be configured with static routes to the UDLR or DLR. If any new subnets are added behind the UDLR or DLR the routes must be updated on the ECMP edges.

Dedicated networks are needed to facilitate traffic between the universal dynamic routers and edge gateways, and to facilitate traffic between edge gateways and the top of rack switches. These networks are used for exchanging routing tables and for carrying transit traffic.

Transit Network Design Decisions

Decision ID

Design Decision

Design Justification

Design Implications

SDDC-VI-SDN-026

Create a universal virtual switch for use as the transit network between the UDLR and ESGs. The UDLR provides east/west routing in both compute and management stacks while the ESG's provide north/south routing.

The universal virtual switch allows the UDLR and all ESGs across regions to exchange routing information.

Only the primary NSX Manager can create and manage universal objects including this UDLR.

SDDC-VI-SDN-027

Create a global virtual switch in each region for use as the transit network between the DLR and ESG's. The DLR provides east/west routing in the compute stack while the ESG's provide north/south routing.

The global virtual switch allows the DLR and ESGs in each region to exchange routing information.

A global virtual switch for use as a transit network is required in each region.

SDDC-VI-SDN-028

Create two VLANs in each region. Use those VLANs to enable ECMP between the north/south ESGs and the ToR switches.

The ToR’s have an SVI on one of the two VLANS and each north/south ESG has an interface on each VLAN.

This enables the ESGs to have multiple equal-cost routes and provides more resiliency and better bandwidth utilization in the network.

Extra VLANs are required.