High Availability Deployment
This topic covers the High Availability (high availability) deployments and configurations supported by SD-WAN appliances (Standard Edition and Enterprise Edition).
SD-WAN appliances can be deployed in high availability configuration as a pair of appliances in Active/Standby roles. There are three modes of high availability deployment:
- Parallel Inline high availability
- Fail-to-Wire high availability
- One-Arm high availability
These high availability deployment modes are similar to Virtual Router Redundancy Protocol (VRRP) and use a proprietary SD-WAN protocol. Both Client Nodes (Clients) and Master Control Nodes (MCNs) within an SD-WAN network can be deployed in a high availability configuration as long as the selected SD-WAN platform model supports high availability.
In high availability configuration, one SD-WAN appliance at the site is designated as the Active appliance and is monitored by the Standby appliance. Configuration is mirrored across both appliances. When the Standby appliance loses connectivity with the Active appliance for a defined period, the Standby appliance assumes the identity of the Active appliance and takes over the traffic load. Depending on the deployment mode, this fast failover has minimal impact on the application traffic passing through the network.
High availability deployment modes
In One-Arm mode, the high availability appliance pair is outside of the data path. Application traffic is redirected to the appliance pair with Policy Based Routing (PBR). One-Arm mode is implemented when a single insertion point in the network is not feasible or to counter challenges of fail-to-wire. In the following illustration, the Standby appliance can be added to the same VLAN or subnet as the Active appliance and the router.
In One-Arm mode, it is recommended that the SD-WAN appliances do not reside in the data network subnets. The virtual path traffic does not have to traverse the PBR and avoids route loops. The SD-WAN appliance and router have to be directly connected, either through an Ethernet port or be in the same VLAN.
IP SLA monitoring for fallback:
The active traffic flows even if the virtual path is down, as long because one of the SD-WAN appliances is active. The SD-WAN appliance redirects traffic back to the router as Intranet traffic. However, if both active/standby SD-WAN appliances become inactive, the router tries to redirect traffic to the appliances. IP SLA monitoring can be configured at the router to disable PBR, if the next appliance is not reachable. This allows the router to fall back to perform a route lookup and forward packets appropriately.
Parallel Inline high availability mode:
In Parallel Inline high availability mode, the SD-WAN appliances are deployed alongside each other, inline by using the data path. Only one path through the Active appliance is used. It is important to note that bypass interface groups are configured to be fail-to-block and not fail-to-wire so that you do not get bridging loops during a failover.
The high availability state can be monitored through the inline interface groups, or through a direct connection between the appliances. External Tracking can be used to monitor the reachability of the upstream or downstream network infrastructure. For example; switch port failure) to direct high availability state change, if needed.
If both active and standby SD-WAN appliances are disabled or fail, a tertiary path can be used directly between the switch and router. This path must have a higher spanning tree cost than the SD-WAN paths so that it is not used under normal conditions. Failover in parallel inline high availability mode is a quick and nearly hitless, because no physical state change occurs. Fallback to the tertiary path is not hitless and can block traffic for 5-30 seconds depending on the spanning tree configuration. If there are out of path connections to other WAN Links, both appliances must be connected to them.
In more complex scenarios, where multiple routers might be using VRRP, non-routable VLANs are recommended to ensure the LAN side switch and routers are reachable at layer 2.
In fail-to-wire mode, the SD-WAN appliances are inline in the same data path. The bypass interface groups must be in the fail-to-wire mode by using the Standby appliance in a passthrough or bypass state. A direct connection among the two appliances on a separate port must be configured and used for the high availability interface group.
- High availability switchover in fail-to-wire mode takes longer period, approximately 10–12 seconds because of delay in ports to recover from Fail-to-Wire state.
- When the high availability connection between the appliances fails, both appliances go into Active state and cause a service interruption. This can be mitigated by assigning multiple high availability connections so that there is no single point of failure.
- It is imperative that in high availability Fail-to-Wire Mode, a separate port is used in the hardware appliance pairs for high availability control exchange mechanism to help with state convergence.
- Because of a physical state change if the SD-WAN appliances switch over from Active to Standby, failover can cause partial loss of connectivity depending on how long the auto-negotiation takes on the Ethernet ports.
- It is recommended that Fail-to-Wire mode is used on ports that are auto‐negotiated, because this increases failover time.
The following illustration shows an example of the Fail-to-Wire deployment.
The One-Arm high availability configuration or Parallel Inline high availability configuration is recommended for data centers or Sites that forward a high volume of traffic to minimize disruption during failover.
If minimal loss of service is acceptable during a failover, then Fail-to-Wire high availability mode is a better solution. The Fail-to-Wire high availability mode protects against appliance failure and parallel inline high availability protects against all failures. In all scenarios, high availability is valuable to preserve the continuity of SD-WAN network during a system failure.
Configuring high availability
To configure high availability:
In the Configuration Editor, navigate to Sites > site name> High Availability. Select Enable High Availability.
Type values for the following parameter:
- High availability Appliance Name: This is the name of the high availability (secondary) appliance.
- Failover Time: This specifies the wait time (in milliseconds) after contact by using the primary appliance is lost, before the standby appliance becomes active.
- Shared Base MAC: This is the shared MAC address for the high availability pair appliances. If a failover occurs, the secondary appliance has the same virtual MAC addresses as the failed primary appliance.
- Swap Primary/Secondary: When this is selected, if both appliances in the high availability pair come up simultaneously, the secondary appliance becomes the primary appliance, and takes precedence.
- Primary Reclaim: If this is selected, the designated primary appliance reclaims control upon restart after a failover event.
- HA Fail-to-Wire Mode: Choose this for Fail-to-wire high availability deployment mode.
For hypervisor and cloud based platforms an extra parameter Disable Shared Base MAC is available. Choose this option to disable the shared virtual MAC address.
For hypervisor based platforms ensure that the promiscuous mode is enabled on the hypervisors to allow packet sourcing from high availability shared MAC address. When promiscuous mode is not enabled, you can enable Disable Shared Base MAC. option.
Click + next to HA IP Interfaces to configure interface groups. Enter values for the following parameters:
- Virtual Interface – This is the Virtual Interface to be used for communication among the appliances in the high availability pair. This interface monitors the Active appliance for reachability. For One-Arm high availability mode, only one interface group is required.
- Primary – This is the unique Virtual IP address for the primary appliance. The secondary appliance uses this for communication by using the primary appliance.
- Secondary – This is the unique Virtual IP address for the secondary appliance. The primary appliance uses this for communication by using the secondary appliance.
For Inline high availability mode, extra interface groups are required for External Tracking to monitor the upstream or downstream network infrastructure. For example. Switch port failure, to detect when high availability change state is required.
Click + to the left of the new HA IP Interfaces entry. In the External Tracking IP Address field, enter the IP Address of the external device that responds to ARP requests to determine the state of the primary appliance.
To monitor high availability configuration:
Log in to the SD-WAN web management interface for the Active and Standby appliance’s for which high availability is implemented. View high availability status under the Dashboard tab.
For Network Adapter details of Active and Standby high availability appliances, navigate to Configuration > Appliance Settings > Network Adapters > Ethernet tab.