WCCP Clustering

The WCCP clustering feature enables you to multiply your acceleration capacity by assigning more than one SD-WAN appliance to the same links. You can cluster up to 32 identical appliances, for up to 32 times the capacity. Because it uses the WCCP 2.0 standard, WCCP clustering works on most routers and some smart switches, most likely including those you are already using.

Because it uses a decentralized protocol, WCCP clustering allows SD-WAN appliances to be added or removed at will. If an appliance fails, its traffic is rerouted to the surviving appliances.

Unlike SD-WAN high-availability, an active/passive pair that uses two appliances to provide the performance of a single appliance, the same appliances deployed as a WCCP cluster has twice the performance of a single appliance, delivering both redundancy and improved performance.

In addition to adding more appliances as your site’s needs increase, you can use Citrix’s “Pay as You Grow” feature to increase your appliances’ capabilities through license upgrades.

Citrix Command Center is recommended for managing WCCP clusters. The following figure shows a basic network of a cluster of SD-WAN appliances in WCCP mode, administered by using Citrix Command Center.

Figure 1. SD-WAN Cluster Administered by Using Citrix Command Center

localized image

Load-Balanced WCCP Clusters

The WCCP protocol supports up to 32 appliances in a fault-tolerant, load balanced array called a cluster. In the example below, three identical appliances (same model, same software version) are cabled identically and configured identically except for their IP addresses. Appliances using the same service groups with the same router can become a load balanced WCCP cluster. When a new appliance registers itself with the router, it can join the existing pool of appliances and receive its share of traffic. If an appliance leaves the network (as indicated by the absence of heartbeat signals), the cluster is rebalanced so that only the remaining appliances are used.

Figure 2. A load-balanced WCCP cluster with three appliances

localized image

One appliance in the cluster is selected as the designated cache, and controls the load-balancing behavior of the appliances in the cluster. The designated cache is the appliance with the lowest IP address. Because the appliances have identical configurations, it doesn’t matter which one is the designated cache. If the current designated cache goes offline, a different appliance becomes the designated cache.

The designated cache determines how the load-balanced traffic is allocated and informs the router of these decisions. The router shares information with all members of the cluster, so the cluster can operate even if the designated cache goes offline.

Note: As normally configured, a SD-WAN 4000/5000 appliance appears as two WCCP caches to the router.

Load-Balancing Algorithm

Load balancing in WCCP is static, except when an appliance enters or leaves the cluster, which causes the cluster to be rebalanced among its current members.

The WCCP standard supports load balancing based on a mask or a hash. For example, SD-WAN WCCP clustering uses the mask method only, using a mask of 1-6 bits of the 32-bit IP address. These address bits can be non-consecutive. All addresses yielding the same result when masked are sent to the same appliance. Load balancing effectiveness depends on choosing an appropriate mask value: a poor mask choice can result in poor load-balancing or even none, with all traffic sent to a single appliance.

Deployment Topology

Depending on your network topology, you can deploy WCCP cluster either with a single router or with multiple routers. Whether connected to a single router or multiple routers, each appliance in the cluster must be connected identically to all routers in use.

Single router Deployment

In the following diagram, three SD-WAN appliances accelerate the datacenter’s 200 Mbps WAN. The site supports 750 XenApp users.

localized image

As shown on the SD-WAN Datasheet, an SD-WAN 3000-100 can support 100 Mbps and 400 users, so a pair of these appliances supports 200 Mbps and 800 users, which satisfies the datacenter’s requirements of a 200 Mbps link and 750 users.

For fault tolerance, however, the WCCP cluster should continue to operate without becoming overloaded if one appliance fails. That can be accomplished by using three appliances when the calculations call for two. This is called the N+1 rule.

Failure is an unusual event, so usually all three appliances are in operation. In this case, each appliance is supporting only 67 Mbps and 250 users, leaving plenty of headroom, and making good use of the fact that the cluster has three times the CPU power and three times the compression history of a single appliance.

Without WCCP clustering, as much capacity and fault-tolerance would require a pair of SD-WAN 4000-500 appliances in high availability mode. Only one of these appliances is active at a time.

Multiple Router Deployments

Using multiple WAN routers is similar to using a single WAN router. If the previous example is changed to include two 100 Mbps links instead of one 200 Mbps link, the topology changes, but the calculations do not.

localized image

Limitations

Configuring appliances in a WCCP cluster has the following limitations:

  • All appliances within a cluster must be the same model and use the same software release.
  • Parameter synchronization between appliances within the cluster is not automatic. Use Command Center to manage the appliances as a group.
  • SD-WAN traffic shaping is not effective, because it relies on controlling the entire link as a unit, and none of the appliances are in a position to do this. Router QoS can be used instead.
  • The WCCP-based load-balancing algorithms do not vary dynamically with load, so achieving a good load balance can require some tuning.
  • The hash method of cache assignment is not supported. Mask assignment is the supported method.
  • While the WCCP standard allows mask lengths of 1-7 bits, the appliance supports masks of 1-6 bits.
  • Multicast service groups are not supported. Only unicast service groups are supported.
  • All routers using the same service group pair must support the same forwarding method (GRE or L2).
  • The forwarding and return method negotiated with the router must match: both must be GRE or both must be L2. Some routers do not support L2 in both directions, resulting in an error of “Router’s forward or return or assignment capability mismatch.” In this case, the service group must be configured as GRE.
  • SD-WAN VPX does not support WCCP clustering.
  • The appliance supports (and negotiates) only unweighted (equal) cache assignments. Weighted assignments are not supported.
  • Some older appliances, such as the SD-WAN 700, do not support WCCP clustering.
  • (SD-WAN 4000/5000 only) Two accelerator instances are required per interface in L2 mode. three interfaces are supported per appliance (and then only on appliances with six or more accelerator instances.)
  • (SD-WAN 4000/5000 only) WCCP control packets from the router must match one of the router IP addresses configured on the appliance for the service group. In practice, the router’s IP address for the interface that connects it to the appliance should be used. The router’s loopback IP cannot be used.

Deployment worksheet and cluster limitations

On the following worksheet, you can calculate the number of appliances needed for your installation and the recommended mask field size. The recommended mask size is 1–2 bits larger than the minimum mask size for your installation.

     
Parameter Value Notes
Appliance Model Used  
Supported XenApp and XenDesktop Users Per Appliance Uspec = From data sheet
XenApp and XenDesktop Users on WAN Link Uwan =
User overload Factor Uoverload = Uwan/Uspec =
Supported BW Per Appliance BWspec = From data sheet
WAN Link BW BWwan =
BW Overload Factor BWoverload = BWwan/BWspec =
Number of appliances required N = max(Uoverload, BWoverload) +1 = Includes one spare
   
Min number of buckets Bmin = N, rounded up a power of 2 =
If SD-WAN 4000 or 5000, Bmin = 2N, rounded up to a power of 2 =
Recommended value B = 4 \Bmin if Bmin <= 16, else 2 \Bmin =
Number of “one” bits in address mask M = log2(B) If B=16, M=4.

Mask value: The mask value is a 32-bit address mask with several “one” bits equal to M in the worksheet provided earlier. Often these bits can be the least-significant bits in the WAN subnet mask used by your remote sites. If the masks at your remote sites vary, use the median mask. (Example: With /24 subnets, the least significant bits of the subnet are 0x00 00 nn 00. The number of bits to set to one is log2(mask size): if mask size is 16, set 4 bits to one. So with a mask size of 16 and a /24 subnet, set the mask value to 0x00 00 0f 00.)

The above guidelines work only if the selected subnet field is evenly distributed in your traffic, that is, that each address bit selected by the mask is a one for half the remote hosts, and a zero for the other half. Otherwise, load-balancing is impaired. This even distribution might be true for only a few bits in the network field (only 2 bits). If so with your network, instead of masking bits in the offending area of the subnet field, displace those bits to a portion of the host address field that has the 50/50 property. For example, if only three subnet bits in a /24 subnet have the 50/50 property, and you are using four mask bits, a mask of 0x00 00 07 10 avoids the offending bit at 0x00 00 0800 and displaces it to 0x00 00 00 10, a portion of the address field that is likely to have the 50/50 property if your remote subnets generally use at least 32 IP addresses each.

     
Parameter Value Notes
Final Mask Value  
Accelerated Bridge   Usually apA
WAN Service Group   A service group not already in use on your router (51-255)
LAN Service Group   Another unused service group
Router IP address   IP address of router interface on port facing the appliance
WCCP Protocol (usually “Auto”)  
DC Algorithm   Use “Deterministic” if you have only two appliances or are using dynamic load balancing like HSRP or GSLB. Otherwise, use “Least Disruptive.”

Configuring appliances in a WCCP cluster has the following limitations:

  • All appliances within a cluster must be the same model and use the same software release.
  • Parameter synchronization between appliances within the cluster is not automatic. Use Command Center to manage the appliances as a group.
  • SD-WAN traffic shaping is not effective, because it relies on controlling the entire link as a unit, and none of the appliances are in a position to do this. Router QoS can be used instead.
  • The WCCP-based load-balancing algorithms do not vary dynamically with load, so achieving a good load balance can require some tuning.
  • The hash method of cache assignment is not supported. Mask assignment is the supported method.
  • While the WCCP standard allows mask lengths of 1-7 bits, the appliance supports masks of 1-6 bits.
  • Multicast service groups are not supported; only unicast service groups are supported.
  • All routers using the same service group pair must support the same forwarding method (GRE or L2).
  • The forwarding and return method negotiated with the router must match: both must be GRE or both must be L2. Some routers do not support L2 in both directions, resulting in an error of “Router’s forward or return or assignment capability mismatch.” In this case, the service group must be configured as GRE.
  • SD-WAN VPX does not support WCCP clustering.
  • The appliance supports (and negotiates) only unweighted (equal) cache assignments. Weighted assignments are not supported.
  • Some older appliances, such as the SD-WAN 700, do not support WCCP clustering.
  • (SD-WAN WANOP 4000/5000 only) Two accelerator instances are required per interface in L2 mode. No more than three interfaces are supported per appliance (and then on appliances with six or more accelerator instances.)
  • (SD-WAN 4000/5000 only) WCCP control packets from the router must match one of the router IP addresses configured on the appliance for the service group. In practice, the router’s IP address for the interface that connects it to the appliance should be used. The router’s loopback IP cannot be used.

Testing and Troubleshooting

The Monitoring > Appliance > Application Performance > WCCP page shows the current state of not only the local appliance but of all other appliances that have joined the cluster. Select a WCCP cache and click Get Info.

The Cache Status tab shows the local appliance’s status. When all is well, the status is “25: has assignment.” You must refresh the page manually to monitor changes in status. If the appliance does not reach the status of “25: has assignment” within a timeout period, other informative status messages are displayed.

Additional information is displayed when you click on the Service Group or the Routers tabs.

The Cluster Summary tab displays information about the WCCP cluster as a whole. As a side effect of the WCCP protocol, each member of the cluster has information about all the others, so this information can be monitored from any appliance in the cluster.

Your router can also provide status information. See your router documentation.

Configure WCCP Clustering

After you have finalized the deployment topology, considered all limitations, and filled in the deployment worksheet, you are ready to deploy your appliances in a WCCP cluster. To configure the WCCP cluster, you need to perform the following tasks: