Product Documentation

Limiting Failovers Caused by Route Monitors in non-INC mode

Sep 01, 2016

In an HA configuration in non-INC mode, if route monitors fail on both nodes, failover happens every 180 seconds until one of the nodes is able to reach all of the routes monitored by the respective route monitors.

However, for a node, you can limit the number of failovers for a given interval by setting the Maximum Number of Flips and Maximum Flip Time parameters on the nodes. When either limit is reached, no more failovers occur, and the node is assigned as primary (but node state as NOT UP) even if any route monitor fails on that node. This combination of HA state as Primary and Node state as NOT UP is called Stick Primary state. 

If the node is then able to reach all of the monitored routes, the next monitor failure triggers resetting of the Maximum Number of Flips and Maximum Flip Time parameters on the node and starting the time specified in the Maximum Flip Time parameter.

These parameters are set independently on each node and therefore are neither propagated nor synchronized.

Parameters for limiting the number of failovers

Maximum Number of Flips (maxFlips)

Maximum number of failovers allowed, within the Maximum Flip Time interval, for the node in HA in non INC mode, if the failovers are caused by route-monitor failure.

Maximum Flip Time ( maxFlipTime )

Amount of time, in seconds, during which failovers resulting from route-monitor failure are allowed for the node in HA in non INC mode.


To limit the number of failovers by using the command line interface

At the command prompt, type:

  • set HA node [-maxFlips < positive_integer>] [-maxFlipTime <positive_integer>]
  • show HA node [< id>]

To limit the number of failovers by using the configuration utility

  1. Navigate to System > High Availability and, on the Nodes tab, open the local node.
  2. Set the following parameters:
  • Maximum Number of Flips
  • Maximum Flip Time 
Sample Configuration 복사

> set ha node -maxFlips 30 -maxFlipTime 60

Done

> sh ha node

1) Node ID: 0

IP: 10.102.169.82 (NS)

Node State: UP

Master State: Primary

Fail-Safe Mode: OFF

INC State: DISABLED

Sync State: ENABLED

Propagation: ENABLED

Enabled Interfaces : 1/1

Disabled Interfaces : None

HA MON ON Interfaces : 1/1

Interfaces on which heartbeats are not seen :None

Interfaces causing Partial Failure:None

SSL Card Status: NOT PRESENT

Hello Interval: 200 msecs

Dead Interval: 3 secs

Node in this Master State for: 0:4:24:1 (days:hrs:min:sec)

 

2) Node ID: 1

IP: 10.102.169.81

Node State: UP

Master State: Secondary

Fail-Safe Mode: OFF

INC State: DISABLED

Sync State: SUCCESS

Propagation: ENABLED

Enabled Interfaces : 1/1

Disabled Interfaces : None

HA MON ON Interfaces : 1/1

Interfaces on which heartbeats are not seen : None

Interfaces causing Partial Failure: None

SSL Card Status: NOT PRESENT

 

Local node information:

Configured/Completed Flips: 30/0

Configured Flip Time: 60

Critical Interfaces: 1/1

Done 

SNMP Alarm for Sticky Primary State

Enable HA-STICKY-PRIMARY SNMP alarm in a node of a high availability set up if you want to be alerted of the node becoming sticky primary. When the node becomes sticky primary, it alerts by generating a trap message (stickyPrimary (1.3.6.1.4.1.5951.1.1.0.138)) and sends it to all the configured SNMP trap destinations. For more information about configuring SNMP alarms and trap destinations, see Configuring the NetScaler to Generate SNMPv1 and SNMPv2 Traps.

Frequently Asked Questions

Consider an example of a high availability setup of two NetScaler appliances NS-1 and NS-2 in non-INC mode. Maximum numbers of flips and maximum flip time in both the nodes have been set with the same values.

The following table lists the settings used in this example:

Entity

Detail

IP address of NS-1

10.102.173.211

IP address of NS-2

10.102.173.212

Maximum number of flips

2

Maximum flip time

200

 

The following table lists some FAQs and answers about maximum number of flips and maximum flip time settings:

Question

 Answer

What must be the next plan of action after one of the node become sticky primary?

 

 

 

Rectify the routes, which are being monitored.

After the maximum flip time is elapsed, any route monitor failure triggers resetting of the Maximum number of flips and maximum flip time, then starting the time specified in maximum flip time.

The following example shows that NS-1 (10.102.173.211) becomes sticky primary.

> show ha node

1)     Node ID:      0 

        IP:  10.102.173.211 

        Node State: NOT UP

        Master State: Primary

        .

        .

2)     Node ID:      1 

        IP:  10.102.173.212

        Node State: UP

        Master State: Secondary

       .

       .

Local node information:

Route Monitor -  Network: 10.102.173.216   Netmask: 255.255.255.255   State: DOWN

Critical Interfaces: 1/1 1/2
Configured/Completed Flips: 2/2
Configured/Remaining Flip Time: 200/0

Done

What happens if a node recovers from sticky primary state before the maximum flip time is elapsed?

Nothing happens. Maximum number of flips and maximum flip time are not reset.

What happens if a node recovers from sticky primary state after the maximum flip time is elapsed?

Nothing happens. Maximum number of flips and maximum flip time are not reset.

What happens if a node recovers from sticky primary state and then the route that is being monitored goes down again before the maximum flip time is elapsed?

 

 

 

 

The node will again become sticky primary without a failover. Maximum number of flips and maximum flip time are not reset.

The following example shows that NS-1 (10.102.173.211) recovers from sticky primary state. NS-1 again becomes sticky primary when the route that is being monitored goes down again before the maximum flip time is elapsed.

> show ha node

1)      Node ID:      0 

       IP:  10.102.173.211

       Node State: UP

       Master State: Primary

       .

       .        

2)     Node ID:      1 

      IP:  10.102.173.212 

      Node State: UP

      Master State: Secondary

       .

       .

Local node information:

Route Monitor -  Network: 10.102.173.216   Netmask: 255.255.255.255   State: UP

Critical Interfaces: 1/1 1/2
Configured/Completed Flips: 2/2
Configured/Remaining Flip Time: 200/113

Done

> show ha node

1)      Node ID:      0 

       IP:  10.102.173.211

       Node State: NOT UP

       Master State: Primary

       .

       .        

2)     Node ID:      1 

      IP:  10.102.173.212 

      Node State: UP

      Master State: Secondary

       .

       .

Local node information:

Route Monitor -  Network: 10.102.173.216   Netmask: 255.255.255.255   State: DOWN

Critical Interfaces: 1/1 1/2 
Configured/Completed Flips: 2/2
Configured/Remaining Flip Time: 200/83

Done

What happens if a node recovers from sticky primary state and then the route that is being monitored goes down again after the maximum flip time is elapsed?

 

Maximum number of flips and maximum flip time are reset to the configured values. Then, Maximum flip time starts. Also, failover happens until either of the following condition is achieved:

  • one of the nodes is able to reach all of the routes monitored by the respective route monitors.
  • number of failover equals the maximum number of flips

 

The following example shows that NS-1 (10.102.173.211) recovers from sticky primary state.

When the route (10.102.173.216) that is being monitored goes down again before the maximum flip time is elapsed, maximum number of flips and maximum flip time are reset, and maximum flip time starts.

The second output of show ha node shows NS-1 becomes secondary after a failover.

> show ha node

1)      Node ID:      0 

       IP:  10.102.173.211

       Node State: UP

       Master State: Primary

       .

       .        

2)     Node ID:      1 

      IP:  10.102.173.212

      Node State: UP

      Master State: Secondary

       .

       .

Local node information:

Route Monitor -  Network: 10.102.173.216   Netmask: 255.255.255.255   State: UP

Critical Interfaces: 1/1 1/2

Configured/Completed Flips: 2/2

Configured/Remaining Flip Time: 200/0

Done

> show ha node

1)      Node ID:      0 

       IP:  10.102.173.211

       Node State: UP

       Master State: Primary

       .

       .        

2)     Node ID:      1 

      IP:  10.102.173.212 

      Node State: UP

      Master State: Secondary

       .

       .

Local node information:

Route Monitor -  Network: 10.102.173.216   Netmask: 255.255.255.255   State: UP
Critical Interfaces: 1/1 1/2
Configured/Completed Flips: 2/1
Configured/Remaining Flip Time: 200/196

Done

What happens when maximum number of flips and maximum flip time are unset?

 

After the maximum number of flips and maximum flip time, the setup falls to the failover cycle of180 seconds until the route monitor state become UP.

What happens when maximum flip time is over but not the maximum number of flips and there is a route down event?

The setup goes to continuous flip cycle.   If maximum flip time is over before the maximum flips are completed, both these parameters are reset to the configured values. As a result, the flip cycle continues forever.  The maximum flip time must be configured in such a way that the maximum number of flips can be completed in this configured time.