Troubleshooting common issues
While joining a node to the cluster, I get the following message, “ERROR: Invalid interface name/number.” What must I do to resolve this error?
This error occurs if you provided an invalid or incorrect backplane interface while using the add cluster nodecommand to add the node. To resolve this error, verify the interface you provided while adding the node. Make sure that you have not specified the appliance’s management interface as the backplane interface, and that the <nodeId> bit of the interface is the same as the node’s Id. For example, if the nodeId is 3, the backplane interface must be 3/<c>/<u>.
While joining a node to the cluster, I get the following message, “ERROR: Clustering cannot be enabled, because the local node is not a member of the cluster.” What must I do to resolve this error?
This error occurs when you try to join a node without adding the node’s NSIP to the cluster. To resolve this error, you must first add the node’s NSIP address to the cluster by using the add cluster node command and then execute the join cluster command.
While joining a node to the cluster, I get the following message, “ERROR: Connection refused.” What must I do to resolve this error?
This error can occur due to the following reasons:
Connectivity problems. The node cannot connect to the cluster IP address. Try pinging the cluster IP address from the node that you are trying to join.
Duplicate cluster IP address. Check to see if the cluster IP address exists on some non-cluster node. If it does, create a new cluster IP address and try re-joining the cluster.
While joining a node to the cluster, I get the following message, “ERROR: License mismatch between the configuration coordinator and the local node.” What must I do to resolve this error?
The appliance that you are joining to the cluster must have the same licenses as the configuration coordinator. This error occurs when the licenses on the node you are joining do not match the licenses on the configuration coordinator. To resolve this error, run the following commands on both the nodes and compare the outputs.
From the command line:
show ns hardware
show ns license
From the shell:
nsconmsg -g feature -d stats
View the contents of the /var/log/license.log file
What must I do when the configurations of a cluster node are not in synch with the cluster configurations?
In most cases, the configurations are automatically synchronized between all the cluster nodes. However, if you feel that the configurations are not synchronized on a specific node, you must force the synchronization by executing the force cluster sync command from the node that you want to synchronize. For more information, see Synchronizing Cluster Configurations.
When configuring a cluster node, I get the following message, “ERROR: Session is read-only; connect to the cluster IP address to modify the configuration.”
All configurations on a cluster must be done through the cluster IP address and the configurations are propagated to the other cluster nodes. All sessions established through the NSIP address of individual nodes are read-only.
Why does the node state show “INACTIVE” when the node health shows “UP”?
A healthy node can be in the INACTIVE state for a number of reasons. A scan of ns.log or error counters can help you determine the exact reason.
How can I resolve the health of a node when its health shows “NOT UP”?
Node health “Not UP” indicates that there are some issues with the node. To know the root cause, you must run the show cluster node command. This command displays the node properties and the reason for the node failure.
What must I do when the health of a node shows as “NOT UP” and the reason indicates that configuration commands have failed on a node?
This issue arises when some commands are not executed on the cluster nodes. In such cases, you must make sure that the configurations are synchronized using one of the following options:
If some of the cluster nodes are in this state, you must perform the force cluster synchronization operation on those nodes. For more information, see Synchronizing Cluster Configurations.
If all cluster nodes are in this state, you must disable and then enable the cluster instance on all the cluster nodes.
When I run the set vserver command, I get the following message, “No such resource.” What must I do to resolve this issue?
The set vserver command is not supported in clustering. The unset vserver, enable vserver, disable vserver, and rm vserver commands are also not supported. However, the show vserver command is supported.
I cannot configure the cluster over a Telnet session. What must I do?
Over a telnet session, the cluster IP address can be accessed only in read-only mode. Therefore, you cannot configure a cluster over a telnet session.
I notice a significant time difference across the cluster nodes. What must I do to resolve this issue?
When PTP packets are dropped due to backplane switch or if the physical resources are over-committed in a virtual environment, the time will not get synchronized.
To synchronize the times, you must do the following on the cluster IP address:
set ptp -state disable
Configure Network Time Protocol (NTP) for the cluster. For more information, see Setting up Clock Synchronization.
What must I do, if there is no connectivity to the cluster IP address and the NSIP address of a cluster node?
If you cannot access to the cluster IP address or the NSIP of a cluster node, you must access the appliance through the serial console. If the NSIP address is reachable, you can SSH to the cluster IP address from the shell by executing the following command at the shell prompt:
# ssh nsroot@<cluster IP address>
What must I do to recover a cluster node that has connectivity issues?
To recover a node that has connectivity issues:
Disable the cluster instance on that node (since you cannot execute commands from the NSIP of a cluster node).
Execute the commands required to recover the node.
Enable the cluster instance on that node.
Some nodes of the cluster have two default routes. How can I remove the second default route from the cluster node?
To delete the additional default route, do the following on each node that has the extra route:
Disable the cluster instance.
disable cluster instance <clId>
Remove the route.
rm route <network> <netmask> <gateway>
Enable the cluster instance.
enable cluster instance <clId>
The cluster functionality gets affected when an existing cluster node comes online. What must I do to resolve this issue?
If RPC password of node is changed from the cluster IP address when that node is out of the cluster, then, when the node comes online, there is a mismatch in rpc credentials and this could affect cluster functionality. To solve this issue, use the set ns rpcNode command to update the password on the NSIP of the node which has come online.