Bug identified in MemSQL Tools online upgrade utility

We identified a bug that manifests when using the MemSQL Tools online upgrade utility. The MemSQL Tools online upgrade utility is initiated when using the following command (it does not refer to cases where one does a manual rolling upgrade of a cluster):

memsql-deploy upgrade --online --version 7.1

More information on our online upgrade utility is here.

If you have not already or do not plan to use our MemSQL Tools to perform an online upgrade from versions 6.7+ to another version of MemSQL, this bug will not impact you. If you have already used the online upgrade utility, please continue reading the summary and mitigation sections below.

Bug Summary

When executing the online upgrade using MemSQL Tools, some global variables are updated and reverted throughout the process (these variable changes are hidden from the user). We identified that one of those variables, specifically the variable attached_rebalance_delay_seconds, was updated but not set back to its default and recommended value.

The bug during an online upgrade incorrectly leaves it set to 86400 (one day) instead of reverting it to the default of 120 seconds. As a result, any cluster where the online upgrade utility was used will have this variable set to 86400 (unless action was taken to change it manually).

The variable affects auto-healing for MemSQL clustering operations, and a value of 86400 may impact the speed of recovery after a leaf failover occurs. As a result, we strongly recommend that this variable be set to the default and correct value of 120 seconds.

Bug Mitigation

For those who have already used the online upgrade utility, you must manually update the variable attach_rebalance_delay_seconds to the recommended value:

You can confirm the value for this variable by running the following command in your SQL client on any MemSQL aggregator:

show variables like "attach_rebalance_%";

If you used the online upgrade utility, the value for the variable would be 86400. To revert it to the recommended setting, please execute the following command:

SET GLOBAL attach_rebalance_delay_seconds = 120;

For those who have not used the online upgrade utility in MemSQL Tools but plan to use it in the future, the fix for this bug is included in the MemSQL Toolbox 1.6.5 release. The fix will properly revert the aforementioned variable to its correct value during an online upgrade.