Use of JSON utf8mb4 column without utf8mb4 server_collation when using JSON_EXTRACT_STRING can require cluster restart

Using all of the features listed below together in SingleStoreDB (at or before versions 7.5.23, 7.6.25, 7.8.20, 7.9.15 and 8.0.6) can cause a query to execute forever, and become unkillable.

  • create a JSON column with any kind of utf8mb4 collation
  • not have collation_server set to a utf8mb4 collation
  • use JSON_EXTRACT_STRING on the JSON column in a query
  • set the json_extract_string_collation variable to a value other than 'json'
  • store extended (non-utf8mb3) characters in the JSON column accessed with JSON_EXTRACT_STRING

This will require the cluster to be restarted to end the query and bring the cluster back to normal operation.

After the versions listed, it’s recommended that you set json_extract_string_collation to SERVER_V2 to avoid this issue. Available settings are discussed in the documentation.

If you are using JSON_EXTRACT_STRING in a computed column that’s part of a shard key, and the source column has a utf8mb4 collation (such as utf8mb4_bin), it will not be possible to use the SERVER_V2 setting. In this case, if the system will allow it, you can instead use the SERVER_V1 setting. However, if using SERVER_V1, be aware that persisted computed column values might change when recomputed, so it is safest to double-check the values in the persisted computed columns for correctness after changing this setting. You may need to delete and re-insert rows with incorrect (e.g. truncated) computed column values.

If you must work with a version prior to the ones patched to fix the issue, consider working around the issue by:

  1. restarting the cluster to kill all non-terminating queries
  2. run “SET GLOBAL json_extract_string_collation = 'json'” to prevent the issue from recurring

However, be aware that setting json_extract_string_collation = 'json' causes the extracted string to use the collation of the JSON object rather than the value of collation_server.

Releases before 7.5 are unaffected because utf8mb4 support began in 7.5.

If you are using the JSON data type and utf8mb4 collations, it is recommended that you apply the appropriate patch and consider using the SERVER_V2 setting. Please monitor the release notes for your release version for the arrival of this patch.