Update table using join on the same table

Hi guys,
i’m having rather interesting problem. We have a server_history table that keeps some information about our servers. we use version column to keep track of changed things. when we do some update queries through golang+mysql driver, we got Error 1749: Feature ‘fully distributed join inside a multi-statement transaction’ is not supported by MemSQL Distributed.
the same queries run on DBeaver connected to the same server via mariadb driver performs update without a blink or warning, and correctly. Can somebody shed some light on this?
I’m attaching queries that we run on the table after batch loading data with mark version=-1 (no signify new coming records)

UPDATE
collector.SERVER_HISTORY
SET
UPDATED_TS = CURRENT_TIMESTAMP(6)
WHERE
VERSION = 0
AND SERVER_ID IN (
select
sh2.SERVER_ID
from
collector.SERVER_HISTORY sh
inner join collector.SERVER_HISTORY sh2 ON
(sh.SERVER_ID = sh2.SERVER_ID
AND sh.VERSION =-1
AND sh2.VERSION = 0)
WHERE
sh.CHECK_SUM = sh2.CHECK_SUM );

DELETE
FROM
collector.SERVER_HISTORY
WHERE
VERSION =-1
AND SERVER_ID IN (
SELECT
sh.SERVER_ID
FROM
collector.SERVER_HISTORY sh
INNER JOIN collector.SERVER_HISTORY sh2 ON
(sh.SERVER_ID = sh2.SERVER_ID
AND sh.VERSION =-1
AND sh2.VERSION = 0)
WHERE
sh.CHECK_SUM = sh2.CHECK_SUM );

UPDATE
collector.SERVER_HISTORY
SET
VERSION = VERSION + 1
WHERE
SERVER_ID IN (
SELECT
sh.SERVER_ID
FROM
collector.SERVER_HISTORY sh
INNER JOIN collector.SERVER_HISTORY sh2 ON
(sh.SERVER_ID = sh2.SERVER_ID
AND sh.VERSION =-1
AND sh2.VERSION = 0)
WHERE
sh.CHECK_SUM <> sh2.CHECK_SUM );

UPDATE
collector.SERVER_HISTORY
SET
VERSION = 0
WHERE
VERSION =-1
and SERVER_ID NOT IN (
SELECT
SERVER_ID
FROM
collector.SERVER_HISTORY
WHERE
VERSION = 0);

and the table definition is
CREATE TABLE collector.SERVER_HISTORY (
SERVER_ID varchar(36) CHARACTER SET utf8 COLLATE utf8_general_ci DEFAULT NULL,
DEVICE_ID int(11) DEFAULT NULL,
VERSION int(11) DEFAULT -1,
DATA JSON COLLATE utf8_bin NOT NULL,
CREATED_TS timestamp(6) NOT NULL DEFAULT CURRENT_TIMESTAMP(),
UPDATED_TS timestamp(6) NULL DEFAULT NULL,
CHECK_SUM varchar(32) CHARACTER SET utf8 COLLATE utf8_general_ci DEFAULT NULL,
SKIP int(11) NOT NULL DEFAULT 0,
SKIP_REASON text CHARACTER SET utf8 COLLATE utf8_general_ci,
UPDATE_LOCK tinyint(4) DEFAULT 0,
ASK_UPDATE int(11) DEFAULT 0,
KEY idx_version (VERSION),
KEY idx_server_version (SERVER_ID,VERSION));

I’m curious how this is happening.

answering my own question, it wasn’t evident in the beginning, but all 4 queries were done in single transaction in program, but were executed as sequence of queries in DBeaver. Removing the transaction requirement in programmatic approach makes the procedure work. It is not 100% correct solution but close enough to give results with minor chance of messing up.

@ladislav.kosco thanks for sharing the problem and your solution. I guess you mean your application did a BEGIN WORK or START TRANSACTION before running those queries, and you took it out, then things started to work?

Yes, you understood it correctly. Funny thing is that this does not show up when developing application using dockerised cluster-in-a-box, this is a warning to other developers. Test your code occasionally in distributed setup

What version are you running on, @ladislav.kosco1? Support for distributed joins in multi-statement transactions was added in the 7.0 release.

Hi Hanson, it looks we are running 6.8.3, we were trying to upgrade to latest, but install script was quite stubborn on /var space requirements, even if we have data directory elsewhere so we could not perform the upgrade (space limitation on our servers). It’s a good news we can get rid of this limitation in 7.x, but first we must either force script to perform upgrade of our servers, or move to another DB altogether.

We are thinking about PostgreSQL due to possibility to talk directly to other databases - via foreign data wrappers. This is our number one - not having to have multiple DB drivers in code, and let DB handle this stuff for us. Another point are foreign keys, triggers.

Your DB is blazing fast and distributed, but sometimes this seems not to be enough to ask from DB. for our use case, playing with other DBs seems to be the key in transparent way (so we can build query across multiple DB sources without importing them and keeping them somehow in sync). Are you planning on such functionality in your DB - foreign data wrappers approach or DB gateway for asking external DBs for data in question?

Foreign data wrappers (ability to query foreign databases) is something we expect to do but I don’t have a firm time commitment on that. Same goes for triggers and foreign keys.

Feel free to open or up-vote feature requests for these on the forums. That’s a great place for this kind of feedback.