Backup Failure due to OOM

Our backup command is failing because oom.

Leaf Error (memsql-leaf-01:3307): Memory used by MemSQL (28703.00 Mb) has reached the ‘maximum_memory’ setting (28665 Mb) on this node. Possible causes include (1) available query execution memory has been used up for table memory (in use table memory: 18855.88 Mb) and (2) the query is large and complex and requires more query execution memory than is available (in use query execution memory 0.00 Mb

Not only this is failing, the queries, instead of just taking more time are failing due to OOM. A simple count distinct or count_approx_distinct fails because of that. Other databases just run the queries until they timeout (I would prefer that instead).

I already analyzed the available memory (without any query running) and it’s even more confusing:

IP_ADDR TYPE MAX_MEMORY_MB memory_used_mb max_table_memory_mb table_memory_used_mb
memsql-cluster-aggl CA 14285 4579 12856 3410
memsql-cluster-leaf-04l LEAF 28665 26186 25798 18411
memsql-cluster-leaf-03 LEAF 28665 26099 25798 18413
memsql-cluster-leaf-02 LEAF 28665 25954 25798 18351
memsql-cluster-leaf-01 LEAF 28665 27227 25798 18501
memsql-cluster-master MA 14276 8327 12848 3443

we see 26GB of memory use which is a lot a probably the cause of the failure. It doesn’t decrease even if I run analyze table XXX flush;

Analyzing what’s is consuming that memory I ran:

select sum(memory_use)/(1024*1024)
from information_schema.table_statistics

select sum(memory_use)/(1024*1024)
from information_schema.INDEX_STATISTICS is2

Total: 9979 + 1652 = 11GB.

Any idea how can we resolve this issue? We have almost all the queries failing by memory issues.

MEMSQL VERSION: 7.3.12
Thanks in advance for your help.

3 Likes

I’m trying to run OPTIMIZE TABLE table_name full, already ran flush… but it fails due to out of meemory as well… I’m blocked

Are you using rowstore or columstore tables? With Universal Storage features in 7.6, if you are using rowstore today, you may be able to switch to columnstore and still meet the needs of your application. That would free a lot of memory.

See SingleStore’s Patented Universal Storage - Part 4

7.6 also supports hash group-by operation spilling (the most common cause of out-of-memory during query execution). See docs on enable_spilling variable.

What’s your most common query that runs out of memory?