Error ER_INDEX_CORRUPT

Hello SingleStore,

With a cluster in 7.6.5, 2 leafs with 32GB RAM each, and a dataset that takes less that 17GB in “Databases > Memory usage”.

The dashboard reports that the cluster memory usage used is 41.5GB over 62.8GB (with 6.3GB reserved).

Running a significant query throws:

Error: ER_INDEX_CORRUPT: Leaf Error (xx.xx.xx.xx:3306): Memory used by MemSQL (28751.88 Mb) has reached the 'maximum_memory' setting (28952 Mb) on this node. Possible causes include (1) available query execution memory has been used up for table memory (in use table memory: 20991.25 Mb) and (2) the query is large and complex and requires more query execution memory than is available (in use query execution memory 1640.57 Mb). See https://docs.memsql.com/troubleshooting/latest/memory-errors for additional information.

Could you please explain what’s going? why is SingleStore eating so much memory to the point it cannot compute queries anymore?

Thanks for your help

Hi @pierre.

Here’s a discussion of possible causes and solutions for this:

Also, if the big query you ran was using a lot of memory for a hash group-by operation, you can enable spilling for that. See

and look for mention of the “enable_spilling” variable.

Hi @hanson, thanks for the links.

I’ve activated the spilling while trying to figure out what’s going on with SingleStore.

Could you please detail why a 17GB Rowstore dataset eats 41.5GB?

PS: I’m computing real-time analytics with pivots and joins over significative tables, and the dataset is relatively small compared to what could be done in this my industry.

The rowstore table is not using all 41.5 GB of space. The system has allocated more memory from the total available capacity, beyond the 17GB storing the rowstore table data. It hangs onto it in its “allocator” structures and will use it for rows or query execution memory as needed. When it runs low, it may grab more from the as-yet-unused memory. Or from time to time it may free part of it.

Happy to hear that it’s by design and not a bug.

Could you confirm that SingleStore requires to provision 2.5x the rowstore memory to work?

Why is this ratio not explicitly written in the docs please?

Thanks

That ratio is highly-dependent on the application, so it won’t work to publish a blanket recommendation. We have lots of customers running rowstores where the table data takes up to 70% of total system memory. If you’re only doing small OLTP-like queries, that will work fine. If you have high concurrency, and are doing analytical queries with large hash joins, hash groups, and sorts, you may need more memory, like the 2.5x ratio you mentioned.

This 2.5x ratio is constant, even when the DB activity is iddle. Also, this 2.5x ratio of used memory is not freed, because when I run a query that requires more than the remaining memory it crashes with Memory used by MemSQL (28751.88 Mb) has reached the 'maximum_memory' setting

Could you please list the things stored in memory that is not part of the rowstore dataset, and that is not freed?

Thanks for the details

Hi @hanson , I’m still stuck with this issue, I need to be able to predict the RAM to provision clusters properly.

Could you please list the things stored in memory that is not part of the rowstore dataset, and that is not freed?

Thanks for your help

EDIT: after a sb-admin restart-node all , memory returns from 28GB to 16.5GB, why?

Many people are having trouble understanding what’s going on with the memory… please don’t let this question unanswered @hanson .
Thanks for your help

EDIT: after a sb-admin restart-node all , memory returns from 28GB to 16.5GB, why?

As @hanson, said above, the cluster memory usage metric in the studio dashboard includes memory that is cached from the operating system. This will be reused from query to query and is not a concern. The memory drops after restarting because we don’t proactively request this memory from the OS, only as it is needed.

To find precisely where your memory is being used, can you share a query profile as well as the output of these queries:

select node_id,variable_name,variable_value 
from information_schema.mv_global_status 
where node_type = 'LEAF' and (variable_name like 'Alloc%' or variable_name like 'Buffer%');

and

select(memory_use) from information_schema.table_statistics;

Thanks @rob for your answer,

Could you please explain how to predict this memory usage in order to provision RAM properly?

That’s a hard question to answer without more workload details. If you get the results of the information requested above, we can provide some advice.