Singlestore is queuing most queries, but CPU or Memory is not at maximum usage

Singlestore is queuing most queries, but CPU or Memory is not at maximum usage

I have version 8.9 (self-managed), and it has 5 databases. For example, one database has a table with over 70 million records. This is just for context.

Is a cluster with 4 units (1 Master, 1 Child and 2 Leaves)

The leaves has 64GB Memory and 4 OCPU.

Last week, singlestore has been queuing most queries, but I noticed some strange things.

I ran this command:

SELECT * FROM information_schema.WORKLOAD_MANAGEMENT_STATUS

And result:

Node Id	1
Node Site Queue ID	1
Queued Queries (from global queue)	0
Queued Queries (from local queue)	62
Total Queued Queries Since Startup	243052
Running Queries (from global queue)	0
Running Queries (from local queue)	20
Running Memory (MB) On Leaves (from global queue)	0
Running Threads Per Leaf (from global queue)	0
Running Connections Per Leaf (from global queue)	0
Running Memory (MB) On Leaves (from local queue)	0
Allocated Memory (MB) On Leaves (for local queue)	38413
Required Memory (MB) On Leaves (from local queue)	0
Avg Required Memory (MB) On Leaves (from local queue)	0
Running Threads Per Leaf (from local queue)	3960
Allocated Threads Per Leaf (for local queue)	4096
Required Threads Per Leaf (from local queue)	15890
Avg Required Threads Per Leaf (from local queue)	16139
Running Connections Per Leaf (from local queue)	4100
Allocated Connections Per Leaf (for local queue)	5000
Required Connections Per Leaf (from local queue)	16422
Avg Required Connections Per Leaf (from local queue)	16677
Memory Threshold (MB) to Queue Locally	768
Memory Threshold (MB) to Queue Globally	38413
Connections Threshold to Queue Globally	2500
Threads Threshold to Queue Globally	2048
Total queries started since startup	324150
Total queries finished since startup	324125
Total medium queries upgraded since startup	0
Total queries cancelled since startup	40

Only 20 queries is running and 62 is queued, is possible increase the query parallelism or the problem is other?

Other observations that I found strange, maybe this running threads is too high?

Running Threads Per Leaf (from local queue)	3960
Allocated Threads Per Leaf (for local queue)	4096

But the usage of CPU is 11% on average, that is, there are still plenty of resources on the machine. The same about memory, is 20% of usage.

The fact that singlestore is queuing most queries, is making the system slow.

When I restart all node with

sdb-admin restart-node --all

The singlestore start fast, but after sometime it’s getting slow. And after some minutes the Total Queued Queries Since Startup presents a very high number.

I’ve already made sure there were no new features we deployed to our system that could affect this.

See if you can find out why the queries are getting queued (looks like they are getting queued locally). Then maybe you can change the query or physical database design (sorting/sharding/indexes) to make that queueing less frequent, or change the settings for workload management to affect the queueing.

Here’s some useful info from our AI bot: SingleStore | Develop with SQrL, Your SingleStore Copilot.

To find the reason why a query is queued by workload management in SingleStore, check the WORKLOAD_MANAGEMENT_QUEUE system table. This table lists queries waiting for resources and explains which resource is lacking, such as memory, threads, or connections, by showing details like required, limit, and used values for each resource1.

You can also check the REASON_FOR_QUEUEING column in INFORMATION_SCHEMA.MV_PROCESSLIST for additional context on what component is responsible for queuing a query2.