Singlestore is queuing most queries, but CPU or Memory is not at maximum usage
I have version 8.9 (self-managed), and it has 5 databases. For example, one database has a table with over 70 million records. This is just for context.
Is a cluster with 4 units (1 Master, 1 Child and 2 Leaves)
The leaves has 64GB Memory and 4 OCPU.
Last week, singlestore has been queuing most queries, but I noticed some strange things.
I ran this command:
SELECT * FROM information_schema.WORKLOAD_MANAGEMENT_STATUS
And result:
Node Id 1
Node Site Queue ID 1
Queued Queries (from global queue) 0
Queued Queries (from local queue) 62
Total Queued Queries Since Startup 243052
Running Queries (from global queue) 0
Running Queries (from local queue) 20
Running Memory (MB) On Leaves (from global queue) 0
Running Threads Per Leaf (from global queue) 0
Running Connections Per Leaf (from global queue) 0
Running Memory (MB) On Leaves (from local queue) 0
Allocated Memory (MB) On Leaves (for local queue) 38413
Required Memory (MB) On Leaves (from local queue) 0
Avg Required Memory (MB) On Leaves (from local queue) 0
Running Threads Per Leaf (from local queue) 3960
Allocated Threads Per Leaf (for local queue) 4096
Required Threads Per Leaf (from local queue) 15890
Avg Required Threads Per Leaf (from local queue) 16139
Running Connections Per Leaf (from local queue) 4100
Allocated Connections Per Leaf (for local queue) 5000
Required Connections Per Leaf (from local queue) 16422
Avg Required Connections Per Leaf (from local queue) 16677
Memory Threshold (MB) to Queue Locally 768
Memory Threshold (MB) to Queue Globally 38413
Connections Threshold to Queue Globally 2500
Threads Threshold to Queue Globally 2048
Total queries started since startup 324150
Total queries finished since startup 324125
Total medium queries upgraded since startup 0
Total queries cancelled since startup 40
Only 20 queries is running and 62 is queued, is possible increase the query parallelism or the problem is other?
Other observations that I found strange, maybe this running threads is too high?
Running Threads Per Leaf (from local queue) 3960
Allocated Threads Per Leaf (for local queue) 4096
But the usage of CPU is 11% on average, that is, there are still plenty of resources on the machine. The same about memory, is 20% of usage.
The fact that singlestore is queuing most queries, is making the system slow.
When I restart all node with
sdb-admin restart-node --all
The singlestore start fast, but after sometime it’s getting slow. And after some minutes the Total Queued Queries Since Startup
presents a very high number.
I’ve already made sure there were no new features we deployed to our system that could affect this.