Recommended Nodepool VM size for AKS

Hi,

I’m trying to configure a nodepool for a cluster deployment, but am having difficulty finding a good compatible size. If we run it with leaf height 1 then it makes CPU requests for a bit more than 8 cpus (exactly 8 for the leaf, and 100mi for the exporter). The allocatable CPUs on a standard AKS 8cpu node is 7.8 as some is reserved for the OS. Running with a height of 0.5 on these nodes works, but then we either leave 49% of the node unused, or let other pods run on it, but Memsql gets memory errors if it doesn’t have a dedicated node.

We really need either a height of 0.95 (am going to try this, but the docs say increments of 0.5 so I’m not hopeful), or a VM size with 8.2 cpus. Does anyone know what is the recommended Azure VM size (or indeed any working VM size) for a leaf height of 1? As it currently stands this seems really poorly thought through, I feel I must be missing something.

Regards,
Will

What errors are you getting, specifically?

Hi Will,

Your assessment is correct, you will need to provide an appropriately sized VM SKU to run a height of 1 which is 8 cores for the MemSQL Node and 1 for the MemSQL Exporter – I’ve taken note of this and will relay the info to our documentation team.

… but Memsql gets memory errors if it doesn’t have a dedicated node.

It’s OK to have more than 1 MemSQL Pod running on a node, as long as the node has the necessary available cores and memory – the Kubernetes scheduler will correctly schedule for the requested resources as defined in the MemSQL CR config (memsql-cluster.yaml).

In fact, we run many MemSQL Pods on large SKU VM’s without any issue to performance, memory, disk IO, etc.

You might consider using a VM SKU with 20 cores and running two MemSQL Pods per VM. For Azure, we’ve used SKU of type Standard_E16s_v3 all up to Standard_E48s_v3 for our MemSQL Pods. We would go with larger VM SKU’s but then we have to deal with NUMA and at the moment Kubernetes is only just now starting to support scheduling against NUMA with the Kubernetes Topology Manager.

What VM SKU are you trying to run a height of 1 cluster deployment on?

Kind regards,
Cindy

Hi Cindy, Hanson,

Thankyou for your replies. I should have explained that the errors that we get are if we run a Singlestore pod alongside another one of our workloads. At this point we get errors that the operating system could not assign enough memory. I found this thread suggesting we should dedicate per memsql instance: The operating system failed to allocate memory (MemSQL memory use 25576.12 Mb). The request was not processed - SingleStore, Inc.

We had been deploying on a per-environment basis, and my plan/hope was that I could happen to have a server size that would then dedicate exactly one server per pod. Thus I was playing with the D8 v4 nodes in the mistaken belief that it was an 8 CPU requirement.

A 9 CPU requirement is frustrating as the Azure sizes generally scale in 8 cpu increments, indeed your suggested E16s are 16CPU and so I can’t see how they could run 2 pods as even ignoring the exporter, a CPU request of 8 is greater than the allocatable memory for an 8cpu node due to the node not offering all its resources to pods. However I see that there are 20 core offerings, so will try one of those (probably a B20 as I don’t understand how the memory constraints applied by a height 1 would make use of the 160GB of Ram for an E20) and will re-jig things to put 2 memsql pods on one of these.

Thanks for your help,
Will

Sorry, I should also have explained. The main error I was getting was that the pod could not be assigned because there wasn’t a node with sufficient CPU or memory resources. This was with a d8as v4 nodepool and a leaf height of 1.

Thank you for the additional information, much appreciated @william.pegg.

Is it possible in the case where you had memsql running with your other workload services that the underlying host kernal and environment settings weren’t set?

System Requirements
Setting Requirements in K8’s
You are on Azure so it’s easier to use a Deamonset

Kind regards,
Cindy

In Azure Kubernetes Benefit (AKS), hubs of the same arrangement are assembled together into hub pools. These hub pools contain the fundamental VMs that run your applications. The beginning number of hubs and their estimate (SKU) is characterized after you make an AKS cluster. To back applications that have different compute or capacity requests, you’ll be able make extra client hub pools. Framework hub pools serve the essential reason of facilitating basic framework cases such as CoreDNS and tunnelfront!

Sorry, I should have followed this up with a conclusion.

A B20ms VM size does allow for us to stack a height 1 leaf and aggregator node on top of each other, with minimal remaining CPU or RAM unutilized, and we now only get errors if we exceed the RAM capacity with a heavy query.

Thanks for the recommendations on Daemonsets, I’ve also created those, so in theory it should be optimised.

I know that it is possible to modify the height parameter for the operator to have a different scale by changing some environment variables. I assume this is unsupported though? Otherwise it would seem sensible to make it much more granular.