Kubernetes and SingleStore: 5 Things You Absolutely Need to Know, Part 1

CP

Cindy Parker

Principal Enterprise Solutions Engineer

Kubernetes and SingleStore: 5 Things You Absolutely Need to Know, Part 1

Using Kubernetes for SingleStoreDB Self-Managed process orchestration is a great idea—right now, one of our customers is orchestrating 15,000 SingleStore instances with a DevOps/Site Reliability Engineer (SRE) team of just eight people. But Kubernetes is not out-of-the-box magic you can unleash by watching a quick tutorial. Part one of this three-part series covers five things you absolutely need to know.

Unless you’ve been coding under a rock for almost a decade, you know that Kubernetes (abbreviated to K8s) is a wildly popular open-source orchestration platform used to manage container technology across cloud environments. Initially released by Google in 2014 and later donated to the Cloud Native Computing Foundation, Kubernetes is now deployed in almost 50% of organizations worldwide.

As it happens, Kubernetes is an ideal tool for orchestrating SingleStoreDB Self-Managed instances. But — and this is a big but — in order to be successful in this use case, K8s needs to be set up properly and performantly. Before you spin up a Kubernetes instance on your laptop or in the cloud and dive in, this blog series will tell you five things you absolutely need to know and prepare for:

  1. Running a production Kubernetes cluster is different than a test environment
  2. A Kubernetes production cluster environment is a multi-layered stack
  3. Kubernetes novices face steep challenges
  4. Singlestore Helios makes all of the above easy for you
  5. Still want to go alone? How SingleStoreDB Self-Managed fits into a production environment

This blog, Part 1, touches on Azure-related issues, but Google Kubernetes Engine (GKE) and Amazon Elastic Kubernetes Service (EKS) have their own idiosyncrasies to consider too. At the end of Part 3 I’ll share a bit about our customer who’s easily managing more than 15,000 instances of SingleStoreDB Self-Managed with Kubernetes, and a checklist to recap the skills your IT shop should have if you’re deploying K8s.

Let’s get started with the first offive things you absolutely need to know.

1. Running a production Kubernetes cluster is different from running a test environment.

Setting up an instance of Kubernetes is astonishingly simple. Google Cloud provides a wizard that will give you a standard-issue GKE cluster with just a few clicks. In this cluster the node pools will already be decided for you. If you want to run Kubernetes on your laptop, you can install minikube, which quickly sets up a local Kubernetes cluster on MacOS, Linux and Windows.

But let’s be clear here. Neither of these simple options are suitable for using Kubernetes to orchestrate SingleStoreDB Self-Managed instances at scale, or any enterprise-class production environment. To illustrate why, let’s look at important K8s node topology points: scheduling, virtual machine (VM) sizing, storage and networking, all of which must be set up and working properly.

Kubernetes node topology

  • Scheduling details: With just a few clicks a Kubernetes cluster can be set up on Google Cloud, Amazon Web Services (AWS) or Microsoft Azure. How you configure and use your VMs is a critical component of K8s operations. You can set up different node pools in Kubernetes; you might want to have a high-performance compute set (VMs with a low- or no-count of NUMA sockets with high processing core counts), or low-power infrastructure VMs for running a simple maintenance web interface, for example. From an architecture standpoint, how you configure VM pools matters, to make sure you have enough high-powered resources available to schedule when needed.

  • VM sizing: It’s important to have VM resources sized correctly to adequately support the Kubernetes services you will run. Because VM sizing matters, SingleStoreDB Self-Managed does not recommend running Kubernetes on AMD architectures. Intel is preferable because it has much larger NUMA sockets. The bus (AMD or Intel) of the VMs is important because if your instructions are executing on the same processor within the same bus, memory access time is in nanoseconds. But if the kernel schedules the process elsewhere, away from the memory, it can take milliseconds to access the memory—a very expensive proposition. The fact is, Kubernetes does not handle NUMA well; you can’t specify where a process should execute, so it’s far safer to choose an Intel bus.

    Admittedly, this is fairly gnarly configuration territory. If you’re an infrastructure engineer setting up a K8s cluster you’ll instinctively know how to configure the VMs. If not, you’ll likely click on a standard, low-cost VM configuration, typically with 32 processing cores, 256 GB of RAM and a performance-killing four NUMA sockets.

Check out our Kubernetes guide

Storage

  • Performant vs. non-performant: Google Cloud, AWS and Microsoft Azure have very different default storage resources. Again, to set up storage correctly you’ve got to know your Kubernetes environment.

    For example, the default storage type for K8s on AWS EKS is usually good for most workloads. But on Azure, basic disk storage is not very performant, so when you set up Azure Kubernetes Service (AKS) you will want to choose premium storage. Here’s why: Azure sets throttling limits. The system permits a fixed amount of read-writes to storage, and if your Kubernetes production environment exceeds that the speed will be throttled. This important bit of information is not disclosed when provisioning an AKS cluster, nor is that fact that premium storage is faster and has a much higher read-write quota.

    If you haven’t worked with storage arrays, or are unsure of how storage works in the cloud, you’re apt to choose Azure’s standard, slower disk option. Depending on how you’re using Kubernetes to orchestrate SingleStoreDB Self-Managed, this can have a significant impact on your environment.

Networking is critical

  • Choosing a performant network stack: Each cloud services provider has network configuration options that can make or break the performance of your K8s cluster. In Azure, when you stand up a Kubernetes environment you have two networking options. The first, default option is kubenet:

    With kubenet an Azure virtual network and subnet are created for you. [N]odes get an IP address from the Azure virtual network subnet. Pods receive an IP address from a logically different address space to the Azure virtual network subnet of the nodes. Network address translation (NAT) is then configured so that the pods can reach resources on the Azure virtual network. The source IP address of the traffic is NAT'd to the node's primary IP address. This approach greatly reduces the number of IP addresses that you need to reserve in your network space for pods to use.

    All of which sounds great—but for a production Kubernetes environment, it’s a ticket to failure. Setting up Azure’s networking successfully requires provisioning specific IP address ranges, to make sure you have enough network bandwidth to handle the Kubernetes workload. Again, if you’re not an experienced network engineer, it’s easy to skip over this option and go for kubenet. As one Microsoft cloud engineer put it, “In other words, Azure CNI forces you to design networking in advance. With Kubenet, problems of this type are simply delayed.”

    So exactly what are the problems? SingleStoreDB Self-Managed has heard bountiful anecdotal evidence that kubenet can’t handle production volumes of Kubernetes traffic, dropping connections left and right. A production K8s cluster running on kubenet is markedly slow, bound by the number of connections Azure supports in that environment.

  • Network observability: When problems do occur, it’s important that the networking stack can be observed. A production-caliber network stack or driver offers hooks that provide log output—essential information for troubleshooting. The log output will point to the specific layer of the OSI stack where the problem is occurring.

Azure kubenet does not provide network visibility. Azure Container Network Interface (CNI), the performant option, does. It uses Calico and a version of Azure on top of that. Cilium is another popular option for network observability in Kubernetes environments on Google Cloud and AWS.

No small engineering feat to accomplish

As you can see, standing up a production-class Kubernetes environment takes more than a few quick clicks on default options. Simple, standard-issue K8s clusters are uniformly unfit for the demands of production SingleStoreDB Self-Managed orchestration.

The fact is, unless you have the right engineering skills and technical background, it’s deceptively easy to get started with Kubernetes. Read my next blog, on why a Kubernetes production cluster environment is a multi-layered stack, and why Kubernetes novices face steep challenges, to dive deeper into the five things you absolutely need to know before deploying Kubernetes for SingleStoreDB Self-Managed orchestration.

Follow @SingleStoreDB on Twitter to keep up with all of our latest news. And try Singlestore Helios for free.


Share