New Environment Validation Checks in SingleStore Tools

RP

Roxanna Pourzand

Previous Product Manager

New Environment Validation Checks in SingleStore Tools

You can now validate the hardware and software environment for SingleStore, before you install the software. As you may already know, the SingleStore-Report module within SingleStore Tools allows you to run health checks on your database. The new functionality in SingleStore-Report lets you run environment validation checks – before you install SingleStore. These checks ensure that machines are appropriately configured for the best performance of the database, and the least likelihood of future problems.

SingleStore is a high-performance database; it ingests data rapidly, and supports workloads that require fast transaction processing, low query latency, and high concurrency. (Which includes lots of simultaneous users, including individual SQL queries, business intelligence tools, applications, and machine learning models.)

You may have heard the database analogy that compares a fast database to a racecar. At SingleStore, we like this comparison. To confirm your race car is ready to perform, you need to ensure that the foundation of the car – the configuration of the environment that your database will run in – is in good shape.

the-importance-of-configuration-for-a-distributed-databaseThe Importance of Configuration for a Distributed Database

The world of distributed databases includes a large number of databases that run best on a single machine, or that require very careful configuration and management to “scale out” in a limited fashion. There is also a small number of newer, relational databases that are distributed, which are referred to as NewSQL databases. SingleStore is a NewSQL database. There are also a wide, and growing, range of NoSQL databases. (We have our own take on NoSQL.)

Any truly distributed database, whether NewSQL or NoSQL, depends on the cooperation of many separate nodes to function. For a distributed database, your performance is only as good as your slowest node – and, the more time each and every node is up and running, the faster and more reliable the whole database is.

Since performance issues can often be tied directly to configuration – whether it be the operating system, network, or disk – it is important to have a foolproof way to check that your entire system is set up properly, down to the last node, so you can unleash the full power of the database.

configuring-single-storeConfiguring SingleStore

What does this mean for the SingleStore database? Optimal configuration will lead to the best possible performance, with the fewest possible problems. This translates to obtaining more value out of your data faster, and spending less time on tuning and troubleshooting.

We will review some examples below of configuration recommendations that can affect the database. Our system requirements documentation contains a full list of these items.

At a high level, it is essential to confirm that your machines have enough resources to operate the database, and that your operating system is configured properly. Here are three examples:

  • From a hardware perspective, we require a minimum of 4 cores, and 8 GB of RAM, per server.
  • Some operating system configuration recommendations include checks for settings like ‘Transparent Huge Pages’; if this setting is not disabled, you may experience inconsistent query performance.
  • Configuring Non-Uniform Memory Access (NUMA), on your machines that can benefit from it, will improve your performance significantly given your workload.

SingleStore recommends more than a dozen specific system configuration settings be checked and, where needed, changed, before you install the SingleStore database. It’s tedious to have to check/change each of them by hand, across every host in your cluster – and any tedious manual effort opens the door for potential errors.

To avoid the manual effort that would otherwise be needed, use SingleStore-Report to do this work. SingleStore-Report summarizes all the information in one place, through an easy-to-use interface.

pre-installation-validation-in-single-store-toolsPre-Installation Validation in SingleStore Tools

The SingleStore Report module collects a report on your cluster that covers a series of checks around the SingleStore cluster, databases within it, and the system hosting it. It also outputs a set of pass/fail checks on settings, based on SingleStore-recommended best practices.

In the previous versions, the Report module expected that you had an existing SingleStore cluster when using it. Recently, we released a version of the SingleStore Report that adds the ability to run pre-environment checks, which only reports on components that are applicable to host machines without the SingleStore software installed on them.

This feature allows you to confirm the validity of the environment before installing the database and loading data. Incorporating this pre-check functionality in SingleStore Tools means you have a clear-cut path to identify any problems, before you proceed with the installation process.

how-does-it-workHow Does It Work?

In the first step of SingleStore software installation, you download SingleStore Tools to manage the software. (Later in the process, Tools will deploy the database for you as well.) After you register the machines that you plan to install SingleStore on with Tools, don’t proceed immediately with SingleStore installation as the next step. Instead, run the following command to check your environment first:

memsql-report collect --validate-env

This collects a report with pre-installation environment checks, without installing anything. After the report has been collected, you can run:

memsql-report check --validate-env --report-path </path/to/report>

This outputs a list of all pre-environment checks in a pass/fail/warn manner, and alerts you to any potential configuration changes that you need to make before proceeding with the installation.

Below is a sample output of this check. See below for the actions we recommend you take, if you get this report in your own environment.

$ memsql-report check --validate-env --report-path report-2020-05-05T000204.tar.gz

✘ minFreeKbytes ................................. [FAIL]

FAIL vm.min_free_kbytes = 67584 too low on 172.31.68.57

NOTE https://docs.singlestore.com/db/latest/en/reference/configuration-reference/cluster-configuration/system-requirements-and-recommendations.html

✓ validateSsd ................................... [PASS]

✘ partitionsConsistency ......................... [WARN]

WARN Some partitions start sector on nvme0n1 are inconsistent (should be a multiple of 4096): [nvme0n1p1]

✓ diskUsage ..................................... [PASS]

✓ chronydDisabled ............................... [PASS]

✓ cpuHyperThreading ............................. [PASS]

✓ cpuModel ...................................... [PASS]

NOTE AMD EPYC 7571 on all

✓ orchestratorProcesses ......................... [PASS]

✓ cpuFeatures ................................... [PASS]

✓ vmOvercommit .................................. [PASS]

✓ defunctProcesses .............................. [PASS]

✓ kernelVersions ................................ [PASS]

NOTE 4.18 on all

✓ cpuFreqPolicy ................................. [PASS]

✘ maxMapCount ................................... [FAIL]

FAIL vm.max_map_count = 65530 too low on 172.31.68.57

NOTE https://docs.singlestore.com/db/latest/en/reference/configuration-reference/cluster-configuration/system-requirements-and-recommendations.html

✓ collectionErrors .............................. [PASS]

✘ transparentHugepage ........................... [FAIL]

FAIL /sys/kernel/mm/transparent_hugepage/enabled is [always] on 172.31.68.57

FAIL /sys/kernel/mm/transparent_hugepage/defrag is [madvise] on 172.31.68.57

NOTE https://docs.singlestore.com/db/latest/en/reference/troubleshooting-reference/query-errors/issue--inconsistent-query-run-times.html

Some checks failed: 11 PASS, 2 WARN, 3 FAIL

Seeing this report, as a user, I would do the following:

  • Increase the vm setting, vm.max_map_count, to the specified value, which will decrease the risk of memory errors.
  • Check on the consistency of disk partitions on this host, so that performance of disk operations across the cluster falls within a similar range.
  • Disable Transparent Huge Pages to ensure that the system has consistent query performance times.

For more information on these commands, please see the documentation on memsql-report collect and memsql-report check.

conclusion-and-whats-nextConclusion and What’s Next

The ability to check your system, prior to installation, against a vetted set of SingleStore best practices ensures that your database is production-ready to serve your critical applications.

Stay tuned for additional functionality around pre-install checks that we will provide in the future. This includes the ability for the tool to run performance benchmarks on your hardware. Also, we plan to incorporate the validation check directly in the installation process, so you won’t have to run it separately anymore.

If you have a SingleStore cluster that’s managed by SingleStore tools, you can try this today! Check and see if your servers are configured appropriately.

If you are not yet using SingleStore, you can try SingleStore for free or contact SingleStore.


Share