SingleStoreDB for Wide Table Use Cases

SingleStoreDB specializes in applications with high-performance analytic query latency and concurrency requirements, including real-time, customer-facing streaming analysis.

Many of these streaming analytics applications leverage wide tables — particularly where the results require a vast (or all) data to be aggregated, or when the schema evolves. One of the most common questions I get after a demo is, “this makes sense for more structured data and distributed data models, but how does this work with my wide table with n-hundred columns?”

The question typically comes from experience with tuning analytical queries for performance, or updating speed on wide tables in MySQL, cold-storage themed cloud-only columnstores or data lakes. The question can also arise from trying to take data from NOSQL and ETL to pre-parse json into spare and wide tables. Sometimes it leads developers to time-series solutions which are resource intensive, difficult to tune and limited in extensibility. What should developers do when the requirement is for real-time ingest and millisecond response for complex queries on large datasets?

This blog explores how SingleStore’s Universal Storage is designed to handle complex queries on large datasets in narrow or wide tables before discussing sizing and design considerations for bringing wide tables into SingleStore. We’ll see how SingleStore can meet real-time streaming ingest and query response times for web scale applications as long as we understand our data and queries, apply the appropriate indexing and schema strategy, and select a workspace with enough memory, diskspace and CPU.

Wide Tables Need Wider Capabilities

If you’ve found this blog, you likely know at least a little about the benefits of SingleStoreDB’s Universal Storage — and might even be curious what you need to consider for wide table-specific use cases. It is a modern scalable distributed architecture designed to handle transactions and analytics with low millisecond latency.

It is a multi-modal and memory-optimized columnstore that is compressed and presorted for seekability. This engine diagram provides an overview of how SingleStoreDB is designed for real-time speed and real-world scale.

In this diagram, we see how data is ingested into SingleStoreDB. Inserts and updates are streamed with native pipelines or app-driven commands into an in-memory rowstore buffer before flushing synchronously to disk and asynchronously to a cloud storage blob. This three-tier storage is partitioned across independent compute nodes with attached SSD storage that can scale out according to workload requirements. When queried, SingleStoreDB leverages memory to speed up queries by reducing seeks within the columnstore — or eliminating them entirely.

A few notable features of this three-tiered storage engine help it perform particularly well in wide-table use cases compared to OLAP solutions, or OLTP-only solutions:

Distributed scale-out SQL brings massively parallelized lock-free inserts, upserts and queries
Compression and query compilation reduces disk scanning for lower latency queries
Real-time streaming with native SingleStoreDB Pipelines parallelize ingest across partitions with ability to transform and query data
Separation of storage and compute; expand storage beyond working set in disk by writing asynchronously to object store

In wide table use cases, SingleStoreDB’s storage engine’s design foundation on speed and complexity ensure app performance. Indexing and query optimization engines work together to guarantee low latency for transactional and analytical use cases when properly sized.

Keys and Indexes for Performance

Schema tuning is always important, but in wide tables where queries span many column segments, optimizing schemas is vital to guaranteeing sub-second response times. All schema tuning should be done against critical queries from your use case to ensure the SingleStoreDB engine is tuned properly.

This can be done with a profile of the query with the Visual Explain Tool and considering the results to adjust shard, sort and index selection appropriately. Check out this blog post for a guide on how to get started with query tuning.

Key Selection

Data is partitioned and segmented by shard and sort key into rows and column segments that power bitmapping that facilitates segment elimination and subsegment access. Combine this with hash indexes to speed up equality lookups — and the fact that some queries only need to scan in-memory metadata ex. Select count(*) if no recent deletes — and columnstore is built to speed up complex analytical queries while maintaining the ability to insert and update efficiently, even in wide tables.

When optimizing schema for wide tables, consider shard and sort key selection carefully to maximize query speed by taking advantage of the architecture (and add a hash on equality filters). More detail on indexes can be found here.

In wide-table uses cases, consider the following advice on key selection:

The effect of skew often has more weight as tables become wider; choose a shard key with minimal skew (sub 10%) and on common filters to maximize the distributed engine
Selective filters on sort key seek within required column segments are required instead of the whole row, reducing the volume of data scanned for the query to return

As tables become larger or wider, the parameters behind the columnstore segment can be adjusted with alter commands run on the sort key.

Columnstore_segment_rows determines the size of columnstore segments for elimination; the default is one million, with smaller rows allowing for more elimination
Increasing `Columnstore_flush_bytes` can facilitate wider rows loading through memory, instead of direct to disk

As shown in the following example, both can be set at table creation or with an alter.

1CREATE TABLE t2 (2  id INT, 3  SORT KEY (id) WITH (4    columnstore_flush_bytes = 4194304, 5    columnstore_segment_rows = 1000006  )7);8
9ALTER TABLE t2 MODIFY KEY id SET (10  columnstore_flush_bytes = 838860811);

Column Group Index

SingleStoreDB has also addressed a traditional pain point of columnstores for wide tables: slow I/O intensive seeks of the whole row when doing single or multi-row lookup or insert. This is done through the addition of a columngroup index. This section synthesizes the great work done in a recent blog post, “Universal Storage, Part 6: Column Group,” which outlines transactional performance improvement of up to 6x with the addition of column groups.

Here we see a table with a column group created. As an in-memory rowstore segment flushes to disk, SingleStoreDB creates a single columnstore segment containing the standard columnstore blobs and a secondary column group index (shown in green) which represents the entire row.

A query:

Filters down to required rows using segment elimination on columnstore blobs and filtering on hash indexes
This happens before materializing only required rows after filtering from column group.

While this second copy increases disk utilization significantly, it gives SingleStoreDB’s columnstore the transactional chops on wide tables to back up the Universal Storage claim. Note, column group indexes can be created during table creation, or afterwards with the following alter command: `ALTER TABLE col_table ADD COLUMN GROUP col_group_ind (*));`

Query Compilation and Plancache

SingleStoreDB’s in-memory buffer provides a query compilation plancache that compiles query shapes and uses them to optimize queries for speed by keeping required blobs in memory, and maintaining a bitmap to speed up the retrieval of blobs from disc or object.

This asynchronous compilation dramatically improves performance of queries on subsequent runs, but does add compilation time on initial run and consumes additional memory and disk resources.

Wide tables have more potential query shapes to compile depending on the distinct queries being run (think ad-hoc vs. repeated queries), so you might max out compilation memory on an undersized workspace. This can be resolved by increasing max_compilation_memory, as outlined here. If you find compilation for queries in a wide table consistently consuming too high a percentage of memory, then consider setting plancache_expiration to a shorter timespan or scaling the workspace to add additional memory as required to better manage memory.

Sizing Considerations

Sizing a wide-table use case starts with the same considerations as any other. The easiest way to do this is to work with SingleStore’s Solutions Engineering team to do sizing, testing and tuning.

Otherwise, start with uncompressed data size as a baseline; this can be calculated: RawDataSizeinTBs * (1 - CompressionRatio)*2 + (RawDataSizeinTBs * (1 - CompressionRatio)*.25). Review the sizes in Singlestore Helios and select an workspace instance size with the appropriate disk as a starting point.

This is when you should load data, test queries and ingest, apply indexes and scale as needed. You can also get in touch with our team for help. In wide tables with high concurrency, real-time streaming and low latency, you may need to scale-out as dictated by your requirements.

In high concurrency queries and large ingest volume across wide tables, you’ll want to preserve enough CPU per partition to perform operations quickly by testing relationship between CPU:Partition, typically best at 4:1.. For example, a database with eight partitions on a S-4 with 32 CPU as: create database test partitions 8;. This must be balanced against the overall simultaneous threads required, and may require added scale to a workspace to meet concurrency SLAs when required to meet a more aggressive latency target.

Summary

SingleStoreDB is the ideal solution for your wide table use case because it is built from the ground up to maximize speed in transactional and complex analytics use cases in a distributed, scale-out universal storage engine. We have customers doing this at scale in tables with hundreds of columns — some with thousands.

The question isn't can we handle it, but would your queries be better served with some normalization considering SingleStoreDB’s powerful distributed joins. If your table is wide enough to bring you here, you’ve probably already found eliminating joins is the best approach. Start there and test normalization as time permits later on.

As shown in transactional benchmarking on wide tables and analytical-transactional benchmarking across industry-standard data, SingleStoreDB allows you to transact, analyze and contextualize data as it is created with low millisecond latency. In this blog we’ve seen how SingleStoreDB’s three-tier storage, query engine and index capabilities can be considered for tables with thousands of columns.

Still think your table is too wide for SingleStoreDB? Try it out for yourself today with a free trial.

SingleStoreDB for Wide Table Use Cases

Wide Tables Need Wider Capabilities