Objects in SingleStore, Part 2


Shrinivas Sagare

Senior Professional Services Architect

Objects in SingleStore, Part 2

With a number of exciting features, it is always important for any database to perform better, while simultaneously delivering a better user experience. Indexes are the key objects in any database system, providing path and metadata to the optimizer. SingleStore works on unique indexes, which are best on different table types. 

This is the second article in a three-part series on using objects in SingleStore. In case you missed it, check out part one here to explore using table types.  

Types of Indexes:

SingleStore allows you to create indexes for search paths like other standard databases. Indexes are generally used by optimizers for filter (WHERE clause) and JOIN conditions. Main indexes include skiplist for Rowstore tables, and clustered Columnstore for Columnstore tables. 

Skiplist Index:

SingleStoreDB supports skiplists only on rowstore tables. These are similar to B-tree indexes. Skiplists are optimized to run in memory, support lock-free implementation and offer extremely fast insert performance. 

Get the full story behind SingleStore’s skiplist indexes

Columnstore Index

Clustered columnstore indexes provide significant data compression and are backed by disk. Currently, clustered columnstore indexes cannot be combined with in-memory row store indexes on the same table. 

Five key advantages of Columnstore index / SORT KEY include:

  1. All queries can take advantage of the fact that only the column segments containing columns referenced in the query will need to be scanned, as well as the fact that the columnstore’s compression results in less data needing to be scanned. 
  2. Some queries can be performed simply by reading the in-memory metadata for column segments referenced in the query. Queries that need only sum, count, min and max aggregates eliminate the need to read column segments from the disk.
  3. Queries that join tables on columns within a columnstore index can be performed very efficiently through the use of a merge join algorithm, allowing the join to be performed by simply scanning two segments that need to be joined in lock step.
  4. Certain types of data allow filters and group-by operations to be performed without decompressing data from its serialized-for-disk format. This greatly improves performance by reducing the amount of data that needs to be processed.
  5. Queries using equality filters can take advantage of hash indexes. 

Hash Index: 

These are classic indexes on rowstore and columnstore tables. A HASH index will only be utilized if the query employs an equality filter for every column in the index. For multi-column indexes, query filters must match all of the index columns to take advantage of the index. It can be created with a clause using HASH. 

If a rowstore or columnstore table is being created, the following applies:

  • You can create single-column or multi-column hash indexes
  • When you create a unique single-column hash index, the shard key can contain only one column — which must be the same column that you have created the index on. When you create a unique multi-column hash index, the shard key must be a subset of the columns that you have created the index on
  • You can create multiple single-column hash indexes on a reference table

If a columnstore table is being created, the following applies:

  • You can create, at most, one unique hash index. You can create multiple multi-column hash indexes.
  • You cannot create a unique hash index on a FLOAT, REAL or DOUBLE column.

Full-Text Indexes:

Full-text search enables you to search for words or phrases in a large body of text through an inverted index. These types of indexes can be defined during the CREATE TABLE statement on columnstore tables. Full-text indexes cannot be altered or dropped, and can only be  created on char, varchar, text or longtext. 

Indexing JSON Data:

SingleStore enables storage of  JSON data, as well as quickly  querying the data using indexing on persisted columns. JSON columns are not indexed directly — they are indexed using computed columns. For the fastest performance, you should not use JSON built-ins or :: notation in your filters. Instead, create a computed column that includes the JSON column in the computation, and then use the computed column for the index. This way, the index gets updated only when the relevant JSON data is updated in a row.

description TEXT,
properties JSON NOT NULL,
weight AS properties::%weight PERSISTED DOUBLE,
license_plate AS properties::$license_plate PERSISTED TEXT,
KEY(license_plate), KEY(weight));

Don’t miss part three of this series, where I’ll dive into additional details about procedural objects. 

Interested in seeing how objects in SingleStore work? Try it out free today.