
There has been an explosion of interest, sparked by LLMs, in vector search for machine learning, semantic search, and generative AI applications. The market for vector databases is estimated to be around $3 billion in 2026, growing from almost nothing before 2022. Early implementations of vector search focused on functionality and used the standard 32-bit (4-byte) floating point type to represent vector elements.
As people added more data to their vector databases, cost, particularly memory usage, became a concern. As it turns out, vector similarity checks are not very sensitive to the precision of vector elements. That characteristic makes using 16-bit float vector elements instead of 32-bit float vector elements feasible. Further, Intel and ARM instruction sets have supported 16-bit floats for native math operations since around 2022-2023. In fact, fast conversion functions (F16C) on Intel hardware date back to 2012 (Ivy Bridge).
The combination of vector operations' tolerance for reduced precision and hardware support for 16-bit floating point operations has enabled SingleStore to dramatically reduce the compute and storage costs of vector search in our 9.1 release, while maintaining similar recall performance.
In our 9.1 release, SingleStore has approximately halved storage costs and reduced compute costs by almost 40%, while maintaining similar recall!
Our new data type VECTOR(<N>, F16), which capitalizes on 16-bit floating point instructions, provides these benefits. Several other vector-search-capable database products have also added float16 support since 2024, though some notable products have not.
It's our pleasure to tell you how we made vector search in SingleStore cheaper, faster, and simply better using VECTOR(<N>, F16).
When working with vector embeddings, including text embedding models, it's helpful to understand the numeric values involved. Modern embedding models typically output vectors in 32-bit floating point format or single precision. In SingleStore, we can now store and index vectors in both 32-bit floating point (F32) and 16-bit floating point (F16) formats.
F32 (32-bit float) can represent an enormous range of values—from approximately ±1.2 × 10⁻³⁸ up to ±3.4 × 10³⁸. That's incredibly tiny numbers and astronomically large ones. And it provides about 7 decimal digits of precision.
F16 (16-bit float), also called "half precision," covers a smaller but still substantial range of values: roughly ±6.1 × 10⁻⁵ to ±6.5 × 10⁴. In comparison to F32, F16 has reduced precision—only about 3-4 decimal digits—but F16 uses half the memory and is faster to process.
The key differences between F32 and F16 floating point representations are that F32 can store a greater range of values and has more precision.
In vector embeddings, you rarely see values anywhere near the theoretical limits of F32 or F16. Most embedding models normalize their output vectors to unit length (length = 1). Normalization means that if you calculate the Euclidean distance from the origin to any embedding vector, you get 1.
Because of this normalization, individual components of embedding vectors fall in a much tighter range: -1 to 1. In practice, they're often clustered even closer to zero, depending on the dimensionality of your embeddings. The higher the dimension count, the smaller the typical magnitude of individual components, since the same total "energy" (that unit length) is being distributed across more dimensions. The full range of values possible with F32 vectors is typically not necessary for vector embeddings.
Further, while F16 has lower precision than F32, F16 is acceptable for many vector search applications. Vector search applications such as semantic search and generative AI retrieval systems do not need exact distances; they only need the correct ranking. So even with some loss of precision, vector search still maintains the correct ordering. Furthermore, the embedding vectors themselves are not precise and have good tolerance to some noise and lost precision.
We compared the storage used by F16 vectors versus F32 vectors in SingleStore tables using the GIST 1M dataset from http://corpus-texmex.irisa.fr/. This dataset contains one million vectors of dimension 960. As expected, F16 uses about 50% of the storage of F32.
Vector Type | Storage Size |
F16 | 1.79GB |
F32 | 3.58GB |
The F16 vectors provide faster search and indexing performance compared to F32 vectors for operations such as DOT_PRODUCT (which calculates cosine similarity on normalized vectors) and EUCLIDEAN_DISTANCE.
As with the storage test, the performance tests were conducted with the GIST 1M dataset. These vectors were stored in two tables - one with a VECTOR(960, F16) column, and one with a VECTOR(960, F32) column. The F16 vectors were obtained from the F32 column by casting using SingleStore's built-in conversions.
The results show that an exact kNN (k-Nearest Neighbor) search using DOT_PRODUCT on F16 vectors is 38% faster than when using F32, and a similar query using EUCLIDEAN_DISTANCE is 37% faster when using F16 versus F32.
Operation | F16 | F32 | Improvement |
| 491.8 ms | 794.6 ms | 38.1% |
| 500.6 ms | 797.6 ms | 37.2% |
The query used to test DOT_PRODUCT performance on F16 vectors is shown below; @vec_f16 is a variable that holds the query vector. Queries for F32 and EUCLIDEAN_DISTANCE are similar and are included in the Methodology section.
1SELECT DOT_PRODUCT (f16_col, @vec_f16) AS score 2FROM t_f16 3ORDER BY score DESC LIMIT 10;
ANN indexes are supported on both F16 and F32 vectors. The results show that index build and search times are similar for F16 and F32 vectors. You can save vector storage space, while maintaining and even improving search performance.
We compared IVF_PQFS and HNSW_FLAT indexes because those are the indexes we recommend using. IVF_PQFS uses product quantization (PQ) to reduce storage space and produces a smaller index. HNSW_FLAT does not compress vectors and has high accuracy (recall) but produces a larger index.
SingleStore uses the Faiss library, which does not support F16 hardware instructions. Thus, while building the index, vectors are up-casted to F32 and the index stores F32 vectors. Vector index sizes are the same for F16 and F32 since F16 vectors are converted to F32 vectors during vector index building. For IVF_PQFS indexes, the use of product quantization reduces storage space significantly.
The search times are shown below. Index search on F16 is slightly faster, likely due to the smaller memory footprint of F16 vectors leading to better cache utilization as more of the working set fits in the cache.
F16 | F32 | Improvement | |
IVF_PQFS | 33.25 ms | 34.73 ms | 4.3% |
HNSW_FLAT | 30.71 ms | 33.50 ms | 8.3% |
The query used to test F16 ANN search times is shown below; @qvec holds the query vector. Queries for F32 and the index creation commands are included in the Methodology section.
1SET @qvec = UNHEX('<gist_query_vector_hex>'):>VECTOR(960, F16); 2 3-- ANN search: find 100 approximate nearest neighbors4SELECT id, EUCLIDEAN_DISTANCE(f16_col, @qvec) AS dist 5FROM t_f16 6ORDER BY dist ASC7LIMIT 100;
The index build times are below. The index build is slightly slower for F16 vectors due to the overhead of typecasting from F16 to F32.
F16 | F32 | Improvement | |
IVF_PQFS | 12.11 s | 11.06 s | -9.5% |
HNSW_FLAT | 463.3 s | 429.25 s | -7.3% |
The queries to create the IVF_PQFS and HNSW_FLAT indexes are shown in the Methodology section below.
Okay, you doubted we could make vector search better. But you read this far to find out. You're right, it is not better in the sense that the quality of the results you get is essentially the same as when using F32 vectors. But using F16 vectors can save you money because F16 vector searches can be done with half the hardware and storage at about the same speed as F32 searches and, on our test dataset, provide nearly identical recall. Half the price and just as good!
Using F32 is, however, still important for use cases that require higher numerical precision, such as scientific, medical imaging, and financial applications. But for the majority of use cases, our experiments show that F16 is the better choice, especially for larger datasets
The GIST 1M dataset includes ground truth data. Recall was calculated by comparing query results against the ground truth. The results are shown below.
F16 and F32 have statistically equivalent recall across both index types. The sub-1% variations are within the margin of error inherent to approximate nearest neighbor (ANN) search and the benchmark methodology. In practice, F16 delivers the same search quality as F32.
IVF_PQFS F16 | IVF_PQFS F32 | HNSW_FLAT F16 | HNSW_FLAT F32 | |
Recall | 96.4% | 95.8% | 97.9% | 98.0% |
When building a new application using vector search, if the application doesn't need high precision, all you need to do to get the benefits of F16 is to create your tables with columns of the VECTOR(<N>, F16) type, and use them as we describe in our documentation on working with vector data.
If migrating an existing application that uses F32 vectors to F16, you can either ALTER the table to add a new F16 vector column and use UPDATE (or a set of UPDATE commands for smaller batches), or, create a new table, and use INSERT…SELECT (also in batches if needed) to move the data over. Example SQL for doing this is given in the later section Migrating to F16 from F32.
The new F16 vector support in SingleStore 9.1 allows you to reduce vector storage space while maintaining and even improving performance. It's a win-win.
We have argued before that your vector database should not be a vector database:
Why Your Vector Database Should Not be a Vector Database
Why Your Vector Database Should Still Not be a Vector Database
Vectors and vector search are a data type and a query processing approach, not a foundational change. Vector processing can be effectively integrated into modern SQL databases. Using SingleStore gives you the benefits of vector processing plus all the other benefits of a full data platform.
The appendix contains supplemental information. A methodology section describing our test process and an example of how to migrate from F32 vectors to F16 vectors.
The tests were conducted with the GIST 1M dataset from http://corpus-texmex.irisa.fr/, which contains 1 million vectors of 960 dimensions.
Setup
- The GIST 1M vectors were stored in two tables. One table with a VECTOR(960, F16) column, and one with a VECTOR(960, F32) column.
1CREATE TABLE t_f16 (f16_col VECTOR(960,F16), id INT);2CREATE TABLE t_f32 (f32_col VECTOR(960,F32), id INT);
- The F16 vectors were obtained from the F32 column by using an INSERT INTO statement which uses SingleStore's built-in conversions from F32 to F16.
1INSERT INTO t_f16 2SELECT * FROM t_f32;
- All tests were performed using a SingleStore Helios S-00.
- All tests were run 5 times; the results are the average of the middle 3 values.
- The following query was used to calculate the table sizes.
1SELECT2 cs.database_name,3 cs.table_name,4 dp.role,5 FORMAT(SUM(cs.uncompressed_size) / 1024/1024, 0) AS uncompressed_mb,6 FORMAT(SUM(cs.compressed_size) / 1024/1024, 0) AS compressed_mb7FROM information_schema.columnar_segments AS cs8JOIN information_schema.distributed_partitions AS dp9 ON cs.database_name = dp.database_name10 AND cs.partition = dp.ordinal11 AND cs.node_id = dp.node_id12WHERE cs.database_name = '<db-name>'13 AND dp.role = 'Master'14GROUP BY15 cs.database_name,16 cs.table_name,17 dp.role;
DOT_PRODUCT and EUCLIDEAN_DISTANCE Performance
The performance of DOT_PRODUCT and EUCLIDEAN_DISTANCE was evaluated as follows.
A query vector for the F16 test was selected using this query.
1SELECT f32_col INTO @vec_f32 2FROM t_f32 3ORDER BY f32_col DESC LIMIT 1;4 5SET @vec_f16 = @vec_f32 :> VECTOR(960, F16);
The following queries were timed to obtain the F16 performance results.
1SELECT DOT_PRODUCT (f16_col, @vec_f16) AS score 2FROM t_f16 3ORDER BY score DESC LIMIT 10;4 5SELECT EUCLIDEAN_DISTANCE (f16_col, @vec_f16) AS dist 6FROM t_f16 7ORDER BY dist DESC LIMIT 10;8
A query vector for the F32 test was selected using this query.
1SELECT f32_col INTO @vec_f32 2FROM t_f32 3ORDER BY f32_col DESC LIMIT 1;4
The following queries were timed to obtain the F32 performance results.
1SELECT DOT_PRODUCT(f32_col, @vec_f32) AS score 2FROM t_f32 3ORDER BY score DESC LIMIT 10;4 5SELECT EUCLIDEAN_DISTANCE(f32_col, @vec_f32) AS dist 6FROM t_f32 7ORDER BY dist DESC LIMIT 10;
The performance of indexed ANN queries was evaluated using similar queries, but on the table with indexes.
Index Creation for F16 column:
1-- Build IVF_PQFS index 2ALTER TABLE t_f16 ADD VECTOR INDEX (f16_col)3INDEX_OPTIONS '{"index_type":"IVF_PQFS", "metric_type":"EUCLIDEAN_DISTANCE", "nlist":128, "nprobe":100, "m":240}'; 4 5-- Build HNSW_FLAT index 6ALTER TABLE t_f16 ADD VECTOR INDEX (f16_col)7INDEX_OPTIONS '{"index_type":"HNSW_FLAT", "metric_type":"EUCLIDEAN_DISTANCE", "M":16, "efConstruction":128, "ef":200}';
Index Creation for F32 Vector column:
1-- Build IVF_PQFS index 2ALTER TABLE t_f32 ADD VECTOR INDEX (f32_col)3INDEX_OPTIONS '{"index_type":"IVF_PQFS", "metric_type":"EUCLIDEAN_DISTANCE", "nlist":128, "nprobe":100, "m":240}'; 4 5-- Build HNSW_FLAT index 6ALTER TABLE t_f32 ADD VECTOR INDEX (f32_col)7INDEX_OPTIONS '{"index_type":"HNSW_FLAT", "metric_type":"EUCLIDEAN_DISTANCE", "M":16, "efConstruction":128, "ef":200}';
ANN Search for F16 Column (gist_query_vector_hex is the hex-encoded representation of a 960-dimensional query vector from the GIST1M benchmark dataset):
1SET @qvec = UNHEX('<gist_query_vector_hex>'):>VECTOR(960, F16); 2 3SELECT id, EUCLIDEAN_DISTANCE(f16_col, @qvec) AS dist 4FROM t_f16 5ORDER BY dist ASC LIMIT 100;
ANN Search for F32 Column:
1SET @qvec = UNHEX('<gist_query_vector_hex>'):>VECTOR(960, F32); 2 3SELECT id, EUCLIDEAN_DISTANCE(f32_col, @qvec) AS dist 4FROM t_f32 5ORDER BY dist ASC LIMIT 100;
SingleStore supports conversions from F32 to F16 vectors which makes migrating from F32 to F16 straightforward. So, you can save 2X storage costs right away.
To convert a table that uses F32 vectors to F16 vectors:
- Add a new column with type
VECTOR(<N>, F16)to the table. - Use an UPDATE … SET statement to convert the F32 values to F16 values.
- Drop F32 column and associated index.
- Rename the new column and create an index on it, if desired.
Given the following CREATE TABLE and ADD INDEX statement for F32 vectors.
1CREATE TABLE t_vecs(c_orig VECTOR(1024, F32));2 3... Insert Data ...4 5ALTER TABLE t_vecs ADD VECTOR INDEX i_orig (c_orig)6 INDEX_OPTIONS '...';
Use the following queries to add a new column of F16 vectors and to set the value of those F16 vectors to the value of the F32 vectors.
The UPDATE … SET command uses SingleStore's built in casting to convert the F32 vectors to F16 vectors.
1ALTER TABLE t_vecs ADD column c_new VECTOR(1024,F16);2UPDATE t_vecs SET c_new = c_orig;3
Once the new column has been created and updated, drop the old column and index, rename the new column to the original column name, and re-create the index, if there was an index.
The new column and re-created index use the same names as the original column and index so that existing queries execute without modifications.
1DROP INDEX i_orig ON t_vecs;2ALTER TABLE t_vecs DROP COLUMN c_orig;3 4ALTER TABLE t_vecs CHANGE c_new c_orig;5ALTER TABLE t_vecs ADD VECTOR INDEX i_orig (c_orig)6 INDEX_OPTIONS '...';





-for-Real-World-Machine-Learning_feature.png?height=187&disable=upscale&auto=webp)












