SingleStore vs. ClickHouse: Why Consistent Vector Search Latency Matters

5 min read

Mar 5, 2026

SingleStore vs. ClickHouse: Why Consistent Vector Search Latency Matters

Introduction

As AI-powered applications become mainstream, vector search has evolved from a niche capability to a critical database feature. Whether you're building recommendation engines, semantic search, or RAG (Retrieval-Augmented Generation) applications, the performance of your vector operations directly impacts user experience.

We recently conducted comprehensive benchmarks comparing SingleStore's vector capabilities against ClickHouse, another popular analytical database. The results demonstrate why SingleStore's unified architecture delivers the consistent, low-latency performance that production AI applications demand.

The Benchmark Setup

To ensure a fair comparison, we tested both databases on comparable cloud infrastructure in AWS us-east-1, using the same dataset of 45,466 movie records. Each record includes a 768-dimensional vector embedding generated using the all-mpnet-base-v2 model—a realistic representation of production semantic search workloads.

SingleStore Configuration: S-00 cluster (16 GB RAM, 2 vCPUs) running on Helios, our fully managed cloud service

ClickHouse Configuration: Cloud deployment (16 GB RAM, 4 vCPUs)—notably with double the CPU resources

Despite running on half the vCPUs, SingleStore consistently matched or outperformed ClickHouse across majority of test scenarios.

Schema Design

We used a real-world dataset of 45,466 movies from Kaggle, representing a realistic semantic search workload. The schema was designed to mirror production use cases:

The embeddings were generated using the all-mpnet-base-v2 model—a popular choice for semantic similarity tasks. SingleStore stores these vectors using columnar storage for optimal analytical performance.

Testing Methodology

We divided testing into two phases to capture both programmatic and interactive performance characteristics:

Phase 1 — Python Script Testing: We built a Python test harness that simulates real application behavior. A user provides a natural language query (e.g., "A movie about sports like basketball"), which the script converts to vector embeddings matching the 768 dimensions stored in the database. The script then fires the similarity query and measures response time. Importantly, timing begins after the database connection and cursor are established, and ends when the result rows are written—isolating pure query execution time from connection overhead and embedding generation.

Phase 2 — Portal Testing: To eliminate any scripting overhead and validate our programmatic results, we also ran queries directly through each database's native web portal.

Handling Cold vs. Warm Cache

Cache behavior significantly impacts real-world performance, so we measured both scenarios:

Cold Cache (Run 1): Before each test series, we cleared the query plan cache to force the optimizer to generate fresh execution plans. This simulates first-time queries or queries after system restarts—critical for understanding worst-case latency.

Warm Cache (Average): We then ran each query multiple times consecutively and averaged the results. This represents steady-state performance for frequently-executed queries.

We also made a deliberate choice: no query tuning or optimizer hints. All queries ran with default optimizer behavior, reflecting what most developers experience out of the box. The only optimization applied was adding indexes where documented—and those results are reported separately.

Key Performance Findings

Dot Product Search: Faster Out of the Box

For dot product similarity searches—the foundation of many recommendation systems—SingleStore delivered first-query response times of 1197 ms compared to ClickHouse's 2+ seconds. That's faster on cold cache.

When we added SingleStore's DOT_PRODUCT index, query times dropped to ~70 ms. ClickHouse doesn't offer a dedicated dot product index, leaving a significant performance gap for this common use case.

L2/Euclidean Distance: Consistent Performance

Both databases offer L2 distance calculations, but SingleStore showed more predictable performance patterns. Without indexes, SingleStore completed queries in ~75ms, while ClickHouse took 241ms to 366ms.

With indexes, SingleStore took its performance to the next level with less computational overhead.

Concurrency: Where Real-World Performance Matters

Production applications don't run one query at a time. Our Python test harness included concurrency tests that reveal how each database performs under parallel load.

Basic Concurrency (5 Parallel Queries):

We fired 5 similarity queries simultaneously, each searching for the closest match to a different user prompt:

SingleStore's latency ranged from 450-580ms with tight variance, while ClickHouse ranged from 540-810ms. More importantly, SingleStore delivered this performance with half the CPU resources.

Advanced Concurrency (Sustained Background Load):

We then ran a more realistic test: for every query being timed, 4 other queries ran continuously in the background. The script spawned 5 threads, each executing its query 5 times while the others ran simultaneously. This measures performance under sustained concurrent load—what your database actually experiences in production.

Overall, SingleStore demonstrated better concurrency performance, with improvements ranging from 50 to 260ms per query depending on the workload.

Complex Analytical Queries

We also tested a complex k-nearest-neighbor query that combines multiple vector operations:

  1. Select N pivot rows with 768-dimensional embeddings
  2. For each pivot, compare against all other rows computing both dot product similarity and L2 distance
  3. For each pivot, return only the top K nearest neighbors by L2 distance

Both databases initially took over 2 seconds. However, SingleStore's execution time dropped to one-third of its initial time on subsequent runs—and to one-fifth with proper indexing. ClickHouse showed minimal improvement in this scenario

Portal-Based Query Results

To validate our Python script results and eliminate any scripting overhead, we ran the same queries directly through each database's native web portal. Before each test, we cleared the plan cache to ensure fresh optimization.

The results tell a clear story: SingleStore delivers dramatically better cold-cache performance (often 3-7x faster on first run), while maintaining competitive warm-cache times. For user-facing applications where cold starts impact experience, this difference is critical.

Why SingleStore Excels for Vector Workloads

  1. Native Vector Type: SingleStore's purpose-built VECTOR data type and functions (DOT_PRODUCT, EUCLIDEAN_DISTANCE, COSINE_SIMILARITY) are optimized from the ground up, not bolted on.
  2. Flexible Indexing: Support for multiple ANN index types (IVF_PQFS, HNSW) lets you optimize for your specific access patterns.
  3. Consistent Latency: Whether it's your first query or your thousandth, SingleStore delivers predictable performance under varying loads.
  4. Unified Platform: Combine vector search with transactional data, analytics, and full-text search in a single database—no complex data pipelines required.

The Bottom Line

For teams building AI-powered applications, database choice matters. SingleStore's consistent performance across cold and warm queries, superior concurrency handling, and comprehensive vector indexing options make it the stronger choice for production vector workloads—especially when you need predictable latency at scale.

Ready to see how SingleStore performs with your vector data? Start your free trial today and experience the difference unified, real-time data architecture makes for AI applications.
Dataset: The Movies Dataset on Kaggle
Start Your Free Trial →