Why Your Vector Database Should Not be a Vector Database
The database market is seeing a proliferation of specialty vector databases.People who buy these products and plumb them into their data architectures may find initial excitement with what they can do with them to query for vector similarity. But eventually, they will regret bringing yet another component into their application environment.Vectors and vector search are a data type and query processing approach, not a foundation for a new way of processing data. Using a specialty vector database (SVDB) will lead to the usual problems we see (and solve) again and again with our customers who use multiple specialty systems: redundant data, excessive data movement, lack of agreement on data values among distributed components, extra labor expense for specialized skills, extra licensing costs, limited query language power, programmability and extensibility, limited tool integration, and poor data integrity and availability compared with a true DBMS.Instead of using a SVDB, we believe that application developers using vector similarity search will be better served by building their applications on a general, modern data platform that meets all their database requirements, not just one. SingleStoreDB is such a platform.SingleStoreDBSingleStoreDB is a high-performance, scalable, modern SQL DBMS and cloud service that supports multiple data models including structured data, semi-structured data based on JSON, time-series, full text, spatial, key-value and vector data. Our vector database subsystem, first made available in 2017 and subsequently enhanced, allows extremely fast nearest-neighbor search to find objects that are semantically similar, easily using SQL. Moreover, so-called "metadata filtering" (which is billed as a virtue by SVDB providers) is available in SingleStoreDB in far more powerful and general form than they provide — simply by using SQL filters, joins and all other SQL capabilities.The beauty of SingleStoreDB for vector database management is that it excels at vector-based operations and it is truly a modern database management system. It has all the benefits one expects from a DBMS including ANSI SQL, ACID transactions, high availability, disaster recovery, point-in-time recovery, programmability, extensibility and more. Plus, it is fast and scalable, supporting both high-performance transaction processing and analytics in one distributed system.SingleStoreDB Support for VectorsSingleStoreDB supports vectors and vector similarity search using dot_product (for cosine similarity) and euclidean_distance functions. These functions are used by our customers for applications including face recognition, visual product photo1 search and text-based semantic search [Aur23]. With the explosion of generative AI technology, these capabilities form a firm foundation for text-based AI chatbots.The SingleStore vector database engine implements vector similarity matching extremely efficiently using Intel SIMD instructions.