Every organization strives to process more data faster to be able to react in real time to customers, manufacturing, logistics, pricing and other business decisions. This requires optimizing two related data functions.
The first is to make data broadly accessible. Legacy data systems were purpose built for applications and specific application performance. The advances in flash data storage and networking have provided the technology foundation for modern, distributed data platforms that can serve multiple applications and data types effectively. Companies are simplifying data management by removing data silos and natively supporting multiple data types.
The second is to reduce the overhead of legacy data pipelines. The traditional data cycle was transaction, extraction, transformation, combination and then eventually… analytics. Real-time decision making requires analytics queries very close to the data creation. New frameworks that include AI inference and Spark can only be effective when they have the data.
SingleStoreDB meets these needs by unifying data-intensive applications in a real-time distributed SQL database. It provides fast ingest, and supports multiple data types and queries.
At a customer’s request, IBM Storage had the opportunity to test SingleStoreDB with IBM Spectrum Scale. It is natural to want to deploy a highly scalable database (SingleStoreDB) on the leader in scalable, distributed storage. The combined solution is more impressive and complementary than we originally anticipated.
Our customers adopt IBM Spectrum Scale to provide a Global Data Platform on which they had deployed multiple databases, NoSQL data stores, HDFS and related applications. The IBM storage solution eliminates data movement and provides fast, multi-protocol access for different teams and applications. It also has superior data storage economics through consolidation and data tiering.
SingleStoreDB combines transactional and analytical workloads, which is one of the key reasons our client is intending to deploy SingleStoreDB to consolidate and simplify. They wanted fast, scalable data access that would be flexible in data ingest, but also able to tap their existing data lake on HDFS (on IBM Spectrum Scale).
The combined solution provides deployment simplicity, independent scaling of compute and data, and enhanced data agility. We used OpenShift Kubernetes to deploy IBM Spectrum Scale and SingleStoreDB across multiple servers. IBM Spectrum Scale presents as a local filesystem, though it is distributed across the SingleStoreDB servers. The combined solution performs as if each server were independent, yet it provides a consolidated storage platform. The consolidated storage platform eliminates over-provisioning and increases utilization.
It also has the intrinsic benefit for data ingest and data extraction into a common location. This is important to our customer who already takes advantage of IBM Spectrum Scale’s native storage protocols to write data directly from some applications. Each node in SingleStoreDB will have easy access to this data to ingest the raw data into the database.
Our thanks to the technical team at our customer who initiated this work, and to those at SingleStore and IBM Storage who quickly tested and demonstrated the solution.
More to come!