While Hadoop is great for storing large volumes of data, it’s too slow for building real-time applications. However, our recent collaboration with Cisco provides a solution for Hadoop users who want a better way of processing real-time data. Using Cisco’s Application Centric Infrastructure including APIC and Nexus switch technology, we’ve been able to demonstrate exceptional throughput on concurrent SingleStore and Hadoop 2.0 workloads.
Here’s How It Works
Cisco’s new networking technology automatically prioritizes smaller packet streams generated by real-time workloads over the larger packet streams typically generated by Hadoop. This enables impressive throughput on clusters running simultaneous SingleStore and Hadoop workloads.
At the Strata + Hadoop conference last week in New York, Cisco demonstrated the solution on an 80 node cluster running both SingleStore and Hadoop. Without additional network traffic, the cluster can serve 2.4 million reads per second from SingleStore’s in-memory database. Without packet-prioritization, the database’s performance drops to under 600 thousand reads per second when a simulated Hadoop workload is added to saturate the cluster’s network. With packet-prioritization, the performance recovers to 1.4 million reads per second, more than doubling the throughput.
Why Does It Matter?
This advance provides the ability to collocate SingleStore, for real-time, mission critical data ingest and analysis, with Hadoop workloads that are less time-sensitive and executed as large batch jobs on historical data.
By combining Hadoop’s storage infrastructure with SingleStore’s real-time data processing ability, businesses get the best of both worlds: real-time analytics with Hadoop scale workloads. As an added bonus, the solution allows businesses to save on hardware costs by running SingleStore and Hadoop together on the same cluster.
If you want to learn more, contact a SingleStore representative at email@example.com or at (855) 463-7660.