Real-Time Stream Processing Architecture with Hadoop and SingleStore


Lesia Myroshnichenko

Product Marketing Specialist

Real-Time Stream Processing Architecture with Hadoop and SingleStore

While SingleStore and Hadoop are both data stores, they fill different roles in the data processing and analytics stack. The Hadoop Distributed File System (HDFS) enables businesses to store large volumes of immutable data, but by design, it is used almost exclusively for batch processing. Moreover, newer execution frameworks, that are faster and storage agonistic, are challenging MapReduce as businesses’ batch processing interface of choice.

lambda-architectureLambda Architecture

A number of SingleStore customers have implemented systems using the Lambda Architecture (LA). LA is a common design pattern for stream-based workloads where the hot, recent data requires fast updates and analytics, while also maintaining long-term history on cheaper storage. Using SingleStore as the real-time path and HDFS as the historical path has been a winning combination for many companies. SingleStore serves as a real-time analytics serving layer, ingesting and processing millions of streaming data points a second. SingleStore gives analysts immediate access to operational data via SQL. Long-term analytics and longer running, batch-oriented workflows are pushed to Hadoop.

use-case-real-time-analytics-at-comcastUse Case: Real-Time Analytics at Comcast

As an example, SingleStore customer Comcast focuses on real-time operational analytics. By using SingleStore and Hadoop together, Comcast can proactively diagnose potential issues from real-time intelligence and deliver the best possible video experience. Their Lambda architecture writes one copy of data to a SingleStore instance and another one to Hadoop.

SingleStore enables Comcast to run lightning fast real-time analytics on large, changing datasets and makes their analytics infrastructure more performant overall. Instead of just logging all Xfinity data and analyzing it hours or days later, SingleStore gives Comcast the power to get both viewership and infrastructure monitoring metrics in real time. HDFS provides a quasi-infinite data store where they can run machine learning jobs and other “offline” analytics.

Watch the recorded session from Strata+Hadoop World to find out more on how SingleStore helps Comcast improve their Xfinity platform to work with millions of users, process enormous volumes of data and, at the same time, perform advanced real-time analytics.

Review more customer stories.

If you’re interested in test driving SingleStore, download it now.