Video: Real-Time Analytics at UBER Scale


Mason Hooten

 Digital Marketing Associate

Video: Real-Time Analytics at UBER Scale

We’ve created an updated version of this blog post with much more detail. – Editor

At Strata+Hadoop World, James Burkhart, technical lead on real-time data infrastructure at Uber, shared how Uber supports millions of analytical queries daily across real-time data with Apollo, Uber’s internal analytics querying language.

James covers architectural decisions and lessons learned from building an exactly-once ingest pipeline that captures raw events across in-memory row storage and on-disk columnar storage. He also details how Uber uses a custom metalanguage and query layer by leveraging partial OLAP result set caching and query canonicalization. Putting all the pieces together provides thousands of Uber employees with subsecond p95 latency analytical queries spanning hundreds of millions of recent events.

video-and-slidesVideo and Slides:

about-jamesAbout James

James Burkhart, is the technical lead on real-time data infrastructure at Uber. James has a strong background in time series data storage, processing, and retrieval. Previously, he worked on Blueflood, a time series database on top of Cassandra, while at Rackspace.

additional-resourcesAdditional Resources

Uber Engineering Blog:

Uber Open Source:

Uber Eng Twitter:

Uber Slides: