Organizations once waited hours, days, or even weeks to get a handle on their data. In an earlier era, that sufficed. But with today’s endless stream of zeros and ones, data must be usable right away. It’s the crux of decision making for enterprises competing in the modern era.
Recognizing cross-industry interest in massive data ingest and analytics, we teamed up with O’Reilly Media on a new book: The Path to Predictive Analytics and Machine Learning. In this book, we share the latest step in the real-time analytics journey: predictive analytics, and a playbook for building applications that take advantage of machine learning.
Chapter 1: Building Real-Time Data Pipelines
We begin with a review our previous O’Reilly book: Building Real-Time Data Pipelines – Unifying Applications and Analytics with In-Memory Architectures. It covers the emergence of in-memory architectures and provides a framework for building real-time pipelines that serve as the foundation for machine learning applications.
Chapter 2: Processing Transactions and Analytics in a Single Database
This chapter details the shift from Online Transaction Processing (OLTP) and Online Analytical Processing (OLAP) to converged, multi-model systems designed for Hybrid Transaction/Analytical Processing (HTAP).
Chapter 3: Dawn of the Real-Time Dashboard
Data visualization is arguably the most powerful method for enabling humans to understand and spot patterns in a dataset. Chapter three explores the role of Business Intelligence (BI) tools, and how they provide a visualization layer for data analysts to detect historical trends and identify future predictions.
Chapter 4: Redeploying Batch Models in Real Time
Applying existing batch processes based on statistical models to real-time data pipelines opens a multitude of easily accessible opportunities for machine learning and predictive analytics. In this section, we look at ways to apply machine learning to real-time problems by repurposing familiar machine learning models.
Chapter 5: Applied Introduction to Machine Learning
Choosing the proper machine learning technique requires evaluating a series of tradeoffs like training and scoring latency, bias and variance, and in some cases accuracy versus complexity. This chapter provides a broad introduction to applied machine learning with emphasis on resolving these tradeoffs with business objectives in mind.
Chapter 6: Real-Time Machine Learning Applications
Chapter six details how real-time data processing systems and the ready availability of machine learning libraries make it possible to apply machine learning and execute finely tuned decisions in the moment.
Chapter 7: Preparing Data Pipelines for Predictive Analytics and Machine Learning
Although certain techniques are better suited to real-time analytics and tight training or scoring latency requirements, the challenges preventing adoption relate largely to infrastructure rather than machine learning theory. This part of the book provides best practices for building database systems that are best suited for predictive analytics and machine learning.
Chapter 8: Predictive Analytics in Use
Expanding on the subject of taking machine learning from batch to real-time, this chapter explores Internet of Things (IoT) and renewable energy use cases. It also provides machine learning code samples and connects the dots from real-time data pipelines to BI visualizations.
Chapter 9: Techniques for Predictive Analytics in Production
This section makes predictive analytics more accessible by combining well-understood machine learning techniques with technology advances in software and hardware. It also includes numerous code samples to help you get started.
Chapter 10: From Machine Learning to Artificial Intelligence
The move from machine learning to broader artificial intelligence will happen. The final chapter paves a logical path forward for making the leap.