Designing Data-Intensive Applications, as a complete book, is more than 500 pages long. It takes as its premise that data is at the center of many of the challenges in system design today. (This is not a premise that is a surprise to us at SingleStore.)
Author Martin Kleppman is an expert on this topic. He’s a researcher in distributed systems at Cambridge University. Before that, he was a software engineer tackling these topics at several companies, including LinkedIn and Rapportive.
Martin is a frequent conference speaker, blogger, and contributor to open source projects. Besides the work under discussion here, his other currently available book is a free e-book on stream processing, which goes into depth on Apache Kafka and other streaming data platforms. Check it out!
You’ll find that reading the complete Designing Data-Intensive Applications book will be good for you in every possible way. You’ll grow professionally and personally. Martin’s work is that good.
However, we here at SingleStore will also challenge you to go beyond even Martin’s ambitious thesis and book. Rather than focusing on data-intensive applications as some kind of special category, as the author does, we believe that most applications under development today should be treated as data-intensive.
That is, we believe that almost every application should be reconsidered, during design, around key data-related questions:
- What is the core data needed as input to this application?
- What additional data could usefully be gathered as part of the application?
- What is the short-term and long-term value of the core data?
- What is the short-term and long-term value of the potential, additional data?
- Where will the operational data store be? Where will the archival data store be? (Which database; in the cloud vs. on-prem; etc.)
- Do you have the data science and data analytics resources in-house to make the best use of the data?
- If not, should you consider licensing the data, partnering around it, and other means to ensure that you make full use of it?
- Once your data plans are fully in place, do any additional application opportunities open up?
Machine learning and AI, in particular, bring these questions to life. You can’t do machine learning and AI without data; with machine learning, AI, and the data needed to power relevant and related applications, you may be able to accomplish things you had not previously believed to be achievable.