Incumbents and Contenders in the $33B Database Market


Bruce Armstrong

Operating Partner at Khosla Ventures

Incumbents and Contenders in the $33B Database Market

The database market continues to surprise those of us who have been in it for a while. After the initial wave of consolidation in the late 1990s and early 2000s, the market has exploded with new entrants: column-stores, document databases, NoSQL, in-memory, graph databases, and more. But who will truly challenge the incumbents for a position in the Top 5 rankings? Oracle, IBM, Microsoft, SAP, and Teradata dominate the \$33B database market. Will it be a NoSQL database? Will it be an open source business model?

Ripping and replacing existing databases has been described as heart and brain surgery – at the same time. As such, new entrants must find new use cases to gain traction in the market. In addition, the new use cases must be of enough value to warrant adding a new database to the list of approved vendors. Splitting the world roughly into analytic use cases and operational use cases, we have seen a number of different vendors come and go without seriously disrupting the status quo. Part of the problem appears to be the strategy of using open source as a way to unseat the established vendors. While people seem willing to at least try free software (especially for new use cases), is it a sustainable business model?

The open-source market is growing rapidly. However, it is still less than 2% of the total commercial database market. Gartner’s latest numbers show the open-source database market at only $562M, and the total commercial database market at $33B, in 2014.

Furthermore, databases are complex, carrying decades of history behind them. To match, and ultimately exceed incumbent offerings, the key is not to have armies of contributors working in individual lanes, but rather to have a focused effort on the features that matter most for today’s critical workloads. This is especially true with the increasing number of mixed analytical and transactional use cases driven by the new real-time, digital economy. In the case of MySQL, the most successful open source database product, less than 1% of the installed base pays anything. Monty Widenius, the creator of MySQL, himself pointed this out in a famous post a couple of years ago.

The business model needs to make sense too. The open source world almost never subtracts, it adds: more components, more configurations, more scratches for individual itches. Witness the explosion of projects in the Hadoop ecosystem, and the amount of associated services revenue. A commercial model embeds features into the primary product, efficiently generating value. Today customers seek to consolidate the plethora of extensive data processing tools into fewer multi-model databases.

So, it is likely that the next vendor to win a spot in database history will do so by winning on features and workload applicability, and a proven business model with a primary product roadmap.

However, there are many compelling aspects of the open source model, with three core value propositions: (1) a functional, free version; (2) open-source at the “edges” of the product; and (3) a vibrant community around the product. How can a commercial vendor balance both worlds?

Companies pursuing these strategies include MapR in the Hadoop space. With announcements earlier this summer, SingleStore appears to be heading there too, for operational and analytical databases. They now have a SingleStore Community Edition with unlimited size and scale, and full access to core database features. While the production version of the product requires a paid license, this seems to be a reasonable way to balance the need to support a growing, focused engineering team with core value propositions of an open-source model.

So, the question remains: as the database wars heat up and the market gets crowded, who will prevail to lead the industry? With open-source becoming more mainstream, the true contenders will be the vendors that can offer a symmetry between open-source models and new critical workload features.