Full House: Developers Share 3 Signs You’ve Outgrown Your Open Source Database




Full House: Developers Share 3 Signs You’ve Outgrown Your Open Source Database

A developer’s life is anything but a sitcom. But if you’re building database functionality into an application or an operating environment, you eventually may run into a host of challenges with single-node open source databases that, if you weren’t tasked with fixing them, might seem comical.

In general, database performance problems relate to data ingestion, scaling, speed and not being able to easily store all the different kinds of data you want. In this blog you’ll get a personal view into what those problems look like, as developers share their experiences with the three signs of outgrowing popular open-source databases like MySQL, PostgreSQL and MariaDB. We’ve also included some tips on what to look for in a new database solution to ease the pain.

sign-1-application-performance-hits-a-wallSign 1: Application Performance Hits a Wall

Jack Ellis is co-founder of Fathom, Inc., a SaaS firm that believes website analytics should be simple, fast and privacy focused. Fathom delivers a simple, lightweight, privacy-first alternative to Google Analytics. Jack describes how his application’s performance suffered because he had maxed out MySQL:

"Despite keeping summary tables only (data rolled up by the hour), our [MySQL] database struggled to perform SUM and GROUP BY. And it was even worse with high cardinality data. One example was a customer who had 11,000,000 unique pages viewed on a single day. MySQL would take maybe 7 minutes to process a SUM/GROUP query for them, and our dashboard requests would just time-out. To work around this limitation, I had to build a dedicated cron job that pre-computed their dashboard data."

Read the impact story: Why Fathom Analytics Ditched MySQL, Redis and DynamoDB

Josh Blackburn is co-founder and head of technology at IEX Cloud, a data infrastructure and delivery platform for financial and alternative data sets that connects developers and financial data creators. Josh’s team builds high-performance APIs and real time streaming data services used by hundreds of thousands of applications and developers. He had hit a similar wall with MySQL running in Google Cloud:

"We average about 500,000 to 800,000 data ops per second, typically during market hours. These could be really tiny requests, but you can see our ingress and egress rates; we’re consuming a lot of data from multiple resources, but we’re also passing a lot of that out the door… In our case, we’ve got to keep up not just with the stock market, with real-time prices, but also with everyone coming in and needing all that data in real time."

Josh summed up his data ingestion challenge, “We were in a tight spot to find something that would scale and had better performance, especially on the ETL side, because we’re loading hundreds of gigs of data every day.”

Read the impact story: IEX Cloud Speeds Financial Data Distribution 15x With SingleStore

sign-2-an-open-source-database-doesnt-support-your-business-needsSign 2: An Open-Source Database Doesn’t Support Your Business Needs

Gerry Morgan is lead developer at dailyVest, a fintech company using 401(k) participant data and analytics to improve the health and performance of retirement plans. Each month, over 7 million investors and plan participants can access digestible insights delivered via visual dashboards.

Data volumes are growing at 36% a year, fueled by billions of transactions, and Gerry found that dailyVest’s Azure SQL database couldn’t support business growth. He said:

"[We were] not just increasing resource requirements in our cloud environment, but also increasing costs [of Azure Cloud resources]… We were also seeing some performance degradation in Azure SQL. Not so much that our customers would have noticed, but we noticed there was some drop off in speed in our ingestion of data. We wanted to improve our ETL operation, but at the same time improve the customer experience — all customers will be happy if you make things faster, even if they haven’t noticed if things were particularly slow."

Read the impact story: dailyVest Empowers 401(k) Plans for 7 Million Plan Participants

Mohammed Radwan is head of engineering at Foodics, a restaurant management software company serving more than 22,000 establishments in 35 markets. The company processes more than 5 billion orders per year, offering dashboard analytics for business owners and managers. At first, Foodics used a combination of CitusDB for and MySQL to power the business, later swapping out MySQL for a commercial version of PostgreSQL.

Foodics ran into reliability problems with CitusDB, experiencing outages that lasted three hours at a time up to four times per month. Only 200 users could concurrently use the existing system. Foodics had 5,000 customers, but downtime and a lack of fast data were accelerating churn. Although the company had just received $20 million in Series B funding in 2021, the unreliable system limited growth and put future funding at risk. Mohammed said:

"Like many tech companies, we started with MySQL. It was compatible with what we had and was easy to use. It fulfilled its purpose for a while, but when we needed to grow and expand, MySQL couldn’t enable that."

When experiencing scaling issues with MySQL or other open source databases, developers often turn to sharding middleware or NoSQL. These approaches, however, can compromise the performance of ACID-compliant transactions and complex joins, particularly in high-volume production environments.

sign-3-youre-dealing-with-database-sprawlSign 3: You’re dealing with database sprawl

Here, the writing on the wall is clear: if you need to incorporate multiple data types into your application or environment – such as time series, JSON, documents and other specialty data types – you are going to need to spin up specialty databases to contain them. These separate databases will need to be connected, maintained and upgraded, creating database sprawl — and exponential complexity.

If your single-node database supports only standard SQL numeric data, you will likely experience significant growing pains if you try to augment it to support multiple data types.

What should you look for in a replacement database?

Most single-node open source growing pains can be solved by a database that offers:

  • Streaming data ingestion overcomes open source databases’ inability to ingest, process and analyze streaming data necessary to power modern interactive SaaS applications and production environments.
  • Low-latency query performance solves query performance problems as data or concurrency demands grow.
  • Limitless scalability addresses the struggle that single-node architectures face when attempting to scale as business or user volumes grow.
  • Robust analytical abilities to overcome open source databases’ basic to non-existent analytical capabilities — driving fast, interactive user experiences.
  • Hybrid workload handling to eliminate the need for separate OLTP and OLAP systems; instead, these hybrid workloads can be handled in a single, unified system.

Your list may be much more granular. Jack at Fathom had a lengthy list of non-negotiables for any database he might consider to replace MySQL:

  1. It must be ridiculously fast
  2. It must grow with us. We don't want to be doing another migration any time soon
  3. It must be a managed service. We are a small team, and if we start managing our database software, we've failed our customers. We're not database experts and would rather pay a premium price to have true professionals manage something as important as our customers' analytics data
  4. It must be highly available. Multi-AZ would be ideal, but high availability within a single availability zone is acceptable too
  5. Cost of ownership should be under $5,000/month. We didn't want to spend $5,000 off the mark, as this would be on top of our other AWS expenses, but we were prepared to pay for value
  6. The software must be mature
  7. Companies much larger than us must already be using it
  8. Support must be great
  9. Documentation must be well-written and easy to understand

For Mohammed, delivering 24/7 resource availability was paramount. He said:

"We can’t take time off or delay reports. The most important thing for us is concurrency. As we grow, we need to ensure that our customer base grows with us. We needed a database that allows for seamless reporting without worrying about how many customers are using it all at once."

After all the challenges Foodics had weathered, Mohammed needed a database that would offer: 

  1. The ability to place all analytics-related data in a single unified data store
  2. A performant analytics engine with columnstore to democratize data access
  3. Real time and near real-time analytics with very fast reads and quick ingestio
  4. A multi-tenant architecture to use a single database for all customers 
  5. Support for a large and growing customer base in the tens of thousands 
  6. 100 concurrent queries per second, or approximately 1% of Foodics’ customer base at the time, to support the large number of reports being generated 
  7. The capability to process billions of orders and 5 million transactions per month
  8. Scale up and out capabilities to support Foodics’ accelerated growth strategy 
  9. High availability with almost zero downtime 

developers-choose-my-sql-wire-compatible-single-store-dbDevelopers Choose MySQL Wire-Compatible SingleStoreDB

Fathom, IEX Cloud, dailyVest and Foodics all chose SingleStoreDB, a real-time, distributed SQL database, to replace open source database technology. Mohammed’s reasons why are a common theme:

"We are a small team, so we did not want to spend time tuning a database. We wanted something that just worked out of the box. For this reason, we went with Singlestore Helios running on AWS. With SingleStore, we can just plug and play and do everything we need to empower our customers. It allows us to focus on what we are really here to do: serve our customers."

SingleStoreDB is MySQL wire-compatible, making it incredibly easy to migrate from any flavor of MySQL (including AWS RDS, Google Cloud SQL, Azure MySQL or others). It supports familiar SQL syntax, so developers don’t need to learn a completely new technology to get started.

Most developers can quickly complete their migration and get started with SingleStore in hours or a few days. To learn about migration, check out these resources:

after-migration-a-bigger-better-houseAfter Migration, a Bigger, Better House

All of the developers experienced major improvements in speed, performance, scalability and flexibility after they migrated to SingleStoreDB. Here’s how Jack tells Fathom’s “after” story:

  1. We no longer need a dedicated data-export environment…We do our data exports by hitting SingleStore with a query that it will output to S3 for you typically within less than 30 seconds. It's incredible. This means we can export gigantic files to S3 with zero concern about memory. We would regularly run into data export errors for our bigger customers in the past, and I've spent many hours doing manual data exports for them. I cannot believe that is behind me. I'm tearing up just thinking about it.
  2. Our queries are unbelievably fast. A day after migrating, two of my friends reached out telling me how insanely fast Fathom was now, and we've had so much good feedback.
  3. We can update and delete hundreds of millions of rows in a single query. Previously, when we needed to delete a significant amount of data, we had to chunk up deletes into DELETE with LIMIT. But SingleStoreDB doesn't need a limit and handles it so nicely
  4. We used to have a backlog, as we used INSERT ON DUPLICATE KEY UPDATE for our summary tables… [W]e had to put sites into groups to run multiple cron jobs side by side, aggregating the data in isolated (by group) processes. But guess what? Cron jobs don't scale, and we were starting to see bigger pageview backlogs each day. Well, now we're in SingleStore, data is fully real time. So if you view a page on your website, it will appear in your Fathom dashboard with zero delays.
  5. Our new database is sharded and can filter across any field we desire. This will support our brand new, Version 3 interface, which allows filtering over EVERYTHING.
  6. We are working with a team that supports us. I often feel like I'm being cheeky with my questions, but they're always so happy to help. We're excited about this relationship.
  7. SingleStoreDB has plans up to $119,000/month, which is hilarious. That plan comes with 5TB of RAM and 640 vCPU. I don't think we'll get there any time soon, but it feels good to see they're comfortable supporting that kind of scale. They're an exciting company because they're seemingly targeting smaller companies like us, but they're ready to handle enterprise-scale too.
  8. And as for price, we're spending under $2,000/month, and we're over-provisioned, running at around 10% - 20% CPU most of the day.

Josh from IEX Cloud summed up, “SingleStore enables us to do monitoring and analysis in the same system that houses our historical data, and this creates enormous efficiencies for us. We’ve been able to consolidate multiple databases, run our platform faster, and speed the onboarding processes for new data sets.”

If you’ve outgrown your single-node open source database and are ready to move into a bigger, better house, try SingleStore for free today.