How would you optimize a database for an app like Twitter?

trent27on · December 5, 2021, 3:03am

If you were to work with a set of data like Twitter’s with over a trillion rows, how could they efficiently utilize Singlestore? Not for analytics, but simply for serving user-generated content.

Should they store Tweets in Rowstore or Columnstore? Should accounts be stored in Rowstore or Clumnstore? How should they generate efficient Id’s? -They have onwards of 1 trillion Tweets. Would Singlestore replication and cache be enough to handle millions of requests per second, or would they need to cache with a system like Redis or Aerospike for the REST api?

How could they efficiently shard and index with Singlestore?

Is Singlestore even meant for this use case?

I’m asuming they should store accounts in Rowstore and use Columnstore for tweets, but all in persistent memeory to boost performance. And if not Rowstore for accounts, then Columnstore.

Now for feeds, I think using redis would still be the best option for them.

For id generation, I’m not sure. It depends on how you shard or index the data I imagine.

I’m sure singlestore could handle many requests per second, but latency is a big deal, so I’m assuming a cache like Redis/Aerospike would be needed to efficiently serve data.

Just curious about the scale of Singlestore, and what it can really be used for.

Thanks!

Edit: twitter is now using Google Bigtable & BigQuery for a lot of data. How can singlestore compete?

hanson · December 9, 2021, 7:37pm

Great questions! Before we launch into answering them all, do you have a specific app that your are trying to build, or are you mainly trying to learn more about SingleStore and its capabilities?