MemSQL Database Backup Takes up Twice the Amount of Reported Disk Usage under MemSQL Studio

Hey all,

I don’t know if this is very simple solution that I missed elsewhere, but I’m currently running a setup with 1 master aggregator and 1 leaf node, with high availability off. MemSQL Studio reports the total disk usage as 1.6 TB, yet whenever I backup the database, it ends up taking twice the space > 3 TB.

Is there a redundancy setting I can disable somewhere, so that this database backup size only ends up as 1.6 TB? This would actually help us save time, as we don’t own extra SSDs in the >= 3TB range.

Thanks

Are you using columnstores or rowstores?

And where are you seeing the size of the backup?

Hey Hanson, sorry for delay, just now getting back to this topic.

The majority of my database (now 1.83 TB) are columnstore tables, while my rowstore data accounts for at most 1.1 GB. I am viewing the approximate database size within the MemSQL Studio dashboard.

Below is what’s displayed on my dashboard, whereas my actual disk usage accounts for now 5.8 TB. I’ve verified the only contents of this hard drive RAID array is the MemSQL database.

  • Is there a way to clear any possible duplicate data, further reducing the actual disk size of my database?
  • Is there a need for 24 partitions?

What do you get as a result for this?

select @@redundancy_level;

The number of partitions shouldn’t have a significant impact on database and backup data space usage, given the size of your data.

And are you sure you are only storing a single backup?

MySQL [(none)]> select @@redundancy_level;
+--------------------+
| @@redundancy_level |
+--------------------+
|                  1 |
+--------------------+

And yes, I just restored this backup onto a new server. This single database is reporting 1.83 TB of usage while also consuming 5+ TB, and there are no database backup files in the [memsql-node-path]/data directory.

So two machines are hosting this database backup, as MemSQL Studio on both display 1.83 TB.

Our engineer says that even with redundancy 2, we only back up the master partition, and the backup should be of size comparable to the DB, not double. I don’t have a next step to offer you at this point. If you’re a paying customer, you can file a support ticket.

No problem, thanks Hanson. The bulk of my data comes from the [leaf-path]/data/blobs directory, yet I’m starting to think perhaps the way I’m creating columnstore shards may also have something to do with the larger than expected database.