Where to get the memsql_demo Database data

Hi and Good Morning.
Where can I get the memsql_demo database data from please?
It is referred to in “#DeepDive into #SingleStore’s #DistributedArchitecture (#DeepDive into #SingleStore's #DistributedArchitecture - YouTube)”
Many thanks.

That demo looks like it is using TPC-H data as a basis to build on. If you just want to load that data into a managed service cluster, you can run our managed service and there are prompts to load data which give you CREATE TABLE statements and PIPELINES commands to load a TPC-H database.

Are you looking for something beyond that?

Hi @hanson. Thank you.
I just wanted to use it locally for test data.
I see it is about 100 Gb which is too much for me to download at present.
Thank you for answering my query.

@jlb

@hanson is right, the memsql_demo database I use in that training video is an auto generated standard TPC-H dataset. For example, comments and other text columns are generated in the manner of the “lorem ipsum” filler content and are meaningless. The database is named for the previous name of SingleStore, MemSQL. It contains sales type data. I have generated a smaller sample dataset and loaded it up into a public GDrive for you: TPCH Sample Data - Google Drive

I provided two backups, depending on your preference:

SingleStore Backup
This dir contains a SingleStore structured backup of an 8 partition database. You can move the partitions into place on a SingleStore cluster and then restore the database. It’s a good idea to rebalance the partitions afterward to ensure you have an equal count of partitions per leaf.

mysqldump
This dir contains SQL format data without the SingleStore partition structure. This is useful if you want a different count of partitions than the 8 in the backup above. Additionally it can be ingested into any MySQL compatible database, not just SingleStore. For example see below in which the mysql client is used to load it into SingleStore. Loading it this way requires that the mysql client is installed (or the memsql client) and that you have access to your cluster. Be sure to replace with the appropriate connection parameters.

$ mysql -h 127.0.0.1 -u root -p -e "create database memsql_demo;"
$ mysql -h 127.0.0.1 -u root -p memsql_demo < schema.sql 
$ mysql -h 127.0.0.1 -u root -p memsql_demo < memsql_demo_dump.sql 

I hope you enjoy trying out the data!

1 Like

Thank you very much @glalonde.
A noob question.
How to I copy these data files to the home directory of the cluster-in-a-box Docker container please.
I access a volume from Studio but can’t figure out how to simply copy these files into the os home directory of ciab.
I’m just feeling stupid today.

@jlb,

I think you can SCP to a guest like a container, depending on permissions. Then untar and change permissions so all files are owned by the user you use to connect to the database. Then you can restore by specifying the filepath.

However I’ll give you another option. I also put it publicly accessible up on S3. You can restore from it like this:

memsql> RESTORE DATABASE memsql_demo FROM s3 "s3://memsql-demo/"  CONFIG '{"region":"us-west-2"}';
Query OK, 1 row affected (21.65 sec)

memsql> show databases;
+--------------------+
| Database           |
+--------------------+
| cluster            |
| information_schema |
| memsql             |
| memsql_demo        |
+--------------------+
4 rows in set (0.00 sec)

I also wanted to provide some info on the size.

  • Backup in S3: about 400 MB
  • Rows in lineitem: 1,225,090
  • Rows in partsupp: 800,000
  • Rows in orders: 306,481
  • Rows in part: 200,000
  • Rows in customer: 150,000
  • Rows in supplier: 10,000
  • Rows in supplier: 1000
  • Rows in nation: 25
  • Rows in region: 5
  • Rows in system: 4

Best,
Genevieve

2 Likes

@glalonde.
Hi Genevieve.
Thank you very much for your effort and assistance which I will definitely use.
As Cluster-in-a-Box(memsql-ciab) is my go-to practice and learning area I would like to learn more about how to use it.
I have previously set up the 4 node cluster as guided in the SingleStore Training Course.
I was able to follow the SSH instructions to access those nodes etc.

Obviously Cluster-in-a-Box would be set up differently.

Would there be some easy steps to enable me to access memsql-ciab (with SSH/SCP if necessary) from the Windows command line?

I can use
docker exec -it memsql-ciab sh
(and then, e.g.
memsql -uroot -p"xxx")

but am unable to copy files in etc.

I would greatly appreciate that.

Many thanks.

Hi @glalonde.
After my bad moment yesterday I figured out what to do. Sorry for that stupid question.

Your SingleStore Backup and mysqldump instructions worked 100%.

I did try “RESTORE DATABASE memsql_demo FROM s3” and everything worked fine except that the download from S3 timed out. I tried a number of times.
I suspect that it is the mobile data internet connection we have here. It averages about 1.5 megabytes per second but does go up and down a lot. Yes it is megabytes per second, not megabits.
We are unable to get a fiber connection where I live and work from.

I can backup and restore small databases using S3 absolutely successfully. It works extremely well.

Thank you very much again for your input.

I’m getting there.

You have an amazing product.