One of the most common tasks with any database is loading large amounts of data into it from an external data store. Both SingleStore and MySQL provide the LOAD DATA command for this task; this command is very powerful, but by itself, it has a number of restrictions:
Why We Built the SingleStore Loader
At SingleStore, we’ve acutely felt all of these limitations. That’s why we developed SingleStore Loader, which solves all of the above problems and more. SingleStore Loader lets you load files from Amazon S3, the Hadoop Distributed File System (HDFS), and the local filesystem. You can specify all of the files you want to load with one command, and SingleStore Loader will take care of deduplicating files, parallelizing the workload, retrying files if they fail to load, and more.
Use a load command to load a set of files
View the progress of a job using the ps command
We have been using SingleStore Loader here at SingleStore for quite a while now, and have provided a binary version on our website for anyone to use. However, we are proud of the code we produced (or at least proud enough), and have decided to open source the SingleStore Loader project.
Give SingleStore Loader a Try – Download Now on GitHub
The project uses several open source libraries, such as the Voluptuous data validation library and our own SingleStore Python connector. You can find the project here. Check it out, and let us know what you think!