
Managing Stage files with Fusion SQL
Notebook

Fusion SQL can be used to manage your workspace groups and workspaces, but it can also be used to upload, download, and manage files in your workspace group or starter workspace Stage. We'll show you how to work with files in Stage in this notebook.
Displaying the Stage Fusion SQL commands
The SHOW FUSION COMMANDS
displays the commands that are handled by the Fusion
engine. You can use the LIKE
to filter the commands.
In [1]:
1
commands = %sql SHOW FUSION COMMANDS LIKE '%stage%'2
for cmd in commands:3
print(*cmd, '\n')
Creating a workspace group
We'll start by creating a workspace group. We can get a region in the US by using the SHOW REGIONS
command and the random
package.
In [2]:
1
import random2
import secrets3
4
us_regions = %sql SHOW REGIONS LIKE '%us%'5
6
region_id = random.choice(us_regions).ID7
region_id
In [3]:
1
wg_name = 'Fusion Notebook'2
password = secrets.token_urlsafe(20) + '-x&'
In [4]:
1
%%sql2
CREATE WORKSPACE GROUP '{{ wg_name }}'3
IN REGION ID '{{ region_id }}' WITH PASSWORD '{{ password }}'4
WITH FIREWALL RANGES '0.0.0.0/0'
Uploading and downloading Stage files
Uploading and downloading files to your Stage is easy with Fusion SQL. The commands are shown below.
DOWNLOAD STAGE FILE '<stage-path>' [ IN { ID '<deployment-id>' | '<deployment-name>' } ] [ TO '<local-path>' ]
[ OVERWRITE ] [ ENCODING '<encoding>' ];
UPLOAD FILE TO STAGE '<stage-path>' [ IN { ID '<deployment-id>' | '<deployment-name>' } ] FROM '<local-path>' [ OVERWRITE ];
First we'll create a data file locally that we can work with.
In [5]:
1
%%writefile mydata.csv2
name,age,height3
Sue,27,654
Joe,32,705
Max,44,696
Ann,33,64
We can now upload our data file to our workspace group Stage.
In [6]:
1
%%sql2
UPLOAD FILE TO STAGE 'stats.csv' IN '{{ wg_name }}' FROM 'mydata.csv'
We can list the files in a Stage with the SHOW STAGE FILES
command.
In [7]:
1
%%sql2
SHOW STAGE FILES IN '{{ wg_name }}'
Downloading the file is just as easy as uploading.
In [8]:
1
%%sql2
DOWNLOAD STAGE FILE 'stats.csv' IN '{{ wg_name }}' TO 'stats.csv' OVERWRITE
In [9]:
1
!cat stats.csv
If you just want to display the contents of the Stage file without saving it to a local
file, you simply leave the TO
option off the DOWNLOAD STAGE FILE
.
In [10]:
1
%%sql2
DOWNLOAD STAGE FILE 'stats.csv' IN '{{ wg_name }}' ENCODING 'utf-8'
Creating folders
Up to this point we have just worked with files at the root of our Stage. We can use Fusion SQL
to create folder structures as well. This is done with the CREATE STAGE FOLDER
command.
CREATE STAGE FOLDER '<stage-path>' [ IN { ID '<deployment-id>' | '<deployment-name>' } ] [ OVERWRITE ];
The following code will create this folder structure:
project-1/
project-1/data/
project-2/
project-2/data/
In [11]:
1
for name in ['project-1', 'project-1/data', 'project-2', 'project-2/data']:2
%sql CREATE STAGE FOLDER '{{ name }}' IN '{{ wg_name }}';
In [12]:
1
%%sql2
SHOW STAGE FILES IN '{{ wg_name }}' RECURSIVE
Now that we have a folder structure we can put files into those folders.
In [13]:
1
%%sql2
UPLOAD FILE TO STAGE 'project-1/data/stats.csv' IN '{{ wg_name }}' FROM 'mydata.csv';3
UPLOAD FILE TO STAGE 'project-2/data/stats.csv' IN '{{ wg_name }}' FROM 'mydata.csv';
Now when we do a recursive listing of our Stage, we'll see the newly created files.
In [14]:
1
%%sql2
SHOW STAGE FILES IN '{{ wg_name }}' RECURSIVE
We can list the files at a specific path as well.
In [15]:
1
%%sql2
SHOW STAGE FILES IN '{{ wg_name }}' AT 'project-2/data'
Loading data from Stage
We are going to load data from a Stage into a database table. For this, we need to have a workspace and a database.
In [16]:
1
%%sql2
CREATE WORKSPACE 'stage-loader' IN GROUP '{{ wg_name }}' WITH SIZE 'S-00' WAIT ON ACTIVE
In [17]:
1
%%sql2
SHOW WORKSPACES IN GROUP 'Fusion Notebook'
Action Required
Make sure to select the stage-loader workspace from the drop-down menu at the top of this notebook.
In [18]:
1
%%sql2
CREATE DATABASE IF NOT EXISTS stage_loader
Action Required
Make sure to select the stage_loader database from the drop-down menu at the top of this notebook. It updates the connection_url to connect to that database.
In [19]:
1
%%sql2
DROP TABLE IF EXISTS stats;3
CREATE TABLE stats (4
name TEXT,5
age INT,6
height INT7
);
Load the data from the Stage using a pipeline.
In [20]:
1
%%sql2
CREATE PIPELINE IF NOT EXISTS stage_test3
AS LOAD DATA STAGE 'project-2/data/stats.csv'4
BATCH_INTERVAL 25005
SKIP DUPLICATE KEY ERRORS6
INTO TABLE stats7
IGNORE 1 LINES8
FIELDS TERMINATED BY ','9
LINES TERMINATED BY '\n'10
FORMAT CSV;11
START PIPELINE stage_test FOREGROUND;12
DROP PIPELINE stage_test;
We can now query the table and select the output into a Stage. Note that the
GROUP BY 1
is used here to combine the outputs from all of the database partitions
into a single file. If you don't use that, you'll get multiple output files,
each with a portion of the result set.
In [21]:
1
%%sql2
SELECT * FROM stats GROUP BY 1 INTO STAGE 'project-3/data/stats.csv'3
FIELDS TERMINATED BY ','4
LINES TERMINATED BY '\n'
In [22]:
1
%%sql2
SHOW STAGE FILES IN '{{ wg_name }}' AT 'project-3' RECURSIVE
In [23]:
1
%%sql2
DOWNLOAD STAGE FILE 'project-3/data/stats.csv' ENCODING 'utf-8'
Deleting Stage files and folders
Files and folders can be deleted from a workspace Stage as well.
This is done with the DROP STAGE FILE
and DROP STAGE FOLDER
commands.
DROP STAGE FILE '<stage-path>' [ IN { ID '<deployment-id>' | '<deployment-name>' } ];
DROP STAGE FOLDER '<stage-path>' [ IN { ID '<deployment-id>' | '<deployment-name>' } ] [ RECURSIVE ];
Let's delete the stats.csv
file at the root of our Stage.
In [24]:
1
%%sql2
DROP STAGE FILE 'stats.csv' IN '{{ wg_name }}'
In [25]:
1
%%sql2
SHOW STAGE FILES IN '{{ wg_name }}'
Now let's delete the project-2
folder including all of the files in it.
In [26]:
1
%%sql2
DROP STAGE FOLDER 'project-2' IN '{{ wg_name }}' RECURSIVE
In [27]:
1
%%sql2
SHOW STAGE FILES IN '{{ wg_name }}' RECURSIVE
In [28]:
1
%%sql2
DROP STAGE FOLDER 'project-1' IN '{{ wg_name }}' RECURSIVE;3
DROP STAGE FOLDER 'project-3' IN '{{ wg_name }}' RECURSIVE;
Conclusion
We have demonstrated how to create and delete files and folders in a workspace group Stage using Fusion SQL. Note that it also supports managing Stage for starter workspaces. It is also possible to work with Stage files using the SingleStoreDB Python SDK, see the API documentation for more details.

Details
About this Template
Learn how to manage your Stage files in SingleStoreDB Cloud using Fusion SQL.
This Notebook can be run in Standard and Enterprise deployments.
Tags
License
This Notebook has been released under the Apache 2.0 open source license.
See Notebook in action
Launch this notebook in SingleStore and start executing queries instantly.