New

SingleStoreDB Notebook Basics

Notebook

SingleStore Notebooks

SingleStoreDB Notebook Basics

Note

This notebook can be run on a Free Starter Workspace. To create a Free Starter Workspace navigate to Start using the left nav. You can also use your existing Standard or Premium workspace with this Notebook.

Prototyping applications or analyzing datasets using notebooks in SingleStoreDB Cloud follows the same general principles as developing with a Jupyter Notebook. SingleStoreDB Cloud supports internal and external datasources. Internal datasources are databases that exist within your workspace. An external datasource could be an AWS S3 bucket for example. In this Notebook we cover:

  1. Connecting to a SingleStoreDB instance

  2. Connecting to an external datasource including firewall Settings

  3. Using SQL in a cell

  4. Using Python in a cell

  5. Using both SQL & Python

  6. Installing Libraries

  7. Using Magic Commands

To learn more about working with SingleStoreDB notebooks check out our docs!

1. Connecting to SingleStoreDB

Once you select a workspace, you can access all of the databases attached to that workspace. You cannot connect to databases that are not attached to the workspace you are using.

First select a workspace and the information_schema database from the drop-down menu at the top of this notebook.

With the database selected, the connection_url variable in the Python enviroment is now updated with that information and we can use the %%sql magic command to query the selected database.

In [1]:

%%sql
SELECT * FROM users
LIMIT 3;

When running SQL commands against a different database explicitly, you can specify the database in your SQL code with the USE command:

In [2]:

%%sql
USE information_schema;
SELECT * FROM users
LIMIT 3;

Alternatively, you can specify the database prefix on the table in the query itself.

In [3]:

%%sql
SELECT * FROM information_schema.users
LIMIT 3;

Connecting with SQLAlchemy

You can also connect to your SingleStoreDB datasource using Python and SQLAlchemy. As mentioned above, the connection_url variable is automatically populated by the notebook environment when selecting a database in the drop-down menu at the top of the notebook.

In [4]:

import sqlalchemy as sa
sa_conn = sa.create_engine(connection_url).connect()

You can also explicitly define a URL using the individual connection components.

In [5]:

database_name = "information_schema"
connection_url2 = f"singlestoredb://{connection_user}:{connection_password}@{connection_host}:{connection_port}/{database_name}"
url_conn = sa.create_engine(connection_url2).connect()

In addition, the SingleStoreDB Python package includes a wrapper create_engine function that uses the SINGLESTOREDB_URL without having to specify connection_url.

In [6]:

import singlestoredb as s2
conn = s2.create_engine().connect()

Using conn, we can run our queries much like the %%sql command.

In [7]:

query1 = sa.text('SELECT * FROM users LIMIT 3;')
for row in conn.execute(query1):
print(row)

2. Connecting to an external datasource

You can securely connect to external endpoints from your SingleStoreDB notebooks. By default, connections are limited to SingleStoreDB databases; however, you can enable and disable connections to other external endpoints via the allowlist. To add or remove endpoints from the allowlist:

  1. Select Edit Firewall at the top-left of this notebook.

  1. Select Edit to add new endpoints:

  1. In the Edit Allowlist dialog, you can add a Fully Qualified Domain Name (FQDN) or select from a list of suggested FQDNs (for example pypi.org or github.com). You can provide wildcard access to an endpoint by using the * character. For example, to access an AWS S3 endpoints, you can use the following syntax: *.s3.*.amazonaws.com

  2. Select Save.

3. Using SQL

The default language for SingleStoreDB Cloud notebooks is Python. However, the %%sql magic command can be used to submit SQL code for an entire cell.

In [8]:

%%sql
SELECT * FROM users
LIMIT 3;

By default, the results are displayed as a table. We can also store the result in a variable for use later in the notebook. The following code includes the result1 << syntax which indicates that the output of the SQL code should be stored in the result variable in the Python environment.

In [9]:

%%sql result1 <<
SELECT * FROM users
LIMIT 3;

We now have access to the result variable and can convert it to a DataFrame!

In [10]:

import pandas as pd
df = pd.DataFrame(result1)
df

4. Using Python in a code cell

By default, Python is the language for code cells. In the cell below, we are using a SQLAlchemy connection to execute the same query as the previous example. The result of this query can be converted into a DataFrame in the same manner as above

In [11]:

result = conn.execute(sa.text('SELECT * FROM users LIMIT 3;'))
df = pd.DataFrame(result)
df

5. Using both SQL & Python in a code cell

We can use a single line of SQL within a Python cell using a single %sql call. Below we combine SQL and Python in the same cell to capture the output in the result variable. We then convert it to a DataFrame as in previous examples.

In [12]:

result = %sql SELECT * FROM users LIMIT 3;
df = pd.DataFrame(result)
df

6. Preinstalled libraries

By default, a SingleStoreDB notebook has a large number of preinstalled libraries. Run the cell below to see what libraries are already installed!

In [13]:

!pip list

Our notebooks support libraries available from https://pypi.org/. For example, run the cell below to install the Kaggle open dataset library to install the opendatasets package.

In [14]:

!pip3 install opendatasets

You can even upgrade versions of a preinstalled library. Run the cell below to get the new version of Plotly.

In [15]:

!pip3 install plotly --upgrade

7. Magic commands

Magic commands in Jupyter Notebook are special commands that allow you to perform various tasks that are not part of the standard Python language. We have demonstrated two of the included magic commands already: %%sql for submitting entire cells of SQL code and %sql for submitting a single query in the context of a Python code cell.

There are many other magic commands as well for everything from file system access to debugging your Python code. For information about teh full list of magic commands available, run the code cell below.

In [16]:

%quickref

Learn more about SingleStoreDB notebooks here and get started with your first notebook!

Details

Tags

#starter#python#dataframes

License

This Notebook has been released under the Apache 2.0 open source license.