We’re guiding you through how to integrate Presto, a distributed query engine for SQL users with SingleStoreDB — complete with a deep dive into architecture, installation, queries and more.

What Is Presto?what-is-presto

Presto is a distributed query engine for big data that uses SQL query language. Its architecture enables users to query data sources like Hadoop, Cassandra, Kafka, AWS S3, Alluxio, MySQL, MongoDB and Teradata — and allows use of multiple data sources.  

Simply put, Presto offers compute that runs on top of storage.

Presto architecturepresto-architecture

What Is SingleStoreDB?what-is-single-store-db

SingleStoreDB is a real-time, distributed SQL database. With familiar SQL tooling and MySQL wire protocol compatibility, SingleStoreDB eliminates the need for specialized databases and simplifies database architectures. 

SingleStoreDB is also built to handle multiple data types (including JSON, time-series, geospatial and full-text search) — delivering high-speed data ingest on a unified transactional and analytical foundation.

SingleStoreDB architecturesingle-store-db-architecture

Using Presto with SingleStoreDBusing-presto-with-single-store-db

Presto is more performant when integrated with SingleStoreDB for a few key reasons:

  • Similar to SingleStoreDB, Presto supports supports in-memory processing
  • Presto is a pull model
  • Like SingleStoreDB, Presto supports columnar storage and execution in its query engine
  • Presto supports multi-level caching

Sample use casesample-use-case

Now, let’s take a look and installing and running Presto with SingleStoreDB.

Presto Installationpresto-installation

Presto on EC2 Amazon Linux:presto-on-ec-2-amazon-linux

First elevate yourself to root

sudo su

Then update yum:

yum update -y

Now, install OpenJDK for Amazon

yum install java-11-amazon-corretto.x86_64

Check that Java 11 is correctly installed

java --version

Install the Presto binariesinstall-the-presto-binaries

Download the Presto release binaries into the EC2 instance

wget https://repo.maven.apache.org/maven2/io/prestosql/presto-server/330/presto-server-330.tar.gz

Extract the archive to a directory named presto-server-330

tar xvzf presto-server-330.tar.gz

Configure Presto and add a data sourceconfigure-presto-and-add-a-data-source

Let’s provide a set of configuration files in presto-server-330/etc , add a data source and start the Presto daemon:

  • Presto logging configuration etc/config.properties
  • Presto node configuration etc/node.properties
  • JVM configuration etc/jvm.config
  • Catalog properties file for the TPC-H connector

Create the etc directory in presto-server-330

cd presto-server-330

mkdir etc

Then create the three files:

etc/config.properties

coordinator=true

node-scheduler.include-coordinator=true

http-server.http.port=8081

query.max-memory=5GB

query.max-memory-per-node=1GB

query.max-total-memory-per-node=2GB

discovery-server.enabled=true

`discovery.uri=http://172.31.21.146:8081`

etc/node.properties

node.environment=demo

etc/jvm.config

-server

-Xmx4G

-XX:+UseG1GC

-XX:G1HeapRegionSize=32M

-XX:+UseGCOverheadLimit

-XX:+ExplicitGCInvokesConcurrent

-XX:+HeapDumpOnOutOfMemoryError

-XX:+ExitOnOutOfMemoryError

-Djdk.nio.maxCachedBufferSize=2000000

-Djdk.attach.allowAttachSelf=true

etc/catalog/mysql.properties (if the catalog folder is not found, manually create it)

connector.name=mysql

connection-url=jdbc:mysql://localhost:3306

connection-user=root

connection-password=Singlestore@123

Run Prestorun-presto

Let’s start Presto! Begin as a foreground process:

bin/launcher run

In the previous function you should see following line: 

INFO        main io.prestosql.server.PrestoServer ======== SERVER STARTED

This indicates that you have a running instance of Presto.

You can access the Presto UI at http://{ec2-public-ip}:8081

SingleStoreDB Installationsingle-store-db-installation

You can deploy SingleStoreDB using any of the methods listed in our deployment documentation.

Here is a sample test result using Presto with SingleStoreDB: 

[ec2-user@ip-172-31-21-146 presto-server-330]$ ./presto --server 172.31.21.146:8081 --catalog mysql --schema test

presto:test> show catalog;

Query 20230110_130053_00002_7qnha failed: line 1:6: mismatched input 'catalog'. Expecting: 'CATALOGS', 'COLUMNS', 'CREATE', 'CURRENT', 'FUNCTIONS', 'GRANTS', 'ROLE', 'ROLES', 'SCHEMAS', 'SESSION', 'STATS', 'TABLES'

show catalog

presto:test> show catalogs;

 Catalog 

---------

 mysql   

 system  

(2 rows)

Query 20230110_130101_00003_7qnha, FINISHED, 1 node

Splits: 19 total, 19 done (100.00%)

185ms [0 rows, 0B] [0 rows/s, 0B/s]

presto:test> show schemas from mysql;

       Schema       

--------------------

 cluster            

 information_schema 

 memsql             

 test               

(4 rows)

Query 20230110_130111_00004_7qnha, FINISHED, 1 node

Splits: 19 total, 19 done (100.00%)

167ms [4 rows, 55B] [23 rows/s, 329B/s]

presto:test> use mysql

          -> ;

Query 20230110_130307_00005_7qnha failed: Schema does not exist: mysql.mysql

presto:test> use test;

USE

presto:test> create table presto(id int);

CREATE TABLE

presto:test> insert into presto values (222);

INSERT: 1 row

Query 20230110_130420_00010_7qnha, FINISHED, 1 node

Splits: 35 total, 35 done (100.00%)

0:01 [0 rows, 0B] [0 rows/s, 0B/s]

Get Started Today with SingleStoreDBget-started-today-with-single-store-db

In addition to integrating seamlessly with Presto, SingleStoreDB also works with a variety of analytics and BI tools, ETL platforms, security and governance tools, and monitoring technology. To see the full capabilities of SingleStoreDB integrations, get started with a free trial today.