Integrating SingleStoreDB with Presto

JR

Jeetendra Ranjan

Senior Enterprise Solutions Engineer

Integrating SingleStoreDB with Presto

We’re guiding you through how to integrate Presto, a distributed query engine for SQL users with SingleStoreDB — complete with a deep dive into architecture, installation, queries and more.

what-is-prestoWhat Is Presto?

Presto is a distributed query engine for big data that uses SQL query language. Its architecture enables users to query data sources like Hadoop, Cassandra, Kafka, AWS S3, Alluxio, MySQL, MongoDB and Teradata — and allows use of multiple data sources.  

Simply put, Presto offers compute that runs on top of storage.

presto-architecturePresto architecture

what-is-single-store-dbWhat Is SingleStoreDB?

SingleStoreDB is a real-time, distributed SQL database. With familiar SQL tooling and MySQL wire protocol compatibility, SingleStoreDB eliminates the need for specialized databases and simplifies database architectures. 

SingleStoreDB is also built to handle multiple data types (including JSON, time-series, geospatial and full-text search) — delivering high-speed data ingest on a unified transactional and analytical foundation.

single-store-db-architectureSingleStoreDB architecture

using-presto-with-single-store-dbUsing Presto with SingleStoreDB

Presto is more performant when integrated with SingleStoreDB for a few key reasons:

  • Similar to SingleStoreDB, Presto supports supports in-memory processing
  • Presto is a pull model
  • Like SingleStoreDB, Presto supports columnar storage and execution in its query engine
  • Presto supports multi-level caching

sample-use-caseSample use case

Now, let’s take a look and installing and running Presto with SingleStoreDB.

presto-installationPresto Installation

presto-on-ec-2-amazon-linuxPresto on EC2 Amazon Linux:

First elevate yourself to root

sudo su

Then update yum:

yum update -y

Now, install OpenJDK for Amazon

yum install java-11-amazon-corretto.x86_64

Check that Java 11 is correctly installed

java --version

Install the Presto binaries

Download the Presto release binaries into the EC2 instance

wget https://repo.maven.apache.org/maven2/io/prestosql/presto-server/330/presto-server-330.tar.gz

Extract the archive to a directory named presto-server-330

tar xvzf presto-server-330.tar.gz

Configure Presto and add a data source

Let’s provide a set of configuration files in presto-server-330/etc , add a data source and start the Presto daemon:

  • Presto logging configuration etc/config.properties
  • Presto node configuration etc/node.properties
  • JVM configuration etc/jvm.config
  • Catalog properties file for the TPC-H connector

Create the etc directory in presto-server-330

cd presto-server-330

mkdir etc

Then create the three files:

etc/config.properties

coordinator=true

node-scheduler.include-coordinator=true

http-server.http.port=8081

query.max-memory=5GB

query.max-memory-per-node=1GB

query.max-total-memory-per-node=2GB

discovery-server.enabled=true

`discovery.uri=http://172.31.21.146:8081`

etc/node.properties

node.environment=demo

etc/jvm.config

-server

-Xmx4G

-XX:+UseG1GC

-XX:G1HeapRegionSize=32M

-XX:+UseGCOverheadLimit

-XX:+ExplicitGCInvokesConcurrent

-XX:+HeapDumpOnOutOfMemoryError

-XX:+ExitOnOutOfMemoryError

-Djdk.nio.maxCachedBufferSize=2000000

-Djdk.attach.allowAttachSelf=true

etc/catalog/mysql.properties (if the catalog folder is not found, manually create it)

connector.name=mysql

connection-url=jdbc:mysql://localhost:3306

connection-user=root

connection-password=Singlestore@123

Run Presto

Let’s start Presto! Begin as a foreground process:

bin/launcher run

In the previous function you should see following line: 

INFO        main io.prestosql.server.PrestoServer ======== SERVER STARTED

This indicates that you have a running instance of Presto.

You can access the Presto UI at http://{ec2-public-ip}:8081

single-store-db-installationSingleStoreDB Installation

You can deploy SingleStoreDB using any of the methods listed in our deployment documentation.

Here is a sample test result using Presto with SingleStoreDB: 

[ec2-user@ip-172-31-21-146 presto-server-330]$ ./presto --server 172.31.21.146:8081 --catalog mysql --schema test

presto:test> show catalog;

Query 20230110_130053_00002_7qnha failed: line 1:6: mismatched input 'catalog'. Expecting: 'CATALOGS', 'COLUMNS', 'CREATE', 'CURRENT', 'FUNCTIONS', 'GRANTS', 'ROLE', 'ROLES', 'SCHEMAS', 'SESSION', 'STATS', 'TABLES'

show catalog

presto:test> show catalogs;

 Catalog 

---------

 mysql   

 system  

(2 rows)

Query 20230110_130101_00003_7qnha, FINISHED, 1 node

Splits: 19 total, 19 done (100.00%)

185ms [0 rows, 0B] [0 rows/s, 0B/s]

presto:test> show schemas from mysql;

       Schema       

--------------------

 cluster            

 information_schema 

 memsql             

 test               

(4 rows)

Query 20230110_130111_00004_7qnha, FINISHED, 1 node

Splits: 19 total, 19 done (100.00%)

167ms [4 rows, 55B] [23 rows/s, 329B/s]

presto:test> use mysql

          -> ;

Query 20230110_130307_00005_7qnha failed: Schema does not exist: mysql.mysql

presto:test> use test;

USE

presto:test> create table presto(id int);

CREATE TABLE

presto:test> insert into presto values (222);

INSERT: 1 row

Query 20230110_130420_00010_7qnha, FINISHED, 1 node

Splits: 35 total, 35 done (100.00%)

0:01 [0 rows, 0B] [0 rows/s, 0B/s]

get-started-today-with-single-store-dbGet Started Today with SingleStoreDB

In addition to integrating seamlessly with Presto, SingleStoreDB also works with a variety of analytics and BI tools, ETL platforms, security and governance tools, and monitoring technology. To see the full capabilities of SingleStoreDB integrations, get started with a free trial today.


Share