As seen in the blog post AWS SageMaker + SingleStore, data scientists can take models that they build in AWS SageMaker and deploy them as user-defined functions within our general purpose relational database. But what if the system you want to access cannot be easily re-deployed inside our database? What if the system you want to use is a complex remote service like AWS Translate? In this article, three of SingleStore’s engineering team members, Stacey Christian, Cheng Chen and William Zhang, explain how SingleStore provides new external user-defined and table-valued functions to call out to arbitrary remote services from inside SingleStore queries.
Setting the Stage
Amazon Translate is a Neural Machine Translation (MT) service for translating text between supported languages. Powered by deep learning methods, the service provides high-quality, affordable, and customizable language translation, enabling developers to translate content across multiple languages. This service can be used via an API, enabling either real-time or batch translation of text.
It would be impractical to embed this service inside the SingleStore database as a UDF. The code is a large and complex neural network based algorithm and cannot be rewritten as SQL statements.
Instead SingleStore enables you to call out to this service with two new features:
- External User-Defined (and Table Valued) Functions
- HTTP Connection Links
In other words, SingleStore now provides the external function feature allowing your business to call out to large third-party code bases written in any programming language. How sweet is that?!
This blog will demonstrate that setting up Singlestore to use Amazon Translate can be done in three easy steps:
- Setup an AWS Lambda translate function
- Define the external function in SingleStore
- Run standard SQL queries using the external function
Setup AWS Lambda Function
The simplest way to set up the Amazon Translation Service for use with SingleStore is through an AWS Lambda Function using the AWS API Gateway.
The basic flow looks like this:
The use of the secure http proxy service (through the API Gateway) can increase security by authenticating requests to the remote service. The proxy service can also support subscription-based billing for a remote service.
In this setup, authentication of the remote service and data encryption are achieved with HTTPS used by AWS API Gateway. Authentication of the client (SingleStore) can be achieved with SingleStore’s HTTP connection link. It implements the basic access authentication by sending custom headers of all the credentials along with every HTTP request.
Create Lambda Function
Using the AWS Lambda Console, create a lambda function as listed below. This function is written in python and is fairly straightforward, following these five steps:
- Imports the boto3 library for creating AWS service clients
- Retrieves the credentials from the HTTP Request header
- Creates the translate client with the credentials
- Loops over the rows passed in the HTTP Request body and translates each one
- Returns the translated data in the HTTP Response body
Setup the API Gateway
Once your lambda function is created, use the AWS API Gateway Console to create a public REST API with Lambda Proxy Integration that references your lambda function.
In Lambda Proxy Integration, the input to the integrated Lambda function is expressed as a combination of request headers, path variables, query string parameters, and body accessed through the event object. The lambda function above assumes this and accesses request headers as well as the body in the request and response.
Setting up a Lambda Proxy Integration is simple. Other than choosing your lambda function, you have little else to do. API Gateway configures the integration request and integration response for you. You can secure access to your API with authentication and authorization controls. See AWS API Gateway documentation for more details on security options.
Finally, deploy the API. Once deployed, the console will tell you the exact url to invoke. This needs to be used in the external function declaration below.
Define Your External Function in S2
An external function calls code that is executed outside SingleStore. The remotely executed code runs in a remote service (as described above). Information sent to and received from the remote service is relayed through the HTTP Proxy Service. Security related information needed by the service can be stored in SingleStore’s new HTTP Connection Link.
Define HTTP Connection Link
An HTTP Connection Link can be used to pass credentials (or any arbitrary settings) to your remote service in the HTTP header area. For this example, we will define an HTTP Link called aws_creds with the aws credentials as follows:
CREATE LINK aws_creds as HTTP
\"aws_secret_access_key\":\"ARcUKFNfCc+Dt2vY5WL/8Wj5\"} } ";
The AWS lambda function needs both the aws_access_key_id and the aws_secret_access_key in order to create a Translate client for the Translation Service.
Note that when the HTTP Link is created, SingleStore never displays these Credentials back to any users. This allows multiple users to use the HTTP Link without knowing what the credentials are. Only users with CREATE LINK permissions can create an HTTP Connection Link.
Define the External Function
The external function needs to be created that specifies the signature of the function as well as the url for the remote service, the data format, and the HTTP Link to use.
In our case, the aws lambda function will take 3 text arguments as input:
- Text to translate
- The language to translate from
- The language to translate to
and will return a single text value with the translated text.
Our function expects the data sent to and received from it to be JSON.
So the external function definition looks like this:
CREATE EXTERNAL FUNCTION translate(t TEXT, lang_from TEXT(2), lang_to TEXT(2))
AS REMOTE SERVICE
FORMAT JSON LINK aws_creds;
Notice that we reference the aws_creds HTTP Link defined in the previous section. Also note that the url is the one specified by the AWS Gateway API Console when we deployed it.
Enable Your External Function
By default external functions are disabled in SingleStore. Before using them you must turn them on using the enable_external_functions global variable. While you can just turn them ON, it is preferable to set enable_external_functions to ALLOWLIST and then to place your url in the global variable external_functions_allowlist.
Note that if you are using the SingleStoreDB Cloud you will need to file a support ticket to enable external functions.
Run Standard Queries
Because the external function we defined is scalar, it can be used anywhere in a SQL statement where an expression is allowed. We created a table with blog entries in various languages as follows:
CREATE TABLE blog(entry TEXT NOT NULL, lang TEXT(2) NOT NULL);
With these rows:
| entry | lang |
| Guten Morgen allerseits | de |
| William is at home for seven days | en |
| Levi es un chico muy malo | es |
and submitted the following query:
SELECT translate(entry, lang, 'en') as english, entry, lang t.blog FROM t;
The following results were returned:
| english | entry | lang |
| William is at home for seven days | William is at home for seven days | en |
| Good morning, everyone | Guten Morgen allerseits | de |
| Levi is a very bad boy | Levi es un chico muy malo | es |
Notice that all entries were successfully translated to English from whatever language they were entered in. A client application could use a query like this to retrieve data from SingleStore and translate it into whatever language the client application is currently using, regardless of the language(s) the original blog entries were captured in.
SingleStore will attempt to execute external functions in batched mode instead of single-row mode. That is, when making external function calls to the remote service, SingleStore will try to send a batch of input rows instead of just a single input row. This improves the performance by amortizing the network cost. SingleStore will always use batched mode for table-valued external functions, and will convert scalar-valued external functions to table-valued external functions if possible in order to use batched mode.
So what have we demonstrated here? We have shown how SingleStore now provides the external function feature allowing your business to call out to large third-party code bases (such as commercial machine learning libraries) written in any programming language (such as python, golang, C++, etc.). These external functions can be used anywhere in a SQL statement where an expression is allowed. Furthermore, developers can write remote services that can be used not only by SingleStore but from any other application using the same interface.
If you have any questions about how SingleStore uses external functions to harness the power of remote services, ask them on our Forum. Development experts and engineers from SingleStore, as well as members of our user community are always happy to help out.
If you are ready to try SingleStore for yourself, sign up for a free trial version. To stay current with what’s happening, follow our developer-focused handle on Twitter.