Now Available — Building an Enterprise-Grade Gen AI App Using Google Vertex AI and SingleStoreDB


Madhukar Kumar


Now Available — Building an Enterprise-Grade Gen AI App Using Google Vertex AI and SingleStoreDB

Most Large Language Models (LLMs) have been trained on publicly available data with a cut-off date. This is not ideal for enterprises and companies with vast amounts of data that are typically behind a firewall or on a virtual private network.

To take advantage of LLMs, more and more companies are building — or expanding — their technology stack that has two key features: 

  1. The ability to use an ensemble of LLMs that are fine tuned to their needs
  2. The ability to contextualize their enterprise data in real time to make the LLMs custom requirements.

Companies can now use Google Vertex AI and SingleStore to build enterprise grade private LLMs apps that are data-aware and custom to their needs.

Google Vertex AI APIs that recently became Generally Available (GA) now allow companies to choose from a variety of available models from Vertex AI Model Garden, fine tune them with company specific requirements and expose them as APIs. This enables companies to use open-source libraries like LangChain and LangSmith to chain different models, agents and tools.

When it comes to using context data in real time to further make the LLMs custom to companies’ requirements, organizations can now use the SingleStore database deployed on Google Cloud Platform (GCP). SingleStoreDB allows companies to store SQL and JSON data, run analytics in split seconds and do both lexical and semantic search to curate the most relevant and fresh data to send over to LLMs.

There are some specific features within SingleStoreDB that allow companies to take full advantage of the current advancements in the world of generative AI:

  1. SingleStore enables storing and querying vector data along with all other types of data using SQL in real time
  2. SingleStore uses Notebooks, enabling customers to use libraries like LangChain and inbuilt functions from Ibis in a native way to chain multiple LLMs — and also take advantage of private data deployed into SingleStoreDB.

  3. SingleStoreDB has connectors and Pipelines that enable companies to do fast ingest of data from different sources like streaming data from Kafka, JSON data from MongoDB and other data sources like Snowflake, CSV, etc.

In this article we will look at building a simple application that uses Vertex AI to choose and fine tune models, expose them as APIs and use them in a SingleStore Notebook with LangChain.

steps-to-build-out-an-end-to-end-llm-app-on-vertex-ai-and-single-storeSteps to Build out an end-to-end LLM app on Vertex AI and SingleStore

1. Deploy SingleStore on GCP

Sign up for SingleStore at . From the left hand menu options, choose +Group and create a new Workgroup and choose GCP.

Next, create a workspace and pick a compute instance that fits your needs:

Once you have created the workspace, now create a database and add tables based on your requirements. You can use the SQL Editor by clicking on the left-hand menu and build all of this using SQL as well.

2. Enable Vertex AI APIs

Sign in to Google Cloud console, and search and enable vertex AI APIs.

3. Copy Google Vertex AI project ID

Within Vertex AI click on the top left menu and choose a project. In the pop-up you would see a project ID which we need to copy for our next steps.

4. (Optional Step) Fine tune/customize the models based on a couple of custom use cases.

We will cover this in greater detail in a future blog/webinar.

5. Write your app in Google Collab or within SingleStore Notebook

 The code snippets below will show you how to use LangChain to scrape the Google Vertex AI documents website and then create a chat bot that uses Vertex AI PaLM model to answer questions about Vertex AI.

Install all libraries. We will be using:

!pip install gcloud
!pip install langchain
!pip install google-cloud-aiplatform
!pip install singlestoredb

Authenticate to Google Cloud account. Once you execute this piece of code, you will be prompted to open a new browser window and follow instructions to authenticate.

!gcloud auth application-default login

Next, add the Vertex AI project ID

!gcloud auth application-default set-quota-project <replace with
your project id from the step 3 above>

Set the project Id with the same ID as above

!gcloud config set project <replace with your project id from
the step 3 above>

Now let’s import the libraries from LangChain

from langchain.llms import VertexAI
from langchain.chains import RetrievalQA
from langchain.vectorstores import SingleStoreDB

Let’s now use LangChain to crawl the VertexAI docs and load and then split the text into chunks. In the next steps we will create embeddings and store in SingleStoreDB.

llm = VertexAI()
from langchain.document_loaders import WebBaseLoader

loader =
data = loader.load()

from langchain.text_splitter import

text_splitter = RecursiveCharacterTextSplitter(chunk_size =
chunk_overlap =
all_splits = text_splitter.split_documents(data)

Next, let’s use the Vertex AI model to create vector embeddings and then store these in SingleStoreDB as vectors. Make sure to replace the credentials below by getting your SingleStore credentials from your workspace connections menu option in SingleStoreDB.

from langchain.embeddings import VertexAIEmbeddings
from langchain.vectorstores import SingleStoreDB
import os


vectorstore = SingleStoreDB.from_documents(documents=all_splits,
vectorstore = SingleStoreDB(embedding=VertexAIEmbeddings())

As the last step, let’s ask a question and get the response back from Vertex AI

qa_chain =
"query": "What is Vertex AI?"})


{'query': 'What is Vertex AI?', 'result': 'Vertex AI is a
unified machine learning platform that helps you build, deploy,
and manage machine learning models.'}


We saw how easy it is to create a SingleStore database within Google Cloud Platform and then use custom data and custom private LLMs to build an end-to-end LLM application. You can continue to expand on this by adding your own UI layer and bringing in both streaming and JSON data into SingleStore that you can contextualize for your LLMs in real time.