November 27, 2023

Bring Your Generative AI Application to the Next Level With watsonx.ai and SingleStoreDB

Pranav Aurora

Product Manager

Bring Your Generative AI Application to the Next Level With watsonx.ai and SingleStoreDB

As a component of the IBM watsonx platform, watsonx.ai is designed to merge groundbreaking generative AI technologies with conventional machine learning.

This all-in-one studio simplifies the AI development process, providing a streamlined environment for training, validating, tuning and deploying generative AI, foundation models and machine learning capabilities. With watsonx.ai, you can create AI quickly across the entire enterprise — even when working with limited datasets.

Clients can also integrate the platform with SingleStoreDB for real-time context data. This enables organizations to customize their Large Language Models (LLMs) to meet specific business requirements. SingleStoreDB combines hybrid search and analytics capabilities to deliver high performance, serving as a knowledge base for generative AI applications. It feeds accurate contextual data to watsonx.ai's LLM models in just milliseconds.

Deploying watsonx.ai and SingleStoreDB can help address network costs and performance bottlenecks, and also offer added layers of security and ease of deployment. This unified platform is designed to deliver a holistic solution for the many AI needs of a business.

SingleStoreDB is known for its HTAP (Hybrid transaction/analytical processing) storage, which outputs strong performance for transactional and analytical queries and enables real-time analytics. Those capabilities also apply to vectors with fast ingestion and efficient storage serving ibm.com applications with low latency response times on semantic searches.

Let’s take a simple example that you can use on your own through IBM watsonx sample notebooks. It lets you use watsonx.ai and SingleStoreDB to respond to natural language questions using the Retrieval Augmented Generation (RAG) approach. We’ll use a LangChain integration to make the developer experience easy.

Outline + steps

Setup and configuration. We ensure all the required packages are installed and configuration information (e.g. credentials) is provided.
Define query. We establish the query to be used. This is established up front because we will use the same query in a basic completion with both an LLM and RAG pattern.
Initialize language model. We select and configure the LLM.
Perform basic completion. We perform a basic completion with our query and LLM.
Get data for documents. We get and preprocess (e.g. split) the data we want to use in our knowledge base.
Initialize embedding model. We select and configure the embedding model we would like to use to encode our data for our knowledge base.
Initialize vector store. We initialize our vector store with our data and embedding model.
Perform similarity search. We use our initialized vector store and perform a similarity search with our query.
Perform RAG. We perform a completion with a RAG pipeline. In this version, we are explicitly passing the relevant docs (from our similarity search).
Perform RAG with Q+A chain. We perform a completion with a RAG pipeline. In this version, there is no explicit passing of relevant docs.

Setup and configuration

Dev settings

# Ignore warnings
import warnings
warnings.filterwarnings("ignore")

Packages

!pip install langchain -q
!pip install ibm-watson-machine-learning -q
!pip install wget -q
!pip install sentence-transformers -q
!pip install singlestoredb -q
!pip install sqlalchemy-singlestoredb -q

langchain: Orchestration framework
ibm-watson-machine-learning: For IBM LLMs
wget: To download knowledge base data
sentence-transformers: For embedding model

Import utility packages

import os
import getpass

Environment variables and keys

watsonx URL

try:
    wxa_url = os.environ["WXA_URL"]
except KeyError:
    wxa_url = getpass.getpass("Please enter your watsonx.ai URL domain (hit enter):
")

watsonx API key

try:
    wxa_api_key = os.environ["WXA_API_KEY"]
except KeyError:
    wxa_api_key = getpass.getpass("Please enter your watsonx.ai API key
(hit enter): ")

watsonx project ID

try:
    wxa_project_id = os.environ["WXA_PROJECT_ID"]
except KeyError:
    wxa_project_id = getpass.getpass("Please enter your watsonx.ai Project
ID (hit enter): ")

SingleStoreDB connection

If you do not have a SingleStoreDB instance, you can start today with a free trial here. To get the connection strings:

Select a workspace
If the workspace is suspended, click on resume it
Click on Connect
Click on Connect Directly
Click SQL IDE which gives you SINGLESTORE_USER (admin for trials), SINGLESTORE_PASS (Password), SINGLESTORE_PORT (usually 3306
Pick a name for your SINGLESTORE_DATABASE

try:
    connection_user = os.environ["SINGLESTORE_USER"]
except KeyError:
    connection_user = getpass.getpass("Please enter your SingleStore username (hit
enter): ")

try:
    connection_password = os.environ["SINGLESTORE_PASS"]
except KeyError:
    connection_password = getpass.getpass("Please enter your SingleStore
password (hit enter): ")

try:
    connection_port = os.environ["SINGLESTORE_PORT"]
except KeyError:
    database_name = input("Please enter your SingleStore database name (hit
enter): ")

try:
    connection_host = os.environ["SINGLESTORE_HOST"]
except KeyError:
    database_name = input("Please enter your SingleStore database name (hit enter):
")

try:
    database_name = os.environ["SINGLESTORE_DATABASE"]
except KeyError:
    database_name = input("Please enter your SingleStore database name (hit enter):
")

try:
    table_name = os.environ["SINGLESTORE_TABLE"]
except KeyError:
    table_name = input("Please enter your SingleStore table name (hit
enter): ")

Query

query = "What did the president say about Ketanji Brown Jackson?"

Language model

For our language model we will use Granite, an IBM-developed LLM.

from ibm_watson_machine_learning.foundation_models.utils.enums import ModelTypes
from ibm_watson_machine_learning.foundation_models import Model
from ibm_watson_machine_learning.metanames import GenTextParamsMetaNames as
GenParams
from ibm_watson_machine_learning.foundation_models.utils.enums import
DecodingMethods

parameters = {
    GenParams.DECODING_METHOD: DecodingMethods.GREEDY,
    GenParams.MIN_NEW_TOKENS: 1,
    GenParams.MAX_NEW_TOKENS: 100
}

model = Model(
    model_id=ModelTypes.GRANITE_13B_CHAT,
    params=parameters,
    credentials={
        "url": wxa_url,
        "apikey": wxa_api_key
    },
    project_id=wxa_project_id
)

from ibm_watson_machine_learning.foundation_models.extensions.langchain import
WatsonxLLM
granite_llm_ibm = WatsonxLLM(model=model)

Basic completion

result = granite_llm_ibm(query)
print("Query: " + query)
print("Response: " + response)

Response from LLM: The president said that Ketanji Brown Jackson is an “incredible judge” and that he is “proud” to have nominated her to the Supreme Court.<|endoftext|>

Data for documents

Let’s now load the knowledge base stored in AWS S3 into documents

import wget

filename = './state_of_the_union.txt'
url =
'https://raw.github.com/IBM/watson-machine-learning-samples/master/cloud/data/foundation_models/state_of_the_union.txt'

if not os.path.isfile(filename):
    wget.download(url, out=filename)

Embeddings

By default, we will be using the LangChain Hugging Face embedding model — which at the time of this writing is sentence-transformers/all-mpnet-base-v2.

Let’s split the documents into chunks:

from langchain.embeddings import HuggingFaceEmbeddings
embeddings = HuggingFaceEmbeddings()

Vector store

We are going to store the embeddings in SingleStoreDB.

Create a SingleStore SQLAlchemy engine

from sqlalchemy import *

# Without database connection URL - we use that connection string to create a database
connection_url =
f"singlestoredb://{connection_user}:{connection_password}@{connection_host}:{connection_port}"
engine = create_engine(connection_url)

Create database for embeddings (if one doesn’t already exist)

# Create database in SingleStoreDB

with engine.connect() as conn:
    result = conn.execute(text("CREATE DATABASE IF NOT EXISTS " + database_name))

Verify the database exists

print("Available databases:")
with engine.connect() as conn:
    result = conn.execute(text("SHOW DATABASES"))
    for row in result:
        print(row)

Drop table for embeddings (if exists)

with engine.connect() as conn:
    result = conn.execute(text("DROP TABLE IF EXISTS " + database_name +
"." + table_name))

Instantiate SingleStoreDB in LangChain

# Connection string to use Langchain with SingleStoreDB
os.environ["SINGLESTOREDB_URL"] =
f"{connection_user}:{connection_password}@{connection_host}:{connection_por
t}/{database_name}"

from langchain.vectorstores import SingleStoreDB
vectorstore = SingleStoreDB.from_documents(
        texts,
        embedding_model,
        table_name = table_name
)

Check table

with engine.connect() as conn:
    result = conn.execute(text("DESCRIBE " + database_name + "." +
table_name))
    print(database_name + "." + table_name + " table schema:")
    for row in result:
        print(row)

    result = conn.execute(text("SELECT COUNT(vector) FROM " + database_name
+ "." + table_name))
    print("\nNumber of rows in " + database_name + "." + table_name + ": "
+ str(result.first()[0]))

Perform similarity search

Here, we’ll find the similar (i.e. relevant) texts to our query. You can modify the number of results returned with `k` parameter in the `similarity_search` method here.

texts_sim = vectorstore.similarity_search(query, k=5)
print("Number of relevant texts: " + str(len(texts_sim)))

Response: Number of relevant texts: 5

print("First 100 characters of relevant texts.")
for i in range(len(texts_sim)):
        print("Text " + str(i) + ": " + str(texts_sim[i].page_content[0:100]))

Response:

First 100 characters of relevant texts.

Text 1: Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Ac

Text 2: A former top litigator in private practice. A former federal public defender. And from a family of p

Text 3: As Frances Haugen, who is here with us tonight, has shown, we must hold social media platforms accou

Text 4: And I’m taking robust action to make sure the pain of our sanctions is targeted at Russia’s economy

Text 5: But cancer from prolonged exposure to burn pits ravaged Heath’s lungs and body.

Perform RAG with explicit context control

We’ll perform RAG using our model and explicit relevant knowledge (documents) from our similarity search.

from langchain.chains.question_answering import load_qa_chain
chain = load_qa_chain(granite_llm_ibm, chain_type="stuff")
result = chain.run(input_documents=texts_sim, question=query)

print("Query: " + query)
print("Result:" + result)

Response:

Query: What did the president say about Ketanji Brown Jackson?

Response: The president said that Ketanji Brown Jackson is a consensus builder who will continue Justice Breyer's legacy of excellence.<|endoftext|>

RAG Q+A chain

This includes RAG using a chain of our model and vector store. The chain handles getting the relevant knowledge (texts) under the hood.

from langchain.chains import RetrievalQA
qa = RetrievalQA.from_chain_type(llm=granite_llm_ibm, chain_type="stuff",
retriever=vectorstore.as_retriever())
response = qa.run(query)

print("Query: " + query)
print("Result:" + result)

Response:

Query: What did the president say about Ketanji Brown Jackson?

Response: The president said that Ketanji Brown Jackson is a consensus builder who will continue Justice Breyer's legacy of excellence.<|endoftext|>

Conclusion

We saw how easy it is to integrate SingleStoreDB with IBM watsonx.ai to help enhance your LLM model with a knowledge base collocated with your watsonx stack for fast retrieval on hybrid search and analytics. Start with watsonx.ai and SingleStoreDB today!

Product