Semantic Search: What Is It + How Does It Work?


Pavan Belagatti

Developer Evangelist

Semantic Search: What Is It + How Does It Work?

Semantic search in the context of generative AI, or any AI system, refers to the capability of the system to understand and process user queries based on the intent and contextual meaning rather than just relying on keywords.

You cannot be left behind in today’s generative AI world. Large Language Models (LLMs) and things like Retrieval Augmented Generation (RAG), LangChain and LlamaIndex are revolutionizing the world with their unique capabilities.

While Natural Language Processing (NPL) has made computers understand human language, there is one more advancement when it comes to searching and retrieving the data needed. That is where semantic search comes into play. 

Sometimes models do not understand user queries with keyword search, misinterpreting and creating confusion while retrieving the required output. With the application of semantic search functionality, applications will have stronger search capabilities.

Today, we’ll dive deeper into what semantic search is, and how it works in the world of generative AI.

In the context of generative AI (or any AI system), semantic search in the context of refers to the system’s ability to understand and process user queries based on the intent and contextual meaning rather than just relying on keywords. In other words, semantic search seeks to understand the nuances and relationships of words in a query to produce more relevant results or outputs.

Semantic search plays a vital role in generative AI, since it's not just about retrieving information but also about generating content that aligns with the user's intent and context. For example, if a user is looking to generate a story based on a specific theme, the AI would need to comprehend that theme semantically to produce a relevant and coherent story.

how-semantic-search-worksHow semantic search works

Semantic search stands at the forefront of a paradigm shift in the way we interact with information, embodying a transition from keyword-based retrieval to a more nuanced, intent-driven dialogue with data. The development of generative AI models like OpenAI's GPT series has made significant strides in semantic understanding, allowing for more natural and contextually relevant interactions between users and AI.

Here's a step-by-step explanation of the semantic search process as shown in the diagram:

  1. User submits query. The user initiates the process by entering a search query into the system.

  2. Analyze intent and context. The LLM analyzes the query to understand the user's intent and context of the query.
  3. Extract intent and relationships. Semantic search processes the query to determine the relationships between the terms and the overall meaning.
  4. Return intent and relationships. The extracted intent and relationships are sent back to the LLM.
  5. Retrieve relevant data. The LLM uses the understood intent to retrieve data that is relevant to the query.
  6. Rank data based on relevance. The ranking algorithm evaluates the retrieved data from a vector database, ranking it according to its relevance to the query.
  7. Return ranked results. The ranked results are then sent back to the LLM.
  8. Present generated content/output. Finally, the LLM presents the generated content or search results to the user, completing the semantic search process.

It is very important for any system to understand the user queries and present them in a more accurate — if not contextual — format.  Imagine you’re browsing your favorite eCommerce website; you enter a query in the  search bar for the product you are looking for, and find the search results are broken — all you see is a set of clothes presented on your screen. This of course has negatively impacted your entire user experience.

Now, that is where semantic search functionality plays a vital role.

The essence of semantic search lies in its ability to understand the intent and contextual nuances behind user queries, transforming the search experience from a simplistic keyword match to a sophisticated, intent-driven interaction. This leap is critical as it ensures users find genuinely relevant content, not just pages with keyword matches.

Semantic search's pivotal role in improving data retrieval accuracy across industries — from eCommerce to healthcare — streamlines operations, empowers informed decision making and enriches the overall user experience. By capturing the subtleties of human language, semantic search is reshaping our access to and interaction with the vast expanse of digital information.

Amazon has integrated semantic search with their eCommerce websites around the globe. Some other companies that use semantic search include Google, Microsoft (Bing, IBM's watsonx, OpenAI, Anthropic, etc. Even Elon Musk is interested in adding semantic search functionality to X (formerly Twitter).

Enough of the theory, let’s understand how semantic search works through a simple tutorial.

semantic-search-tutorialSemantic search tutorial

We understand that semantic search is about understanding the query's context and intent to return the most relevant results, not just matching keywords. To demonstrate this, we can use the sentence-transformers library to create embeddings for a set of documents and a query, and then perform a similarity search to find the most relevant document.

SingleStore Notebooks extends the capabilities of Jupyter Notebook to enable data and AI professionals to easily work and play around.

What is SingleStoreDB?

SingleStoreDB empowers the world’s leading organizations to build and scale modern applications using the only database that allows you to transact, analyze and contextualize data in real time. It offers streaming data ingestion, support for both transactions and analytics, horizontal scalability and hybrid vector search capabilities. 

Here is a step-by-step tutorial you can follow in a SingleStore Notebook.

But first, we need to sign up to the free Singlestore Helios account to use the Notebook feature. When you sign up, you will receive $600 in free computing resources. 

Once you sign in to your Singlestore Helios account, you’ll see the following dashboard — and where you need to click ‘Notebooks’ as shown.

Then, create a blank Notebook and name it as you wish. I am naming mine ‘semantic-search-demo’.

Once you create your Notebook, you will be presented with a dashboard where you can add code snippets and start working.

Follow along the tutorial and make sure you add the code shown in the next steps into the Notebooks and run it every time. Let’s get started!

Step 1. Install the necessary libraries

First, you need to install the sentence-transformers library. Run this in a Jupyter Notebook cell:

!pip install sentence-transformers

For the first time, let me show you how to add the preceding command into your Notebook and run it:

Now, you should understand how to add the code into the Notebook and run it every time. You’ll do the same for the following commands and code snippets.

Step 2. Import the libraries

from sentence_transformers import SentenceTransformer, util
import numpy as np

Step 3. Load the pre-trained model

We will use a pre-trained model from the sentence-transformers library. This model is trained to generate embeddings that are useful for semantic similarity tasks.

model = SentenceTransformer('all-MiniLM-L6-v2')

Step 4. Define your documents and query

Define some documents and a query. The documents can be sentences, paragraphs or longer blocks of texts.

# Example documents
documents = [
    "The quick brown fox jumps over the lazy dog.",
    "I had a great time at the park with my friends.",
    "The economy is showing signs of recovery after the pandemic.",
    "The surface of Mars is red due to iron oxide.",
    "Machine learning models have become very sophisticated."

# Example query
query = "Natural language processing models"

Step 5. Encode the documents and the query

We will create embeddings for both our documents and the query.

# Encode the documents
document_embeddings = model.encode(documents)

# Encode the query
query_embedding = model.encode(query)

Step 6. Perform semantic search

Now, we will use cosine similarity to find the most semantically similar document to the query.

# Compute similarity scores of the query against all document
similarity_scores = util.pytorch_cos_sim(query_embedding,

# Find the index of the highest score
highest_score_index = np.argmax(similarity_scores)

print("The most semantically similar document to the query:")

The output will be the document from your predefined list that has the highest cosine similarity score with your query "natural language processing models". This score is a numerical representation of how similar the document is to the query in the context of the embedding space created by the sentence-transformers model.

Here's what the output might look like:

In this example, the model has determined that the document discussing machine learning models is most semantically similar to your query about natural language processing models. This is because both sentences are related to the field of AI and the underlying concepts of models and learning, even though the exact words from the query may not be present in the document.

You can enhance the output to provide more information or format it differently according to your needs. Here are some suggestions:

Create a DataFrame display. If you prefer a table format, you can use pandas to create a DataFrame that shows documents and their similarity scores.

import pandas as pd

# Create a DataFrame for better visualization
scores = similarity_scores[0].tolist()[0]
df = pd.DataFrame({'Document': documents, 'Similarity Score': scores})

# Sort the DataFrame based on similarity scores
df = df.sort_values(by='Similarity Score', ascending=False)


You can see the output here:

Visualize similarity scores. You can create a bar chart to visualize the similarity scores for each document.

import matplotlib.pyplot as plt

# Plot the similarity scores, similarity_scores[0].tolist()[0])
plt.xticks(range(len(documents)), range(1, len(documents)+1))
plt.xlabel('Document Number')
plt.ylabel('Similarity Score')
plt.title('Semantic Similarity Scores')

See the output here:

There is much more you can do with these Notebooks to understand the concepts clearly. Check out all available tutorials in SingleStore Spaces.

semantic-search-with-single-store-dbSemantic search with SingleStoreDB

SingleStoreDB aids in semantic search by enabling the storage and querying of high-dimensional vector data within its distributed SQL database system. With its patented Universal Storage, SingleStoreDB is  optimized for both OLTP and OLAP workloads — crucial for modern semantic search platforms that require fast transactional and analytical processing.

Developers can store vector embeddings directly in the database using binary or blob columns and employ built-in functions for efficient vector operations, including similarity matching through dot_product, to perform semantic queries.

SingleStoreDB's architecture is designed to handle large-scale vector similarity workloads with ease, utilizing its distributed nature, parallelization and Intel SIMD-based vector processing for rapid retrieval and real-time analytics. This means SingleStoreDB powers applications to perform semantic search that understands the context and nuance of user queries, delivering precise and relevant results at speed.

Try SingleStore DB free.


Semantic search represents a monumental leap in how we interact with the digital world. By prioritizing intent and meaning over mere keywords, it redefines the boundaries of user engagement and information retrieval. As this technology continues to evolve, it will not only refine the accuracy of search results but also revolutionize the way we navigate and utilize the burgeoning universe of online content.

The future of search is undeniably semantic — promising a more intuitive, efficient and contextually aware landscape for users to explore the depths of human knowledge with unprecedented ease.