New

Demonstrate some common AI function usecases

Notebook


SingleStore Notebooks

Demonstrate some common AI function usecases

Note

You can use your existing Standard or Premium workspace with this Notebook.

This feature is currently in Private Preview. Please reach out to support@singlestore.com to confirm if this feature can be enabled in your org.

This Jupyter notebook will help you:

  1. Load the Amazon Fine Foods Reviews dataset from Kaggle

  2. Store the data in SingleStore

  3. Demonstrate powerful AI Functions for text processing and analysis

Prerequisites: Ensure AI Functions are installed on your deployment (AI Services > AI & ML Functions).

Create some simple tables

This setup establishes a basic relational structure to store some reviews for restaurants. Ensure you have selected a database and have CREATE permissions to create/delete tables.

In [1]:

1%%sql2CREATE DATABASE IF NOT EXISTS temp;3USE temp;

In [2]:

1%%sql2DROP TABLE IF EXISTS reviews;3
4CREATE TABLE IF NOT EXISTS reviews (5    Id INT PRIMARY KEY,6    ProductId VARCHAR(20),7    UserId VARCHAR(50),8    ProfileName VARCHAR(255),9    HelpfulnessNumerator INT,10    HelpfulnessDenominator INT,11    Score INT,12    Time BIGINT,13    Summary TEXT,14    Text TEXT15);

Install the required packages

In [3]:

1!pip install -q httplib2 kagglehub pandas

Download and Load Dataset

In [4]:

1import kagglehub2import pandas as pd3
4# Download the Amazon Fine Foods Reviews dataset from Kaggle5print("Downloading dataset from Kaggle...")6path = kagglehub.dataset_download("snap/amazon-fine-food-reviews")7print(f"Dataset downloaded to: {path}")8
9# Read the CSV file10df = pd.read_csv(f"{path}/Reviews.csv")11
12# Display dataset info13print(f"\nDataset shape: {df.shape}")14print(f"Columns: {list(df.columns)}")15print("\nFirst few rows:")16df.head()

Load Data into SingleStore

In [5]:

1import singlestoredb as s22
3# Create SQLAlchemy engine instead of regular connection4engine = s2.create_engine(database='temp')5
6# Take a sample of 10,000 reviews for demo purposes7sample_df = df.head(10000).copy()8
9print(f"Loading {len(sample_df)} reviews into SingleStore...")10
11# Write dataframe to SingleStore table using SQLAlchemy engine12sample_df.to_sql(13    'reviews',14    con=engine,  # Use engine instead of connection15    if_exists='append',16    index=False,17    chunksize=100018)19
20print("Data loaded successfully!")

Verify Data Load

In [6]:

1%%sql2-- Check the number of reviews loaded3SELECT COUNT(*) as total_reviews FROM reviews;

Sample Data Preview

In [7]:

1%%sql2-- View sample reviews3SELECT Id, ProductId, Score, Summary, LEFT(Text, 100) as Review_Preview4FROM reviews5LIMIT 10;

AI Functions Demonstrations

Now let's explore the power of SingleStore AI Functions for text analysis and processing. Ensure that AI functions are enabled for the org and you are able to list the available AI functions

In [8]:

1%%sql2SHOW functions in cluster;

In [9]:

1%%sql2-- AI_COMPLETE: Ask general questions and get LLM-powered completions3SELECT cluster.AI_COMPLETE(4    'What is SingleStore?'5) AS completion;

In [10]:

1%%sql2-- AI_SENTIMENT: Analyze sentiment of customer reviews for a specific product3-- WHERE ProductId = <Your choice>4-- Remember to specify the datbase name. In this example 'temp' is the Database name5SELECT6    Id,7    ProductId,8    Score,9    LEFT(Text, 80) as Review_Snippet,10    cluster.AI_SENTIMENT(Text) AS sentiment11FROM temp.reviews12WHERE ProductId = 'B000NY8ODS'13LIMIT 10;

In [11]:

1%%sql2-- Aggregate sentiment analysis across products3-- Using CTE to filter and prepare data first4WITH filtered_reviews AS (5    SELECT6        ProductId,7        Text8    FROM temp.reviews9    WHERE ProductId IN (10        SELECT ProductId11        FROM temp.reviews12        GROUP BY ProductId13        HAVING COUNT(*) >= 514    )15    LIMIT 10016),17grouped_reviews AS (18    SELECT19        ProductId,20        COUNT(*) as review_count,21        GROUP_CONCAT(Text SEPARATOR '. ') as combined_text22    FROM filtered_reviews23    GROUP BY ProductId24    LIMIT 525)26SELECT27    ProductId,28    review_count,29    cluster.AI_SENTIMENT(combined_text) as overall_sentiment30FROM grouped_reviews;

In [12]:

1%%sql2-- AI_SUMMARIZE: Create concise summaries of lengthy reviews3-- Filter long reviews first using CTE4WITH long_reviews AS (5    SELECT6        Id,7        ProductId,8        Text,9        LEFT(Text, 150) as Original_Review10    FROM temp.reviews11    WHERE LENGTH(Text) > 20012    LIMIT 513)14SELECT15    Id,16    ProductId,17    Original_Review,18    cluster.AI_SUMMARIZE(19        Text,20        'aifunctions_chat_default',21        1522    ) AS summary23FROM long_reviews;

In [13]:

1%%sql2-- AI_CLASSIFY: Classify customer feedback into categories3-- Filter negative reviews first using CTE4WITH negative_reviews AS (5    SELECT6        Id,7        ProductId,8        Text,9        LEFT(Text, 100) as Review_Text10    FROM temp.reviews11    WHERE Score <= 312    LIMIT 1013)14SELECT15    Id,16    ProductId,17    Review_Text,18    cluster.AI_CLASSIFY(19        Text,20        '[quality, price, shipping, taste]'21    ) AS classification22FROM negative_reviews;

In [14]:

1%%sql2-- AI_EXTRACT: Extract specific information from reviews3-- Filter positive reviews first using CTE4WITH positive_reviews AS (5    SELECT6        Id,7        ProductId,8        Text,9        LEFT(Text, 100) as Review_Text10    FROM temp.reviews11    WHERE Score >= 412    LIMIT 1013)14SELECT15    Id,16    ProductId,17    Review_Text,18    cluster.AI_EXTRACT(19        Text,20        'Does this customer indicate they will buy this product again? Answer with yes, no, or unclear only'21    ) AS repeat_purchase_intent22FROM positive_reviews;

In [15]:

1%%sql2-- AI_EXTRACT: Identify reviews with high churn risk3-- Filter low-rated reviews first using CTE4WITH low_rated_reviews AS (5    SELECT6        Id,7        ProductId,8        Score,9        Text,10        LEFT(Text, 120) as Review_Text11    FROM temp.reviews12    WHERE Score <= 213    LIMIT 1014)15SELECT16    Id,17    ProductId,18    Score,19    Review_Text,20    cluster.AI_EXTRACT(21        Text,22        'Is this customer at high risk of not purchasing again? Answer with high, medium, or low only'23    ) AS churn_risk24FROM low_rated_reviews;

In [16]:

1%%sql2-- AI_TRANSLATE: Translate text between languages3-- Filter reviews with substantial summaries first using CTE4WITH translatable_reviews AS (5    SELECT6        Id,7        Summary as Original_English8    FROM temp.reviews9    WHERE Score = 510    AND Summary IS NOT NULL11    AND LENGTH(Summary) > 2012    LIMIT 513)14SELECT15    Id,16    Original_English,17    cluster.AI_TRANSLATE(18        Original_English,19        'english',20        'spanish'21    ) AS spanish_translation22FROM translatable_reviews;

In [17]:

1%%sql2-- Combined AI Functions: Comprehensive product analysis3-- Filter to products with multiple reviews first4WITH popular_products AS (5    SELECT ProductId6    FROM temp.reviews7    GROUP BY ProductId8    HAVING COUNT(*) >= 109    LIMIT 510),11product_reviews AS (12    SELECT13        r.ProductId,14        r.Text,15        r.Score,16        LEFT(r.Text, 80) as Review_Sample17    FROM temp.reviews r18    INNER JOIN popular_products p ON r.ProductId = p.ProductId19    LIMIT 1020)21SELECT22    ProductId,23    Score,24    Review_Sample,25    cluster.AI_SENTIMENT(Text) as sentiment,26    cluster.AI_CLASSIFY(Text, '[quality, value, taste, packaging]') as category,27    cluster.AI_SUMMARIZE(Text, 'aifunctions_chat_default', 10) as brief_summary28FROM product_reviews;

Cleanup

In [18]:

1%%sql2DROP TABLE IF EXISTS reviews;3DROP DATABASE IF EXISTS temp;

Details


About this Template

Learn how to connect to use pre configured AI Functions.

This Notebook can be run in Standard and Enterprise deployments.

Tags

advancednotebookspython

See Notebook in action

Launch this notebook in SingleStore and start executing queries instantly.

License

This Notebook has been released under the Apache 2.0 open source license.