Demonstrate some common AI function usecases
Notebook
Note
You can use your existing Standard or Premium workspace with this Notebook.
This feature is currently in Private Preview. Please reach out to support@singlestore.com to confirm if this feature can be enabled in your org.
This Jupyter notebook will help you:
Load the Amazon Fine Foods Reviews dataset from Kaggle
Store the data in SingleStore
Demonstrate powerful AI Functions for text processing and analysis
Prerequisites: Ensure AI Functions are installed on your deployment (AI Services > AI & ML Functions).
Create some simple tables
This setup establishes a basic relational structure to store some reviews for restaurants. Ensure you have selected a database and have CREATE permissions to create/delete tables.
In [1]:
1%%sql2CREATE DATABASE IF NOT EXISTS temp;3USE temp;
In [2]:
1%%sql2DROP TABLE IF EXISTS reviews;3 4CREATE TABLE IF NOT EXISTS reviews (5 Id INT PRIMARY KEY,6 ProductId VARCHAR(20),7 UserId VARCHAR(50),8 ProfileName VARCHAR(255),9 HelpfulnessNumerator INT,10 HelpfulnessDenominator INT,11 Score INT,12 Time BIGINT,13 Summary TEXT,14 Text TEXT15);
Install the required packages
In [3]:
1!pip install -q httplib2 kagglehub pandas
Download and Load Dataset
In [4]:
1import kagglehub2import pandas as pd3 4# Download the Amazon Fine Foods Reviews dataset from Kaggle5print("Downloading dataset from Kaggle...")6path = kagglehub.dataset_download("snap/amazon-fine-food-reviews")7print(f"Dataset downloaded to: {path}")8 9# Read the CSV file10df = pd.read_csv(f"{path}/Reviews.csv")11 12# Display dataset info13print(f"\nDataset shape: {df.shape}")14print(f"Columns: {list(df.columns)}")15print("\nFirst few rows:")16df.head()
Load Data into SingleStore
In [5]:
1import singlestoredb as s22 3# Create SQLAlchemy engine instead of regular connection4engine = s2.create_engine(database='temp')5 6# Take a sample of 10,000 reviews for demo purposes7sample_df = df.head(10000).copy()8 9print(f"Loading {len(sample_df)} reviews into SingleStore...")10 11# Write dataframe to SingleStore table using SQLAlchemy engine12sample_df.to_sql(13 'reviews',14 con=engine, # Use engine instead of connection15 if_exists='append',16 index=False,17 chunksize=100018)19 20print("Data loaded successfully!")
Verify Data Load
In [6]:
1%%sql2-- Check the number of reviews loaded3SELECT COUNT(*) as total_reviews FROM reviews;
Sample Data Preview
In [7]:
1%%sql2-- View sample reviews3SELECT Id, ProductId, Score, Summary, LEFT(Text, 100) as Review_Preview4FROM reviews5LIMIT 10;
AI Functions Demonstrations
Now let's explore the power of SingleStore AI Functions for text analysis and processing. Ensure that AI functions are enabled for the org and you are able to list the available AI functions
In [8]:
1%%sql2SHOW functions in cluster;
In [9]:
1%%sql2-- AI_COMPLETE: Ask general questions and get LLM-powered completions3SELECT cluster.AI_COMPLETE(4 'What is SingleStore?'5) AS completion;
In [10]:
1%%sql2-- AI_SENTIMENT: Analyze sentiment of customer reviews for a specific product3-- WHERE ProductId = <Your choice>4-- Remember to specify the datbase name. In this example 'temp' is the Database name5SELECT6 Id,7 ProductId,8 Score,9 LEFT(Text, 80) as Review_Snippet,10 cluster.AI_SENTIMENT(Text) AS sentiment11FROM temp.reviews12WHERE ProductId = 'B000NY8ODS'13LIMIT 10;
In [11]:
1%%sql2-- Aggregate sentiment analysis across products3-- Using CTE to filter and prepare data first4WITH filtered_reviews AS (5 SELECT6 ProductId,7 Text8 FROM temp.reviews9 WHERE ProductId IN (10 SELECT ProductId11 FROM temp.reviews12 GROUP BY ProductId13 HAVING COUNT(*) >= 514 )15 LIMIT 10016),17grouped_reviews AS (18 SELECT19 ProductId,20 COUNT(*) as review_count,21 GROUP_CONCAT(Text SEPARATOR '. ') as combined_text22 FROM filtered_reviews23 GROUP BY ProductId24 LIMIT 525)26SELECT27 ProductId,28 review_count,29 cluster.AI_SENTIMENT(combined_text) as overall_sentiment30FROM grouped_reviews;
In [12]:
1%%sql2-- AI_SUMMARIZE: Create concise summaries of lengthy reviews3-- Filter long reviews first using CTE4WITH long_reviews AS (5 SELECT6 Id,7 ProductId,8 Text,9 LEFT(Text, 150) as Original_Review10 FROM temp.reviews11 WHERE LENGTH(Text) > 20012 LIMIT 513)14SELECT15 Id,16 ProductId,17 Original_Review,18 cluster.AI_SUMMARIZE(19 Text,20 'aifunctions_chat_default',21 1522 ) AS summary23FROM long_reviews;
In [13]:
1%%sql2-- AI_CLASSIFY: Classify customer feedback into categories3-- Filter negative reviews first using CTE4WITH negative_reviews AS (5 SELECT6 Id,7 ProductId,8 Text,9 LEFT(Text, 100) as Review_Text10 FROM temp.reviews11 WHERE Score <= 312 LIMIT 1013)14SELECT15 Id,16 ProductId,17 Review_Text,18 cluster.AI_CLASSIFY(19 Text,20 '[quality, price, shipping, taste]'21 ) AS classification22FROM negative_reviews;
In [14]:
1%%sql2-- AI_EXTRACT: Extract specific information from reviews3-- Filter positive reviews first using CTE4WITH positive_reviews AS (5 SELECT6 Id,7 ProductId,8 Text,9 LEFT(Text, 100) as Review_Text10 FROM temp.reviews11 WHERE Score >= 412 LIMIT 1013)14SELECT15 Id,16 ProductId,17 Review_Text,18 cluster.AI_EXTRACT(19 Text,20 'Does this customer indicate they will buy this product again? Answer with yes, no, or unclear only'21 ) AS repeat_purchase_intent22FROM positive_reviews;
In [15]:
1%%sql2-- AI_EXTRACT: Identify reviews with high churn risk3-- Filter low-rated reviews first using CTE4WITH low_rated_reviews AS (5 SELECT6 Id,7 ProductId,8 Score,9 Text,10 LEFT(Text, 120) as Review_Text11 FROM temp.reviews12 WHERE Score <= 213 LIMIT 1014)15SELECT16 Id,17 ProductId,18 Score,19 Review_Text,20 cluster.AI_EXTRACT(21 Text,22 'Is this customer at high risk of not purchasing again? Answer with high, medium, or low only'23 ) AS churn_risk24FROM low_rated_reviews;
In [16]:
1%%sql2-- AI_TRANSLATE: Translate text between languages3-- Filter reviews with substantial summaries first using CTE4WITH translatable_reviews AS (5 SELECT6 Id,7 Summary as Original_English8 FROM temp.reviews9 WHERE Score = 510 AND Summary IS NOT NULL11 AND LENGTH(Summary) > 2012 LIMIT 513)14SELECT15 Id,16 Original_English,17 cluster.AI_TRANSLATE(18 Original_English,19 'english',20 'spanish'21 ) AS spanish_translation22FROM translatable_reviews;
In [17]:
1%%sql2-- Combined AI Functions: Comprehensive product analysis3-- Filter to products with multiple reviews first4WITH popular_products AS (5 SELECT ProductId6 FROM temp.reviews7 GROUP BY ProductId8 HAVING COUNT(*) >= 109 LIMIT 510),11product_reviews AS (12 SELECT13 r.ProductId,14 r.Text,15 r.Score,16 LEFT(r.Text, 80) as Review_Sample17 FROM temp.reviews r18 INNER JOIN popular_products p ON r.ProductId = p.ProductId19 LIMIT 1020)21SELECT22 ProductId,23 Score,24 Review_Sample,25 cluster.AI_SENTIMENT(Text) as sentiment,26 cluster.AI_CLASSIFY(Text, '[quality, value, taste, packaging]') as category,27 cluster.AI_SUMMARIZE(Text, 'aifunctions_chat_default', 10) as brief_summary28FROM product_reviews;
Cleanup
In [18]:
1%%sql2DROP TABLE IF EXISTS reviews;3DROP DATABASE IF EXISTS temp;

Details
About this Template
Learn how to connect to use pre configured AI Functions.
This Notebook can be run in Standard and Enterprise deployments.
Tags
See Notebook in action
Launch this notebook in SingleStore and start executing queries instantly.
License
This Notebook has been released under the Apache 2.0 open source license.