New

Python UDF to process images

Notebook


SingleStore Notebooks

Python UDF to process images

Note

This tutorial is meant for Standard & Premium Workspaces. You can't run this with a Free Starter Workspace due to restrictions on Storage. Create a Workspace using +group in the left nav & select Standard for this notebook. Gallery notebooks tagged with "Starter" are suitable to run on a Free Starter Workspace

In [1]:

1# Install and Import Required Libraries2!pip install -q langchain==0.3.27 langchain-openai==0.2.10 langchain-community==0.3.25 langchain-core==0.3.72 pillow==10.4.0 aiofiles==24.1.0

In [2]:

1# Import necessary modules2import base643import aiohttp4import io5
6from singlestoredb.functions import udf7import singlestoredb.apps as apps8from singlestoredb.management import get_secret9from openai import AsyncOpenAI10from PIL import Image11
12# Configuration for the Multimodal LLM ( Replace with the Path and Auth API key of your model )13INFERENCE_API_MODEL_NAME_1 = "gpt-4-1"14INFERENCE_API_MODEL_API_ENDPOINT_1 = "https://ai.us-east-1.cloud.singlestore.com/b45ad4a4-a620-4ed0-9ffc-76d83ebf9bc7/v1"15# Secrets are the recommended way to manage sensitive information like API keys for use within SingleStore Notebooks16INFERENCE_API_MODEL_API_AUTH_1 = get_secret('GPT_4_1')17
18# Setup AsyncOpenAI Client19async_llm_client = AsyncOpenAI(20    api_key=INFERENCE_API_MODEL_API_AUTH_1,21    base_url=INFERENCE_API_MODEL_API_ENDPOINT_122)23
24async def download_image_from_url(image_url: str, max_size: int = 1024) -> str:25    """26    Download image from URL and encode to base64 for GPT-4o Vision API.27
28    Parameters29    ----------30    image_url : str31        URL of the image to download32    max_size : int, optional33        Maximum size to resize image to save on API costs, by default 102434
35    Returns36    -------37    str38        Base64 encoded image string39
40    Raises41    ------42    Exception43        If image download fails or processing encounters an error44    """45    try:46        # Set proper headers to avoid 403 errors47        headers = {48            'User-Agent': 'SingleStore-ImageAnalyzer/1.0 (Product-Manager-Bot; https://singlestore.com)',49            'Accept': 'image/jpeg,image/png,image/webp,image/*,*/*;q=0.8',50            'Accept-Language': 'en-US,en;q=0.5',51            'Accept-Encoding': 'gzip, deflate',52            'DNT': '1',53            'Connection': 'keep-alive',54            'Upgrade-Insecure-Requests': '1',55        }56
57        # Configure timeout settings58        timeout = aiohttp.ClientTimeout(total=30)59
60        # Download image from URL with proper headers61        async with aiohttp.ClientSession(headers=headers, timeout=timeout) as session:62            async with session.get(image_url) as response:63                if response.status != 200:64                    raise Exception(f"Failed to download image: HTTP {response.status}")65
66                image_data = await response.read()67
68        # Process image with PIL69        with Image.open(io.BytesIO(image_data)) as img:70            # Convert to RGB if needed71            if img.mode in ("RGBA", "P"):72                img = img.convert("RGB")73
74            # Resize if too large to save on API costs75            if max(img.size) > max_size:76                img.thumbnail((max_size, max_size), Image.Resampling.LANCZOS)77
78            # Convert to base6479            buffer = io.BytesIO()80            img.save(buffer, format='JPEG', quality=85)81            return base64.b64encode(buffer.getvalue()).decode('utf-8')82
83    except Exception as e:84        raise Exception(f"Error processing image from URL {image_url}: {str(e)}")85
86@udf87async def AI_IMG_COMPLETE(image_url: str, prompt: str) -> str:88    """89    Process image from URL with GPT-4o Vision and return analysis.90
91    Parameters92    ----------93    image_url : str94        URL of the image to analyze95    prompt : str96        Text prompt for the vision model97
98    Returns99    -------100    str101        String output from GPT-4o vision analysis102
103    Notes104    -----105    This function downloads an image from the provided URL, processes it through106    a multimodal LLM (GPT-4o Vision), and returns the AI's analysis based on107    the given prompt.108    """109
110    try:111        # Download and encode image112        base64_image = await download_image_from_url(image_url)113
114        # Create vision request115        messages = [{116            "role": "user",117            "content": [118                {119                    "type": "text",120                    "text": prompt121                },122                {123                    "type": "image_url",124                    "image_url": {125                        "url": f"data:image/jpeg;base64,{base64_image}",126                        "detail": "high"127                    }128                }129            ]130        }]131
132        # Call GPT-4o Vision API133        response = await async_llm_client.chat.completions.create(134            model=INFERENCE_API_MODEL_NAME_1,135            messages=messages,136            max_tokens=500,137            temperature=0.2138        )139
140        return response.choices[0].message.content141
142    except Exception as e:143        error_msg = f"Error processing image analysis: {str(e)}"144        return error_msg

In [3]:

1# Start Python UDF server2connection_info = await apps.run_udf_app()

Publish the Notebook as a UDF

  1. Click on the publish button at the top right of your notebook editor window to publish these functions in

Now open a NEW SQL Editor and run the following commands to test the functioning of the UDFs

SQL Commands

SHOW functions;

Example 1: Analyze a sample image with AI_IMG_COMPLETE

SELECT AI_IMG_COMPLETE(
    'https://upload.wikimedia.org/wikipedia/commons/b/b6/Mount_Everest_as_seen_from_Drukair2_PLW_edit_Cropped.jpg',
    'which country do you think this image is from? Answer in one word'
) as analysis_result;

Example 2: Multiple images ( Better if done in batches)

SELECT
    image_url,
    AI_IMG_COMPLETE(image_url, 'describe the main subject matter in 15 words or less') as description
FROM (
    SELECT 'https://upload.wikimedia.org/wikipedia/commons/4/41/A_Man_on_the_Moon%2C_AS11-40-5903_%28cropped%29.jpg' as image_url
    UNION ALL
    SELECT 'https://upload.wikimedia.org/wikipedia/commons/4/4c/Series-N700a-Mt.Fuji.jpg'
    UNION ALL
    SELECT 'https://upload.wikimedia.org/wikipedia/commons/b/b6/Mount_Everest_as_seen_from_Drukair2_PLW_edit_Cropped.jpg'
) as test_images;

Details


About this Template

Learn how to integrate with Multimodal LLMs and run it from a Python UDF.

This Notebook can be run in Standard and Enterprise deployments.

Tags

advancednotebookspython

See Notebook in action

Launch this notebook in SingleStore and start executing queries instantly.

License

This Notebook has been released under the Apache 2.0 open source license.