Snowflake LLM Assisted Query Monitoring

If you don’t have a Snowflake DBA or you’re a Snowflake DBA that wants to get some assistence across all the databases in your account, you can leverage Snowflake’s built-in LLM functionality to look for and analyze expensive or poorly performing queries.

It can save you time by producing a documented starting point for identifying and addressing problems that might be costing more money than necessary.

Things often fall through the cracks because it’s no one’s job to constantly monitor and improve the query workload. This can result in higher than necessary compute costs and slow queries.

In Snowflake, this might fall into one or more categories (non-exhaustive):

  1. Poorly designed schema
  2. Queries using inefficient anti-patterns (such as NOT IN, or accidental JOIN explosions, …)
  3. Inefficient data loading patterns and long running ELT/ETL processes
  4. Poorly clustered tables and insufficient partition pruning (see Snowflake: Clustered Tables)

I built a project, in GitHub with a MIT License, that leverages Snowflake’s hosted LLM models to analyze your Snowflake query workload. It sends an HTML formatted email to an email address/email distribution list (preferrable).

When using Snowflake Cortex, your data never leaves Snowflake’s security boundaries. Customer data is strictly isolated within your account boundary and is never used to train or fine-tune third-party large language models (LLMs). See Snowflake AI Trust and Safety FAQs

Overview

The overview looks more complicated than it is! It’s basically: collect → analyze → email.

  • No external infrastructure is required. The whole system lives natively inside Snowflake: the DB, the scheduler (Task), the AI, and the email integration.
  • Isolated database + schema. MONITORING.AGENT is its own isolated database+schema that acts as a private audit trail. All four tables are append-only snapshots keyed by run_timestamp – so you get a full history of every monitoring run, not just the latest state. The AGENT_FINDINGS table stores the raw LLM prompt alongside the AI response, which gives you full reproducibility and lets you compare how findings change over time.
  • The execution chain is sequential. QUERY_MONITORING() is the sole entry point. It sets a QUERY_TAG = ‘QUERY_MONITORING’ at the start (so its own queries are excluded), then calls three procs in order: collect → analyze → email.
  • The AI layer sits inside Snowflake entirely. SEND_FINDINGS_TO_CORTEX() converts the query data into a Markdown table, crafts a detailed system prompt (senior DBA persona, output format specified as Outlook-compatible HTML tables, today’s date injected), then calls SNOWFLAKE.CORTEX.AI_COMPLETE() — the LLM model is claude-opus-4-7 by default but configurable. The AI output is stored in AGENT_FINDINGS before being consumed by the email proc.
  • The email is built entirely in SQL. SEND_FINDINGS_EMAIL() assembles a full HTML document then fires SYSTEM$SEND_EMAIL() via the notification integration. The Python to_html_table() helper (Snowpark, Python 3.13) renders Outlook-compatible zebra-striped tables with inline CSS.

Example Output

The format of the HTML output in the email can vary based on the model used. Here’s an example: