Back to Blog

The 25 Best AI Prompts for SQL Queries and Data Analysis

March 12, 2026by Promptzy
ai prompts sqlchatgpt sql promptsai data analysis promptssql query ai prompts

SQL sits at the exact sweet spot where AI is useful without being dangerous. The language is strict enough that a wrong answer usually produces a clear error rather than silently wrong data, and verbose enough that writing it from scratch for the tenth time this week is genuinely tedious. The trap is in the silent wrong answers, which is why every SQL prompt should end with a sanity check. The prompts below are built around that discipline: get the query fast, then verify it actually says what you thought it said.

Below are 25 prompts I run when I am writing and debugging SQL. Table schemas, sample rows, slow queries, and raw business questions all go into {{clipboard}}. Pick the ones that match your workflow and keep them a keystroke away so you are not re explaining the schema to an AI fifty times a day.


Promptzy syncs AI skills across Claude, Cursor, OpenClaw, ChatGPT, and Gemini

Download Free for macOS

Writing queries from plain English

1. Write a SQL query from a plain English description

I need a SQL query to answer a specific question. I will describe what I want and paste the relevant table schemas and a few sample rows.

Here is everything:

{{clipboard}}

Produce:

1. The SQL query, formatted cleanly with proper indentation and one condition per line.
2. A one or two sentence explanation of what the query does.
3. Any assumption I am making about the data that the query depends on (for example, "assuming every user has exactly one row in the users table").
4. A sanity check I could run to verify the result is reasonable.
5. A flag if the query is likely to be slow on large tables, with a note on what would help (index, rewrite, materialized view).
6. The SQL dialect you wrote it for (Postgres, MySQL, BigQuery, Snowflake, etc.). Flag any function that is dialect specific.

If the description is ambiguous, ask one clarifying question before writing the query.

2. Translate a SQL query between dialects

I have a SQL query that works in one database and I need it to work in another.

Here is the query, the source dialect, and the target dialect:

{{clipboard}}

Produce:

1. The translated query.
2. Any function or syntax that does not have a direct equivalent, with the workaround.
3. A note on any performance characteristic that changes between dialects.
4. A warning about any date, timestamp, or timezone difference between the two dialects.
5. A test case I could run in both dialects to verify they produce the same result.
6. A flag if the query cannot be cleanly translated, with an explanation.

3. Explain what a query does in plain English

I have a SQL query and I want a plain English explanation of what it does, before I run it or modify it.

Here is the query:

{{clipboard}}

Produce:

1. A one sentence summary of the query's goal.
2. A step by step walkthrough: what happens in each CTE, subquery, join, or clause, in the order the database actually evaluates them.
3. The final shape of the result: what columns, what rows, what order.
4. Any assumption the query makes about the data.
5. Any subtle thing in the query that a reader might misread (a filter inside a join versus after it, a distinct where it might be unexpected).
6. A one line verdict: is this query well structured, or would a reviewer flag anything?

Joins and subqueries

4. Pick the right join type for a specific question

I have a question and I am not sure whether to use INNER, LEFT, RIGHT, FULL OUTER, CROSS, or something else.

Here is the question and the relevant tables:

{{clipboard}}

Produce:

1. The recommended join type with a one sentence justification.
2. The exact ON condition, with attention to nullable columns and composite keys.
3. A query using the recommended join.
4. The behavior of the query if a row on either side has no match.
5. A sanity check: what counts I should expect and why.
6. An alternative approach using a subquery or EXISTS if that would be cleaner for this question.
7. A warning if the question requires deduplication after the join, and how to handle it.

5. Write a query that uses multiple joins

I need to join three or more tables to answer a question. The join path and the filtering order matter.

Here are the tables and the question:

{{clipboard}}

Produce:

1. The query with all joins in a readable order.
2. A comment next to each join explaining why it is there and why it is the join type it is.
3. A check for Cartesian products: is any join missing a condition that could explode the row count?
4. The filter order: which filters go in WHERE, which go in ON, and why.
5. A sanity check query that verifies the row count at each stage is reasonable.
6. A simpler version using CTEs to make the logic easier to read if the single query becomes unwieldy.

6. Rewrite a subquery as a join or vice versa

I have a query that uses subqueries where a join might be cleaner, or uses joins where a subquery might be clearer. Rewrite it for readability and check if the performance changes.

Here is the query:

{{clipboard}}

Produce:

1. The rewritten version using the alternative approach.
2. A note on which version is more readable for this specific case.
3. A note on which version is likely to be faster in a typical database, with reasoning.
4. Any semantic difference between the two versions (they should return the same rows, but sometimes subtle differences creep in with NULL handling or duplicates).
5. A recommendation on which to use and why.

Aggregation and window functions

7. Write a query with aggregation and grouping

I need a query that aggregates data by one or more groupings, with specific functions applied.

Here is the question and the table:

{{clipboard}}

Produce:

1. The query using GROUP BY with the right aggregate functions (SUM, COUNT, AVG, MIN, MAX, COUNT DISTINCT, etc.).
2. A HAVING clause if the question requires filtering on aggregated values.
3. A note on which columns must be in the GROUP BY versus which can be aggregated.
4. A treatment of NULLs: whether they are counted, excluded, or grouped separately.
5. A sanity check: does the total of the grouped result match the total of the raw data?
6. An alternative using window functions if grouping is the wrong approach for this question.

8. Write a query using a window function

I need a running total, rank, moving average, previous value, or percentile using a window function.

Here is the question and the data:

{{clipboard}}

Produce:

1. The query using the correct window function (ROW_NUMBER, RANK, DENSE_RANK, LAG, LEAD, SUM OVER, AVG OVER, NTILE, PERCENT_RANK, etc.).
2. The exact PARTITION BY and ORDER BY clauses.
3. An explanation of why those are the right choices for my question.
4. A note on whether the window needs a ROWS or RANGE frame, and why.
5. A sample expected result for a small subset of the data.
6. A warning if the query will be expensive on large datasets and how to mitigate.

9. Calculate period over period changes

I need a query that compares a metric period over period: month over month, week over week, or year over year.

Here is the question and the data:

{{clipboard}}

Produce:

1. The query that returns the current period, the comparison period, the absolute change, and the percent change.
2. A handling for divide by zero when the comparison period is zero.
3. A note on how the date truncation should work (calendar weeks, rolling weeks, fiscal weeks).
4. Handling for missing periods: should they show as zero, as null, or be skipped?
5. A sanity check on a specific known period to verify the comparison is calculating correctly.
6. An alternative approach using LAG if that would be simpler than a self join.

Debugging queries

10. Debug a query that returns the wrong number of rows

My query returns more rows or fewer rows than I expected. Help me find out why.

Here is the query and the expected row count:

{{clipboard}}

Walk me through:

1. The most likely causes of row inflation: joins without correct conditions, duplicate keys, Cartesian products.
2. The most likely causes of row deflation: inner join filtering out rows, WHERE clause on a nullable column, missing OR conditions.
3. A diagnostic query that counts rows at each stage of the query, so I can see where the number changes.
4. A fix for the most likely cause.
5. A note on whether the expected row count is actually correct, or whether my expectation is the thing that is wrong.
6. A test query that verifies the fix returns the expected count.

11. Debug a query that returns the wrong values

My query returns the right number of rows but the values are wrong. Help me trace the calculation.

Here is the query, a sample of the wrong output, and what I expected:

{{clipboard}}

Walk me through:

1. The specific column that is wrong, and what the correct calculation should be.
2. A trace of how the query arrives at the wrong value, step by step.
3. The most likely bug: wrong aggregate, wrong join condition causing duplicate contribution, wrong filter, NULL handling, or a CASE that is not exhaustive.
4. A corrected version of the query.
5. A verification step on a specific row or group where I know the right answer.
6. A defensive pattern to use next time to catch similar bugs early.

12. Diagnose a query that is returning duplicates

My query returns duplicate rows that should not exist. Help me find the cause.

Here is the query:

{{clipboard}}

Produce:

1. The most likely cause of duplicates: a join that matches multiple rows, a missing DISTINCT, a grouping that is not tight enough, or a many to many relationship I did not account for.
2. A diagnostic to confirm the cause: count the rows per join candidate to see which table is exploding.
3. A fix that removes the duplicates without hiding the real bug.
4. A note on whether DISTINCT is the right fix or a lazy one. DISTINCT is usually a symptom, not a cure.
5. A rewrite using the correct aggregation or filtering that produces unique rows naturally.

Query optimization

13. Explain why a query is slow and how to fix it

I have a query that is taking too long and I need to understand why and fix it.

Here is the query, the schema, and the EXPLAIN output if I have it:

{{clipboard}}

Produce:

1. The most likely reasons the query is slow: missing index, bad join order, full table scan, expensive function in WHERE, correlated subquery, or data skew.
2. The specific operation in the query plan that is the biggest contributor, if visible.
3. An index recommendation: which columns, in what order, and on which table.
4. A rewrite of the query that avoids the most expensive operation if possible.
5. A note on whether the query can be materialized, cached, or moved to a different database path.
6. A before and after plan comparison I could check to verify the fix worked.

Do not recommend rewriting the query if an index will fix it. Rewrite only when the structure is the problem.

14. Find a missing index for a query

I have a query and I think it is missing an index, but I am not sure which columns to index or in what order.

Here is the query and the relevant table schemas:

{{clipboard}}

Produce:

1. The column or set of columns that would most improve the query, with reasoning.
2. The index type (B-tree, hash, partial, expression) and why.
3. The exact DDL to create the index.
4. An estimate of the impact: from full scan to index scan, from seconds to milliseconds, etc.
5. A warning about the tradeoff: write performance, storage cost, and how it interacts with existing indexes.
6. A check: is there already an index that could be extended instead of adding a new one?

15. Rewrite a query to run faster

I have a query that is slow and I cannot add an index. I need to rewrite the query itself to be faster.

Here is the query:

{{clipboard}}

Produce:

1. A rewrite that is logically equivalent but faster, with a note on the specific optimization (early filtering, push down predicates, replace correlated subquery with a join, use EXISTS instead of IN, limit before join).
2. A note on the expected performance gain.
3. Any tradeoff in readability: a faster query that is harder to read might not be worth it for a small gain.
4. A verification step: a small query that proves the rewrite returns the same result as the original.
5. An alternative approach using a CTE or temporary table if the query is too complex for a single statement.

Schema exploration

16. Explore an unfamiliar schema

I am working with a database I do not know well. Help me understand its structure so I can write useful queries.

Here are the table and column names I have access to (or the output of information_schema queries):

{{clipboard}}

Produce:

1. A likely entity relationship map: which tables are facts, which are dimensions, and how they connect.
2. The probable primary and foreign keys, inferred from naming conventions.
3. The three most interesting tables for someone trying to understand the business domain.
4. Any table that looks like a log, audit, or staging table that I should be careful with.
5. Three questions I could ask the data using only these tables, ranked from easy to advanced.
6. A SELECT query I could run on the largest table to get a sense of what a row looks like without fetching too much data.

Do not make up tables or columns. Work only with what is in the material.

17. Generate an ERD description from a set of tables

I have a set of tables and I want a text description of their relationships that I could turn into an ERD.

Here are the tables:

{{clipboard}}

Produce:

1. Each table with its primary key and notable columns.
2. The relationships between tables: which table references which, in which direction, with what cardinality (one to one, one to many, many to many).
3. Any junction table used for many to many relationships.
4. Any foreign key that looks suspicious (wrong type, nullable when it should not be, missing).
5. A suggested diagram structure: which tables to group together visually.
6. A one paragraph summary of the domain this schema represents.

Data cleaning with SQL

18. Find and flag bad data in a table

I want to audit a table for bad data: nulls where they should not be, impossible values, format inconsistencies, and outliers.

Here is the schema and a sample:

{{clipboard}}

Produce:

1. A query that counts nulls in each column.
2. A query that finds format inconsistencies in text columns (case, trailing whitespace, unusual characters).
3. A query that finds numeric outliers using standard deviation or percentiles.
4. A query that finds rows with values outside a reasonable range (negative ages, future birthdates, etc.).
5. A summary query that counts each type of bad data so I can see the scale.
6. A recommendation on which issues to fix first.

19. Deduplicate a table while keeping the best version of each row

I have a table with duplicate rows and I need to keep the best version of each one: usually the latest, or the most complete.

Here is the table and the definition of "duplicate":

{{clipboard}}

Produce:

1. A query that identifies the groups of duplicates.
2. A query that picks one row per group using a ROW_NUMBER window function.
3. The ORDER BY clause that defines what "best" means (most recent, most complete, or another criterion).
4. A safe deletion or materialization strategy (insert the kept rows into a new table, then swap, rather than deleting in place).
5. A sanity check: count the expected number of unique rows before and after.
6. A flag for any edge case where the definition of best is ambiguous.

20. Normalize a column of text into a consistent format

I have a text column where the same value appears in many different forms (different cases, typos, abbreviations). I want to normalize it.

Here is the column and a sample:

{{clipboard}}

Produce:

1. A query using LOWER, TRIM, and REPLACE to normalize the obvious variations.
2. A mapping table for known variations that cannot be handled by simple string functions.
3. A query that flags any value that does not match any known normalized version.
4. A recommendation on whether to fix this in the data, in a view, or at the application layer.
5. A note on any variation that might actually be a different value, not a spelling variation.

Analyzing business questions

21. Turn a fuzzy business question into a precise SQL query

My stakeholder asked me a fuzzy question ("how is engagement?") and I need to turn it into a specific SQL query that produces a defensible answer.

Here is the question and the data I have:

{{clipboard}}

Walk me through:

1. The specific interpretation of the question I am going to commit to, in one sentence.
2. The metric I am using and why I am using it instead of the alternatives.
3. The time range and the comparison period.
4. The population (all users, active users, new users) and why.
5. The query.
6. A caveat paragraph I would include when I send the answer, so the stakeholder understands the limits of what I measured.

Do not answer the fuzzy question literally. Answer the sharper version of it.

22. Build a funnel query

I need a funnel: count how many users or events made it through each step of a sequence.

Here are the steps, the event table, and the user identifier:

{{clipboard}}

Produce:

1. A query that counts users at each step, with drop off between steps.
2. A handling for the time window: is there a max time between steps? Should the steps be in order?
3. A way to handle users who skip a step but reach a later one.
4. A percentage calculation for each step relative to the top of the funnel.
5. A sanity check: the top of the funnel count should match a simpler query I can verify independently.
6. An alternative query using conditional aggregation if window functions are not available.

23. Cohort analysis by signup week

I want a cohort analysis: group users by the week they signed up, and then show their retention or activity over the following weeks.

Here are the tables and the definition of "retained":

{{clipboard}}

Produce:

1. A query that produces a cohort table with rows as signup weeks and columns as week offsets.
2. The specific definition of retention used in the query.
3. A note on how to handle incomplete cohorts (users who signed up too recently to have data for all weeks).
4. A smaller version of the query for a single cohort as a sanity check.
5. An interpretation guide: what to look for in the cohort table (horizontal is the cohort's behavior over time, vertical is how cohorts differ from each other).

Writing reports from results

24. Turn a query result into a one paragraph summary for a stakeholder

I have the result of a query and I need to write a short, clear summary for a non technical stakeholder.

Here is the query, the result, and who the stakeholder is:

{{clipboard}}

Produce:

1. A one paragraph summary that leads with the answer, not the method.
2. The specific number or numbers they care about, formatted appropriately.
3. One sentence on the time range and population so they do not misread the result.
4. One sentence on what the number means in context (is it good, bad, up, down, compared to what).
5. One caveat if the result has any important limitation.
6. A suggested next question they might want to ask, based on the result.

Do not include the SQL. Do not explain how the query works. Just the answer.

25. Write a short data brief for a product decision

I pulled data to inform a product decision and I need to write a brief that helps the team decide. Not a dump of numbers. An argument built on numbers.

Here is the data, the decision, and the team context:

{{clipboard}}

Produce:

1. A one sentence recommendation at the top.
2. The three numbers that most support the recommendation, with context for why each one matters.
3. The one number that might contradict the recommendation, with a note on how much weight it should carry.
4. The assumption behind the recommendation that is most likely to be wrong.
5. A "next step" that would reduce uncertainty if the team cannot commit to the decision yet.
6. A one paragraph caveats section with data quality notes.

Do not pad it. Brevity is the whole point.

Store and manage your prompts with Promptzy

Free prompt manager for Mac. Search with Cmd+Shift+P, auto-paste into any AI app.

Download Free for macOS