What is the best AI skills manager for Mac?

Promptzy is a free AI skills manager for macOS that syncs skills across Claude, Cursor, OpenClaw, and more. It stores everything as plain Markdown files, features a global shortcut (Cmd+Shift+P) for instant search and paste, and includes multi-directional sync with conflict resolution. With collections, tags, an advanced Markdown editor, and iCloud Sync, it is the most feature-rich AI skills manager available for Mac.

How do I organize my AI skills and prompts?

Promptzy lets you organize AI skills and prompts into collections with optional nesting (sub-collections), add tags for cross-cutting categories, and mark favorites for quick access. When sync is active, collections stay flat for AI tool compatibility — sub-collections are converted to tags automatically. Everything is stored as Markdown files on your Mac, making them portable and readable by other tools like VS Code, Obsidian, and Cursor.

Does Promptzy work with Claude, Cursor, and ChatGPT?

Yes. Promptzy syncs skills directly with Claude, Cursor, and OpenClaw through multi-directional sync. It also works with every other AI tool and any app on your Mac through the global shortcut - press Cmd+Shift+P, select a skill or prompt, and it auto-pastes into the active app.

How does multi-directional sync work?

Promptzy syncs skills between Claude, Cursor, OpenClaw, and other supported apps. When you create or edit a skill in any connected app, Promptzy detects the change and pushes it to all the others. You can see sync status for every skill and resolve conflicts when two apps have different versions - you pick which one is the source of truth.

How is Promptzy different from other AI tools managers?

Most prompt managers store text snippets you paste manually. Promptzy actually syncs your skills across AI apps - Claude, Cursor, OpenClaw, and more. Create a skill anywhere and it appears everywhere. It also includes a full sync management panel with conflict resolution, an advanced Markdown editor, dynamic tokens, per-prompt shortcuts, and local Markdown storage with no lock-in. Pro is $5 one-time.

Yes. Promptzy is free to download and use with up to 10 prompts or skills, 10 role-based starter prompts, and 1 collection. Pro costs a one-time payment of $5 (no subscription) and unlocks unlimited prompts and skills, multi-directional sync with Claude, Cursor, and OpenClaw, sync management with conflict resolution, dynamic tokens, per-prompt global shortcuts, and iCloud Sync.

Can I sync skills across apps and across Macs?

Yes. Promptzy Pro includes multi-directional sync between AI apps like Claude, Cursor, and OpenClaw, plus iCloud Sync to keep everything in sync across all your Macs. Skills you create or edit in any connected app automatically appear in all the others.

What file format does Promptzy use?

Promptzy stores every prompt and skill as a plain Markdown (.md) file with a simple frontmatter header for metadata like title, tags, and collections. Your content is never locked in - files work with any text editor, VS Code, Obsidian, Cursor, or other Markdown-compatible tools.

How do I paste prompts into other apps instantly?

Press Cmd+Shift+P from any app to open Promptzy's Spotlight-style launcher. Search or browse your skills and prompts, press Enter, and the selected item is automatically copied to your clipboard and pasted into the previously active app. The entire flow takes under 2 seconds.

Can I use Promptzy as a Markdown editor?

Yes. Promptzy includes an advanced Markdown editor with rich view and edit modes, line numbers, and find-in-file search. You can use it to edit skills or open any .md file on your Mac. It is fast, distraction-free, and works with dark and light themes.

What are dynamic tokens in Promptzy?

Dynamic tokens are placeholders you add to prompts that get replaced at paste time. Built-in tokens include {{date}}, {{time}}, and {{datetime}} with custom format support like {{date:YYYY-MM-DD}}. Pro tokens include {{clipboard}} (injects clipboard contents) and custom variables like {{name:default}} that prompt for input. They turn static prompts into reusable, context-aware templates.

Back to Blog

The 25 Best AI Prompts for Debugging Code

February 20, 2026by Promptzy

ai prompts debuggingchatgpt debugging promptsai bug hunting promptsdebug code ai prompts

Most debugging sessions go wrong for the same reason. Either you start changing code before you understand what the bug actually is, or you understand the bug but you solve it by adding a check instead of fixing the thing that let the check become necessary in the first place. AI is genuinely useful for the slow, careful middle of that work, which is reading the symptoms, proposing testable hypotheses, and refusing to guess when the evidence is thin. It will not magically find your bug, but it will force you to reason about it properly, which is usually the whole game.

Below are 25 prompts I run when I am stuck on something. Stack traces, logs, reproductions, diffs from the commit where the regression started, all of it. Each one expects you to paste the relevant evidence into {{clipboard}}. Once you find the ones that fit the kinds of bugs you hit most, keep them somewhere a keystroke away so you do not have to think about how to start asking.

Jump to a section

3 Reproducing the bug
3 Reading errors and stack traces
3 Narrowing down the source
3 Hypothesis and testing
3 Debugging specific bug types
3 Runtime and environment
2 Writing debug logs strategically
3 Root cause and postmortem
2 Preventing regressions

Promptzy syncs AI skills across Claude, Cursor, OpenClaw, ChatGPT, and Gemini

Download Free for macOS

Reproducing the bug

1. Turn a vague bug report into a reproduction plan

I have a bug report that is vague, second hand, or missing context. I need to turn it into a concrete reproduction plan before I start looking at code.

Here is the report:

{{clipboard}}

Produce:

1. A one sentence statement of the suspected bug in precise terms.
2. A list of assumptions the reporter is making that I should verify.
3. The minimum information I still need before I can reproduce it: environment, inputs, user state, feature flags, timing, browser or OS version.
4. A suggested sequence of steps that would most likely reproduce it, based on what is in the report.
5. Any detail in the report that sounds plausible but is probably wrong (user attribution errors, coincidental correlation, misremembered sequence).
6. Three questions I should ask the reporter before I burn time reproducing.

Do not write any fixes. Do not speculate about code. Stay in the reproduction phase.

2. Minimize a reproduction case

I have a reproduction of a bug but it involves a lot of moving parts. I want to minimize it to the smallest possible case that still reproduces, so I can reason about it cleanly.

Here is the current reproduction:

{{clipboard}}

Walk me through the minimization:

1. List every component, dependency, and piece of state involved in the current repro.
2. For each one, ask "can this be removed without losing the bug?" and give your best guess based on the symptom.
3. Propose a minimization order: what to try removing first, second, third. Prioritize the things most likely to be irrelevant.
4. For each removal step, tell me what behavior change would confirm that the component was NOT the cause.
5. The end state I am aiming for: the minimum set of moving parts that still reproduces.

Be systematic. Do not try to solve the bug at the same time. The goal here is just to isolate.

3. Reproduce an intermittent bug

I have a bug that happens sometimes but not always. I need a plan for reliably reproducing it so I can actually debug it.

Here is the description and any logs or patterns I have noticed:

{{clipboard}}

Produce:

1. A list of the likely sources of nondeterminism: race conditions, cache state, network timing, random seeds, clock, concurrent users, input order, memory pressure, retries.
2. For each one, a targeted way to force that source to be consistent so I can test whether it is the cause.
3. A list of environmental factors that might be making it more likely: load, specific times of day, recent deploys, feature flags, A/B buckets.
4. A plan for collecting evidence the next time it happens in production: what to log, what to capture, what to dump.
5. A technique I can use to artificially amplify the bug's probability (for example, add sleeps, inject latency, increase concurrency, shrink caches) so I can repro it in dev.

Do not tell me to "just add retries." The goal is to find the cause, not paper over it.

Reading errors and stack traces

4. Explain a stack trace from the top of the stack to the root cause

I have a stack trace and I need help understanding what actually caused the error, not just what threw it.

Here is the stack trace (and the code at the top frame if I have it):

{{clipboard}}

Walk me through:

1. The error type and message in plain English.
2. The frame where the error was thrown, and what it was trying to do when it threw.
3. The frame where the error was caused, which is often different from where it was thrown. If you cannot tell, say so.
4. Any frame in the stack that is suspicious: user code that receives unvalidated input, a library call with bad arguments, async boundaries that may have lost context.
5. The most likely root cause, stated as a hypothesis, with a confidence level (low, medium, high).
6. Three things I could check next to confirm or reject the hypothesis.

Do not propose fixes until I confirm the cause.

5. Decode a cryptic or shortened error message

I am getting an error message that is almost useless: shortened, machine generated, or a numeric code without context. I need help understanding what it actually means.

Here is the message and any context about where it came from:

{{clipboard}}

Produce:

1. The most likely meaning of the message, in plain English.
2. The component or library that probably emitted it, based on its shape.
3. Common causes of this specific message or code, ranked by frequency.
4. What additional information I would need to narrow it down from "probably X" to "definitely X."
5. A suggested log line or debugger breakpoint that would capture that information the next time the error occurs.

If the message is genuinely too ambiguous to guess, say so and tell me what I need to add instrumentation for.

6. Compare two error messages to see if they are the same bug

I have two error messages from different occurrences and I need to know if they are the same underlying bug or two separate ones that happen to look similar.

Here are both errors with their context:

{{clipboard}}

Tell me:

1. The type of each error and whether they are structurally the same.
2. The call paths and whether they converge on a common frame.
3. The inputs, times, and conditions, and whether any of them look correlated.
4. A confidence assessment: same bug (same root cause), related bugs (same code path, different cause), or separate bugs that happen to look alike.
5. The single piece of evidence that would most strongly confirm or reject your assessment.

Do not conflate them unless you are confident.

Narrowing down the source

7. Bisect a regression across a range of commits

A feature used to work and now it does not. I have the current broken state and an older working state. I want a plan for finding the commit that introduced the regression.

Here is what I know:

{{clipboard}}

Produce:

1. A bisection plan: which commit to test first, and why that one, versus the naive midpoint.
2. The set of test cases I should run at each bisection step to decide "broken" or "working." Be specific.
3. Any commit in the range that I should skip (merges that would not matter, formatting commits, documentation changes, reverts).
4. A prediction about which area of the code is most likely to contain the regression based on the symptoms, so I can prioritize manual review if bisection is expensive.
5. A fallback if bisection does not converge (for example, if the bug is intermittent or depends on external state).

Keep the plan tight. I do not want to run git bisect on 300 commits if I can narrow by reading first.

8. Narrow a bug to a specific module or function

I have a bug somewhere in a larger system and I need to narrow it to a specific module or function before I start reading line by line.

Here is the symptom and any evidence I have gathered so far:

{{clipboard}}

Walk me through:

1. Which layer of the system is most likely affected (UI, client logic, network, backend, database, cache, external service)?
2. Which modules within that layer are likely candidates based on what the bug does?
3. For each candidate, one observation or log line that would confirm or reject it.
4. The order to investigate them, from cheapest check to most expensive.
5. The single code path you would trace first, and why.

Do not look at code yet. Reason from symptoms.

9. Use git blame to find when a line of code changed

I have a line of code that I think is wrong and I want to understand when and why it was written. I am going to paste the line, the file it lives in, and the git blame output.

Here is the material:

{{clipboard}}

Produce:

1. Who last touched the line, when, and in what commit.
2. The commit message and whether it suggests this change was intentional, a refactor, or a side effect.
3. The surrounding lines in the same commit, to see if this line was part of a larger change.
4. A question I should ask the author or the PR if I can still reach them.
5. A guess at whether this line was the cause of the bug I am chasing, or a symptom of something upstream.

Do not assume the last author is guilty. They may have been moving code, not writing it.

Hypothesis and testing

10. Generate hypotheses for a bug and rank them

I need a list of possible causes for this bug so I can test them systematically instead of chasing one guess at a time.

Here is the symptom and context:

{{clipboard}}

Produce:

1. Five to ten hypotheses about what could cause the observed behavior.
2. For each hypothesis, a confidence score (1 to 5) based on how well it fits the symptoms.
3. For each hypothesis, a specific test I could run to confirm or reject it, ranked by how cheap the test is.
4. The one hypothesis I should test first, considering both confidence and cost.
5. Any hypothesis that is so expensive to test that I should gather more evidence before going after it.

Do not rank them by how interesting they would be to fix. Rank by probability.

11. Design a test that isolates a single variable

I have a hypothesis about a bug and I want to design a test that changes only one variable at a time to confirm or reject it.

Here is my hypothesis and the current state:

{{clipboard}}

Produce:

1. The single variable the test should isolate.
2. The baseline (what should happen if the hypothesis is wrong).
3. The prediction (what should happen if the hypothesis is right).
4. The setup: what fixtures, stubs, or environment I need.
5. Any confounding variables I need to hold constant, and how to hold them constant.
6. The expected duration and cost of running the test.

Do not design a test that would fail to distinguish the hypothesis from its alternatives. If the test is ambiguous, propose a better one.

12. Evaluate the evidence I have for a hypothesis

I have been investigating a bug and I am starting to get attached to a hypothesis. I want an honest evaluation of whether the evidence actually supports it.

Here is the hypothesis and the evidence I have gathered:

{{clipboard}}

Play devil's advocate:

1. Does the evidence logically imply the hypothesis, or only suggest it?
2. What other hypotheses would the same evidence also support?
3. Is there any evidence that contradicts the hypothesis that I might be downplaying?
4. What evidence would I expect to see if the hypothesis is true but have not looked for yet?
5. A confidence level from 1 to 10 based strictly on the evidence.
6. The single next experiment that would most strongly distinguish my hypothesis from its nearest alternative.

Do not be kind. If I am fooling myself, say so.

Debugging specific bug types

13. Debug a race condition

I think I have a race condition but I cannot prove it. Help me reason about the interleaving and find the bad ordering.

Here is the code and a description of the symptom:

{{clipboard}}

Walk me through:

1. Every shared piece of state that is accessed from multiple threads, tasks, or processes.
2. Every critical section that is not protected by a lock, transaction, or atomic operation.
3. A specific interleaving of operations that would produce the observed symptom. Walk it step by step.
4. Whether the bug is a true race, a memory ordering issue, or a missing happens-before relationship.
5. The minimum synchronization that would fix it, without introducing a worse bug like a deadlock.
6. A test strategy: can I reproduce it deterministically with sleeps, thread interleaving frameworks, or stress runs?

Do not suggest "just add a lock" without identifying exactly which operations need to be mutually exclusive.

14. Debug a memory leak

I have a memory leak and I need help finding what is being held on to that should have been released.

Here is the evidence I have: heap snapshots, profiling data, or just the symptom:

{{clipboard}}

Produce:

1. The object type or shape that is accumulating, based on the evidence.
2. Common root causes for this kind of accumulation in the language or runtime I am using: event listeners, closures, timers, caches, global state, circular references.
3. A list of candidate locations in the code where this kind of leak typically hides.
4. A diagnostic plan: what to measure, what to snapshot, and what to compare between runs.
5. A reproduction strategy that can leak predictably, so I can verify any fix.
6. The difference between a leak (grows without bound) and a working set issue (large but stable).

Do not recommend bumping heap size as a fix.

15. Debug a performance regression

Something got slower and I need to find out what. Help me narrow down where the time is going.

Here is the before and after, plus any profiling data I have:

{{clipboard}}

Produce:

1. A rough diagnosis: is the regression in CPU, memory, IO, network, lock contention, or allocator pressure?
2. The specific functions or code paths that likely account for most of the added time, based on the data.
3. For each candidate, a test I could run to confirm or reject it (turn it off, skip it, instrument it, bisect the commits).
4. Any pattern in the data that suggests the cause is not where the time shows up (for example, a cold cache caused by an upstream change).
5. A list of recent changes or deploys that are plausible suspects.
6. The smallest fix I could try to confirm the diagnosis, independent of the full fix.

Focus on finding the cause first. Do not propose optimizations for code that is not the bottleneck.

Runtime and environment

16. Debug an environment specific bug

A bug only reproduces in one environment. Production works, staging works, but a specific instance fails. Or vice versa. I need to find what is different.

Here is the symptom and what I know about the environments:

{{clipboard}}

Produce:

1. A list of the usual suspects for environment specific bugs: config, secrets, environment variables, DNS, timezone, locale, file system, package versions, network policy, firewall, TLS, clock skew.
2. For each suspect, a specific way to check whether it is different between the failing and working environments.
3. The order to check them, from most likely to least.
4. Any symptom in the bug that points to a specific class of environment issue (for example, "works on my machine" patterns).
5. A diagnostic I could run in the failing environment that would capture all the relevant state at once.

Do not assume the code is the same across environments. Verify it.

17. Debug a dependency version mismatch

I think a dependency version mismatch is causing my bug. Help me find which dependency and which version.

Here is the error or symptom and the dependency tree if I have it:

{{clipboard}}

Produce:

1. The likely culprit package, based on what the error is complaining about.
2. The transitive path that brings it into my project, if visible.
3. Recent versions of that package and any breaking changes in their changelogs I should be aware of.
4. A diagnostic command to verify the exact version being used at runtime, not just the one listed in the lockfile.
5. A fix strategy: pin the version, override a transitive dependency, upgrade, or downgrade.
6. Any risk of the fix: will pinning cause other packages to break?

If you cannot identify the culprit from the information I gave you, tell me what to add to narrow it down.

18. Debug a production only bug that will not reproduce in dev

A bug shows up in production but I cannot reproduce it locally. I need a plan that works without shipping debug code to production.

Here is the symptom and what I know:

{{clipboard}}

Produce:

1. The most likely reasons the bug does not reproduce in dev: data volume, concurrency, configuration, secrets, network, third party services, feature flags, specific user state.
2. For each reason, a safe way to make my dev environment more prod-like to try to reproduce it.
3. What non invasive telemetry or logging I could add to production to capture the bug next time it happens, without performance or PII risk.
4. A way to replay production traffic against a staging environment if that is an option.
5. A plan B if the bug remains elusive: wait for it, shadow mode, or canary a targeted fix.

Do not suggest "just add more logging everywhere." Be targeted.

Writing debug logs strategically

19. Decide where to add log statements for the bug I am chasing

I need to add logging to find a bug, but I do not want to scatter print statements everywhere. Help me pick the right places.

Here is the bug and the code path I suspect:

{{clipboard}}

Produce:

1. The three to five most informative places to add a log statement, ranked by expected value.
2. For each, the exact log line I should write (variables, prefix, log level).
3. The format that would be easiest to grep or parse later.
4. Any log that I should add as a safety net, like at the entry and exit of a suspect function, to detect whether the code is even being reached.
5. Any log I should NOT add because it would be too noisy, too expensive, or would expose PII.

Aim for the minimum number of logs that would conclusively tell me whether my hypothesis is right.

20. Turn a noisy log into a useful diagnostic

I added logging for a bug and now I am drowning in output. Help me turn the noise into a useful diagnostic by filtering, aggregating, or restructuring what is being logged.

Here is a sample of the current log output:

{{clipboard}}

Produce:

1. Patterns in the output that are pure noise (repeated lines, irrelevant entries, idle chatter).
2. A grep, awk, or jq pipeline that would isolate only the lines relevant to the bug.
3. A way to aggregate or count events so I can see frequency rather than every individual occurrence.
4. A restructured version of the logging that would be more useful if I could re run the test: which logs to drop, which to add, which to change.
5. A small script or command that would turn the current log into a table or summary.

Assume I have basic unix tools. Do not suggest importing it into a full observability platform unless that is genuinely the right answer.

Root cause and postmortem

21. Conduct a five whys analysis from a symptom

I have fixed a bug, but I want to make sure I understand the actual root cause, not just the surface symptom. Walk me through a five whys analysis.

Here is the bug and the immediate cause I identified:

{{clipboard}}

Walk through:

1. Why 1: what was the immediate technical cause?
2. Why 2: why did that happen? What allowed or caused the immediate cause?
3. Why 3: why did the thing in why 2 happen?
4. Why 4: why did the thing in why 3 exist in that state?
5. Why 5: why did the system allow the situation in why 4 to be possible?

After the five, give me a one sentence root cause statement that a coworker would understand. Then tell me which level of the chain is actually the right place to intervene, because the deepest why is not always the most actionable.

22. Write a blameless postmortem

I need to write a postmortem for an incident. I want it to be useful and blameless, not a witch hunt.

Here is what I know about the incident:

{{clipboard}}

Produce a postmortem with:

1. Summary: one paragraph of what happened and its impact.
2. Timeline: chronological list of events with timestamps, including when the issue started, when it was detected, when it was mitigated, when it was resolved.
3. Root cause: the actual technical cause, stated clearly without assigning blame.
4. Contributing factors: other conditions that made the incident possible or worse.
5. What went well: the parts of the response that worked.
6. What went poorly: the parts that slowed us down, stated without naming individuals.
7. Action items: specific, assignable, with clear owners and due dates as placeholders.

Do not speculate about intent. Do not use the word "human error" as a root cause. If someone made a mistake, the question is why the system allowed the mistake to cause an incident.

23. Identify the difference between the cause and the trigger of a bug

I fixed a bug but I am not sure I understood it properly. The thing I changed made the symptom go away, but I suspect the real cause is somewhere else.

Here is the bug, my fix, and the code that was changed:

{{clipboard}}

Tell me:

1. Whether my fix addresses the cause or just the trigger. Explain the difference in this specific case.
2. If I addressed only the trigger, what the cause is likely to be and where it lives.
3. Under what other triggers the same root cause could surface again.
4. Whether I should leave my fix in place AND address the cause, or revert and address the cause properly.
5. A test that would catch the cause, not just the current trigger.

Be honest. If the fix is good enough, say so. If it is patching a symptom, say that clearly.

Preventing regressions

24. Write a regression test from a fixed bug

I just fixed a bug and I want to lock the fix in with a regression test that will fail if the same bug comes back.

Here is the bug, the fix, and the relevant code:

{{clipboard}}

Produce:

1. The exact input or sequence of events that triggered the bug.
2. The assertion that would have failed before the fix.
3. A test function that sets up the state, triggers the condition, and asserts the correct behavior.
4. A note on which file the test should live in, based on the project's conventions.
5. Any dependency, mock, or fixture the test needs.
6. Whether the test should be unit, integration, or end to end, and why.

Keep the test minimal. It should check this specific behavior, not the whole feature.

25. Propose process changes to prevent this class of bug

I just debugged a nasty bug and I want to think about whether any process change would prevent or catch this class of bug earlier next time.

Here is a summary of the bug and how it was caught:

{{clipboard}}

Suggest:

1. Any code review checklist item that would have caught this.
2. Any static analysis rule, linter, or type constraint that would have flagged it.
3. Any test that the original author could reasonably have written to catch it.
4. Any observability signal that would have surfaced it sooner in production.
5. Any architectural change that would make this class of bug impossible rather than just detectable.
6. A verdict on which of the above is worth actually doing, and which is overkill for the frequency of this bug type.

Do not propose process changes that are heavier than the problem. One good lint rule beats a new review policy.

Store and manage your prompts with Promptzy

Free prompt manager for Mac. Search with Cmd+Shift+P, auto-paste into any AI app.

Download Free for macOS