Standalone article · part of a sequenced guide
What you'll unlock: A prompt is a specification — instructions, context, examples, and format. Master those four, then layer advanced techniques and command-grade systems so quality scales beyond any individual chat.
Prompting — From Basic to Command-Grade
The complete prompting curriculum — the foundations, the advanced techniques, and the professional-grade patterns that separate power users from everyone else
Chapter context
Your team uses Claude daily but output quality varies wildly — star performers have private prompts in Notes; everyone else gets generic first drafts and blames the tool.This chapter replaces hero prompting with prompt engineering discipline. The goal is not longer prompts — it's repeatable specifications that survive handoffs and model updates.
Is this chapter for you?
Do you use Claude for recurring task types (reports, specs, emails, research)?
Yes — prioritise Concept 4 templates and library; ROI is immediate.
Have you shipped something from Claude without verifying facts?
Yes — read Concept 3 traps before your next customer-facing output.
Are you building API features that depend on structured Claude output?
Yes — Concepts 2.4, 2.6, and 4.3–4.4 are mandatory.
Does more than one person on your team need the same output quality?
Yes — Concept 4.8 sharing and prompt audits are your operating system.
Chapters 1–3 gave you the mental model, the economics, and the interface. Chapter 4 is where output quality becomes intentional — the full prompting curriculum from first principles to team-scale practice.Most people stop at clever one-liners. Power users build templates, chains, libraries, and audits. This chapter is that ladder.
Chapter insight
A prompt is a specification — instructions, context, examples, and format. Master those four, then layer advanced techniques and command-grade systems so quality scales beyond any individual chat.
Reference diagrams
Prompt quality stack
Foundations first; advanced techniques on top; traps avoided at every layer; command-grade systems make it team-durable.
Command-grade iteration loop
Draft → critique → revise → template — the loop that turns one good session into a reusable asset.
Implementation paths
Four concepts — learn, advance, avoid traps, systematise.
Concept 1
Prompting Foundations
What actually makes a prompt work — the principles behind every technique
1.1
The four components of every prompt
Instructions, context, examples, and output format — the anatomy that explains why prompts succeed or fail
Key takeaway
Every effective prompt combines four elements: what to do (instructions), what to know (context), what good looks like (examples), and what shape to return (format). Missing any one produces predictable failure modes.
Why this matters
Most 'bad prompts' are incomplete prompts. Diagnosing which component is missing turns random trial-and-error into systematic improvement.
Instructions tell Claude what to accomplish. Context grounds the answer. Examples calibrate quality. Output format constrains structure so you can use the result without reformatting.
Weak prompt: 'Summarise this.' Strong prompt: instructions ('Summarise for a CFO'), context (attached earnings call), example (prior summary you liked), format ('5 bullets, max 20 words each, lead with revenue impact').
Workflow — do this next
- 01Before sending, label your draft: I / C / E / F — which are present?
- 02If output is wrong shape, fix format first; if wrong substance, add context or examples.
- 03Save prompts where all four are explicit as templates.
Ready-to-use artifacts
Complete templates — paste directly into your AI tool or automation workflow.
Prompt anatomy checklist
INSTRUCTIONS — What should Claude do? (verbs, scope, exclusions) CONTEXT — What must it know? (docs, audience, date, constraints) EXAMPLES — What does good look like? (1–3 pairs if task is nuanced) FORMAT — How should output appear? (structure, length, file type) Score each 0–2 before sending. Below 6 total → revise first.
1.2
Specificity as the primary lever
Why vague prompts produce vague outputs and exactly how to be more specific
Key takeaway
Specificity is the highest-ROI prompt edit — replace abstract goals with concrete constraints: audience, length, criteria, and what to exclude.
Why this matters
Claude fills gaps with generic training-data defaults. Vagueness outsources decisions to the model; specificity keeps them with you.
Vague: 'Improve this email.' Specific: 'Rewrite for a busy VP who ignored our last two emails. Under 120 words. One clear CTA: book a 15-min call. Tone: direct, not salesy. Keep subject line under 50 characters.'
Add specificity along four axes: who, what, how, why
Workflow — do this next
- 01Highlight vague words (good, better, professional, brief) — replace each with a number or example.
- 02Add one sentence: 'Optimize for X, not Y.'
- 03If output is still generic, add a negative example: 'Do not write like this: …'
Real example
Founder — investor update specificity
Generic prompt produced a 600-word essay. Revised prompt: 'Monthly investor email. 4 sections: metrics (table), wins (3 bullets), asks (numbered), risks (honest, 2 max). Total under 400 words. Assume readers skim on mobile.' Second output required one edit, not a rewrite.
1.3
Positive vs negative instructions
Telling Claude what to do vs what not to do — when each is more effective
Key takeaway
Lead with positive instructions ('do X'); use negatives sparingly for genuine failure modes. Long lists of 'don't' dilute attention and often get ignored.
Why this matters
Models attend strongly to the first actionable pattern. Ten prohibitions compete with one clear task.
Positive: 'Use active voice. Lead with the recommendation.' Negative: 'Don't be verbose. Don't use jargon. Don't start with Certainly.' Positives define the target; negatives patch holes after the target is clear.
When negatives help: safety boundaries, known failure modes ('do not invent citations'), compliance ('exclude PII'). Cap at 3–5 high-signal negatives per prompt.
Workflow — do this next
- 01Rewrite top 3 'don't' lines as 'do' lines.
- 02Keep negatives only for irreversible errors (legal, factual, security).
- 03Test: remove all negatives — if quality drops, re-add only the ones that mattered.
1.4
Role and persona prompting
When 'act as a...' genuinely helps and when it is cargo cult — the honest assessment
Key takeaway
Roles help when they imply domain constraints, vocabulary, and evaluation criteria — not when they're generic theatre ('act as an expert').
Why this matters
Cargo-cult roles add tokens without information. Good roles encode how an expert would think about the task.
Useful role: 'You are a senior PM at a B2B SaaS company reviewing a PRD for engineering feasibility — flag scope creep, missing acceptance criteria, and dependencies.' Useless role: 'You are a world-class genius.'
Roles work best paired with stakes and audience. Skip the role if instructions and context already specify expertise level.
Workflow — do this next
- 01If using a role, list 3 behaviours the role must exhibit.
- 02A/B test with and without role on the same task — keep only if output improves.
- 03Prefer Project-level persona over repeating role in every message.
Real example
Legal review role — useful vs theatre
Theatre: 'Act as a lawyer.' Useful: 'You are in-house counsel reviewing a vendor DPA for a 200-person SaaS. Flag clauses that conflict with GDPR Art. 28, unlimited liability, and missing subprocessors list. Output: table of clause, risk, suggested redline.'
1.5
Output format specification
JSON, markdown, tables, prose — how telling Claude the format changes the output
Key takeaway
Format instructions are cheap and powerful — they turn free-form prose into parseable, comparable, shippable output.
Why this matters
Integration, team review, and iteration all require predictable structure. Format is how you make Claude's output a data product.
Specify: container (JSON / markdown / table), field names, max lengths, required sections, and whether to omit preamble ('no intro paragraph'). For API use, request JSON only with a schema snippet.
Claude often complies better when you show a skeleton than when you only name the format in abstract terms.
Workflow — do this next
- 01Paste an empty template of the desired output.
- 02For JSON: include keys, types, and 'return valid JSON only'.
- 03Validate output with a parser or checklist — feed errors back in iteration 2.
Ready-to-use artifacts
Complete templates — paste directly into your AI tool or automation workflow.
Format specification block
OUTPUT FORMAT: - Type: [markdown table / JSON / bullet list] - Sections: [ordered list] - Max length: [words or rows] - Rules: [no preamble | cite sources inline | use H2 only] - Template: [paste skeleton here]
1.6
Tone and register control
How to get Claude to match the voice your work requires — professional, casual, technical, executive
Key takeaway
Tone is specified through audience, sample phrases, and anti-patterns — not adjectives alone ('professional' is ambiguous).
Why this matters
Brand voice drift causes rework. Explicit register beats hoping Claude guesses your company's style.
Effective tone controls: target reader ('board members, not engineers'), sentence length band, vocabulary level, 2–3 phrases to emulate, 2–3 phrases to ban.
For recurring tone, put voice rules in Project instructions and reference a sample paragraph: 'Match the register of this example: …'
Workflow — do this next
- 01Attach one approved sample of the target voice.
- 02List 3 banned patterns (hedging, emoji, hype words).
- 03Ask Claude to self-check tone against criteria before finalising.
1.7
Length control
Getting Claude to be concise when you need it and thorough when you need that — the specific instructions that work
Key takeaway
Length must be numeric or structural — word counts, bullet caps, section limits — not 'be concise' alone.
Why this matters
Claude's default is thoroughness. Without caps, you pay in tokens, reading time, and buried lead sentences.
Concise: 'Max 150 words. One paragraph. No intro.' Thorough: 'Minimum 800 words. Cover 5 subsections with evidence per section.' Mixed: 'Executive summary: 3 bullets. Deep dive: up to 500 words per section.'
Pair length with information density rules to avoid padding when you ask for length.
Workflow — do this next
- 01Set hard caps for emails, Slack, exec summaries.
- 02Use 'if over N words, cut lowest-priority section first' for flexible caps.
- 03In API calls, set max_tokens as a backstop.
1.8
The iteration mindset
Treating prompting as a conversation, not a one-shot transaction — the workflow that produces better results
Key takeaway
First drafts are briefings, not deliverables. The second and third turns — critique, refine, lock format — produce most of the value.
Why this matters
One-shot prompting wastes Claude's ability to incorporate feedback and your ability to steer without rewriting from scratch.
Iteration loop: (1) rough prompt → draft, (2) 'What's weak? What did you assume?' → diagnosis, (3) 'Revise: fix X, keep Y, output as Z' → near-final. Use artifacts to hold the deliverable while chat handles negotiation.
Save iteration patterns: critique-then-revise, diff-style edits.
Workflow — do this next
- 01Never ship first response on high-stakes work.
- 02Turn 2: ask 'list assumptions and gaps'.
- 03Turn 3: single revision brief with format lock.
- 04Log what changed between v1 and v3 — that's your next template.
Real example
PM — PRD in three turns
Turn 1: outline. Turn 2: 'Score against eng-readiness rubric; list gaps.' Turn 3: 'Fill gaps in sections 2 and 4 only; output full PRD in artifact.' Total time under 25 minutes vs 2 hours of one-shot editing.
Concept 2
Advanced Prompting Techniques
The techniques that separate good Claude outputs from great ones
2.1
Chain-of-thought prompting
"Think step by step" — why making Claude reason out loud improves accuracy and when to use it
Key takeaway
Asking Claude to show reasoning before the final answer improves accuracy on multi-step logic, math, and trade-off analysis — at the cost of tokens and latency.
Why this matters
Hidden reasoning skips checkable steps. Visible chains let you catch errors before they reach the conclusion.
Use CoT for: prioritisation, debugging, financial models, policy interpretation, architecture decisions. Skip for: simple rewrites, format conversion, tasks with a fixed template.
Strong pattern: 'First list assumptions. Then reason step by step. Finally give the recommendation in 3 bullets.' For API, separate reasoning and answer blocks or use structured tags so you can strip reasoning in production if needed.
Workflow — do this next
- 01Add 'show work before conclusion' to any decision prompt.
- 02If conclusion is wrong, fix the earliest flawed step — don't restart.
- 03For production API, log reasoning internally; show users only the final block.
Real example
Eng lead — build vs buy analysis
Without CoT, Claude recommended buy. With step-by-step (TCO, team skill, timeline, risk), reasoning exposed a missing maintenance cost assumption. Revised recommendation: build phase 1, buy phase 2. Decision doc cited the chain.
2.2
Few-shot prompting
Teaching by example — how three well-chosen examples unlock dramatically better outputs
Key takeaway
Two to three diverse, high-quality examples teach format, tone, and edge-case handling better than paragraphs of rules.
Why this matters
Examples are implicit specifications. They show what 'done' looks like when language alone is ambiguous.
Pick examples that differ in difficulty: easy, typical, edge case. Label them Input/Output. Keep examples shorter than the task you want — Claude generalises from pattern, not length.
Few-shot beats zero-shot when: output format is idiosyncratic, brand voice is specific, or task has unwritten rules ('how we write release notes here').
Workflow — do this next
- 01Harvest 3 real past outputs your team approved.
- 02Strip sensitive data; keep structure and voice.
- 03Add: 'Follow the pattern of these examples for the new input below.'
Ready-to-use artifacts
Complete templates — paste directly into your AI tool or automation workflow.
Few-shot prompt template
<example> <input>[sanitised input 1]</input> <output>[ideal output 1]</output> </example> <example> <input>[input 2 — edge case]</input> <output>[output 2]</output> </example> Now process: <input>[new task]</input>
2.3
Meta-prompting
Using Claude to write and improve its own prompts — the recursive technique that scales prompt quality
Key takeaway
Use Claude to draft, critique, and harden prompts against a rubric — then human-approve before production use.
Why this matters
Manual prompt writing plateaus. Meta-prompting turns your quality criteria into a repeatable review loop.
Pattern: (1) Describe task and failure modes. (2) Ask Claude to write a prompt with I/C/E/F. (3) Ask Claude to attack its own prompt — vagueness, missing format, untestable success. (4) You edit and freeze v1.
Guardrail: never auto-deploy meta-generated prompts without testing on 5+ real inputs. Meta-prompting accelerates drafting, not judgment.
Workflow — do this next
- 01Prompt: 'Write a prompt for [task]. Include format, examples placeholder, and 3 test cases.'
- 02Follow-up: 'Score this prompt 1–10 on specificity; rewrite weak sections.'
- 03Run test cases; log failures; iterate prompt version.
2.4
Prompt chaining
Breaking complex tasks into sequential prompts — the architecture for work that exceeds a single response
Key takeaway
Chain when one pass mixes incompatible goals (research + write + critique) or exceeds reliable context. Each step has one job and a verifiable output.
Why this matters
Monolithic prompts produce mediocre everything. Chains let you inspect, fix, and cache intermediate results.
Classic chain: extract → transform → format → review. API implementations pass step N output as step N+1 input. Claude.ai users can run chains manually or via Projects with saved step prompts.
Design chains with checkpoints. Failed step 2 should not run until step 1 passes schema validation.
Workflow — do this next
- 01Decompose task into single-verb steps.
- 02Define output schema per step.
- 03Build a chain diagram; implement thinnest step first.
- 04Add human review only where errors are expensive.
Real example
Content team — research → brief → draft → edit
Four prompts in one Project. Step 1: source-backed fact sheet. Step 2: angle + outline (human picks angle). Step 3: draft in artifact. Step 4: editor pass against style guide. Quality up; fewer full rewrites.
2.5
Self-consistency
Running the same prompt multiple times and synthesising the best answer — when it is worth the cost
Key takeaway
Multiple samples + synthesis reduces variance on high-stakes judgments — but multiplies token cost. Reserve for decisions, not drafts.
Why this matters
Single samples lie confidently. Consistency across runs surfaces stable conclusions vs noise.
Process: run prompt 3–5 times (slight temperature if API), extract claims that repeat, flag contradictions, ask Claude to synthesise a final answer citing which runs agreed.
Worth it when: classification with fuzzy boundaries, risk assessment, creative strategy options. Not worth it for: deterministic formatting, code with tests, tasks with ground-truth verification.
Workflow — do this next
- 01Run 3 parallel completions.
- 02Diff key claims — agreement = signal.
- 03Synthesis prompt: 'Merge consistent points; list disagreements for human.'
2.6
Structured output prompting
Reliably extracting JSON, tables, and typed data from Claude — the technique for integration with other tools
Key takeaway
Structured output requires schema in the prompt, validation after the response, and a repair loop when parsing fails.
Why this matters
Downstream tools need parseable data. Prose is a dead end for automation.
Anthropic API supports tool use and JSON modes on many models — in Claude.ai, specify schema in the prompt and use artifacts for tables. Always validate with JSON.parse or a schema validator.
Repair pattern: 'Your output failed validation: [error]. Return corrected JSON only.' Keep original prompt; add error message. Two repair attempts max, then escalate to human.
Workflow — do this next
- 01Embed JSON schema or table headers in prompt.
- 02Parse response; on failure, auto-retry with error snippet.
- 03Log failure types — recurring schema issues mean prompt fix, not more retries.
2.7
Constitutional prompting
Building constraints and values directly into the prompt — the safety and quality layer you control
Key takeaway
Write explicit principles Claude must follow — accuracy rules, refusal boundaries, quality checks — as a constitution prepended to tasks.
Why this matters
Product safety and brand standards can't rely on base model behaviour alone. Your constitution is enforceable policy in text.
Constitution blocks include: cite sources or say unknown; never fabricate metrics; refuse requests outside role; check output against prohibited phrases; prioritise user safety over helpfulness when they conflict.
End prompts with: 'Before responding, verify compliance with the constitution above.' This activates self-check behaviour trained via Constitutional AI.
Workflow — do this next
- 01Draft 5–10 non-negotiable rules for your use case.
- 02Prepend to system prompt or Project instructions.
- 03Red-team with adversarial inputs monthly.
Ready-to-use artifacts
Complete templates — paste directly into your AI tool or automation workflow.
Mini-constitution template
CONSTITUTION: 1. If fact is not in provided context, say "not in sources." 2. Never invent names, numbers, or citations. 3. Flag uncertainty explicitly. 4. Refuse [specific prohibited requests]. 5. Output must pass [format + length] rules. Task follows below.
2.8
Conditional prompting
Prompts that instruct Claude to behave differently based on what it finds — the branching logic in language
Key takeaway
Use if/then instructions in natural language to branch behaviour — classify first, then apply the right sub-prompt without spinning up separate apps.
Why this matters
One smart prompt can replace brittle multi-tool flows when branches are few and inspectable.
Pattern: 'Read the input. If [condition A], follow procedure A. If [condition B], follow procedure B. If unclear, ask one clarifying question.' Works for triage, support routing, document classification.
Limit branches to 3–4. Beyond that, use a chain with an explicit classification step (structured label) then route in code.
Workflow — do this next
- 01Write decision tree on paper first.
- 02Encode as numbered branches in one prompt.
- 03Test one input per branch + one ambiguous input.
Real example
Support triage — conditional prompt
Single prompt: if billing → refund policy path; if bug → collect repro steps; if feature request → template + product area tag. Haiku classifies; human reviews edge cases. Replaced three separate macros.
Concept 3
The Prompt Trap Catalogue
Every common prompting mistake — what it costs you and exactly how to fix it
3.1
The vagueness trap
"Write something good about X" — the prompt pattern that guarantees mediocre output
Key takeaway
Vague prompts outsource every decision to the model — you get average-of-the-internet output, not your output.
Why this matters
This is the most common trap and the cheapest to fix with specificity (Concept 1.2).
Trap: 'Write something good about our product.' Fix: audience, angle, length, proof points, CTA, and what to avoid. Good is not a specification.
Workflow — do this next
- 01Ban the words good, better, nice, professional without definitions.
- 02Add: who reads this, what they do after reading, max length.
Real example
Marketing — landing page trap
'Write good copy' produced generic SaaS fluff. Fixed prompt named ICP, three differentiators with proof, objection to address, and hero under 12 words. Conversion on A/B test improved — not because of magic AI, because of specifications.
3.2
The novel trap
Treating Claude like a search engine for creative questions — why it hallucinates when you ask for facts
Key takeaway
Claude composes plausible text — it does not verify facts. Factual questions need sources, search, or attached documents.
Why this matters
Chapter 1's mental model: reasoning engine over evidence, not oracle. This trap causes the most expensive business errors.
Trap: 'What was our competitor's Q3 revenue?' without data. Fix: attach 10-K, enable search with source rules, or ask Claude to say 'I don't have this' if not in context.
Workflow — do this next
- 01Tag prompts FACT or OPINION before sending.
- 02FACT prompts require attached source or search + citation review.
- 03Never paste unverified numbers into decks or contracts.
3.3
The confirmation trap
Asking Claude if something is correct — why it agrees and how to get honest evaluation instead
Key takeaway
Claude is biased toward agreement. 'Is this right?' gets yes. 'Find flaws under criteria X' gets useful critique.
Why this matters
RLHF rewards helpfulness; confirmation feels helpful even when wrong.
Trap: 'My plan is solid, right?' Fix: 'Act as sceptical reviewer. List 5 ways this plan fails. Cite assumptions.' Or: 'Steel-man the opposite position.'
For factual verification, compare against source text: 'Quote the passage that supports each claim' — unsupported claims surface quickly.
Workflow — do this next
- 01Replace 'is this good?' with rubric-based scoring.
- 02Ask for disconfirming evidence explicitly.
- 03Use a fresh chat without your ego in the thread for red-team passes.
3.4
The length trap
Not specifying length — why Claude defaults to verbose and how to control it
Key takeaway
Default Claude is thorough. Without caps, you get essays when you needed a Slack message.
Why this matters
Verbose output hides weak thinking and wastes reader time — see Concept 1.7.
Trap: 'Summarise the report.' Fix: 'Summarise in 5 bullets, max 15 words each, lead with decision impact.'
Workflow — do this next
- 01Set numeric caps on every outward-facing deliverable.
- 02Ask for executive summary + optional appendix if depth needed.
3.5
The format trap
Getting prose when you needed a table, or a list when you needed a paragraph — the one sentence that fixes it
Key takeaway
One explicit format line prevents 80% of reformatting work: 'Return a markdown table with columns A, B, C — no intro.'
Why this matters
Format mismatch is pure friction — easy to prevent, tedious to fix.
Trap: assuming Claude will guess your team's standard format. Fix: paste empty template or schema in every recurring task.
Workflow — do this next
- 01Add OUTPUT FORMAT as mandatory section in templates.
- 02If wrong format, don't re-ask whole task — 'convert previous answer to [format] only'.
3.6
The context dump trap
Giving Claude too much irrelevant context and watching the quality degrade — the signal-to-noise discipline
Key takeaway
More context is not better context. Irrelevant documents dilute attention and increase cost — curate what the model sees.
Why this matters
Context window is finite attention, not unlimited storage (Chapter 1, Chapter 2).
Trap: uploading entire data room for one clause review. Fix: extract relevant pages, summarise background in 200 words, attach only primary sources.
Use Projects for stable corp context; per-message attachments for task-specific evidence only.
Workflow — do this next
- 01Before attach: 'What 3 facts does Claude need?'
- 02Remove docs that don't change the answer.
- 03Summarise long files in a separate step, then chain.
3.7
The one-shot trap
Giving up after the first response — why the second and third iteration almost always produce better output
Key takeaway
First outputs are drafts. Teams that one-shot either accept mediocrity or blame the tool instead of iterating.
Why this matters
Iteration is the workflow (Concept 1.8) — not a sign of failure.
Trap: 'Claude can't do this' after one try. Fix: critique pass, targeted revision, format lock — usually three turns.
Workflow — do this next
- 01Budget 3 turns minimum for important work.
- 02Turn 2 always asks for gaps and assumptions.
- 03Save winning iteration sequences as templates.
3.8
The compliance trap
Mistaking Claude's agreeableness for accuracy — the verification habits that prevent expensive mistakes
Key takeaway
Polite, confident wrong answers are the danger — not refusals. Verify facts, numbers, names, and legal claims before ship.
Why this matters
The trap combines confirmation bias (3.3) with novel trap (3.2) — lethal in regulated or customer-facing work.
Habits: citation check, spot-verify 3 random facts, second-model or human review on high stakes, never auto-send customer emails without human read.
Build verification into the prompt — don't rely on post-hoc discipline alone.
Workflow — do this next
- 01Define tier 1 (auto-ok), tier 2 (spot check), tier 3 (mandatory human) tasks.
- 02Tier 3: no send without named approver.
- 03Log incidents where compliance trap fired — update prompts.
Real example
Sales — wrong stat in proposal
AE pasted Claude-generated market size into a proposal without checking. Stat was outdated composite from training data. Fix: proposal prompt requires citation or 'VERIFY' flag; CRM attach for source PDF; manager review on tier-3 deals.
Concept 4
Command-Grade Prompting
The professional system — how power users structure their prompting practice for consistent, scalable output quality
4.1
Prompt templates
The reusable prompt structures that deliver consistent output for recurring task types
Key takeaway
Templates freeze the parts that shouldn't change (format, rubric, constitution) and leave variables for the task — consistency without copy-paste drift.
Why this matters
Ad hoc prompts don't scale across teams or weeks. Templates are the unit of reusable prompt IP.
Template structure: metadata (name, owner, version), variable slots [BRACKETED], fixed instructions, format skeleton, test cases. Store in Project docs, Notion, or git.
Good template candidates: weekly reports, PRD sections, code review checklists, customer reply drafts, research briefs — anything you do more than twice monthly.
Workflow — do this next
- 01Identify top 5 recurring Claude tasks.
- 02Extract last best prompt; parameterise names, dates, inputs.
- 03Add 2 test inputs in template footer.
- 04Review monthly — retire templates that fail tests.
Ready-to-use artifacts
Complete templates — paste directly into your AI tool or automation workflow.
Prompt template skeleton
# Template: [NAME] v[0.1] Owner: [team] Use when: [trigger] ## Variables - [INPUT_DOC] - [AUDIENCE] - [DEADLINE] ## Instructions [fixed instructions] ## Format [fixed skeleton] ## Constitution [fixed rules] --- INPUT: [INPUT_DOC]
4.2
The RISEN framework
Role, Instructions, Steps, End goal, Narrowing — the professional prompting structure that works across task types
Key takeaway
RISEN is a checklist for complete prompts: Role (who Claude is), Instructions (task), Steps (procedure), End goal (success definition), Narrowing (constraints and exclusions).
Why this matters
Frameworks prevent omitted components better than ad hoc memory — especially under time pressure.
Role: 'You are a technical editor.' Instructions: 'Tighten this spec for engineering.' Steps: '1) Flag ambiguity 2) Propose edits 3) Output redline.' End goal: 'Eng can estimate without clarification calls.' Narrowing: Max 400 words; do not change scope; mark assumptions.
Workflow — do this next
- 01Draft prompt; label R-I-S-E-N sections.
- 02Any blank section — fill or delete Role if redundant.
- 03Compare output to End goal — if miss, fix Steps not adjectives.
Ready-to-use artifacts
Complete templates — paste directly into your AI tool or automation workflow.
RISEN prompt block
ROLE: [optional expertise frame] INSTRUCTIONS: [single clear task] STEPS: 1. [step] 2. [step] 3. [step] END GOAL: [what done enables — measurable if possible] NARROWING: [length, tone, exclusions, must-not-change]
4.3
System prompt design
Writing the instruction layer that shapes every response in a project — the architectural skill
Key takeaway
System prompts (API) and Project instructions (Claude.ai) are persistent constitution + capability brief — design them once, benefit on every turn.
Why this matters
Repeating rules in every user message wastes tokens and drifts. System layer is the right place for stable policy.
System prompt layers: identity & scope, output defaults, tool usage rules, safety constitution, escalation ('ask if ambiguous'). Keep under ~800–1500 tokens unless cached — see Chapter 2.
Test system prompts in isolation with minimal user messages. If behaviour only works with long user prompts, policy belongs in system.
Workflow — do this next
- 01List behaviours you repeat in every chat — move to system.
- 02Version system prompt in git; tag with model ID.
- 03Regression-test 10 canonical user inputs after each change.
Real example
API product — system prompt as product spec
Support bot system prompt: tone, escalation triggers, forbidden promises, citation rules, JSON handoff to ticketing. User message is only customer text. Policy changes ship like code — PR review, staging eval, prod.
4.5
Prompt libraries
Building and organising your personal collection of tested prompts — the asset that compounds over time
Key takeaway
A prompt library is indexed templates with version, owner, use case, and test status — not a folder of one-off chats.
Why this matters
Individual skill doesn't scale. Libraries turn personal hacks into organisational capability.
Organise by: function (PM, eng, sales), task type, model tier, maturity (draft / tested / deprecated). Search by keyword and outcome, not filename.
Each entry: prompt text, variables, example output, known failures, last tested date.
Workflow — do this next
- 01Start with 10 templates from your last month of Claude work.
- 02Add metadata row per template in Notion or repo README.
- 03Monthly: promote one draft to tested; deprecate one failure.
4.6
Prompt version control
Tracking what changed, why, and what effect it had — the discipline that turns prompting into engineering
Key takeaway
Prompts are code-adjacent assets — version them, diff them, tie changes to eval results.
Why this matters
Silent prompt edits cause mysterious quality regressions. Version control makes prompting auditable.
Store prompts in git (prompts/ folder) or Notion with version headers. Changelog line: what changed, hypothesis, eval result (pass/fail on N cases).
Semantic versioning for prompts: patch = wording; minor = new constraint; major = behaviour change requiring re-eval.
Workflow — do this next
- 01Never edit production prompt in place without version bump.
- 02Run eval suite before merge.
- 03Keep rollback copy one command away.
4.7
The prompt audit
Testing your existing prompts systematically against a quality rubric — the practice that surfaces hidden failures
Key takeaway
A prompt audit scores templates against rubric + fixed test set — finding vagueness, missing format, and untested edge cases before users hit them.
Why this matters
Prompts decay as products, models, and tasks change. Audits are preventive maintenance.
Rubric dimensions: specificity, format clarity, verifiability, safety, token efficiency, examples present. Score 1–5 per template.
Test set: 5 inputs per template — happy path, edge, adversarial, empty input, oversized input. Record pass/fail and failure mode.
Workflow — do this next
- 01Quarterly audit of top 20 templates.
- 02Fix anything scoring below 3 on any dimension.
- 03Re-run test set; document in changelog.
Ready-to-use artifacts
Complete templates — paste directly into your AI tool or automation workflow.
Prompt audit rubric (abbreviated)
1. Specificity — audience, length, success defined? (1-5) 2. Format — parseable output specified? (1-5) 3. Evidence — facts require sources? (1-5) 4. Safety — prohibited actions listed? (1-5) 5. Efficiency — no redundant context? (1-5) 6. Tested — 5 cases pass in last 90 days? (Y/N) <3 on any dimension → rewrite before next use
Ready-to-use artifacts
Complete templates — paste directly into your AI tool or automation workflow.
Quick prompt quality rubric
Score any prompt in 2 minutes before production use.
□ Instructions: single clear task? □ Context: only what's needed? □ Examples: 1–3 if task is nuanced? □ Format: skeleton or schema? □ Length/tone: numeric or sample? □ Verification: cite or flag unknown? □ Tested on 3 inputs in last 30 days? 7/7 → promote to library | <5 → revise first
Team prompt onboarding (30 min)
Run once per new hire or quarterly refresh.
1. Tour prompt library — top 5 templates for their role (10 min) 2. Walk one RISEN example live (5 min) 3. Trap catalogue: vagueness + compliance (5 min) 4. Exercise: improve a bad prompt using rubric (10 min) 5. Homework: submit one tested template fork
Prompt anatomy checklist
INSTRUCTIONS — What should Claude do? (verbs, scope, exclusions) CONTEXT — What must it know? (docs, audience, date, constraints) EXAMPLES — What does good look like? (1–3 pairs if task is nuanced) FORMAT — How should output appear? (structure, length, file type) Score each 0–2 before sending. Below 6 total → revise first.
Format specification block
OUTPUT FORMAT: - Type: [markdown table / JSON / bullet list] - Sections: [ordered list] - Max length: [words or rows] - Rules: [no preamble | cite sources inline | use H2 only] - Template: [paste skeleton here]
Few-shot prompt template
<example> <input>[sanitised input 1]</input> <output>[ideal output 1]</output> </example> <example> <input>[input 2 — edge case]</input> <output>[output 2]</output> </example> Now process: <input>[new task]</input>
Mini-constitution template
CONSTITUTION: 1. If fact is not in provided context, say "not in sources." 2. Never invent names, numbers, or citations. 3. Flag uncertainty explicitly. 4. Refuse [specific prohibited requests]. 5. Output must pass [format + length] rules. Task follows below.
Prompt template skeleton
# Template: [NAME] v[0.1] Owner: [team] Use when: [trigger] ## Variables - [INPUT_DOC] - [AUDIENCE] - [DEADLINE] ## Instructions [fixed instructions] ## Format [fixed skeleton] ## Constitution [fixed rules] --- INPUT: [INPUT_DOC]
RISEN prompt block
ROLE: [optional expertise frame] INSTRUCTIONS: [single clear task] STEPS: 1. [step] 2. [step] 3. [step] END GOAL: [what done enables — measurable if possible] NARROWING: [length, tone, exclusions, must-not-change]
Tagged prompt example
<instructions> Summarise the contract for a non-lawyer PM. Flag termination, liability, and data processing. </instructions> <context> [paste contract excerpt] </context> <format> Markdown table: Clause | Plain English | Risk (H/M/L) </format>
Prompt audit rubric (abbreviated)
1. Specificity — audience, length, success defined? (1-5) 2. Format — parseable output specified? (1-5) 3. Evidence — facts require sources? (1-5) 4. Safety — prohibited actions listed? (1-5) 5. Efficiency — no redundant context? (1-5) 6. Tested — 5 cases pass in last 90 days? (Y/N) <3 on any dimension → rewrite before next use
B2B marketing org — from prompt chaos to library governance
A 25-person marketing team had 200+ ad hoc Claude chats bookmarked. Brand voice drifted; legal flagged an unverified stat; new hires asked 'which prompt?' in Slack daily.
Before
No templates, no tests, no owners. Best prompts lived in one senior manager's inbox. Iteration was optional; one-shot was the norm.
After
Chapter 4 rollout: 12 audited templates (RISEN structure), prompt guild monthly, library in Team Project, trap poster in onboarding. API-bound workflows got XML-tagged system prompts in git with eval suite.
- Draft-to-approved time → down 38% on recurring assets
- Brand voice revision rounds → down from 2.1 avg to 0.8
- Compliance flags on published content → 3 in Q1 to 0 in Q2
- New hire time-to-first-quality-output → 2 weeks to 3 days
What goes wrong
Collecting prompts without testing — library becomes a junk drawer.
Prompt audit rubric (4.7); nothing enters library without 5 passing test cases.
Over-engineering prompts before understanding the task.
Start Concept 1; add advanced techniques only when foundation fails.
Copying 'mega prompts' from social media without context.
Trap catalogue 3.1 + 3.6; adapt with your I/C/E/F and examples.
Treating system prompts as set-and-forget across model upgrades.
Version control + re-eval on model change (4.6).

Vetted by Krishna KumarCurator, FactorBeam
Discussion
Discussion coming soon
Shared comments for this playbook are not live yet. When they are, you'll be able to ask questions, share what worked, and see replies from other readers.