Standalone article · part of a sequenced guide

What you'll unlock: A prompt is a specification — instructions, context, examples, and format. Master those four, then layer advanced techniques and command-grade systems so quality scales beyond any individual chat.

View full guide New here? Start Chapter 1

Tool guideChapter 4 of 10

Prompting — From Basic to Command-Grade

~90 min read

The complete prompting curriculum — the foundations, the advanced techniques, and the professional-grade patterns that separate power users from everyone else

Chapter context

Your team uses Claude daily but output quality varies wildly — star performers have private prompts in Notes; everyone else gets generic first drafts and blames the tool.This chapter replaces hero prompting with prompt engineering discipline. The goal is not longer prompts — it's repeatable specifications that survive handoffs and model updates.

Is this chapter for you?

Do you use Claude for recurring task types (reports, specs, emails, research)?

Yes — prioritise Concept 4 templates and library; ROI is immediate.

Have you shipped something from Claude without verifying facts?

Yes — read Concept 3 traps before your next customer-facing output.

Are you building API features that depend on structured Claude output?

Yes — Concepts 2.4, 2.6, and 4.3–4.4 are mandatory.

Does more than one person on your team need the same output quality?

Yes — Concept 4.8 sharing and prompt audits are your operating system.

Chapters 1–3 gave you the mental model, the economics, and the interface. Chapter 4 is where output quality becomes intentional — the full prompting curriculum from first principles to team-scale practice.Most people stop at clever one-liners. Power users build templates, chains, libraries, and audits. This chapter is that ladder.

Chapter insight

A prompt is a specification — instructions, context, examples, and format. Master those four, then layer advanced techniques and command-grade systems so quality scales beyond any individual chat.

Reference diagrams

Prompt quality stack

Foundations first; advanced techniques on top; traps avoided at every layer; command-grade systems make it team-durable.

I/C/E/FFour componentsFoundation

CoT + few-shotAdvancedTechnique

Trap auditFailure modesQuality

Templates + RISENCommand-gradeScale

Command-grade iteration loop

Draft → critique → revise → template — the loop that turns one good session into a reusable asset.

DraftFirst passChat

CritiqueGaps + assumptionsRubric

ReviseTargeted fixArtifact

TemplateVersion + testLibrary

Implementation paths

Four concepts — learn, advance, avoid traps, systematise.

Concept 1

Prompting Foundations

What actually makes a prompt work — the principles behind every technique

1.1

The four components of every prompt

Instructions, context, examples, and output format — the anatomy that explains why prompts succeed or fail

Key takeaway

Every effective prompt combines four elements: what to do (instructions), what to know (context), what good looks like (examples), and what shape to return (format). Missing any one produces predictable failure modes.

Why this matters

Most 'bad prompts' are incomplete prompts. Diagnosing which component is missing turns random trial-and-error into systematic improvement.

Instructions tell Claude what to accomplish. Context grounds the answer. Examples calibrate quality. Output format constrains structure so you can use the result without reformatting.

Weak prompt: 'Summarise this.' Strong prompt: instructions ('Summarise for a CFO'), context (attached earnings call), example (prior summary you liked), format ('5 bullets, max 20 words each, lead with revenue impact').

Workflow — do this next

01Before sending, label your draft: I / C / E / F — which are present?
02If output is wrong shape, fix format first; if wrong substance, add context or examples.
03Save prompts where all four are explicit as templates.

Ready-to-use artifacts

Complete templates — paste directly into your AI tool or automation workflow.

Prompt anatomy checklist

INSTRUCTIONS — What should Claude do? (verbs, scope, exclusions)
CONTEXT — What must it know? (docs, audience, date, constraints)
EXAMPLES — What does good look like? (1–3 pairs if task is nuanced)
FORMAT — How should output appear? (structure, length, file type)

Score each 0–2 before sending. Below 6 total → revise first.

1.2

Specificity as the primary lever

Why vague prompts produce vague outputs and exactly how to be more specific

Key takeaway

Specificity is the highest-ROI prompt edit — replace abstract goals with concrete constraints: audience, length, criteria, and what to exclude.

Why this matters

Claude fills gaps with generic training-data defaults. Vagueness outsources decisions to the model; specificity keeps them with you.

Vague: 'Improve this email.' Specific: 'Rewrite for a busy VP who ignored our last two emails. Under 120 words. One clear CTA: book a 15-min call. Tone: direct, not salesy. Keep subject line under 50 characters.'

Add specificity along four axes: who, what, how, why

Workflow — do this next

01Highlight vague words (good, better, professional, brief) — replace each with a number or example.
02Add one sentence: 'Optimize for X, not Y.'
03If output is still generic, add a negative example: 'Do not write like this: …'

Real example

Founder — investor update specificity

Generic prompt produced a 600-word essay. Revised prompt: 'Monthly investor email. 4 sections: metrics (table), wins (3 bullets), asks (numbered), risks (honest, 2 max). Total under 400 words. Assume readers skim on mobile.' Second output required one edit, not a rewrite.

1.3

Positive vs negative instructions

Telling Claude what to do vs what not to do — when each is more effective

Key takeaway

Lead with positive instructions ('do X'); use negatives sparingly for genuine failure modes. Long lists of 'don't' dilute attention and often get ignored.

Why this matters

Models attend strongly to the first actionable pattern. Ten prohibitions compete with one clear task.

Positive: 'Use active voice. Lead with the recommendation.' Negative: 'Don't be verbose. Don't use jargon. Don't start with Certainly.' Positives define the target; negatives patch holes after the target is clear.

When negatives help: safety boundaries, known failure modes ('do not invent citations'), compliance ('exclude PII'). Cap at 3–5 high-signal negatives per prompt.

Workflow — do this next

01Rewrite top 3 'don't' lines as 'do' lines.
02Keep negatives only for irreversible errors (legal, factual, security).
03Test: remove all negatives — if quality drops, re-add only the ones that mattered.

1.4

Role and persona prompting

When 'act as a...' genuinely helps and when it is cargo cult — the honest assessment

Key takeaway

Roles help when they imply domain constraints, vocabulary, and evaluation criteria — not when they're generic theatre ('act as an expert').

Why this matters

Cargo-cult roles add tokens without information. Good roles encode how an expert would think about the task.

Useful role: 'You are a senior PM at a B2B SaaS company reviewing a PRD for engineering feasibility — flag scope creep, missing acceptance criteria, and dependencies.' Useless role: 'You are a world-class genius.'

Roles work best paired with stakes and audience. Skip the role if instructions and context already specify expertise level.

Workflow — do this next

01If using a role, list 3 behaviours the role must exhibit.
02A/B test with and without role on the same task — keep only if output improves.
03Prefer Project-level persona over repeating role in every message.

Real example

Legal review role — useful vs theatre

Theatre: 'Act as a lawyer.' Useful: 'You are in-house counsel reviewing a vendor DPA for a 200-person SaaS. Flag clauses that conflict with GDPR Art. 28, unlimited liability, and missing subprocessors list. Output: table of clause, risk, suggested redline.'

1.5

Output format specification

JSON, markdown, tables, prose — how telling Claude the format changes the output

Key takeaway

Format instructions are cheap and powerful — they turn free-form prose into parseable, comparable, shippable output.

Why this matters

Integration, team review, and iteration all require predictable structure. Format is how you make Claude's output a data product.

Specify: container (JSON / markdown / table), field names, max lengths, required sections, and whether to omit preamble ('no intro paragraph'). For API use, request JSON only with a schema snippet.

Claude often complies better when you show a skeleton than when you only name the format in abstract terms.

Workflow — do this next

01Paste an empty template of the desired output.
02For JSON: include keys, types, and 'return valid JSON only'.
03Validate output with a parser or checklist — feed errors back in iteration 2.

Ready-to-use artifacts

Complete templates — paste directly into your AI tool or automation workflow.

Format specification block

OUTPUT FORMAT:
- Type: [markdown table / JSON / bullet list]
- Sections: [ordered list]
- Max length: [words or rows]
- Rules: [no preamble | cite sources inline | use H2 only]
- Template:
[paste skeleton here]

1.6

Tone and register control

How to get Claude to match the voice your work requires — professional, casual, technical, executive

Key takeaway

Tone is specified through audience, sample phrases, and anti-patterns — not adjectives alone ('professional' is ambiguous).

Why this matters

Brand voice drift causes rework. Explicit register beats hoping Claude guesses your company's style.

Effective tone controls: target reader ('board members, not engineers'), sentence length band, vocabulary level, 2–3 phrases to emulate, 2–3 phrases to ban.

For recurring tone, put voice rules in Project instructions and reference a sample paragraph: 'Match the register of this example: …'

Workflow — do this next

01Attach one approved sample of the target voice.
02List 3 banned patterns (hedging, emoji, hype words).
03Ask Claude to self-check tone against criteria before finalising.

1.7

Length control

Getting Claude to be concise when you need it and thorough when you need that — the specific instructions that work

Key takeaway

Length must be numeric or structural — word counts, bullet caps, section limits — not 'be concise' alone.

Why this matters

Claude's default is thoroughness. Without caps, you pay in tokens, reading time, and buried lead sentences.

Concise: 'Max 150 words. One paragraph. No intro.' Thorough: 'Minimum 800 words. Cover 5 subsections with evidence per section.' Mixed: 'Executive summary: 3 bullets. Deep dive: up to 500 words per section.'

Pair length with information density rules to avoid padding when you ask for length.

Workflow — do this next

01Set hard caps for emails, Slack, exec summaries.
02Use 'if over N words, cut lowest-priority section first' for flexible caps.
03In API calls, set max_tokens as a backstop.

1.8

The iteration mindset

Treating prompting as a conversation, not a one-shot transaction — the workflow that produces better results

Key takeaway

First drafts are briefings, not deliverables. The second and third turns — critique, refine, lock format — produce most of the value.

Why this matters

One-shot prompting wastes Claude's ability to incorporate feedback and your ability to steer without rewriting from scratch.

Iteration loop: (1) rough prompt → draft, (2) 'What's weak? What did you assume?' → diagnosis, (3) 'Revise: fix X, keep Y, output as Z' → near-final. Use artifacts to hold the deliverable while chat handles negotiation.

Save iteration patterns: critique-then-revise, diff-style edits.

Workflow — do this next

01Never ship first response on high-stakes work.
02Turn 2: ask 'list assumptions and gaps'.
03Turn 3: single revision brief with format lock.
04Log what changed between v1 and v3 — that's your next template.

Real example

PM — PRD in three turns

Turn 1: outline. Turn 2: 'Score against eng-readiness rubric; list gaps.' Turn 3: 'Fill gaps in sections 2 and 4 only; output full PRD in artifact.' Total time under 25 minutes vs 2 hours of one-shot editing.

Concept 2

Advanced Prompting Techniques

The techniques that separate good Claude outputs from great ones

2.1

Chain-of-thought prompting

"Think step by step" — why making Claude reason out loud improves accuracy and when to use it

Key takeaway

Asking Claude to show reasoning before the final answer improves accuracy on multi-step logic, math, and trade-off analysis — at the cost of tokens and latency.

Why this matters

Hidden reasoning skips checkable steps. Visible chains let you catch errors before they reach the conclusion.

Use CoT for: prioritisation, debugging, financial models, policy interpretation, architecture decisions. Skip for: simple rewrites, format conversion, tasks with a fixed template.

Strong pattern: 'First list assumptions. Then reason step by step. Finally give the recommendation in 3 bullets.' For API, separate reasoning and answer blocks or use structured tags so you can strip reasoning in production if needed.

Workflow — do this next

01Add 'show work before conclusion' to any decision prompt.
02If conclusion is wrong, fix the earliest flawed step — don't restart.
03For production API, log reasoning internally; show users only the final block.

Real example

Eng lead — build vs buy analysis

Without CoT, Claude recommended buy. With step-by-step (TCO, team skill, timeline, risk), reasoning exposed a missing maintenance cost assumption. Revised recommendation: build phase 1, buy phase 2. Decision doc cited the chain.

2.2

Few-shot prompting

Teaching by example — how three well-chosen examples unlock dramatically better outputs

Key takeaway

Two to three diverse, high-quality examples teach format, tone, and edge-case handling better than paragraphs of rules.

Why this matters

Examples are implicit specifications. They show what 'done' looks like when language alone is ambiguous.

Pick examples that differ in difficulty: easy, typical, edge case. Label them Input/Output. Keep examples shorter than the task you want — Claude generalises from pattern, not length.

Few-shot beats zero-shot when: output format is idiosyncratic, brand voice is specific, or task has unwritten rules ('how we write release notes here').

Workflow — do this next

01Harvest 3 real past outputs your team approved.
02Strip sensitive data; keep structure and voice.
03Add: 'Follow the pattern of these examples for the new input below.'

Ready-to-use artifacts

Complete templates — paste directly into your AI tool or automation workflow.

Few-shot prompt template

<example>
<input>[sanitised input 1]</input>
<output>[ideal output 1]</output>
</example>
<example>
<input>[input 2 — edge case]</input>
<output>[output 2]</output>
</example>

Now process:
<input>[new task]</input>

2.3

Meta-prompting

Using Claude to write and improve its own prompts — the recursive technique that scales prompt quality

Key takeaway

Use Claude to draft, critique, and harden prompts against a rubric — then human-approve before production use.

Why this matters

Manual prompt writing plateaus. Meta-prompting turns your quality criteria into a repeatable review loop.

Pattern: (1) Describe task and failure modes. (2) Ask Claude to write a prompt with I/C/E/F. (3) Ask Claude to attack its own prompt — vagueness, missing format, untestable success. (4) You edit and freeze v1.

Guardrail: never auto-deploy meta-generated prompts without testing on 5+ real inputs. Meta-prompting accelerates drafting, not judgment.

Workflow — do this next

01Prompt: 'Write a prompt for [task]. Include format, examples placeholder, and 3 test cases.'
02Follow-up: 'Score this prompt 1–10 on specificity; rewrite weak sections.'
03Run test cases; log failures; iterate prompt version.

2.4

Prompt chaining

Breaking complex tasks into sequential prompts — the architecture for work that exceeds a single response

Key takeaway

Chain when one pass mixes incompatible goals (research + write + critique) or exceeds reliable context. Each step has one job and a verifiable output.

Why this matters

Monolithic prompts produce mediocre everything. Chains let you inspect, fix, and cache intermediate results.

Classic chain: extract → transform → format → review. API implementations pass step N output as step N+1 input. Claude.ai users can run chains manually or via Projects with saved step prompts.

Design chains with checkpoints. Failed step 2 should not run until step 1 passes schema validation.

Workflow — do this next

01Decompose task into single-verb steps.
02Define output schema per step.
03Build a chain diagram; implement thinnest step first.
04Add human review only where errors are expensive.

Real example

Content team — research → brief → draft → edit

Four prompts in one Project. Step 1: source-backed fact sheet. Step 2: angle + outline (human picks angle). Step 3: draft in artifact. Step 4: editor pass against style guide. Quality up; fewer full rewrites.

2.5

Self-consistency

Running the same prompt multiple times and synthesising the best answer — when it is worth the cost

Key takeaway

Multiple samples + synthesis reduces variance on high-stakes judgments — but multiplies token cost. Reserve for decisions, not drafts.

Why this matters

Single samples lie confidently. Consistency across runs surfaces stable conclusions vs noise.

Process: run prompt 3–5 times (slight temperature if API), extract claims that repeat, flag contradictions, ask Claude to synthesise a final answer citing which runs agreed.

Worth it when: classification with fuzzy boundaries, risk assessment, creative strategy options. Not worth it for: deterministic formatting, code with tests, tasks with ground-truth verification.

Workflow — do this next

01Run 3 parallel completions.
02Diff key claims — agreement = signal.
03Synthesis prompt: 'Merge consistent points; list disagreements for human.'

2.6

Structured output prompting

Reliably extracting JSON, tables, and typed data from Claude — the technique for integration with other tools

Key takeaway

Structured output requires schema in the prompt, validation after the response, and a repair loop when parsing fails.

Why this matters

Downstream tools need parseable data. Prose is a dead end for automation.

Anthropic API supports tool use and JSON modes on many models — in Claude.ai, specify schema in the prompt and use artifacts for tables. Always validate with JSON.parse or a schema validator.

Repair pattern: 'Your output failed validation: [error]. Return corrected JSON only.' Keep original prompt; add error message. Two repair attempts max, then escalate to human.

Workflow — do this next

01Embed JSON schema or table headers in prompt.
02Parse response; on failure, auto-retry with error snippet.
03Log failure types — recurring schema issues mean prompt fix, not more retries.

2.7

Constitutional prompting

Building constraints and values directly into the prompt — the safety and quality layer you control

Key takeaway

Write explicit principles Claude must follow — accuracy rules, refusal boundaries, quality checks — as a constitution prepended to tasks.

Why this matters

Product safety and brand standards can't rely on base model behaviour alone. Your constitution is enforceable policy in text.

Constitution blocks include: cite sources or say unknown; never fabricate metrics; refuse requests outside role; check output against prohibited phrases; prioritise user safety over helpfulness when they conflict.

End prompts with: 'Before responding, verify compliance with the constitution above.' This activates self-check behaviour trained via Constitutional AI.

Workflow — do this next

01Draft 5–10 non-negotiable rules for your use case.
02Prepend to system prompt or Project instructions.
03Red-team with adversarial inputs monthly.

Ready-to-use artifacts

Complete templates — paste directly into your AI tool or automation workflow.

Mini-constitution template

CONSTITUTION:
1. If fact is not in provided context, say "not in sources."
2. Never invent names, numbers, or citations.
3. Flag uncertainty explicitly.
4. Refuse [specific prohibited requests].
5. Output must pass [format + length] rules.

Task follows below.

2.8

Conditional prompting

Prompts that instruct Claude to behave differently based on what it finds — the branching logic in language

Key takeaway

Use if/then instructions in natural language to branch behaviour — classify first, then apply the right sub-prompt without spinning up separate apps.

Why this matters

One smart prompt can replace brittle multi-tool flows when branches are few and inspectable.

Pattern: 'Read the input. If [condition A], follow procedure A. If [condition B], follow procedure B. If unclear, ask one clarifying question.' Works for triage, support routing, document classification.

Limit branches to 3–4. Beyond that, use a chain with an explicit classification step (structured label) then route in code.

Workflow — do this next

01Write decision tree on paper first.
02Encode as numbered branches in one prompt.
03Test one input per branch + one ambiguous input.

Real example

Support triage — conditional prompt

Single prompt: if billing → refund policy path; if bug → collect repro steps; if feature request → template + product area tag. Haiku classifies; human reviews edge cases. Replaced three separate macros.

Concept 3

The Prompt Trap Catalogue

Every common prompting mistake — what it costs you and exactly how to fix it

3.1

The vagueness trap

"Write something good about X" — the prompt pattern that guarantees mediocre output

Key takeaway

Vague prompts outsource every decision to the model — you get average-of-the-internet output, not your output.

Why this matters

This is the most common trap and the cheapest to fix with specificity (Concept 1.2).

Trap: 'Write something good about our product.' Fix: audience, angle, length, proof points, CTA, and what to avoid. Good is not a specification.

Workflow — do this next

01Ban the words good, better, nice, professional without definitions.
02Add: who reads this, what they do after reading, max length.

Real example

Marketing — landing page trap

'Write good copy' produced generic SaaS fluff. Fixed prompt named ICP, three differentiators with proof, objection to address, and hero under 12 words. Conversion on A/B test improved — not because of magic AI, because of specifications.

3.2

The novel trap

Treating Claude like a search engine for creative questions — why it hallucinates when you ask for facts

Key takeaway

Claude composes plausible text — it does not verify facts. Factual questions need sources, search, or attached documents.

Why this matters

Chapter 1's mental model: reasoning engine over evidence, not oracle. This trap causes the most expensive business errors.

Trap: 'What was our competitor's Q3 revenue?' without data. Fix: attach 10-K, enable search with source rules, or ask Claude to say 'I don't have this' if not in context.

Workflow — do this next

01Tag prompts FACT or OPINION before sending.
02FACT prompts require attached source or search + citation review.
03Never paste unverified numbers into decks or contracts.

3.3

The confirmation trap

Asking Claude if something is correct — why it agrees and how to get honest evaluation instead

Key takeaway

Claude is biased toward agreement. 'Is this right?' gets yes. 'Find flaws under criteria X' gets useful critique.

Why this matters

RLHF rewards helpfulness; confirmation feels helpful even when wrong.

Trap: 'My plan is solid, right?' Fix: 'Act as sceptical reviewer. List 5 ways this plan fails. Cite assumptions.' Or: 'Steel-man the opposite position.'

For factual verification, compare against source text: 'Quote the passage that supports each claim' — unsupported claims surface quickly.

Workflow — do this next

01Replace 'is this good?' with rubric-based scoring.
02Ask for disconfirming evidence explicitly.
03Use a fresh chat without your ego in the thread for red-team passes.

3.4

The length trap

Not specifying length — why Claude defaults to verbose and how to control it

Key takeaway

Default Claude is thorough. Without caps, you get essays when you needed a Slack message.

Why this matters

Verbose output hides weak thinking and wastes reader time — see Concept 1.7.

Trap: 'Summarise the report.' Fix: 'Summarise in 5 bullets, max 15 words each, lead with decision impact.'

Workflow — do this next

01Set numeric caps on every outward-facing deliverable.
02Ask for executive summary + optional appendix if depth needed.

3.5

The format trap

Getting prose when you needed a table, or a list when you needed a paragraph — the one sentence that fixes it

Key takeaway

One explicit format line prevents 80% of reformatting work: 'Return a markdown table with columns A, B, C — no intro.'

Why this matters

Format mismatch is pure friction — easy to prevent, tedious to fix.

Trap: assuming Claude will guess your team's standard format. Fix: paste empty template or schema in every recurring task.

Workflow — do this next

01Add OUTPUT FORMAT as mandatory section in templates.
02If wrong format, don't re-ask whole task — 'convert previous answer to [format] only'.

3.6

The context dump trap

Giving Claude too much irrelevant context and watching the quality degrade — the signal-to-noise discipline

Key takeaway

More context is not better context. Irrelevant documents dilute attention and increase cost — curate what the model sees.

Why this matters

Context window is finite attention, not unlimited storage (Chapter 1, Chapter 2).

Trap: uploading entire data room for one clause review. Fix: extract relevant pages, summarise background in 200 words, attach only primary sources.

Use Projects for stable corp context; per-message attachments for task-specific evidence only.

Workflow — do this next

01Before attach: 'What 3 facts does Claude need?'
02Remove docs that don't change the answer.
03Summarise long files in a separate step, then chain.

3.7

The one-shot trap

Giving up after the first response — why the second and third iteration almost always produce better output

Key takeaway

First outputs are drafts. Teams that one-shot either accept mediocrity or blame the tool instead of iterating.

Why this matters

Iteration is the workflow (Concept 1.8) — not a sign of failure.

Trap: 'Claude can't do this' after one try. Fix: critique pass, targeted revision, format lock — usually three turns.

Workflow — do this next

01Budget 3 turns minimum for important work.
02Turn 2 always asks for gaps and assumptions.
03Save winning iteration sequences as templates.

3.8

The compliance trap

Mistaking Claude's agreeableness for accuracy — the verification habits that prevent expensive mistakes

Key takeaway

Polite, confident wrong answers are the danger — not refusals. Verify facts, numbers, names, and legal claims before ship.

Why this matters

The trap combines confirmation bias (3.3) with novel trap (3.2) — lethal in regulated or customer-facing work.

Habits: citation check, spot-verify 3 random facts, second-model or human review on high stakes, never auto-send customer emails without human read.

Build verification into the prompt — don't rely on post-hoc discipline alone.

Workflow — do this next

01Define tier 1 (auto-ok), tier 2 (spot check), tier 3 (mandatory human) tasks.
02Tier 3: no send without named approver.
03Log incidents where compliance trap fired — update prompts.

Real example

Sales — wrong stat in proposal

AE pasted Claude-generated market size into a proposal without checking. Stat was outdated composite from training data. Fix: proposal prompt requires citation or 'VERIFY' flag; CRM attach for source PDF; manager review on tier-3 deals.

Concept 4

Command-Grade Prompting

The professional system — how power users structure their prompting practice for consistent, scalable output quality

4.1

Prompt templates

The reusable prompt structures that deliver consistent output for recurring task types

Key takeaway

Templates freeze the parts that shouldn't change (format, rubric, constitution) and leave variables for the task — consistency without copy-paste drift.

Why this matters

Ad hoc prompts don't scale across teams or weeks. Templates are the unit of reusable prompt IP.

Template structure: metadata (name, owner, version), variable slots [BRACKETED], fixed instructions, format skeleton, test cases. Store in Project docs, Notion, or git.

Good template candidates: weekly reports, PRD sections, code review checklists, customer reply drafts, research briefs — anything you do more than twice monthly.

Workflow — do this next

01Identify top 5 recurring Claude tasks.
02Extract last best prompt; parameterise names, dates, inputs.
03Add 2 test inputs in template footer.
04Review monthly — retire templates that fail tests.

Ready-to-use artifacts

Complete templates — paste directly into your AI tool or automation workflow.

Prompt template skeleton

# Template: [NAME] v[0.1]
Owner: [team]
Use when: [trigger]

## Variables
- [INPUT_DOC]
- [AUDIENCE]
- [DEADLINE]

## Instructions
[fixed instructions]

## Format
[fixed skeleton]

## Constitution
[fixed rules]

---
INPUT:
[INPUT_DOC]

4.2

The RISEN framework

Role, Instructions, Steps, End goal, Narrowing — the professional prompting structure that works across task types

Key takeaway

RISEN is a checklist for complete prompts: Role (who Claude is), Instructions (task), Steps (procedure), End goal (success definition), Narrowing (constraints and exclusions).

Why this matters

Frameworks prevent omitted components better than ad hoc memory — especially under time pressure.

Role: 'You are a technical editor.' Instructions: 'Tighten this spec for engineering.' Steps: '1) Flag ambiguity 2) Propose edits 3) Output redline.' End goal: 'Eng can estimate without clarification calls.' Narrowing: Max 400 words; do not change scope; mark assumptions.

Workflow — do this next

01Draft prompt; label R-I-S-E-N sections.
02Any blank section — fill or delete Role if redundant.
03Compare output to End goal — if miss, fix Steps not adjectives.

Ready-to-use artifacts

Complete templates — paste directly into your AI tool or automation workflow.

RISEN prompt block

ROLE: [optional expertise frame]
INSTRUCTIONS: [single clear task]
STEPS:
1. [step]
2. [step]
3. [step]
END GOAL: [what done enables — measurable if possible]
NARROWING: [length, tone, exclusions, must-not-change]

4.3

System prompt design

Writing the instruction layer that shapes every response in a project — the architectural skill

Key takeaway

System prompts (API) and Project instructions (Claude.ai) are persistent constitution + capability brief — design them once, benefit on every turn.

Why this matters

Repeating rules in every user message wastes tokens and drifts. System layer is the right place for stable policy.

System prompt layers: identity & scope, output defaults, tool usage rules, safety constitution, escalation ('ask if ambiguous'). Keep under ~800–1500 tokens unless cached — see Chapter 2.

Test system prompts in isolation with minimal user messages. If behaviour only works with long user prompts, policy belongs in system.

Workflow — do this next

01List behaviours you repeat in every chat — move to system.
02Version system prompt in git; tag with model ID.
03Regression-test 10 canonical user inputs after each change.

Real example

API product — system prompt as product spec

Support bot system prompt: tone, escalation triggers, forbidden promises, citation rules, JSON handoff to ticketing. User message is only customer text. Policy changes ship like code — PR review, staging eval, prod.

4.4

XML tags in prompts

How to use structured tags to separate context, instructions, and examples for complex prompts

Key takeaway

XML-style tags (<context>, <instructions>, <examples>) help Claude parse long prompts — especially with multiple documents and few-shot examples.

Why this matters

Anthropic models are trained to attend to structured delimiters. Tags reduce bleed between sections.

Convention: wrap each block — <context>, <instructions>, <examples>, <input>. Close tags. Nesting is fine for sub-documents.

Use tags when prompt exceeds ~500 words or mixes 3+ sources. Skip for short conversational tasks.

Workflow — do this next

01Refactor messy prompt into tagged sections.
02Run same input tagged vs untagged — compare parse errors and quality.
03Standardise tag names across team templates.

Ready-to-use artifacts

Complete templates — paste directly into your AI tool or automation workflow.

Tagged prompt example

<instructions>
Summarise the contract for a non-lawyer PM.
Flag termination, liability, and data processing.
</instructions>

<context>
[paste contract excerpt]
</context>

<format>
Markdown table: Clause | Plain English | Risk (H/M/L)
</format>

4.5

Prompt libraries

Building and organising your personal collection of tested prompts — the asset that compounds over time

Key takeaway

A prompt library is indexed templates with version, owner, use case, and test status — not a folder of one-off chats.

Why this matters

Individual skill doesn't scale. Libraries turn personal hacks into organisational capability.

Organise by: function (PM, eng, sales), task type, model tier, maturity (draft / tested / deprecated). Search by keyword and outcome, not filename.

Each entry: prompt text, variables, example output, known failures, last tested date.

Workflow — do this next

01Start with 10 templates from your last month of Claude work.
02Add metadata row per template in Notion or repo README.
03Monthly: promote one draft to tested; deprecate one failure.

4.6

Prompt version control

Tracking what changed, why, and what effect it had — the discipline that turns prompting into engineering

Key takeaway

Prompts are code-adjacent assets — version them, diff them, tie changes to eval results.

Why this matters

Silent prompt edits cause mysterious quality regressions. Version control makes prompting auditable.

Store prompts in git (prompts/ folder) or Notion with version headers. Changelog line: what changed, hypothesis, eval result (pass/fail on N cases).

Semantic versioning for prompts: patch = wording; minor = new constraint; major = behaviour change requiring re-eval.

Workflow — do this next

01Never edit production prompt in place without version bump.
02Run eval suite before merge.
03Keep rollback copy one command away.

4.7

The prompt audit

Testing your existing prompts systematically against a quality rubric — the practice that surfaces hidden failures

Key takeaway

A prompt audit scores templates against rubric + fixed test set — finding vagueness, missing format, and untested edge cases before users hit them.

Why this matters

Prompts decay as products, models, and tasks change. Audits are preventive maintenance.

Rubric dimensions: specificity, format clarity, verifiability, safety, token efficiency, examples present. Score 1–5 per template.

Test set: 5 inputs per template — happy path, edge, adversarial, empty input, oversized input. Record pass/fail and failure mode.

Workflow — do this next

01Quarterly audit of top 20 templates.
02Fix anything scoring below 3 on any dimension.
03Re-run test set; document in changelog.

Ready-to-use artifacts

Complete templates — paste directly into your AI tool or automation workflow.

Prompt audit rubric (abbreviated)

1. Specificity — audience, length, success defined? (1-5)
2. Format — parseable output specified? (1-5)
3. Evidence — facts require sources? (1-5)
4. Safety — prohibited actions listed? (1-5)
5. Efficiency — no redundant context? (1-5)
6. Tested — 5 cases pass in last 90 days? (Y/N)

<3 on any dimension → rewrite before next use

Ready-to-use artifacts

Complete templates — paste directly into your AI tool or automation workflow.

Quick prompt quality rubric

Score any prompt in 2 minutes before production use.

□ Instructions: single clear task?
□ Context: only what's needed?
□ Examples: 1–3 if task is nuanced?
□ Format: skeleton or schema?
□ Length/tone: numeric or sample?
□ Verification: cite or flag unknown?
□ Tested on 3 inputs in last 30 days?

7/7 → promote to library | <5 → revise first

Team prompt onboarding (30 min)

Run once per new hire or quarterly refresh.

1. Tour prompt library — top 5 templates for their role (10 min)
2. Walk one RISEN example live (5 min)
3. Trap catalogue: vagueness + compliance (5 min)
4. Exercise: improve a bad prompt using rubric (10 min)
5. Homework: submit one tested template fork

Prompt anatomy checklist

INSTRUCTIONS — What should Claude do? (verbs, scope, exclusions)
CONTEXT — What must it know? (docs, audience, date, constraints)
EXAMPLES — What does good look like? (1–3 pairs if task is nuanced)
FORMAT — How should output appear? (structure, length, file type)

Score each 0–2 before sending. Below 6 total → revise first.

Format specification block

OUTPUT FORMAT:
- Type: [markdown table / JSON / bullet list]
- Sections: [ordered list]
- Max length: [words or rows]
- Rules: [no preamble | cite sources inline | use H2 only]
- Template:
[paste skeleton here]

Few-shot prompt template

<example>
<input>[sanitised input 1]</input>
<output>[ideal output 1]</output>
</example>
<example>
<input>[input 2 — edge case]</input>
<output>[output 2]</output>
</example>

Now process:
<input>[new task]</input>

Mini-constitution template

CONSTITUTION:
1. If fact is not in provided context, say "not in sources."
2. Never invent names, numbers, or citations.
3. Flag uncertainty explicitly.
4. Refuse [specific prohibited requests].
5. Output must pass [format + length] rules.

Task follows below.

Prompt template skeleton

# Template: [NAME] v[0.1]
Owner: [team]
Use when: [trigger]

## Variables
- [INPUT_DOC]
- [AUDIENCE]
- [DEADLINE]

## Instructions
[fixed instructions]

## Format
[fixed skeleton]

## Constitution
[fixed rules]

---
INPUT:
[INPUT_DOC]

RISEN prompt block

ROLE: [optional expertise frame]
INSTRUCTIONS: [single clear task]
STEPS:
1. [step]
2. [step]
3. [step]
END GOAL: [what done enables — measurable if possible]
NARROWING: [length, tone, exclusions, must-not-change]

Tagged prompt example

<instructions>
Summarise the contract for a non-lawyer PM.
Flag termination, liability, and data processing.
</instructions>

<context>
[paste contract excerpt]
</context>

<format>
Markdown table: Clause | Plain English | Risk (H/M/L)
</format>

Prompt audit rubric (abbreviated)

1. Specificity — audience, length, success defined? (1-5)
2. Format — parseable output specified? (1-5)
3. Evidence — facts require sources? (1-5)
4. Safety — prohibited actions listed? (1-5)
5. Efficiency — no redundant context? (1-5)
6. Tested — 5 cases pass in last 90 days? (Y/N)

<3 on any dimension → rewrite before next use

B2B marketing org — from prompt chaos to library governance

A 25-person marketing team had 200+ ad hoc Claude chats bookmarked. Brand voice drifted; legal flagged an unverified stat; new hires asked 'which prompt?' in Slack daily.

Before

No templates, no tests, no owners. Best prompts lived in one senior manager's inbox. Iteration was optional; one-shot was the norm.

After

Chapter 4 rollout: 12 audited templates (RISEN structure), prompt guild monthly, library in Team Project, trap poster in onboarding. API-bound workflows got XML-tagged system prompts in git with eval suite.

Draft-to-approved time → down 38% on recurring assets
Brand voice revision rounds → down from 2.1 avg to 0.8
Compliance flags on published content → 3 in Q1 to 0 in Q2
New hire time-to-first-quality-output → 2 weeks to 3 days

What goes wrong

Collecting prompts without testing — library becomes a junk drawer.

Prompt audit rubric (4.7); nothing enters library without 5 passing test cases.

Over-engineering prompts before understanding the task.

Start Concept 1; add advanced techniques only when foundation fails.

Copying 'mega prompts' from social media without context.

Trap catalogue 3.1 + 3.6; adapt with your I/C/E/F and examples.

Treating system prompts as set-and-forget across model upgrades.

Version control + re-eval on model change (4.6).

Vetted by Krishna KumarCurator, FactorBeam

Discussion

Discussion coming soon

Shared comments for this playbook are not live yet. When they are, you'll be able to ask questions, share what worked, and see replies from other readers.

Chapter context

Is this chapter for you?

Reference diagrams

Prompt quality stack

Command-grade iteration loop

Implementation paths

Prompting Foundations

The four components of every prompt

Ready-to-use artifacts

Prompt anatomy checklist

Specificity as the primary lever

Positive vs negative instructions

Role and persona prompting

Output format specification

Ready-to-use artifacts

Format specification block

Tone and register control

Length control

The iteration mindset

Advanced Prompting Techniques

Chain-of-thought prompting

Few-shot prompting

Ready-to-use artifacts

Few-shot prompt template

Meta-prompting

Prompt chaining

Self-consistency

Structured output prompting

Constitutional prompting

Ready-to-use artifacts

Mini-constitution template

Conditional prompting

The Prompt Trap Catalogue

The vagueness trap

The novel trap

The confirmation trap

The length trap

The format trap

The context dump trap

The one-shot trap

The compliance trap

Command-Grade Prompting

Prompt templates

Ready-to-use artifacts

Prompt template skeleton

The RISEN framework

Ready-to-use artifacts

RISEN prompt block

System prompt design

XML tags in prompts

Ready-to-use artifacts

Tagged prompt example

Prompt libraries

Prompt version control

The prompt audit

Ready-to-use artifacts

Prompt audit rubric (abbreviated)

Sharing prompts with teams

Ready-to-use artifacts

Quick prompt quality rubric

Team prompt onboarding (30 min)

Prompt anatomy checklist

Format specification block

Few-shot prompt template

Mini-constitution template

Prompt template skeleton

RISEN prompt block

Tagged prompt example

Prompt audit rubric (abbreviated)

B2B marketing org — from prompt chaos to library governance

What goes wrong

Discussion