Standalone article · part of a sequenced guide

What you'll unlock: Treat Claude Code as an implementer, not an architect — give it CLAUDE.md context, bounded TASK.md acceptance criteria, branch/PR/CI gates, and extend outward into custom agents when coding loops become products.

Tool guideChapter 8 of 10

Claude Code — End to End

~110 min read

The complete guide to Claude Code — from installation to production-grade agentic coding workflows

Chapter context

Engineering leaders hear 'AI wrote 80% of the code' but see review bottlenecks, surprise diffs, and incidents from unbounded agent runs.This chapter codifies agentic engineering discipline so velocity gains don't trade off reliability.


Is this chapter for you?

Are engineers pasting large code blocks into Claude.ai?

Yes — pilot Claude Code on bounded repo tasks (Concept 2).

Do you need autonomous cross-tool workflows (tickets + code + CI)?

Yes — Concepts 3.7 MCP + Concept 4 workflow agents.

Has an agent change ever merged without human review?

Yes — stop; implement 3.5 git policy and 3.8 production checklist immediately.

Is the team building customer-facing AI agents?

Yes — Concept 4 full path; Claude Code as dev environment.


Chapters 1 and 7 introduced Claude Code and MCP in the stack. Chapter 8 is the full engineering guide — for developers and technical leads shipping software with an autonomous coding agent.Claude Code is not smarter autocomplete. It is a collaborator with shell access — powerful when bounded by TASK specs, CLAUDE.md, git discipline, and human review.

Chapter insight

Treat Claude Code as an implementer, not an architect — give it CLAUDE.md context, bounded TASK.md acceptance criteria, branch/PR/CI gates, and extend outward into custom agents when coding loops become products.


Reference diagrams

Claude Code agent loop

Task spec → plan → read/edit/run → observe → repeat until tests pass or human stops.

TASK.mdSpec + criteriaYou
PlanFiles + stepsAgent
ExecuteEdit + shellAgent
PRHuman reviewTeam

Building agents stack

Scaffold → tools → loop → memory — same patterns Claude Code uses internally.

ScaffoldAPI loopCode
ToolsMCP/APIIntegrate
StateMemoryPersist
Multi-agentSupervisorScale

Implementation paths

Foundations → workflows → advanced → build agents.

Claude Code E2EFoundationsConcept 1 — 1.1–1.8Agent vs assistantMindsetSafetyApprovalsCore workflowsConcept 2 — 2.1–2.8Bugs + featuresDailyTests + docsQualityAdvancedConcept 3 — 3.1–3.8CLAUDE.mdContextGit + CIProductionBuild agentsConcept 4 — 4.1–4.8Scaffold + toolsFoundationMulti-agentScale

Concept 1

Claude Code Foundations

What Claude Code is, how it works, and the mental model that makes it different from every other coding tool

1.1

What Claude Code is

The agentic coding environment that lives in your terminal — not a code completion tool but an autonomous coding agent

Key takeaway

Claude Code is a terminal-native agent that reads your repo, runs commands, edits files, and iterates until a task completes — closer to a junior engineer with shell access than to autocomplete.

Why this matters

Chapter 1 placed Claude Code in the ecosystem. This is the surface where software actually ships.

Claude Code plans steps, executes them, observes output, and continues — write tests, fix failures, open PRs within guardrails you approve.

Not: inline tab completion. Is: task-level delegation with repo context and terminal execution.

Workflow — do this next

  1. 01Install in a non-production repo first.
  2. 02Run one read-only task: 'explain module X'.
  3. 03Graduate to scoped edit tasks with TASK.md.

1.2

How Claude Code differs from Claude.ai

Direct filesystem access, terminal execution, and persistent context — the capabilities that change what's possible

Key takeaway

Claude.ai chats about code; Claude Code operates on code — filesystem, shell, git, tests — in your real environment.

Why this matters

Pasting files into chat doesn't scale; agents need native repo and command access.

Claude.ai: upload snippets, artifacts, advice. Claude Code: traverse tree, run npm test, apply patches, read stderr, retry.

Use Claude.ai for design and specs; Claude Code for implementation in repo. Chapter 6 Projects can hold TASK specs Claude Code executes.

Workflow — do this next

  1. 01Write spec in Project or TASK.md.
  2. 02Invoke Claude Code in repo root.
  3. 03Reference TASK.md in first message.

1.3

Installation and setup

The complete setup process — prerequisites, installation, authentication, and first run

Key takeaway

Prerequisites: supported OS, Node or install path per Anthropic docs, API key or Claude account auth, git repo. First run in trusted directory only.

Why this matters

Skipped setup causes permission confusion and auth failures mid-task.

Follow official install for your platform. Authenticate with Anthropic credentials. Verify with version/help command. Run inside git repo with clean status or dedicated branch.

Team: document approved install channel; block unverified forks. Align API billing with Chapter 2.

Workflow — do this next

  1. 01Install from official source.
  2. 02Authenticate; confirm org policy allows.
  3. 03cd into sample repo; run help.
  4. 04Create agent-test branch.

Ready-to-use artifacts

Complete templates — paste directly into your AI tool or automation workflow.

Claude Code setup checklist

□ Official install completed
□ Auth verified (API or subscription per policy)
□ Git repo initialized / cloned
□ Working branch created (not main)
□ CLAUDE.md stub added (see 3.2)
□ Team security policy acknowledged
□ First task: read-only explain

1.4

The Claude Code interface

Commands, shortcuts, modes, and the terminal interaction model

Key takeaway

Learn core commands: start session, approve/deny tool actions, interrupt, resume context, slash commands for modes — interaction is conversational plus explicit approvals.

Why this matters

Approval model is the UX — misunderstanding it causes surprise edits or blocked progress.

Expect prompts to approve file writes and shell commands. Use deny to redirect approach. Interrupt when scope drifts. Explore built-in help for session controls — exact commands evolve with releases.

Workflow — do this next

  1. 01Run --help or docs for current command list.
  2. 02Practice approve/deny on harmless ls.
  3. 03Note keyboard interrupt behavior.

1.5

How Claude Code reads your codebase

What it indexes, how it navigates, and how to help it understand your project structure

Key takeaway

Claude Code discovers structure via directory traversal, file reads, grep-like search, and CLAUDE.md — not magic full-repo embedding on load.

Why this matters

Large monorepos need CLAUDE.md and .claudeignore-style discipline so navigation stays focused.

It reads what it needs iteratively — package manifests, entry points, symbols, tests. Help it: CLAUDE.md architecture summary, consistent module boundaries, TASK.md with paths.

Exclude generated artifacts and secrets from scope — never commit API keys; use env files in .gitignore.

Workflow — do this next

  1. 01Add CLAUDE.md with tree overview.
  2. 02Point to entry files explicitly in tasks.
  3. 03Keep tasks scoped to one package/service.

1.6

Permissions and safety

What Claude Code can and cannot do without explicit approval — the safety model and the override controls

Key takeaway

Default: ask before destructive ops. You control writes, installs, network, git push. Override sparingly; sandbox for experiments.

Why this matters

Agent with shell is production-risk — treat approvals like code review gates.

Risky: rm -rf, prod deploy, force push, curl | bash, modifying CI secrets. Policy: no auto-approve on main; branch-only; human reviews all diffs.

Workflow — do this next

  1. 01Never run on prod checkout.
  2. 02Require diff review before merge.
  3. 03Log sessions for audit if org requires.

1.7

Claude Code with your existing tools

How it integrates with your editor, your version control, and your development environment

Key takeaway

Claude Code complements IDE — terminal agent + editor for manual tweak. Git is system of record; MCP extends to external tools (Chapter 7).

Why this matters

Teams succeed when Code fits workflow, not replaces it wholesale.

Pattern: Claude Code on branch → IDE diff view → CI → PR. Some teams run Code inside VS Code terminal. Git hooks and formatters still apply to agent output.

Workflow — do this next

  1. 01Define team workflow diagram.
  2. 02Run formatter/linter post-agent.
  3. 03CI must pass before merge.

1.8

The agentic coding mindset

What changes when you have a coding agent rather than a coding assistant — the task size and type that Claude Code is built for

Key takeaway

Delegate bounded tasks with clear acceptance criteria — migrate module, fix test suite, add endpoint — not 'make app better'. You are tech lead; agent is implementer.

Why this matters

Wrong task granularity produces endless loops or reckless diffs.

Good tasks: defined I/O, existing test harness, <20 files typical. Poor tasks: greenfield architecture without design doc, vague 'clean up code everywhere'.

Chapter 4 prompting applies: TASK.md is your spec — instructions, context, format, tests.

Workflow — do this next

  1. 01Write acceptance criteria as checklist.
  2. 02Time-box agent runs; interrupt if looping.
  3. 03Retrospective: what spec was missing?

Real example

Eng — bounded migration task

TASK: migrate auth middleware on 12 endpoints; tests must pass; no unrelated refactors. Claude Code finished in 3 hours; human review caught one edge case. Vague 'improve auth' attempt last month failed after 200-file diff.

Concept 2

Core Claude Code Workflows

The daily workflows that make Claude Code genuinely productive — how to use it for real work, not just demos

2.1

Explaining a codebase

Loading an unfamiliar project and using Claude Code to understand architecture, dependencies, and logic

Key takeaway

Start with read-only: 'map architecture, entry points, data flow, test layout' — output ARCHITECTURE_NOTES.md for humans.

Why this matters

Onboarding is the highest-ROI read-only use — zero merge risk.

Prompt: scope directory, identify frameworks, diagram module deps, flag legacy areas. Cross-check with CLAUDE.md if present.

Workflow — do this next

  1. 01Run on onboarding day one.
  2. 02Ask for sequence diagram of request path.
  3. 03Human validates with senior engineer.

2.2

Bug investigation and fixing

Giving Claude Code a bug report and letting it find, explain, and fix the issue

Key takeaway

Provide repro steps, logs, expected vs actual — agent reproduces, locates root cause, proposes fix, runs tests.

Why this matters

Bug tasks are well-bounded — ideal agent shape.

TASK.md: repro command, stack trace, regression test requirement. Agent loop: reproduce → isolate → fix → test. Human verifies fix isn't band-aid.

Workflow — do this next

  1. 01Paste error + repro in TASK.
  2. 02Require new or updated test.
  3. 03Review root cause explanation before merge.

Real example

Flaky test — agent trace

Agent traced race in async cleanup; added await; flaky test stabilized. Human confirmed no broad timeout increases.

2.3

Feature implementation

Specifying a feature in plain language and having Claude Code write, test, and integrate it

Key takeaway

Feature spec: user story, API contract, files likely touched, test plan — agent implements behind feature branch.

Why this matters

Features fail when spec omits edge cases and integration points.

Include: mockups or OpenAPI snippet, error cases, logging, feature flag if needed. Agent drafts; human reviews API shape and UX edge cases.

Workflow — do this next

  1. 01Spec in TASK.md or linked ticket.
  2. 02List must-not-change modules.
  3. 03CI green + product sign-off.

2.4

Refactoring at scale

Asking Claude Code to improve code quality across a large codebase — the patterns and the guardrails

Key takeaway

Refactor in slices — one pattern, one directory, mechanical changes with tests green each slice — never 'refactor everything'.

Why this matters

Unbounded refactors create unreviewable diffs.

Pattern: rename with tests, extract module with coverage, migrate API with adapter layer. Cap files per session.

Workflow — do this next

  1. 01Define mechanical transformation precisely.
  2. 02Run tests after each commit.
  3. 03Stop at reviewable diff size (~400 lines).

2.5

Test writing

Using Claude Code to achieve comprehensive test coverage — the prompts and the review process

Key takeaway

Point at module; require unit + edge + failure cases; forbid testing implementation details — review assertions for meaning.

Why this matters

Agents generate high-coverage nonsense tests — human judges assertion quality.

TASK: coverage target for file X, list edge cases, use project test conventions from CLAUDE.md.

Workflow — do this next

  1. 01Share example test file as style reference.
  2. 02Run coverage report.
  3. 03Delete vacuous tests in review.

2.6

Documentation generation

Having Claude Code write technical documentation from code — the output quality and the review workflow

Key takeaway

Generate README sections, API docs, ADRs from source — agent reads code truth; human fixes intent and omissions.

Why this matters

Docs drift from code; agent-derived docs stay closer if regenerated on change.

Output to docs/ with PR. Include 'verify against code' step — agent lists files read as cites.

Workflow — do this next

  1. 01Scope one module per doc pass.
  2. 02Technical reviewer checks accuracy.
  3. 03Link from CLAUDE.md.

2.7

Code review assistance

Using Claude Code to review pull requests — what it catches, what it misses, and how to use it alongside human review

Key takeaway

Agent review: style, obvious bugs, missing tests, security smells — not product judgment or subtle domain bugs.

Why this matters

Complement human review; never replace required reviewer.

Prompt: review diff for security, error handling, test gaps. Human owns architecture and product fit.

Workflow — do this next

  1. 01Run on PR branch locally.
  2. 02Feed findings into PR comments.
  3. 03Human reviewer adjudicates.

2.8

Dependency and security audit

Asking Claude Code to identify outdated dependencies, security vulnerabilities, and technical debt

Key takeaway

Agent runs audit tools (npm audit, etc.), interprets output, proposes bump PRs — human approves version jumps.

Why this matters

Automates toil; risky major bumps need human judgment.

TASK: run org-approved scanners, summarise CRITICAL/HIGH, propose patches with changelog notes.

Workflow — do this next

  1. 01Read-only audit first.
  2. 02Separate PR per dependency group.
  3. 03Run full test suite on bumps.

Concept 3

Advanced Claude Code Patterns

The techniques that unlock Claude Code's full capability — multi-file operations, agent loops, and production-grade engineering

3.1

Multi-file operations

How Claude Code edits across multiple files simultaneously — the capability and the coordination discipline

Key takeaway

Multi-file edits need shared types, imports, and tests updated atomically — cap scope; require single logical commit per task.

Why this matters

Wide edits without coordination break builds mid-session.

List affected paths in TASK.md upfront. Agent should run build/tests before declaring done. Human reviews cross-file contracts.

Workflow — do this next

  1. 01Name all files expected to change.
  2. 02Reject unrelated file touches in review.
  3. 03One PR per coherent change.

3.2

The CLAUDE.md file

How to write the project context file that makes Claude Code understand your codebase the moment it loads

Key takeaway

CLAUDE.md is repo-level system prompt: architecture, commands, conventions, test instructions, don't-touch zones — loaded automatically for context.

Why this matters

Best ROI file in agentic coding — saves re-explaining every session.

Include: stack, how to run tests/lint, directory map, naming rules, env setup, common gotchas, link to ADRs.

Workflow — do this next

  1. 01Start from template artifact.
  2. 02Update when architecture changes.
  3. 03Team reviews in onboarding.

Ready-to-use artifacts

Complete templates — paste directly into your AI tool or automation workflow.

CLAUDE.md template

# Project context for Claude Code

## Stack
[language, framework, versions]

## Commands
- install: `...`
- test: `...`
- lint: `...`

## Architecture
[5–10 lines + key paths]

## Conventions
[style, branching, commit format]

## Do not modify
[generated dirs, vendor, secrets]

## Testing
[how to run, coverage expectations]

3.3

Custom commands

Building your own Claude Code commands for recurring operations — the automation layer inside the agentic tool

Key takeaway

Custom commands encode recurring prompts — release prep, migration checklist, security scan. For multi-step SOPs with scripts, prefer Agent Skills (Ch 8.5) — commands are single-shot; skills are progressive packages.

Why this matters

Repeatable eng rituals need the right layer — don't stuff scripts into slash commands when SKILL.md fits.

Define commands per `.claude/commands/` or official extensibility docs — wrap TASK templates. Version in repo. Document in CLAUDE.md index alongside skills.

Use command when: one prompt template. Use skill when: multi-step procedure + optional scripts + resources. Use plugin when: bundling MCP + skills + commands for distribution.

Workflow — do this next

  1. 01Identify third recurring team task.
  2. 02Encode as command + TASK template.
  3. 03Document in CLAUDE.md.

3.4

Agent loops in Claude Code

How to run Claude Code in extended autonomous mode — what to delegate and how to monitor

Key takeaway

Extended loops: clear stop condition, max iterations, checkpoint commits, human review at milestones — watch for thrashing on same error.

Why this matters

Unmonitored autonomy burns tokens and can spiral on wrong approach.

Set: 'stop when tests pass' or 'max 10 attempts'. Interrupt if same error repeats 3 times. Checkpoint git commits when green.

Workflow — do this next

  1. 01Define stop condition in TASK.
  2. 02Monitor terminal output.
  3. 03Interrupt and narrow scope if stuck.

3.5

Claude Code with version control

The Git workflow that integrates Claude Code safely — branch discipline, commit review, and rollback

Key takeaway

Branch per task → small commits → PR → human review → CI → merge. Never agent-commit directly to main; revert via git.

Why this matters

Git is your safety net when agent errs.

Conventional commits optional. Squash or not per team. Tag agent-generated PRs for metrics. Rollback: git revert, not manual memory.

Workflow — do this next

  1. 01feature/agent-* branch naming.
  2. 02Require PR template checklist.
  3. 03Revert script ready.

3.6

Claude Code in CI/CD

Using Claude Code as part of your deployment pipeline — automated review, test generation, and quality gates

Key takeaway

CI use cases: optional PR comment bot, test gap suggestions, doc sync checks — sandboxed, read-mostly, no prod credentials in agent.

Why this matters

Pipeline agents need stricter bounds than local sessions.

Run in ephemeral CI container with least privilege. No auto-merge from agent. Human required for production deploy.

Workflow — do this next

  1. 01Pilot on non-blocking PR comments.
  2. 02Measure false positive rate.
  3. 03Expand scope if signal high.

3.7

MCP integration in Claude Code

Connecting Claude Code to external tools — databases, APIs, and services it can call during coding tasks

Key takeaway

Claude Code + MCP (Chapter 7): query staging DB schema, read ticket from Jira, check API spec — while editing code in repo.

Why this matters

Coding tasks often need live external truth, not stale README.

Configure MCP servers per Claude Code docs. Read-only DB in dev. Never prod write without human gate.

Workflow — do this next

  1. 01Add one MCP server (e.g. GitHub issues).
  2. 02TASK references ticket ID.
  3. 03Verify tool output in PR description.

3.8

Production-grade Claude Code

The discipline, the checkpoints, and the human oversight that makes Claude Code safe for real production work

Key takeaway

Production-grade = TASK specs + CLAUDE.md + branch/PR/CI + human review + audit + no secrets in prompts + incident retros.

Why this matters

Velocity without discipline becomes production incidents.

Team policy doc: allowed tasks, forbidden paths, review requirements, logging. Measure: PR acceptance rate, revert rate, incident count.

Workflow — do this next

  1. 01Publish Claude Code eng policy.
  2. 02Monthly retro on agent PRs.
  3. 03Update CLAUDE.md from incidents.

Ready-to-use artifacts

Complete templates — paste directly into your AI tool or automation workflow.

Agent PR review checklist

□ TASK.md / ticket linked
□ Diff size reviewable
□ Tests added/updated meaningfully
□ No secrets / env committed
□ CI green
□ Human reviewer (not agent-only)
□ Security-sensitive paths double-reviewed

Concept 4

Building AI Agents with Claude Code

Using Claude Code as the foundation for building autonomous AI agents — from simple tools to multi-agent systems

4.1

What an AI agent is

The definition, the loop, and why Claude Code is a natural environment for building them

Key takeaway

An agent observes environment, reasons, acts via tools, repeats until goal or stop — Claude Code is both an agent (coding) and a dev environment for building agents.

Why this matters

Chapter 1 agentic loops apply — building agents extends Claude Code skills to products.

Agent loop. Coding agent uses file/shell tools; research agent uses search/API tools.

Workflow — do this next

  1. 01Sketch loop on whiteboard.
  2. 02List tools and stop conditions.
  3. 03Prototype in repo with tests.

4.2

The agent scaffold

The code structure that wraps Claude API calls in an observe-reason-act loop — the foundation every agent shares

Key takeaway

Scaffold: Messages API + tool definitions + executor + state + stop policy — thin loop, fat tools, logged steps.

Why this matters

Every custom agent shares this skeleton; Claude Code can generate it from spec.

Modules: prompt/system, tool registry, run_loop(max_steps), handlers per tool, structured logging. Use Anthropic SDK patterns.

Workflow — do this next

  1. 01Clone minimal agent template.
  2. 02Implement one tool end-to-end.
  3. 03Add max_steps and timeout.

Ready-to-use artifacts

Complete templates — paste directly into your AI tool or automation workflow.

Agent scaffold outline (conceptual)

run_agent(goal):
  state = init()
  while not done and steps < MAX:
    response = claude(messages, tools)
    if tool_call:
      result = execute_tool(tool_call)
      messages.append(result)
    else:
      return response
  raise MaxStepsError

4.3

Tool definition and integration

How to define tools that your agent can call — APIs, databases, file systems, and external services

Key takeaway

Tools need JSON schema, idempotent reads, explicit write confirmations, error messages agents can parse — design for machine feedback.

Why this matters

Bad tool errors cause agent hallucination of success.

Read tools: get_issue, search_docs. Write tools: create_draft — separate from publish. Return structured JSON always.

Workflow — do this next

  1. 01Document each tool schema.
  2. 02Test tool failures explicitly.
  3. 03Mock tools in unit tests.

4.4

Building a simple research agent

An agent that searches, reads, synthesises, and outputs — the complete build, step by step

Key takeaway

Research agent tools: web/search, fetch_url, write_note — output markdown report with citations; human reviews before use.

Why this matters

Simplest agent pattern — validates loop before complex writes.

Build in Claude Code: TASK to scaffold agent/, implement search + fetch tools, synthesis prompt with SOURCE RULES (Chapter 4), CLI entrypoint.

Workflow — do this next

  1. 01Define report JSON/markdown schema.
  2. 02Implement read-only tools first.
  3. 03Eval on 5 fixed research questions.

Real example

Competitive scan agent

Agent fetched 8 sources, produced comparison table with URLs. PM validated 2 citations manually — faster than manual scan, not auto-published.

4.5

Building a data processing agent

An agent that ingests, transforms, analyses, and reports — the pipeline agent pattern

Key takeaway

Pipeline agent: ingest (CSV/API) → validate schema → transform steps → aggregate → report artifact — log row counts and errors.

Why this matters

Common ops/analytics automation with clear verification metrics.

Tools: read_file, run_sql (sandbox), write_report. Checkpoint intermediate parquet. No PII in logs.

Workflow — do this next

  1. 01Sample data in tests.
  2. 02Validate row counts each stage.
  3. 03Human signs off report.

4.6

Building a multi-step workflow agent

An agent that executes a complex multi-tool workflow autonomously — the orchestration pattern

Key takeaway

Workflow agent encodes DAG: step dependencies, checkpoints, human gates on writes — Chapter 7 MCP workflows as code.

Why this matters

Cross-system automation is where agent products deliver ROI.

State machine explicit in code — not only prompt — for reliability. Resume from last checkpoint on failure.

Workflow — do this next

  1. 01Draw workflow DAG.
  2. 02Implement step handlers.
  3. 03Dry-run with mocked writes.

4.7

Memory and state in agents

How to give your agent persistent memory across runs — the storage patterns that make agents stateful

Key takeaway

Persist: run_id, messages summary, tool results hash, user prefs — SQLite/Redis/file — inject summary on next run (Chapter 5).

Why this matters

Stateless agents repeat work; unbounded state bloats cost.

Patterns: rolling summary + key facts table, episodic log for audit, vector store only if corpus huge.

Workflow — do this next

  1. 01Define what must persist vs ephemeral.
  2. 02Implement summary after each run.
  3. 03Test resume mid-workflow.

4.8

Multi-agent systems

Building a network of specialised agents that coordinate — the architecture and the coordination protocol

Key takeaway

Specialists (researcher, coder, reviewer) + orchestrator — message bus or supervisor model — clear handoff schema, human at merge.

Why this matters

Monolithic mega-agent confuses roles; separation improves eval and safety.

Supervisor routes subtasks. Each agent narrow tools. Reviewer agent blocks merge until checklist pass. Start with two agents before N.

Workflow — do this next

  1. 01Define roles and handoff JSON schema.
  2. 02Implement supervisor loop.
  3. 03Integration test on one end-to-end scenario.

Real example

Spec → code → review trio

Planner agent wrote TASK.md from ticket; coder agent implemented on branch; reviewer agent posted PR comments. Human merged after adjudicating one false positive. Multi-agent reduced single-agent scope creep.

Concept 5

Agent Skills, Plugins & Extensibility

SKILL.md in Claude Code, plugin marketplace, progressive disclosure, and production skill governance

5.1

Agent Skills in Claude Code

`.claude/skills/` project skills, `~/.claude/skills/` personal skills, and discovery at session start

Key takeaway

Claude Code discovers SKILL.md in project `.claude/skills/` and user `~/.claude/skills/` — metadata scanned at start, full skill loaded when relevant.

Why this matters

Custom commands alone cannot package scripts + resources — skills are the team standard for eng workflows.

Add skill folder with SKILL.md + optional scripts. Commit project skills to git. Personal skills for experiments only — promote to project when stable.

Workflow — do this next

  1. 01Create `.claude/skills/release-notes/SKILL.md`.
  2. 02Test invocation on sample task.
  3. 03PR skill folder like code.

5.2

Pre-built document skills

PowerPoint, Excel, Word, PDF skills — OOXML workflows, templates, and verification habits

Key takeaway

Anthropic pre-built skills encode document generation (pptx, xlsx, docx, pdf) — use for board decks and reports; human verifies numbers and branding.

Why this matters

Raw 'make me a deck' without skill produces generic slides — skills enforce structure.

Enable document skills in host. Provide template or brand constraints in Project. Output to artifact → human open in Office → verify charts and footnotes.

Workflow — do this next

  1. 01Enable pptx/xlsx skill.
  2. 02Attach brand template reference.
  3. 03QA in native Office app before send.

5.3

Claude Code plugins

Plugins bundling MCP + skills + commands + sub-agents — install, author, and submit to directory

Key takeaway

Plugins package MCP (.mcp.json), skills, slash commands, and sub-agents — install from marketplace or git; submit to plugin directory when productized.

Why this matters

Plugins are how teams ship one-click capability packs — not ad-hoc config per developer.

Install: plugin add from Anthropic/partner repos. Author: structure per plugin spec — MCP servers, skill folders, commands/. Submit for directory review when stable.

Workflow — do this next

  1. 01Install official frontend-design or doc plugin.
  2. 02Run golden task.
  3. 03Fork internal plugin for company standards.

5.4

Skills vs CLAUDE.md vs custom commands

When to use each Claude Code extensibility mechanism — decision tree with examples

Key takeaway

CLAUDE.md = always-on repo context. Skills = on-demand procedures. Commands = single slash prompt. MCP = live systems. Don't duplicate — cross-reference.

Why this matters

Teams stuff skills content into CLAUDE.md and bloat every session.

CLAUDE.md: stack, test commands, architecture. Skill: 'how to run release' with scripts. Command: `/release-check`. MCP: GitHub issues live lookup.

Workflow — do this next

  1. 01Audit CLAUDE.md length — move procedures to skills.
  2. 02One skill per SOP >3 steps.
  3. 03Link skill from CLAUDE.md index.

5.5

Skill scripts & security review

Executable scripts in skills — review for network calls, secrets, and supply chain before team adoption

Key takeaway

Skills can include scripts Claude executes — code review skills like application code; no curl-to-unknown URLs; pin dependencies.

Why this matters

Malicious or sloppy skill scripts are prompt injection with shell access.

Review checklist: network egress, file paths, env var reads, subprocess calls. Run in CI sandbox. Sign internal skills; vet third-party like any dependency.

Workflow — do this next

  1. 01PR review for new skills mandatory.
  2. 02Run scripts in isolated CI.
  3. 03Block skills from unverified sources on managed laptops.

5.6

Skill stacking & composition

Multiple skills on one task — how Claude coordinates and how to avoid conflicting instructions

Key takeaway

Claude may invoke multiple skills — ensure SKILL.md scopes don't conflict; use explicit task prompt to pick primary skill.

Why this matters

Overlapping skills cause format wars — two skills both defining output schema.

Namespace skill outputs. Primary skill in user prompt: 'Use competitive-brief skill as authority.' Test multi-skill tasks in staging.

Workflow — do this next

  1. 01List active skills per project.
  2. 02Resolve overlapping descriptions.
  3. 03Integration test multi-skill workflow.

5.7

Cowork & plugins

Desktop automation with full MCP and plugin support — multi-app workflows beyond Claude Code

Key takeaway

Cowork supports MCP + plugins for cross-app desktop automation — compare to Zapier for UI-reasoning tasks vs event triggers.

Why this matters

Ops roles need Cowork doc — not only Claude Code for engineers.

Map workflow across email, files, browser. Install plugin bundle. Human checkpoints between write steps. Document vs Chrome: Cowork multi-app; Chrome single-browser.

Workflow — do this next

  1. 01Pilot Cowork on one ops workflow.
  2. 02Install required plugin.
  3. 03Measure reliability vs manual.

5.8

Skills lifecycle & versioning

Authoring, testing, deploying, deprecating skills across Claude.ai, Code, and API

Key takeaway

Skills lifecycle: git version → test golden paths → deploy (upload/API/Code sync) → registry entry → quarterly review → deprecate with migration note.

Why this matters

Undeprecated skills become silent SOP drift across surfaces.

Semantic version in SKILL.md frontmatter. Changelog section. Deprecation: mark description 'DEPRECATED — use X'. Run regression when Anthropic updates pre-built skills.

Workflow — do this next

  1. 01Skills repo with CODEOWNERS.
  2. 02CI: lint SKILL.md frontmatter.
  3. 03Quarterly skill audit in workflow registry.

Ready-to-use artifacts

Complete templates — paste directly into your AI tool or automation workflow.

TASK.md template for Claude Code

Attach to every agent delegation.

# TASK: [title]

## Goal
[one paragraph]

## Acceptance criteria
- [ ] ...

## Scope
Files/modules IN: ...
Files/modules OUT: ...

## Commands
test: `...`
lint: `...`

## Stop condition
[e.g. all tests green]

## Notes
[links, tickets, constraints]

Engineering agent policy (summary)

ALLOWED: feature branches, tests, docs, refactors in scope
FORBIDDEN: direct main commits, prod creds, unreviewed merge
REQUIRED: TASK.md, PR review, CI green, CLAUDE.md maintained
MCP: read-only default; writes need charter (Ch 7)

Claude Code setup checklist

□ Official install completed
□ Auth verified (API or subscription per policy)
□ Git repo initialized / cloned
□ Working branch created (not main)
□ CLAUDE.md stub added (see 3.2)
□ Team security policy acknowledged
□ First task: read-only explain

CLAUDE.md template

# Project context for Claude Code

## Stack
[language, framework, versions]

## Commands
- install: `...`
- test: `...`
- lint: `...`

## Architecture
[5–10 lines + key paths]

## Conventions
[style, branching, commit format]

## Do not modify
[generated dirs, vendor, secrets]

## Testing
[how to run, coverage expectations]

Agent PR review checklist

□ TASK.md / ticket linked
□ Diff size reviewable
□ Tests added/updated meaningfully
□ No secrets / env committed
□ CI green
□ Human reviewer (not agent-only)
□ Security-sensitive paths double-reviewed

Agent scaffold outline (conceptual)

run_agent(goal):
  state = init()
  while not done and steps < MAX:
    response = claude(messages, tools)
    if tool_call:
      result = execute_tool(tool_call)
      messages.append(result)
    else:
      return response
  raise MaxStepsError

Fintech API team — Claude Code at scale

40 engineers, legacy Java services, slow onboarding. Copilot helped lines but not cross-file migrations. Ad hoc Claude.ai paste caused inconsistent fixes.

Before

No CLAUDE.md, no TASK template, agents on main, review fatigue, 2 hotfix reverts from agent PRs.

After

Claude Code rollout: CLAUDE.md per service, TASK.md mandatory, feature/agent-* branches, agent PR checklist, read-only week 1. Built internal research agent (Concept 4.4) for compliance doc scans.

  • Median bug-fix time → down 35% on bounded tasks
  • Agent PR revert rate → 8% to <2% after policy
  • New engineer time-to-first-PR → 10 days to 4
  • Custom agent compliance scan → 12 hrs/week analyst time saved

What goes wrong

Vague tasks → huge unreviewable diffs.

TASK.md with scope + acceptance criteria (1.8, artifact).

No CLAUDE.md → agent re-discovers repo every session.

3.2 template; owner per service.

Auto-approve destructive shell commands.

1.6 permissions; branch-only policy.

Building multi-agent before single-agent loop works.

4.2 scaffold + 4.4 research agent first.


Portrait of Krishna Kumar, Curator

Vetted by Krishna KumarCurator, FactorBeam


Discussion

Discussion coming soon

Shared comments for this playbook are not live yet. When they are, you'll be able to ask questions, share what worked, and see replies from other readers.