System Prompt Assembly

Not one string, an assembly

Most people’s mental model of “system prompt”: a piece of text the developer writes, passed in when calling the model.

Claude Code’s system prompt isn’t that. It’s an assembly of layers injected in order, with explicit types, explicit boundary markers, and registry-based organization in the source. It’s not a one-time “write some text” task — it’s an assembly system.

This chapter unpacks that assembly system. Every claim is tied to a source path (in parentheses).

Source entry points: three functions and a boundary constant

Locate the code first. Claude Code’s system prompt is assembled by three functions:

getSystemPrompt(tools, model, ...) (constants/prompts.ts) — returns string[], the static + dynamic sections
buildEffectiveSystemPrompt({ ... }) (utils/systemPrompt.ts) — picks between override / agent / custom / default
getSystemContext() + getUserContext() (context.ts) — fetches git status / CLAUDE.md / date

Before sending to the model, getCacheSharingParams (in commands/compact/compact.ts) combines them.

Key constant:

// constants/prompts.ts
export const SYSTEM_PROMPT_DYNAMIC_BOUNDARY = '__SYSTEM_PROMPT_DYNAMIC_BOUNDARY__'

This is a literal string marker placed in the prompt array, explicitly separating static (cacheable) from dynamic (may change) content. The Anthropic API uses this marker for global cache scope. Details below.

Five-layer priority: whose prompt wins

buildEffectiveSystemPrompt implements an explicit priority ladder (the source comment literally says this):

0. Override system prompt (loop mode, REPLACES all others)
1. Coordinator system prompt (when coordinator mode is active)
2. Agent system prompt (when mainThreadAgentDefinition is set)
   - In proactive mode: APPENDED to default (agents add domain behavior on top)
   - Otherwise: REPLACES default
3. Custom system prompt (via --system-prompt)
4. Default system prompt (the standard Claude Code prompt)

appendSystemPrompt is always appended at the end (except with override, which replaces everything).

Design decisions worth noting:

Override is nuclear: loop mode and similar scenarios can completely replace the entire system prompt. The harness itself provides an “this session isn’t conventional usage” escape hatch
Proactive / Autonomous mode is special: the agent prompt appends to default rather than replacing — autonomous mode’s default is already lean (identity + memory + env + proactive section), and the agent adds domain-specific behavior that doesn’t conflict with the autonomy layer
--system-prompt CLI flag: users can override the default prompt from the command line. This is an observable product capability, not a hidden interface

This poses a design question for your own agent: how many layers of authority does your prompt system have? If it’s one layer (default + some config), “loop mode” - class special scenarios have nowhere to land — you’ll be forced to cram special instructions into the default, polluting regular sessions.

Static / Dynamic boundary: the cache line

getSystemPrompt returns a layout like this:

// constants/prompts.ts around line 560-577
return [
  // --- Static content (cacheable) ---
  getSimpleIntroSection(outputStyleConfig),
  getSimpleSystemSection(),
  getSimpleDoingTasksSection(),      // unless output style overrides
  getActionsSection(),
  getUsingYourToolsSection(enabledTools),
  getSimpleToneAndStyleSection(),
  getOutputEfficiencySection(),
  // === BOUNDARY MARKER - DO NOT MOVE OR REMOVE ===
  ...(shouldUseGlobalCacheScope() ? [SYSTEM_PROMPT_DYNAMIC_BOUNDARY] : []),
  // --- Dynamic content (registry-managed) ---
  ...resolvedDynamicSections,
].filter(s => s !== null)

The comment literally says “BOUNDARY MARKER - DO NOT MOVE OR REMOVE”. Why this boundary matters:

Before the marker: static — same every session, hits prompt cache across sessions
After the marker: dynamic — CLAUDE.md may change, MCP servers may connect / disconnect, session guidance may change every turn
Cache scope: the Anthropic API applies global cache scope to everything before the marker; hit rates are far higher than per-session cache

This is the “late binding” principle in prompt assembly expressed directly in code — it’s not aesthetic preference; it’s a cache-hit-rate optimization red line.

The static 7 sections (function names matching what you see in any Claude Code session’s system prompt):

#	Function	Content
1	`getSimpleIntroSection`	”You are Claude Code, Anthropic’s official CLI for Claude” + policy
2	`getSimpleSystemSection`	Engineering rules (tool results, hooks, prompt injection detection)
3	`getSimpleDoingTasksSection`	Style rules (when terse, when to ask)
4	`getActionsSection`	Care rules for irreversible ops (git push / rm)
5	`getUsingYourToolsSection`	Tool-use rules (prefer dedicated tools, parallel calls)
6	`getSimpleToneAndStyleSection`	Tone, no emojis, `file_path:line_number` refs
7	`getOutputEfficiencySection`	Output efficiency rules

These 7 sections almost never change — only Anthropic releases touch them. So they’re the most stable part of the cache prefix.

Dynamic sections: registry-based organization

After the marker come the dynamic sections. The source uses a registry pattern (systemPromptSection + resolveSystemPromptSections); each has a name, a lazy producer, and a cache policy:

const dynamicSections = [
  systemPromptSection('session_guidance', () =>
    getSessionSpecificGuidanceSection(enabledTools, skillToolCommands),
  ),
  systemPromptSection('memory', () => loadMemoryPrompt()),
  systemPromptSection('ant_model_override', () => getAntModelOverrideSection()),
  systemPromptSection('env_info_simple', () =>
    computeSimpleEnvInfo(model, additionalWorkingDirectories),
  ),
  systemPromptSection('language', () => getLanguageSection(settings.language)),
  systemPromptSection('output_style', () =>
    getOutputStyleSection(outputStyleConfig),
  ),
  // DANGEROUS: cache-busting. MCP servers connect/disconnect between turns.
  DANGEROUS_uncachedSystemPromptSection(
    'mcp_instructions',
    () => isMcpInstructionsDeltaEnabled() ? null : getMcpInstructionsSection(mcpClients),
    'MCP servers connect/disconnect between turns',
  ),
  systemPromptSection('scratchpad', () => getScratchpadInstructions()),
  systemPromptSection('frc', () => getFunctionResultClearingSection(model)),
  systemPromptSection(
    'summarize_tool_results',
    () => SUMMARIZE_TOOL_RESULTS_SECTION,
  ),
  // Ant-only A/B experiment — see below.
  ...(process.env.USER_TYPE === 'ant' ? [
    systemPromptSection('numeric_length_anchors', () =>
      'Length limits: keep text between tool calls to ≤25 words. Keep final responses to ≤100 words unless the task requires more detail.',
    ),
  ] : []),
  // Feature-flagged.
  ...(feature('TOKEN_BUDGET') ? [ ... ] : []),
  ...(feature('KAIROS') || feature('KAIROS_BRIEF') ? [ ... ] : []),
]

`DANGEROUS_uncachedSystemPromptSection` — explicitly “this breaks cache”

There’s a special factory function DANGEROUS_uncachedSystemPromptSection — literally “dangerous non-cacheable dynamic section”. Currently only mcp_instructions uses it. The comment explains:

MCP servers connect/disconnect between turns

MCP servers can connect or disconnect between turns, so the MCP instruction section can’t be cached. It’s recomputed every turn. The source explicitly names the cost of this design choice — any PR modifying this section sees the “DANGEROUS” label.

Takeaway for your own agent: make cache-break points explicit. “This section breaks cache” is much better when visible in code than hidden — the latter invites teammates to silently contribute cache-breaking sections.

Data-driven prompt engineering: `numeric_length_anchors`

Note the ant-only section in the dynamic list above:

// Numeric length anchors — research shows ~1.2% output token reduction vs
// qualitative "be concise". Ant-only to measure quality impact first.
systemPromptSection(
  'numeric_length_anchors',
  () => 'Length limits: keep text between tool calls to ≤25 words. Keep final responses to ≤100 words unless the task requires more detail.',
)

“Length limits: keep text between tool calls to ≤25 words. Keep final responses to ≤100 words” — this line appears in every Claude Code session’s system prompt. The source comment explicitly records:

research shows ~1.2% output token reduction vs qualitative “be concise”

1.2% output token reduction, numeric phrasing vs qualitative “be concise.” Ant-only rollout first to measure quality impact, then consider global.

Takeaway for your own agent: every instruction in the prompt should have a “why this wording” answer. Claude Code’s internal culture clearly is:

New prompt instructions must point to data or experiment results
Old prompt instructions must enter A/B comparison to verify they’re still alive
Quantified wordings (“≤25 words”) often outperform qualitative ones (“be concise”) — measure first, then roll out

Git Status: snapshot in time

The env info in the dynamic sections comes from getSystemContext() (context.ts). Source worth reading:

// context.ts: getGitStatus()
const [branch, mainBranch, status, log, userName] = await Promise.all([
  getBranch(),
  getDefaultBranch(),
  execFileNoThrow(gitExe(), ['--no-optional-locks', 'status', '--short'], ...),
  execFileNoThrow(gitExe(), ['--no-optional-locks', 'log', '--oneline', '-n', '5'], ...),
  execFileNoThrow(gitExe(), ['config', 'user.name'], ...),
])

const MAX_STATUS_CHARS = 2000
const truncatedStatus = status.length > MAX_STATUS_CHARS
  ? status.substring(0, MAX_STATUS_CHARS) +
    '\n... (truncated because it exceeds 2k characters. If you need more information, run "git status" using BashTool)'
  : status

Details that matter:

5 git commands run concurrently — Promise.all, not serial
--no-optional-locks — doesn’t block other git ops (important: session startup shouldn’t lock the repo)
MAX_STATUS_CHARS = 2000 truncation — if exceeded, placeholder says “run git status using BashTool,” telling the agent this was truncated
getSystemContext is memoized — computed once at session start, doesn’t update during conversation (the system prompt even states “this status is a snapshot in time”)

Skipping git status is conditional on:

CLAUDE_CODE_REMOTE env var — cloud resume skips for performance
shouldIncludeGitInstructions() returns false

CLAUDE.md loading: 6 memory types

The real memory type count in the source isn’t 3 — it’s 6 (utils/memory/types.ts):

export const MEMORY_TYPE_VALUES = [
  'User',        // ~/.claude/CLAUDE.md
  'Project',     // <repo>/CLAUDE.md (git-tracked)
  'Local',       // CLAUDE.local.md (user's **private** project instructions, not in git)
  'Managed',     // policy / enterprise-managed config
  'AutoMem',     // auto-memory, persists across conversations
  ...(feature('TEAMMEM') ? (['TeamMem'] as const) : []),  // team-shared memory
] as const

Each type’s description suffix in the system prompt (source getClaudeMds lines 1169-1177):

Type	Description suffix
Project	`' (project instructions, checked into the codebase)'`
Local	`" (user's private project instructions, not checked in)"`
TeamMem	`' (shared team memory, synced across the organization)'`
AutoMem	`" (user's auto-memory, persists across conversations)"`
User	`" (user's private global instructions for all projects)"`

Load order

getMemoryFiles() (utils/claudemd.ts line 790+) order:

Managed (always loaded first, policy-level)
Managed .claude/rules/*.md
User (if userSettings is enabled)
User ~/.claude/rules/*.md
Walk from filesystem root down to CWD, checking at each level:
- CLAUDE.md (Project)
- .claude/CLAUDE.md (Project)
- .claude/rules/*.md (Project)
- CLAUDE.local.md (Local)

Nested worktree rule

The source has an awkward-looking but sensible branch (lines 868-884):

// When running from a git worktree nested inside its main repo, the upward
// walk passes through both the worktree root AND the main repo root. Both
// contain checked-in files like CLAUDE.md... Skip Project-type files from
// directories above the worktree but within the main repo.

Translation: if you run Claude Code from a worktree nested inside the main repo, walking upward encounters the main repo root, which also has CLAUDE.md. Without special handling, the same rules load twice. The source explicitly skips.

Issue reference: github.com/anthropics/claude-code/issues/29599 — a real bug fix, not a hypothetical.

MEMORY_INSTRUCTION_PROMPT: the override declaration

CLAUDE.md content is prepended with a literal string instruction (utils/claudemd.ts line 89):

const MEMORY_INSTRUCTION_PROMPT =
  'Codebase and user instructions are shown below. Be sure to adhere to these instructions. ' +
  'IMPORTANT: These instructions OVERRIDE any default behavior and you MUST follow them exactly as written.'

You see this text in every Claude Code session. Design intent: override — user/project authority is higher than the default prompt. The 5-layer priority ladder from above combines with the CLAUDE.md layer here:

Default rules < CLAUDE.md / memory < session guidance < user’s current message

Later layers override earlier. The clearer the hierarchy, the fewer self-contradictions the agent hits.

Per-file size cap

// utils/claudemd.ts line 92
export const MAX_MEMORY_CHARACTER_COUNT = 40000

Any single CLAUDE.md over 40k characters gets truncated. A concrete engineering constraint — 1000-person repos easily grow CLAUDE.md; 40k chars ≈ 6-8k tokens.

Env killers

Claude Code exposes several environment variables / CLI flags that can bypass the entire memory loading:

Variable / flag	Effect
`CLAUDE_CODE_DISABLE_CLAUDE_MDS`	Fully disable CLAUDE.md loading (hard off)
`--bare` CLI flag	Skip auto-discovery (CWD upward walk), but honor `--add-dir` explicit directives
`CLAUDE_CODE_REMOTE`	Skip git status (resume scenarios save overhead)
`CLAUDE_CODE_SIMPLE`	Use minimal system prompt (identity + CWD + date only)

The source comment on --bare semantics is worth learning from: --bare means "skip what I didn't ask for", not "ignore what I asked for" — explicit --add-dir still works. A careful semantic choice: arguments should convey user intent, not mechanically execute.

Cache-break command (ant-only)

In context.ts there’s also a debugging-oriented mechanism:

// ant-only, ephemeral debugging state
let systemPromptInjection: string | null = null

export function setSystemPromptInjection(value: string | null): void {
  systemPromptInjection = value
  getUserContext.cache.clear?.()
  getSystemContext.cache.clear?.()
}

With the ant-only BREAK_CACHE_COMMAND feature enabled, calling setSystemPromptInjection("any text") inserts [CACHE_BREAKER: any text] into the system context — intentionally breaking the prompt cache. Used for debugging: what happens on cache miss?

Takeaway for your own agent: the ability to intentionally break cache should be built in for debugging. Production ops often asks “what if this prompt were cold today?” A one-click cache-bust beats editing code.

Tool schemas + error text: also prompts

(This content was covered well in the prior version — brief recap with source evidence.)

Tool descriptions themselves are part of the system prompt — injected via getUsingYourToolsSection(enabledTools). Each tool’s description / schema / error text went through Anthropic’s prompt engineering iteration.

Example from the Edit tool description:

“You must use your Read tool at least once in the conversation before editing” — teaches the agent to read before editing
“This tool will error if you attempt an edit without reading the file” — error text also guides the next action
“The edit will FAIL if old_string is not unique in the file” — preemptive failure-condition warning

Takeaway for your own agent: well-written error text lets the agent self-correct; poorly written (“Invalid argument”) makes it loop retrying.

Full assembly order (real-session observation)

The 10 layers in the diagram are the user-observable session-level layering. At the source level it’s 7 static sections + 13 dynamic sections + context getter injections, more total — the diagram aggregates for reader comprehension.

Takeaways for building your own agent

System prompt is an assembly, not a string. Static sections (cache-friendly) + dynamic sections (possibly per- turn variable) separated by an explicit boundary marker — cache hit rate optimization expressed in code
Use a registry for dynamic sections. Name + lazy producer + cacheability flag per section. Stuffing strings into a list inevitably breaks
Make cache-breaking explicit. Claude Code’s DANGEROUS_uncachedSystemPromptSection name is deliberately scary — makes cache-breaking visible and reviewable
Multi-layer priority ladder (override > coordinator > agent > custom > default) + append mechanism (appendSystemPrompt). Single-layer authority breaks in edge cases
CLAUDE.md-style mechanism: give project / user / enterprise / auto-memory their own prompt insertion points, with an explicit “OVERRIDE” declaration out front, or every project has to negotiate with the defaults first
Env killers: CLAUDE_CODE_DISABLE_* env vars aren’t luxuries — they’re emergency brakes. Always have a one-flag-off option
Prompt instructions need data support. numeric_length_anchors’ 1.2% telemetry is exemplary — qualitative judgment piling up rules eventually spirals out of control
Tool schemas + error text are prompt engineering tasks, not incidental schema docs — critical input for the agent’s self-correction
Memoize session-stable content. getSystemContext / getUserContext are both memoized — compute once for things that don’t change during the session