Applying to Your Agent (AI SDK)

Chapter framing

Prior chapters covered what Claude Code does. This chapter’s goal is different:

For external readers: guidance on landing CC’s principles into specific AI SDK hook lines
For the Zapvol team: using ourselves as a concrete case — current state assessment + evolution direction

Both goals served by the same structure — three parts per embedding point:

Best practice (what AI SDK + CC teach us)
Zapvol current state (Pass / Partial / Missing, cited from source)
Evolution direction (concrete deliverables + priority)

This is not a “look how great Zapvol is” showcase — it’s a roadmap that admits gaps and points forward.

Reading assumption: AI SDK v6 (ai@^6), familiar with streamText / generateText / prepareStep / experimental_onToolCallFinish / stopWhen.

Layer 1: what to copy, what to discard

Not every CC pattern is worth copying. Getting the boundary wrong leads to “I also built a git worktree thing” — mechanism-level mimicry.

CC pattern	Transferability	Why
Worktree / CCR / Bash sandbox	Coding only	Mechanism depends on git / Anthropic SaaS / POSIX
CLAUDE.md directory walk	Partial	Abstract as “layered prompt injection points” and it transfers
Git status / File tools	Coding only	Tool selection
Async generator main loop	Universal	Baseline for streaming UI
Terminal reason enum	Universal	Different terminations → different UX
System prompt static/dynamic boundary	Universal	Cache hit rate hard red line
Compaction cascade (cheap → expensive)	Universal principle	N tiers is your choice
MEMORY.md index pattern	Universal	Only scale path for unbounded storage
Multi-axis memory taxonomy	Universal	Who writes × what stores × how fast ages
Subagent as context firewall	Universal	Any agent with ReAct loops needs this
Hook vs prompt layering	Universal	Process-level can’t rely on prompts
Tombstone streaming retraction	Universal	Any streaming UI
Circuit breaker	Universal	Any system with retries
Data-driven prompt engineering	Universal (meta-principle)	Copy the culture, not the numbers

Core: mechanisms aren’t universal, principles are. Worktree’s mechanism is useless for non-coding agents; but the role it plays (“disposable isolation workspace”) transfers to any agent needing isolated trial-and-error.

The 8 AI SDK embedding points

AI SDK v6’s streamText hides the loop internally, exposing only a few hooks. All CC patterns must fit into these points:

Point	Location	Role
A	Call-site assembly	`system` / `messages` / `tools` / `providerOptions`
B	`prepareStep`	Per-step input preprocessing (compaction / budget / cache control)
C	Streaming consumption	Consume stream events
D	`tool.execute`	Where tools actually run (permission / hook / sandbox)
E	`experimental_onToolCallFinish`	After tool completes (microcompact entry)
F	`stopWhen`	Continue or stop
G	`onFinish`	Aggregate usage / persist messages
H	Post-call	Session snapshot / background tasks

Below: each point in the three-part structure.

Point A · Call-site assembly

Best practice

system byte-stable: don’t concat Date.now() / userName / gitStatus into the system string — one char change invalidates the entire prompt cache
Dynamic context in the messages layer: CC places currentDate in auto memory, not system
Explicit providerOptions.cacheControl breakpoints: AI SDK won’t add them for you
Tool description is prompt engineering: write each one teaching “what the agent should do next” (see CC’s Edit tool)
maxSteps with a ceiling but not too small: 30-50 is a reasonable daily range

Zapvol current state

Partial · one potential cache risk needs audit

Pass · Tool descriptions are generally written tutorial-style (see filesystem.compact.ts, browser.tool.ts), copying CC’s approach
Pass · Explicit providerOptions cache control: applyCacheControl(compacted, model, { extraBreakpointAt }) called every prepareStep
Partial · appendSystemContext(systemPrompt, systemContext) (agent-round.ts) concatenates dynamic systemContext to system’s tail — if systemContext contains any per-turn-changing fields (currentDate / gitStatus / timestamps), the entire system hashes differently every turn, all prior cache breakpoints invalidated. Worth auditing (not listed in .claude/design/compaction-redesign.md but significant)
Partial · BUDGET_RATIO = 0.2 (compaction/config.ts:15) vs doc stating 0.8 vs CC’s ~93% — under review (redesign §2), not Point A specifically but affects threshold discussions

Evolution direction

P0 · Audit appendSystemContext content composition

Add a logging line to verify cache stability:

// Before applyCacheControl
log.info("cache.system_hash", {
  hash: createHash("sha256").update(fullSystemPrompt).digest("hex").slice(0, 16),
  step: steps.length,
})

Run a 10-step task and inspect the hash:

Stable → Pass current design is fine
Changes each step → Missing must refactor: move dynamic parts (currentDate / ctx info) from systemContext to the head of messages as a user message

P1 · Layer systemContext contents by stability

Reference CC’s getSystemContext() + getUserContext() separation (see System Prompt Assembly). Each layer memoized — computed once at session start.

Point B · `prepareStep` (where most compaction logic lives)

Best practice

Split by step: step 0 can do heavy lifting (boundary filter + autocompact); later steps only need microcompact — running autocompact every step blows up cost
Partial return: only write { messages }, not { messages, system, tools, toolChoice } — unchanged fields shouldn’t be written to avoid accidental cache breaks
Pass AbortSignal manually to compaction LLM: prepareStep’s signature doesn’t include abortSignal; read from ctx
Blocking check at the top: on hard ceiling return synthetic error for graceful exit, don’t let API throw 413
Hysteresis: don’t trigger activateMoreCrCandidates when over budget by ≤5% — preserves cache stability

Zapvol current state

Partial · functions there but lacking several fine-grained controls

Pass · stepCompactor.apply uniformly called (agent-round.ts:183) — every step goes through the full compaction pipeline
Pass · abortSignal propagation: context.abortController.signal read from context, passed into autocompact / summarizer LLM calls
Pass · Cache breakpoints explicitly marked: applyCacheControl + extraBreakpointAt = compactedPrefixEnd
Missing · No step 0 vs step 1+ differentiation: stepCompactor.apply runs the same logic every step — may over-compact on later steps
Missing · No hysteresis: over-budget by 1 token triggers activateMoreCrCandidates (redesign §P2 #6 lists as todo)
Missing · No blocking synthesis fallback: only reacts to 413 (but reactive itself is missing, see below)
Missing · No 413 reactive handler: redesign §P1 #4 explicitly lists as gap

Evolution direction

P0 · Add step 0 vs step 1+ differentiation

Currently stepCompactor treats first and subsequent steps identically. Suggested:

// packages/backend/src/agent/compaction/step-compactor.ts
export async function apply(
  messages: UIMessage[],
  options: StepCompactorOptions,
): Promise<CompactResult> {
  const isFirstStep = options.stepNumber === 0  // need passing from prepareStep

  if (isFirstStep) {
    // First step: boundary already handled by prepareInitialMessages; only check autocompact need
    return await maybeAutocompact(messages, options)
  }

  // Later steps: prioritize reading precompact cache, avoid re-running LLM summarize
  const truncated = await truncateOldToolResults(messages, options)
  if (getLastStepTotalTokens(options) > budget) {
    return await maybeAutocompact(truncated, options)  // still over → autocompact
  }
  return { messages: truncated, compactedPrefixEnd: ... }
}

Gain: save one LLM call on later steps when microcompact suffices.

P1 · Hysteresis + blocking synthesis fallback

Per redesign §P2 #6: no activateMoreCrCandidates when over budget by ≤5%.

Per CC’s blocking check (see compaction chapter): on hard ceiling:

if (currentTokens >= HARD_CEILING) {
  yield synthetic error message
  return { reason: 'blocking_limit' }
}

A synthetic-message graceful exit pattern — cleaner than API throwing 413.

P1 · 413 reactive handler

Redesign §P1 #4 already listed. On 413 force activateMoreCrCandidates + retry once.

Point C · Streaming consumption

Best practice

Tombstone retraction: already-streamed chunks voided due to streaming fallback need explicit void-markers — UI can’t retract characters from the client
Thinking block signature preservation: don’t touch signature fields on cross-turn reuse; serialization / deserialization must preserve byte-alignment
Usage tracked by component: inputTokens / outputTokens / cachedInputTokens / cacheCreationInputTokens separately — cache hit rate is core observability metric

Zapvol current state

Partial · streaming retraction missing; thinking signature partially handled

Pass · UI message stream uses AI SDK v6’s ui-message-stream protocol (agent-ui-stream.ts)
Pass · Thinking block filtering: round-precompact.ts:249 explicitly strips reasoning/thinking blocks before sending to the compaction model — thinking carries the original model’s signature which would fail when forwarded to a different model. This is done better than most agents
Partial · Cross-turn thinking reuse: no explicit signature-field protection during serialization — if messages persist to DB and come back with JSON field order changed, next submission gets API-rejected
Missing · No tombstone / streaming retraction: UI side lacks explicit “streamed but void” markers
Partial · Usage recording: onStepFinish accumulates but stores only totalTokens (agent-round.ts:247) — loses cache read / cache creation, two key components

Evolution direction

P1 · Usage component-wise recording

Refactor onStepFinish:

onStepFinish: (step) => {
  context.lastStepTotalTokens = step.usage.totalTokens  // current usage
  // New:
  session.recordStepUsage({
    input: step.usage.inputTokens,
    output: step.usage.outputTokens,
    cacheRead: step.usage.cachedInputTokens,         // ← key
    cacheCreate: step.usage.cacheCreationInputTokens,  // ← key
  })
}

Without component-level tracking, cache hit rate is unobservable — any cache optimization work can’t be verified.

P2 · Streaming retraction primitive

If production encounters “streaming fallback leaving residual UI text” bugs, add tombstone:

// agent-ui-stream.ts add a data part type
context.writeTransient(DataPartEvent.TOMBSTONE, { messageId, reason: "streaming_fallback" })

Frontend removes the rendered region on receipt. Not needed if not currently an issue.

P2 · Thinking signature persistence protection

When Zapvol eventually does resume with extended thinking enabled — serialization of messages must preserve providerOptions.anthropic.signature field verbatim (no JSON field-order normalization, no base64 padding changes). Not blocking now, handle when doing extended thinking.

Point D · `tool.execute` (permissions + hooks physical location)

Best practice

Permission check at top of execute: don’t filter tool list in prepareStep (model doesn’t know it was denied, just reports schema mismatch)
PreToolUse hook equivalent: wrapTool higher-order function, supporting updatedInput (modify args, not just allow/deny)
AbortSignal to leaves: fetch / execFile / DB query all need signal — half-done abort is worse than none
Output truncation + offload: single tool output can’t consume the entire context
Error text is prompt: teach the model what to do next

Zapvol current state

Pass · nearly complete — D is Zapvol’s strongest point

Pass · Complete permission system: utils/permissions/ (mode + rule + classifier + hook levels)
Pass · abortSignal full chain: every tool execute accepts signal, passed into sandbox command execution
Pass · maxResultSizeChars concept exists (TOOL_TRUNCATE_CHARS = 1000)
Pass · ServerToolConfig wrapper: each tool has compact / requiredPermission / renderMessage extension fields, compiles to AI SDK tool (exactly the pattern recommended in “Architectural Decisions” below)
Pass · Error text: most tools have tutorial-style errors (see browser.tool.ts error codes)
Partial · PreToolUse hook equivalent: permission checks exist but no “hook modifies arguments” — updatedInput capability currently absent

Evolution direction

P2 · Add tool wrapper pre/post/error hooks

Currently permissions are done tool-by-tool manually. If future needs like “auto-add lint / auto-inject ctx”:

function wrapTool<I, O>(cfg: ServerToolConfig<I, O>, hooks: ToolHooks<I, O>): AiSdkTool {
  return toAiSdkTool({
    ...cfg,
    execute: async (input, ctx) => {
      const pre = await hooks.preToolUse?.(input, ctx)
      if (pre?.decision === "block") return { error: pre.reason }
      const actualInput = pre?.updatedInput ?? input
      // run original execute ...
    },
  })
}

Not in redesign, but CC’s PreToolUse hook with updatedInput is valuable for extensibility. Low priority (no blocking need today), but worth leaving space for architecturally.

Point E · `experimental_onToolCallFinish` (microcompact entry)

Best practice

Race abort: precompact is a separate LLM call; user ESC must stop it immediately
Idempotent: same toolCallId replay short-circuits on cache
Cheap model: main agent on Opus/Sonnet, precompact on Haiku — 1/10 cost
Offload + cache: compressed result + original text offloaded, main agent can read_file if needed

Zapvol current state

Pass · reference-quality — Zapvol’s tool-precompact.ts nearly fully aligns with CC’s Tier 1 microcompact

Pass · Race abort: explicit raceAbort(compactor(...), abortSignal)
Pass · Idempotent: readCompactCache(toolCallId) check first
Pass · Offload + cache: offloadToolData to disk + writeCompactCache stores result, file-based cross-step persistence
Pass · Per-tool compactor: ServerToolConfig.compact() lets each tool define its own compression strategy (more refined than CC microcompact’s uniform placeholder)
Partial · PRECOMPACT_TRIGGER_TOKENS = 2500: redesign §P1 #3 suggests lowering to 500-1000 — more small tools compressed at finish time

Evolution direction

P1 · Lower PRECOMPACT_TRIGGER_TOKENS

Per redesign §P1 #3: 2500 → 500-1000. Gain: more tools compressed at the cache-safe moment, reducing mid-round prepareStep compaction spikes.

For external readers: look at tool-precompact.ts design — it’s a production-grade reference for microcompact, worth copying.

Point F · `stopWhen`

Best practice

Combine multiple conditions: stepCountIs(N) + hasToolCall('complete') + custom
Provide an explicit complete tool: let the model “explicitly declare completion” to avoid useless iterations
Maintain business-layer TerminalReason: AI SDK’s finishReason is too coarse
Circuit breaker: stop after N consecutive failures

Zapvol current state

Pass · design finer than CC — stopOnComplete’s todos-blocking is a highlight

Pass · stopOnComplete() (tools/stop-conditions.ts): not only checks complete tool call but also checks all todos are done — blocks completion if any todo is pending. More refined than CC
Pass · stepCountIs(maxSteps) hard ceiling
Pass · complete tool: Zapvol defines it, used in browser-subagent etc.
Partial · AI SDK finishReason granularity: Zapvol records “stop / length / tool-calls / error / abort” — no business-level reasons like blocking_limit / permission_denied_fatal
Missing · Circuit breaker missing: redesign §P2 #7 explicitly lists as gap — no fusing mechanism for repeated autocompact / precompact failures

Evolution direction

P1 · Define TerminalReason enum + derive in onFinish

export type TerminalReason =
  | "done"
  | "completed"              // complete tool called
  | "aborted"                // ctx.abortController.signal.aborted
  | "max_steps"              // stepCountIs triggered
  | "blocking_limit"         // hard ceiling exit (needs Point B's synthesis fallback)
  | "todos_incomplete"       // complete called but todos not done (stopOnComplete implements but doesn't surface)
  | "error"

// agent-round.ts onFinish — derive
const terminalReason = deriveTerminalReason(result, context)
await session.setTerminalReason(terminalReason)

Gain: UI can show different messages per reason; telemetry can aggregate by reason.

P1 · Circuit breaker for autocompact + precompact

Redesign §P2 #7 already listed. consecutiveFailures >= 3 → stop trying. Otherwise an irrecoverable session wastes many API calls (CC’s lesson: 250K wasted API calls per day, see compaction chapter).

Point G · `onFinish`

Best practice

result.response.messages must persist: no persistence = no resume
Transactional atomicity: messages and checkpoint must update in the same transaction
Multi-step usage aggregation: don’t only record the last step
finishReason classification: derive business-layer TerminalReason

Zapvol current state

Pass · persistence is rigorous

Pass · captureCompactionCheckpoint: onFinish writes compactionCheckpoint to DB (agent-round.ts:266+)
Pass · Round summarizer triggers: onFinish async-generates RoundSummary saved to DB (for next round’s plan-phase)
Pass · Multi-step usage aggregation: stepUsages accumulated (agent-round.ts:241+)
Partial · Whether messages + checkpoint write atomically not confirmed — from code looks like separate writes (needs audit)
Partial · TerminalReason derivation: records AI SDK’s raw finishReason, missing business-layer mapping (see F’s evolution)

Evolution direction

P1 · Audit messages + checkpoint transactional atomicity

Confirm whether the following two writes are in the same DB transaction:

await db.session.appendMessages(session.id, newMessages)
await db.session.updateCheckpoint(session.id, checkpoint)

If not — partial success on failure leads to inconsistent state on next resume (messages ahead of checkpoint → re-compact waste / messages behind checkpoint → index crash).

P2 · Unified Terminal reason storage

See F’s evolution direction.

Point H · Post-call (background tasks)

Best practice

Memory extraction async: extract long-term memory entries without await — don’t delay the user
Resume pre-compact: on session exit, background-compute resume summary — next resume is instant
PostCompact hook: externally extensible point

Zapvol current state

Partial · memory extraction has; resume pre-compact missing

Pass · Memory extraction: memory/memory-extraction.ts complete — onFinish fire-and-forget, uses Haiku to extract memory entries, mutually exclusive with manual save_memory
Pass · Round summarizer: onFinish async-generates RoundSummary (compacted single round), saved to DB for next-round reuse — this is a lightweight resume pre-compact variant
Missing · Full resume pre-compact missing: redesign §3.3 explicitly says “we need this layer” in the CC alignment matrix. Zapvol currently has per-round summaries — missing a session-level integrated summary
Missing · PostCompact hook missing: no “compaction complete” extension point

Evolution direction

P1 · Add resume pre-compact

CC’s pattern: on session exit, start a background job compressing history into a global summary, next resume reads directly.

Zapvol already has round-level RoundSummary — infrastructure is there. Missing session-level integrated summary:

// agent-round.ts onFinish tail
if (terminalReason === "done" || terminalReason === "completed") {
  void generateSessionResumeSummary(session.id, context)  // don't await
}

// Next A loading
const summary = await session.loadResumeSummary()
if (summary) {
  // use summary + last N rounds raw messages
} else {
  // fallback to current logic: load all messages
}

P2 · PostCompact hook surface

For Zapvol this is future extensibility prep — if eventually building a marketplace / plugin system to let third parties run custom logic on compaction complete (e.g., alerting, workflow triggers). No current need.

Architectural decisions (not traps, but choices)

The two decisions below aren’t tactical questions inside Point A/B/…/H — they’re architectural choices. Zapvol got both right; documenting for external readers.

Decision 1: CLAUDE.md’s abstraction is “layered prompt injection points”

CC’s CLAUDE.md depends on filesystem walk. Generic agents don’t have that, but the role transfers:

Layered prompt injection points = {
  Managed:  Enterprise / policy mandatory rules (highest authority)
  Project:  Project / account-level rules (team-shared)
  Local:    User's private overrides on this project/account (not shared)
  User:     User's global preferences
  Session:  Dynamic injection this conversation
  Auto:     Agent's self-learned memory
}

Storage medium (DB / YAML / settings.json / markdown) doesn’t matter. Layered authority merge logic is the point.

Zapvol current state: has ZapvolMemory system supporting Project / User / Auto three layers (memory-service.ts). Missing Managed layer — multi-tenant enterprise scenarios need this (“for tenant X, no agent may write to production DB”). P2 evolution: add Managed scope to existing memory-service.

Decision 2: Tools should wrap in your own `ServerToolConfig`

AI SDK’s tool only has description + inputSchema + execute. CC’s Tool has a dozen fields. Insert a layer of your own config, compile to AI SDK tool:

type ServerToolConfig<I, O> = {
  name: string
  description: string
  inputSchema: ZodSchema<I>
  execute: (input: I, ctx: Ctx) => Promise<O>

  // Fields AI SDK doesn't know but your internals need
  compact?: (input: I, output: O, hint: CompactHint, ctx: Ctx) => Promise<CompactResult>
  requiredPermission?: PermissionDescriptor
  maxResultSizeChars?: number
  clientRenderer?: (part: ToolUIPart) => ReactNode
}

Zapvol current state: already exactly this pattern — ServerToolConfig interface complete (used by all tools under agent/tools/). Correct choice.

Complexity by scale

Zapvol already past MVP, not “starting from scratch”. Below is a guide for adding features by maturity — useful for external readers, and “which feature to add next” for Zapvol.

Already done (baseline)

Zapvol current coverage:

Pass Point A basics + cache control
Pass Point B’s stepCompactor.apply uniform call
Pass Point D complete (permission + maxResultSize + ServerToolConfig)
Pass Point E complete (precompact reference-quality)
Pass Point F basics + stopOnComplete todos-blocking (finer than CC)
Pass Point G basics + round summarizer
Pass Point H’s memory extraction

Stage 2 (1-2 weeks, significant improvements)

Point A’s system prompt cache audit (P0)
Point B’s step 0 / step 1+ differentiation (P0)
Point C’s usage component-wise recording (P1)
Point E’s PRECOMPACT_TRIGGER_TOKENS lowering (P1, redesign §P1 #3)

Stage 3 (after infrastructure is solid)

Point B’s hysteresis + blocking synthesis fallback (P1)
Point F’s TerminalReason enum + circuit breaker (P1, redesign §P2 #7)
Point H’s session-level resume pre-compact (P1)
Point G’s messages + checkpoint transactional atomicity audit (P1)

Don’t build early (unless real pressure)

Point A’s enterprise Managed scope (P2, multi-tenant only)
Point C’s tombstone retraction (P2, UI needs not observed)
Point D’s pre/post tool hook extension (P2, no blocking need)
Full PostCompact hook surface (P2, for external extension systems)

Zapvol evolution roadmap (consolidated)

Aggregating evolution items scattered across points, by priority:

#	Point	Change	Expected gain
1	A	`appendSystemContext` cache audit + move dynamic content to messages	Preserve prompt cache hit rate
2	B	Add “step 0 vs step 1+” differentiation to stepCompactor	Save one autocompact LLM call on later steps

P1 (infrastructure, 3-4 weeks)

#	Point	Change	Related redesign item
3	C	Usage recorded by component (input/output/cacheRead/cacheCreate)	—
4	E	Lower `PRECOMPACT_TRIGGER_TOKENS` 2500 → 500-1000	§P1 #3
5	F	Define `TerminalReason` enum + derive in onFinish	—
6	F	Circuit breaker for autocompact / precompact	§P2 #7
7	G	Audit messages + checkpoint transactional atomicity	—
8	H	Session-level resume pre-compact	§3.3 (CC has, we don’t)
9	B	Hysteresis (over budget ≤5% doesn’t trigger)	§P2 #6
10	B	Blocking synthesis fallback + 413 reactive handler	§P1 #4

P2 (opportunistic, wait for pressure)

#	Point	Change	Trigger condition
11	A	Managed scope in memory-service	Multi-tenant enterprise customers
12	C	Tombstone / streaming retraction	UI feedback on streaming fallback residue
13	D	Tool wrapper pre/post/error hooks	Need project-level custom interception
14	H	PostCompact hook surface	External extension / plugin system

10-item self-audit checklist

Scan your agent codebase (Zapvol or external):

#	Check	Zapvol state
1	`system` byte-stable? No `Date.now()` / user context concatenated?	Partial Needs audit
2	`messages` have explicit `providerOptions.cacheControl` breakpoints?	Pass
3	`prepareStep` differentiates first step vs later?	Missing Needs change
4	Every tool `execute` has permission check?	Pass
5	Tool `description` tutorial-style?	Pass
6	Tool `execute` passes `abortSignal` to every internal long-running op?	Pass
7	Tool output has size cap + truncation + offload?	Pass
8	`stopWhen` combines multiple conditions (`stepCountIs` + `hasToolCall('complete')` + custom)?	Pass
9	`onFinish` persists `response.messages`?	Pass
10	`onFinish` has async background tasks (no await) for memory extraction / resume pre-compute?	Partial Half (memory yes, resume no)

Zapvol score: 7 Pass / 2 Partial / 1 Missing — infrastructure solid, key improvement points concentrated in A / B / H.

Next: connect the dots with state flow

This chapter cuts by hook — each hook discussed independently (“do this here”). But in real code, the 8 hooks share state spanning steps, rounds, sessions: E writes compactCache → next B reads; onStepFinish writes lastStepTotalTokens → B reads; G persists messages → next A reads.

The next chapter, Lifecycle State Flow, cuts by state flow — tracing how each state evolves across the 8 points in one conversation. Reading both connects the dots into a coherent pipeline.

Chapter framing

Layer 1: what to copy, what to discard

The 8 AI SDK embedding points

Point A · Call-site assembly

Best practice

Zapvol current state

Evolution direction

Point B · prepareStep (where most compaction logic lives)

Best practice

Zapvol current state

Evolution direction

Point C · Streaming consumption

Best practice

Zapvol current state

Evolution direction

Point D · tool.execute (permissions + hooks physical location)

Best practice

Zapvol current state

Evolution direction

Point E · experimental_onToolCallFinish (microcompact entry)

Best practice

Zapvol current state

Evolution direction

Point F · stopWhen

Best practice

Zapvol current state

Evolution direction

Point G · onFinish

Best practice

Zapvol current state

Evolution direction

Point H · Post-call (background tasks)

Best practice

Zapvol current state

Evolution direction

Architectural decisions (not traps, but choices)

Decision 1: CLAUDE.md’s abstraction is “layered prompt injection points”

Decision 2: Tools should wrap in your own ServerToolConfig

Complexity by scale

Already done (baseline)

Stage 2 (1-2 weeks, significant improvements)

Stage 3 (after infrastructure is solid)

Don’t build early (unless real pressure)

Zapvol evolution roadmap (consolidated)

P0 (recommend within 2 weeks)

P1 (infrastructure, 3-4 weeks)

P2 (opportunistic, wait for pressure)

10-item self-audit checklist

Next: connect the dots with state flow

Further reading

Point B · `prepareStep` (where most compaction logic lives)

Point D · `tool.execute` (permissions + hooks physical location)

Point E · `experimental_onToolCallFinish` (microcompact entry)

Point F · `stopWhen`

Point G · `onFinish`

Decision 2: Tools should wrap in your own `ServerToolConfig`