执行环境 - Zapvol Docs

和”权限”是两码事

前一章权限系统讲的是能不能做——mode / rules / hooks 的信任层。

这一章讲的是在哪里做——agent 进程跑在哪台机器、哪个目录、哪个沙箱容器里。两个问题正交：一条命令可能被权限允许但没沙箱（默认 CLI 模式），也可能在沙箱里跑但被 hook 拒绝。

Zapvol 语境里的”sandbox”（backend/infra/sandbox/ —— Node / Daytona / E2B）对应的是这一章的话题，不是上一章。两套系统解决不同问题，本章把它们分清。

三种 Isolation 模式

Claude Code 通过 Agent tool 提供 3 种 isolation（源码 tools/AgentTool/AgentTool.tsx 第 99 行）：

isolation: z.enum(['worktree', 'remote']).optional()
// 不传 = 默认模式（在用户当前 cwd 跑）
// 'worktree' = 临时 git worktree 隔离
// 'remote' = CCR（Claude Code Remote）云端执行

模式	文件系统	网络	进程	故障域
默认（无 isolation）	用户 cwd	用户网络	用户机器	和 CLI 本身同命运
worktree	临时 git worktree（`.claude/worktrees/<slug>`）	用户网络	用户机器	隔离文件改动，不隔离进程
remote	CCR 云端环境	CCR 网络	独立 Linux 容器	完全隔离

关键洞察：三种模式由易到难隔离强度递增，但代价也递增——默认最快但风险最大，remote 最安全但要花几秒拉起云端环境 + GitHub app 预设。

Task.ts 的 7 种 TaskType 和这个正交：local_agent 跑在本地（默认或 worktree 模式）、remote_agent 跑在 CCR、in_process_teammate 跑在同进程。TaskType 描述是什么种类的任务，isolation 描述这个任务跑在哪。

Worktree 隔离：临时 git 分身

创建

源码 utils/worktree.ts 第 902 行 createAgentWorktree：

export async function createAgentWorktree(slug: string): Promise<{
  worktreePath: string
  worktreeBranch?: string
  headCommit?: string
  gitRoot?: string
  hookBased?: boolean
}> {
  validateWorktreeSlug(slug)

  // 1. 优先：hook-based 创建（支持非 git VCS 如 mercurial / sapling）
  if (hasWorktreeCreateHook()) {
    const hookResult = await executeWorktreeCreateHook(slug)
    return { worktreePath: hookResult.worktreePath, hookBased: true }
  }

  // 2. 回退：原生 git worktree
  const gitRoot = findCanonicalGitRoot(getCwd())
  if (!gitRoot) {
    throw new Error('Cannot create agent worktree: not in a git repository...')
  }
  const { worktreePath, worktreeBranch, headCommit, existed } =
    await getOrCreateWorktree(gitRoot, slug)
  // ...
}

几个源码级细节：

`findCanonicalGitRoot` —— 避免 worktree 套 worktree

注释（第 922-925 行）：

findCanonicalGitRoot (not findGitRoot) so agent worktrees always land in the main repo’s .claude/worktrees/ even when spawned from inside a session worktree — otherwise they nest at <worktree>/.claude/worktrees/ and the periodic cleanup (which scans the canonical root) never finds them.

意思：如果当前已经在一个 session worktree 里跑，再创建一个 subagent worktree 时必须回到主仓库根目录——否则 worktree 嵌套，后台 cleanup 就找不着。

这又是一个真实 production 跑出来的 bug fix，不是理论。

Slug 验证防 path traversal

源码第 48-49 行：

const VALID_WORKTREE_SLUG_SEGMENT = /^[a-zA-Z0-9._-]+$/
const MAX_WORKTREE_SLUG_LENGTH = 64

validateWorktreeSlug 拒绝：

../../target（上级目录 escape）
/absolute/path（绝对路径）
. / .. segment（单独的 dot）
超过 64 字符

注释解释为什么要这么严：join('.claude/worktrees/', slug) 在路径归一化之后，.. 能 escape 出 worktrees 目录。

给自研 agent 的启示：任何用户可控的字符串如果最终要拼成文件路径，必须做严格白名单验证——不是 escape 或 sanitize，是只接受明确安全的字符集。

Hook-based 回退：支持非 git VCS

hasWorktreeCreateHook() 返回 true 时走 hook 路径——让用户用 WorktreeCreate / WorktreeRemove hook 接入任何 VCS（mercurial / sapling / perforce）。回来的 path 被当作”像 worktree 一样的隔离目录”。

这是一种协议式扩展：Claude Code 不深度绑定 git，而是定义了”一个 worktree 应该做什么”的接口，用户补上实现细节。

Bwrap ghost dotfile 清理

utils/Shell.ts 第 386 行：

On Linux, bwrap creates 0-byte mount-point files on the host to deny access to paths inside the sandbox; when bwrap exits as ghost dotfiles in cwd. Cleanup is synchronous and a no-op on non-Linux.

跑 Linux 时 bwrap 会在 cwd 创建 0 字节的 “mount-point” 文件用于拒绝访问——bwrap 退出后这些文件变成 “ghost dotfiles” 留在 cwd。Claude Code 专门写了同步清理代码。

细节多到这个程度，说明生产环境的 “沙箱运行” 绝非简单的 exec() 调用——每一类沙箱都有自己的泄漏 / 清理 / edge case。

清理周期

Worktree 有periodic cleanup：扫 .claude/worktrees/ 下 30 天没动过的 worktree 删掉。实现细节里有一个 reuse 路径：

Bump mtime so the periodic stale-worktree cleanup doesn’t consider this worktree stale — the fast-resume path is read-only and leaves the original creation-time mtime intact, which can be past the 30-day cutoff.

快速 resume 路径只读，不会更新 mtime——如果不显式 bump，resume 过一个老 worktree 后下次 cleanup 就把它删了。

Remote 模式：CCR（Claude Code Remote）

Remote isolation 走的是完全不同的架构——Claude Code Remote（CCR），ant-only 特性。

6 种 Precondition

tasks/RemoteAgentTask/RemoteAgentTask.tsx 的 checkRemoteAgentEligibility + formatPreconditionError 定义了 6 种不能用 CCR 的原因：

原因	用户消息
`not_logged_in`	”Please run /login and sign in with your Claude.ai account (not Console).”
`no_remote_environment`	”No cloud environment available. Set one up at https://claude.ai/code/onboarding?magic=env-setup”
`not_in_git_repo`	”Background tasks require a git repository. Initialize git…”
`no_git_remote`	”Background tasks require a GitHub remote.”
`github_app_not_installed`	”The Claude GitHub app must be installed on this repository first.”
`policy_blocked`	”Remote sessions are disabled by your organization’s policy. Contact your organization admin.”

推论：CCR 的架构依赖是GitHub 的代码拉取 + Claude.ai 账户身份 + 企业 policy 审批——不是独立的容器拉起，而是 GitHub integration + Anthropic 侧的运行时。这是一个 SaaS 集成，不是纯技术隔离。

启动一个 remote agent

// AgentTool.tsx 第 433-457 行
if (effectiveIsolation === 'remote') {
  const eligibility = await checkRemoteAgentEligibility()
  if (!eligibility.eligible) {
    const reasons = eligibility.errors.map(formatPreconditionError).join('\n')
    throw new Error(`Cannot launch remote agent:\n${reasons}`)
  }

  const { sessionId, taskId, ... } = registerRemoteAgentTask({
    remoteTaskType: 'remote-agent',
    // ...
  })
  // ...
}

启动后用户可以通过 getRemoteTaskSessionUrl(sessionId) 拿到一个 URL 看这个 agent 跑得怎样（在 claude.ai 上）。

Remote 的三个关键属性

永远 run_in_background: true——CCR 是异步的，主 agent 不等
独立 session——remote agent 有自己的 sessionId，脱离主 agent 的 transcript
通过 WebSocket 回传——remote/SessionsWebSocket.ts + remote/RemoteSessionManager.ts——主 agent 通过 WS 拿到子 agent 的进度和结果

这不是一个 exec-and-wait 模式，是消息驱动的分布式 agent 架构。

Bash 的进程级沙箱

和 isolation 正交的一层：Bash 工具本身的进程隔离。依赖外部 NPM 包 @anthropic-ai/sandbox-runtime。

macOS: Seatbelt (sandbox-exec)

Apple 内置的 sandbox-exec（Seatbelt）
写入权限限于 cwd 和 /tmp
网络默认拒绝，显式开启后按 host 过滤
零安装开销，macOS 出厂就有

Linux: bwrap + seccomp + socat

三个独立依赖（源码 SandboxDependenciesTab.tsx）：

apt install bubblewrap      # bwrap：文件系统隔离（不是 Docker，更轻量）
apt install socat            # socat：网络代理，做 per-host 允许/拒绝
npm install -g @anthropic-ai/sandbox-runtime   # 提供 seccomp BPF 过滤器

seccomp BPF 是关键——不装就有漏洞。源码 diagnostic 里明说：

seccomp filter: not installed (required to block unix domain sockets)

不装 seccomp，攻击者可以通过 unix domain socket 绕过 bwrap 的网络限制。

Windows: 无沙箱

tools/PowerShellTool/PowerShellTool.tsx 第 208 行：

On Windows native, sandbox is unavailable (bwrap/sandbox-exec are POSIX-only)

Windows 版 Claude Code 没有进程级沙箱。权限系统 + deny-list + hooks 是唯一的保护层。

这是一个真实的部署决策维度：

macOS 开发者：有内核沙箱，可以信心地开 permissive mode
Linux 开发者：装完三个依赖后有沙箱；没装的裸跑
Windows 开发者：必须依赖 permission 策略——沙箱拦不住

源码里有完整的 SandboxDoctorSection / SandboxDependenciesTab UI 组件，教用户装依赖——这说明 Anthropic 真在把沙箱的诊断和引导做成产品级功能，而不是只给开发者看 error。

`@anthropic-ai/sandbox-runtime` 是独立 NPM 包

这是一个独立的 NPM 包，claude-code/utils/sandbox/sandbox-adapter.ts 只是一层 adapter。包的职责：

提供跨平台的 SandboxManager 抽象（同一 API 跑 Seatbelt / bwrap）
解析 SandboxRuntimeConfig（settings.json 的 sandbox 配置）
管理 SandboxViolationStore（违规事件的持久化）
统一 SandboxAskCallback（询问用户的回调）

给自研 agent 的启示：沙箱能力应该抽象到一个独立包。沙箱规则随 OS / kernel 版本 / 政策变化非常快，把它圈进一个包能独立迭代，主产品不用跟着发版。

Path Pattern 在沙箱里的含义

前面在权限系统讲过 4 种 path pattern（//path / /path / ~/path / ./path）。在沙箱里这些 pattern 的实际作用：

//path → 文件系统绝对路径，例如 //tmp 允许访问 /tmp
/path → 相对于 settings 文件目录——settings.json 在 /home/me/proj/.claude/，/data 展开为 /home/me/proj/.claude/data
~/path → HOME 展开
./path → 相对 cwd

这些被 resolvePathPatternForSandbox（utils/sandbox/sandbox-adapter.ts）转成 sandbox-runtime 能理解的绝对路径，然后写进 bwrap / seatbelt 的 rule file。

给自研 agent 的启示：文件路径 rule 的语法设计要区分”绝对”和”相对配置文件”。绝对路径不跨机器，配置文件相对路径可以——跨 dev / CI / production 的配置漂移 80% 出在这里。

和 MCP Server 的进程边界

还有一个容易忽略的边界：MCP servers 是独立进程，不在 Claude Code 的沙箱里。

典型的 MCP server（比如 Playwright MCP、GitHub MCP）通过 stdio 或 SSE 跟 Claude Code 通信。这意味着：

MCP server 有自己的权限（用户机器上的普通进程权限）
MCP server 调用外部 API 不走 Claude Code 的沙箱网络策略
MCP server 的文件读写不走 Claude Code 的文件沙箱

这是一层权限边界：Claude Code 对 MCP tool 的调用受权限系统管，但 MCP server 自己做什么超出权限系统范围。

给自研 agent 的启示：子进程就是新的权限边界。你的 agent 可以有完美的内部权限系统，但一旦 spawn 出一个子进程，子进程完全独立。做合规审计时这是一个常见盲点。

三种模式的选型指引

场景	推荐模式	理由
日常 coding，改当前 repo	默认	最快、直接改 cwd
探索性任务，想看改动但不想污染当前 repo	worktree	临时 git branch，可以 diff 再 merge
长时任务（跑 1h+），不想占用本地	remote	后台跑，完成后拉回
需要完全隔离的 env（如外部代码审阅）	remote	独立容器
批处理 100 个 issue 对应的改动	N 个 remote	并行
已经是 subagent 自己想再起 subagent	默认（fork 模式）	避免嵌套 worktree

给自研 agent 的要点

“权限”和”执行环境”是正交的两件事——你的设计里不能只有一个，要两个都有。Zapvol 的 infra/sandbox/ 对应这一章，permission-* 对应上一章
Isolation 应该是分级的：none（快）→ worktree（文件隔离）→ remote（完全隔离）。一刀切要么太慢要么太危险
Worktree slug 严格白名单验证：[a-zA-Z0-9._-]+ 每个 segment 单独验证，拒绝 .. / . / 绝对路径
Canonical git root：从 worktree 里再创建 worktree 时必须回主仓库——不然嵌套 worktree，清理程序找不着
Hook-based worktree 回退：让用户自己接 mercurial / sapling / perforce 的实现，不要硬绑 git
Periodic cleanup + mtime bump：长时不用的 worktree 要清理，但要有 resume 时的 mtime 刷新机制，避免误删
沙箱是平台特定的：macOS 内置、Linux 需三个依赖、Windows 无。部署指南要明说这些差异
SandboxDoctor 组件化：把沙箱依赖检查做成产品级 UI，不是 error message
沙箱抽象到独立包：@anthropic-ai/sandbox-runtime 这种独立包能让沙箱规则独立迭代
云端执行（Remote）的架构是 SaaS 集成：GitHub 拉码 + 身份 + policy 审批——不是纯技术隔离，要把业务集成和技术隔离一起想
MCP server 是独立权限域：你对 MCP tool 的调用受权限系统管，MCP server 自己做什么超出范围——合规审计常见盲点
Ghost dotfile 清理等细节说明生产沙箱不简单——每种沙箱都有泄漏 / 清理 / edge case，不能 “exec 完就完了”