Delegation Patterns

The question this doc answers

After the simplification of team (4-5 tools, member’s initialPrompt IS its work), the surface API of task (synchronous subagents) and team (asynchronous team) looks very similar:

task({ subagent_type, prompt })      ← Lead calls; gets result inline
team_create({ members: [{ name, agentType, initialPrompt }] })
                                      ← Lead calls; gets teamId; checks status later

So when an HR operator opens Zapvol and says “do this work,” which one carries the request?

The answer is in the work, not in the tool. Below: four scenarios from real HR practice, each analyzed at four levels — what triggered it, what the operator actually wants, why the right tool is right, and what would specifically fail if you picked the other one. The boundary falls out at the end.

Scenario 1: Resume screening — task

The trigger

Tuesday morning, 9:00 AM. Jonathan, recruiter at a 200-person SaaS company, opens his ATS queue: 32 new applicants overnight for the open Senior Backend Engineer role.

Daily standup is at 10:00. He needs to know who’s worth a 30-minute phone screen this afternoon before he walks into standup.

He types:

“Screen these 32 against the JD. Rank them, top 5 with rationale.”

What Jonathan actually wants

Not just a ranked list. He wants the rationale — “5 years Go experience, ex-Stripe, owned the payments rate limiter rewrite” is the kind of one-liner that lets him hit accept/reject in 30 seconds without re-reading the full resume.

He wants the answer in 3-5 minutes. He’s not leaving the screen. He’ll go straight from this answer to scheduling phone screens.

Why this is `task`

Three properties together:

Workers are independent. Each resume is evaluated against the same JD. No resume needs to know about other resumes. 32 parallel jobs that never talk to each other.
Operator is on the latency-critical path. Every second of wait is wasted human attention. The work must parallelize even if parallel were strictly more expensive than serial.
The session IS the wait. Jonathan submits, the agent works, the result lands in the same chat session. No “come back later.” No external trigger to wake the operator.

This is what task × N does. Lead calls task 32 times in one assistant turn (AI SDK runs them concurrently), each subagent evaluates one resume, each returns a structured assessment, Lead synthesizes the ranking, the operator sees it.

What would fail if you used `team`

Three specific failures, none of them “would still work, just slower”:

Database write overhead for ceremony. A team creates Team / TeamMember rows in Postgres, persists task records, writes member results. For 32 candidates over 4 minutes of work, that’s tens to hundreds of DB writes whose only purpose is supporting an async lifecycle the work doesn’t need.
Push notification fires into an attended session. team’s value is pushing the operator when they’re absent. Jonathan is right there. A notification arriving while he watches the screen is an interruption, not value. The notification channel has a finite signal-to-noise budget; spending it here debases it for the cases that need it (scenarios 3, 4).
Indirection on a synchronous flow. Instead of inline return, team_status long-poll. Same data, more round-trips, more failure modes (heartbeat / timeout / reconnect). Pure overhead.

What this reveals about the design

task is the right primitive when the operator’s attention IS the wait — and only then. The defining property is not “few workers” or “short work” — those follow. It’s: the operator has not budgeted any time for being absent. Everything async has a cost; spend that cost only when the operator’s absence is real.

Scenario 2: Multi-angle take-home evaluation — task

The trigger

Thursday afternoon, 1:30 PM. A strong candidate has reached the final loop for that same Backend Engineer role. He submitted his take-home: implement a rate limiter with specific semantics (sliding window, distributed, fail-open).

Jonathan has 90 minutes before the final-loop debrief at 3:00 PM. He wants three independent reads on the take-home:

Tech lead — design choices, correctness, did he handle the edge cases
Architect — scalability assumptions, system thinking
Eng manager — code quality, problem decomposition, communication in the README

He types:

“Have three reviewers each evaluate this take-home from their angle: tech lead, architect, eng manager. Bring back three independent assessments.”

What Jonathan actually wants

Three perpendicular perspectives, not consensus. Contradiction is the signal: if the tech lead says “elegant” and the architect says “won’t scale past 1k QPS,” he’s learned something specific to probe in debrief.

For this to work, the three reviewers must not see each other’s reviews before submitting. If they talked, they’d converge — and a converged middle opinion is worth less than three sharp independent ones.

He wants all three in hand at 2:55, lays them side by side, identifies agreement and disagreement, leads the debrief at 3:00.

Why this is `task`

Like scenario 1: short, parallel, operator waiting. But the analysis goes deeper — there’s a property of this scenario that actively requires task over team:

Worker independence is the deliverable. Three reviewers giving independent reads is more valuable than three reviewers coordinating. The shape of the work is “three sealed processes that produce three reads.”

task gives exactly that: three subagent invocations, each in its own scoped sandbox, no shared state, no inter-process communication. The independence is enforced by the architecture, not by convention.

What would fail if you used `team`

The failure here is more interesting than scenario 1’s “overhead.” It’s a structural mismatch:

team_message is in the contract. Even if the three reviewers don’t use it, the system’s affordance for inter-member messaging changes the shape of the work. The prompt for member 2 might mention “feel free to coordinate with member 1 if you find something interesting.” It’s now a different deliverable: three opinions filtered through their awareness of each other.
Async delivery undermines the synthesis. team is async-by-default. If tech lead’s review arrives at 2:35, architect’s at 2:42, eng manager’s at 2:55, Jonathan has been reading them one at a time — he can’t lay all three side by side until 2:55, and by then he’s already shaped a narrative from the first two. Synthesis quality drops.
The persistence is meaningless. No one is going to look up this evaluation in three days. The work product is the debrief decision. team’s DB records survive forever for a one-time decision.

What this reveals about the design

Multiple workers ≠ team. The discriminator between task and team is not worker count. It’s:

Independence between workers: task
Coordination between workers: team

team’s team_message is a feature when coordination is the value (a researcher tells a writer “I found a key statistic, use it”). It’s a liability when independence is the value (three sealed reviewers of one artifact). Some scenarios are easier when the architecture forbids coordination — task is one of them.

Scenario 3: Multi-req long-term sourcing — team

The trigger

Friday, 4:00 PM. Jonathan looks at his three open engineering reqs:

Senior Backend Engineer (open 2 weeks, no acceptable candidate yet)
Senior Frontend Engineer (open 3 weeks)
Platform Engineer (open 1 week)

Target close: 6 weeks each. Doing it manually means roughly 5 hours/day across sourcing, outreach, qualifying. He has six other major projects this quarter — he does not have 25 hours a week to spend on sourcing.

He types:

“Run sourcing for these three roles on LinkedIn for the next six weeks. Pre-qualify candidates against each JD. Only surface candidates above 75% match — and only after a successful initial outreach. Ping me when something good is ready for me to call.”

What Jonathan actually wants

A system that runs in the background for six weeks and shows him only the candidates worth his time. Specifically:

Continuous LinkedIn sourcing, daily.
Per-candidate qualification against the role-specific JD.
Outreach to the qualified candidates, gauge response.
Filter to ≥75% match AND successful initial response.
Notify him, via his actual notification surface (browser push, email, whatever), when a candidate clears that bar.
Survive his closing the laptop, his weekend, his trip to the offsite in week 3.
Let him adjust criteria mid-run: “be stricter on Go for Backend” should reach the workers on the next cycle without his resubmitting the whole setup.

Why this is `team`

Three properties, but in a different combination from task:

Operator session lifetime < work duration. Jonathan will close his laptop ~30 times over six weeks. He’ll be on vacation for one of those weeks. The chat session will be torn down and rebuilt many times. The work must survive every one of those events.
Push is structural, not optional. Without push, Jonathan has no way to “wait” on a six-week process. The system runs invisibly; if it never pings him, he has no idea it found a candidate. The push channel isn’t a feature add-on — it’s what makes the long duration meaningful to a human operator.
Mid-flight steering matters. After week 2, he sees three Backend candidates surface but they’re all Python-heavy when he wanted Go-first. He types: “Be stricter on Go.” team supports this via team_message to the workers; new candidates evaluated after that message use the tightened criteria.

What would fail if you used `task × N`

The failures here are not “inefficient.” They are structurally impossible:

AI SDK streams can’t run for six weeks. Practically capped at minutes to hours due to provider timeouts, connection lifetimes, server restarts. Even if you could, blocking the operator’s chat session for six weeks is a category error — Jonathan needs that session for everything else he does.
No mid-flight surfacing. task either returns or doesn’t. There’s no “halfway through, I found a great candidate, here he is” message mechanism. You’d have to return early, but then the search stops — so Jonathan would have to manually re-fire task every time he wants more candidates. He’s back to manual sourcing of sourcing requests.
No mid-flight steering. task’s prompt is set at call time and doesn’t change. Tightening criteria after week 2 means killing the current task and starting a new one with the new prompt — losing whatever state the workers had built up.

What this reveals about the design

team is the primitive when the operator’s session lifetime is bounded but the work isn’t. The decoupling between session and work is team’s structural contribution. Push delivery is what makes that decoupling valuable to a human — without push, the work runs and finishes invisibly.

If push isn’t wired (browser notification, email, Slack), team degrades to “task with worse UX.” Push is not a feature; it’s load-bearing.

Scenario 4: 5-person onboarding cohort — team

The trigger

Friday, 5:00 PM, before a long weekend. Five new hires start Monday:

2 in engineering (one frontend, one platform)
2 in product (PMs)
1 in support

Each hire has roughly six onboarding stages:

Stage	Owner	Depends on
1. Paperwork (offer letter, I-9, benefits)	HR + candidate	—
2. IT setup (laptop ship, account provisioning)	IT	paperwork complete
3. Workspace (desk, badge, parking)	Facilities	offer signed
4. Manager intro meeting	Hire’s manager	day 1
5. Role-specific training (LMS modules)	LMS	IT accounts active
6. 30-day check-in	Jonathan	calendar

Across 5 hires: 30 tracked items, four weeks. Some are sequential per-hire; others are parallel across hires. Jonathan is HR coordinator; he has 30 other things going on this month.

He types:

“Set up onboarding for these 5 hires starting Monday. Track each through the six stages. Tell me only when something is blocked or a milestone is reached. Otherwise, you don’t need to talk to me.”

What Jonathan actually wants

Set the pipelines up once on Friday, then trust the system to run them. Each hire’s flow proceeds independently — fast on some, slow on others depending on candidate response times, IT shipping schedules, manager availability.

Alerts ONLY on:

Blockers (paperwork unsigned after 3 days)
Milestones (30-day check-in due tomorrow)
Anomalies (training assigned but not started in week 1)

Silence otherwise. He has 30 other things; he does not want a “FYI hire #3’s laptop shipped” ping that adds nothing.

Why this is `team`

The combination from scenario 3 plus one more property that’s actually the dominant one here:

Cross-stage dependencies. Stage 5 (training) requires stage 2 (IT accounts active) to be done first. The system must know this, hold stage 5 until stage 2 reports done, then automatically advance stage 5 — without Jonathan’s involvement.

team’s coordinator has exactly this shape: tasks declare their dependencies, status state machine progresses pending → blocked → claimable → in_progress → completed, downstream tasks auto-unblock as their prereqs complete.

Multiply by 5 hires and you have 30 tasks with a dependency graph spanning weeks. team is not optional here — it’s the only primitive that models this shape.

What would fail if you used `task × N`

Sync model is impossible. As in scenario 3 — four weeks of blocking chat session is absurd.
No cross-task dependency model. task is “run this and return.” You can chain by waiting for one return and firing the next, but the chaining lives in the operator’s flow, not the system. Jonathan has to notice “IT account provisioned for hire #2” and then manually call task to assign training. He’s the coordinator.
No threshold-based push. task either succeeds or fails. There’s no “this task is taking longer than expected, alert the operator” mechanism. Jonathan has no way to express “tell me at hour 72 if paperwork still isn’t signed.”
Per-hire stage tracking lives in the operator’s head. With 5 hires × 6 stages = 30 items, Jonathan is mentally tracking a state machine. That’s what coordination systems are FOR.

What this reveals about the design

team’s distinguishing contribution is process orchestration, not parallel execution. Scenario 3 looked like team because of duration and push. Scenario 4 looks like team because of inter-stage dependencies.

The operator’s role shifts from executor (task: I’ll wait while you work) to supervisor (team: I oversee a process; I intervene only on exceptions). When the value of the system is in coordination — when X done trigger Y — task is structurally inadequate. team’s coordinator was designed exactly for this shape.

The boundary, distilled

Two questions, asked in order, decide every scenario above:

Question 1: Will the operator wait on screen for the result?

Answer	Tool
Yes — he’s at the screen, won’t leave until result lands	`task`
No — he has a session lifetime shorter than the work	`team`

Scenarios 1 and 2 are yes. Scenarios 3 and 4 are no. Duration falls out as a consequence: short work permits waiting, long work usually forbids it. But duration is a symptom; presence is the cause.

Question 2: Do workers / stages need to coordinate?

Answer	Tool
No — independence is part of the value (or doesn’t matter)	`task`
Yes — when X done, trigger Y; or members share findings	`team`

Scenarios 1 and 2 are no. Scenarios 3 and 4 are yes. “Many workers” alone doesn’t separate the tools; “coordination” does.

When the two questions disagree

Usually they point the same way. When they disagree, Q2 wins:

“30 resumes screened over a week” (Q1 says team, Q2 says task) → still task. Independent parallel work doesn’t justify team’s overhead just because the operator isn’t watching every second.
“3 workers coordinating in 5 minutes” (Q1 says task, Q2 says team) → still team. Coordination needs are structural; you can’t simulate them in task × N.

The work’s shape determines the tool more than the operator’s calendar.

The borderline case

“Draft 200 personalized opener emails for these sales leads, based on their LinkedIn profiles.”

Workers are independent (Q2 says task). Duration is ~30 minutes — Jonathan could wait or could go to a meeting (Q1 ambiguous).

Default to task; the operator escalates to team by saying so.

If Jonathan’s response to “this will take 30 minutes” is “OK, I’ll wait,” task fits. If it’s “OK, ping me when done,” team fits. The submission can carry that intent (a flag, a phrase, an explicit choice) — but the default should be task. Reason: defaulting to team for any mid-length parallel batch fires the notification channel constantly. The channel has finite signal-to-noise budget. Reserve it for scenarios 3 and 4, not for “I drafted some emails.”

Current limits to be honest about

team exposes only the simple shape. Each member’s initialPrompt is its work; the member completes; the run ends. Scenarios 3 and 4 above need exactly this — no pool model, no claim-from-queue. The advanced “members claim dynamic tasks from a shared pool” pattern exists in the coordinator code but isn’t surfaced as a tool. None of the four scenarios above need it.

Push notification channel. team’s structural value lives or dies on push delivery — browser notification, email, Slack. Without push, scenarios 3 and 4 become silent systems that nobody finds out about. The architecture is ready for push; integration with delivery surfaces is a separate workstream and a precondition for team being useful at all.

Attention-fluid case. Operators sometimes start on screen and step away mid-run. The choice between task and team is fixed at submission today. The borderline case above is the visible cost.

Where to go from here

task.md — synchronous subagent dispatch, implementation depth
agent-team.md — asynchronous team coordination, implementation depth

The question this doc answers

Scenario 1: Resume screening — task

The trigger

What Jonathan actually wants

Why this is task

What would fail if you used team

What this reveals about the design

Scenario 2: Multi-angle take-home evaluation — task

The trigger

What Jonathan actually wants

Why this is task

What would fail if you used team

What this reveals about the design

Scenario 3: Multi-req long-term sourcing — team

The trigger

What Jonathan actually wants

Why this is team

What would fail if you used task × N

What this reveals about the design

Scenario 4: 5-person onboarding cohort — team

The trigger

What Jonathan actually wants

Why this is team

What would fail if you used task × N

What this reveals about the design

The boundary, distilled

Question 1: Will the operator wait on screen for the result?

Question 2: Do workers / stages need to coordinate?

When the two questions disagree

The borderline case

Current limits to be honest about

Where to go from here

Why this is `task`

What would fail if you used `team`

Why this is `task`

What would fail if you used `team`

Why this is `team`

What would fail if you used `task × N`

Why this is `team`

What would fail if you used `task × N`