End-to-End Coordination
The full round-trip of one sendMessage from click to DOM update / the full reference of 25 UIMessageChunk types / the four hops of bidirectional abort / the end-to-end resume implementation pattern / four error path classes / SSE environment-layer traps
Why this page exists
The previous five pages are split by side — four on the send side, one on the receive side. But the most painful real-world issues are vertical, cutting across both sides — “what actually happens between click and DOM update”, “why did the client stop but the server is still burning tokens”, “how does resume really land in production”. None of the five pages answers these completely on its own.
This page stitches the five together into an end-to-end view:
- The full call chain of a single
sendMessage(client + server + transport) - Every
UIMessageChunktype’s meaning and trigger (the single source of truth for the wire protocol) - Three genuinely two-sided protocols — abort / resume / error — that cannot be debugged from one side only
Nothing from the first five pages is repeated; this page only covers “how the pieces fit together”.
The full round-trip of one sendMessage
Starting from the most common scenario: the user types in a textarea and clicks submit. Below is the complete call chain from click to DOM update.
Where each hop lives in the source:
| Hop | Location | Key call |
|---|---|---|
| UI → hook | your component | sendMessage({ text }) |
| hook → Chat | @ai-sdk/react dist/index.js | AbstractChat.sendMessage |
| Chat → transport | [email protected] AbstractChat.makeRequest | transport.sendMessages(opts) |
| transport → wire | DefaultChatTransport.sendMessages | fetch(api, { method: 'POST', body, signal }) |
| wire → server | your route handler | req.json() → messages: UIMessage[] |
| message bridge | [email protected] convertToModelMessages | UIMessage[] → ModelMessage[] |
| model call | [email protected] streamText (dist:6441) | returns StreamTextResult |
| to UI stream | StreamTextResult.toUIMessageStream (dist:7839) | fullStream.pipeThrough(TransformStream) |
| SSE wrap | toUIMessageStreamResponse / createUIMessageStreamResponse | new Response(stream, { headers: sseHeaders }) |
| server → wire | Node / Bun / Edge runtime | HTTP stream write |
| wire → client | HttpChatTransport.processResponseStream | parses SSE data: ...\n\n frames → UIMessageChunk |
| chunk → state | AbstractChat internal chunk dispatch | pushMessage / replaceMessage |
| state → DOM | React subscription + throttle | useSyncExternalStore + re-render |
UIMessageChunk full-type reference
This is the only wire protocol between client and server — every SSE frame parses into one of these types. The
complete union is at [email protected]/dist/index.d.ts:2158-2273.
Grouped by role (7 groups):
1. Stream lifecycle control
| type | When emitted | Client handling |
|---|---|---|
start | First chunk of the stream; carries messageId and message-level metadata | Create a new assistant message object |
start-step | Every agent step starts | Open a new step on the current message |
finish-step | Every agent step ends | Triggers L3 onStepFinish; step boundary marker |
finish | Stream ends; carries finishReason | Triggers L3 onFinish; status → ready |
abort | Server received an abort signal | onFinish({ isAbort: true, ... }) |
error | Server caught an exception; errorText comes from the onError return value | onError(new Error(errorText)) + isError: true |
2. Text content
| type | Meaning | Client handling |
|---|---|---|
text-start | A text part starts, id tags this part | Push a TextUIPart into message.parts, text="" |
text-delta | Incremental chunk | Append to the corresponding id’s TextUIPart.text |
text-end | This text part ends | Mark complete (optional: trigger tree-shake) |
Same id for start/delta/end = one text segment. Different ids = different segments (e.g. model emits a text block
→ calls a tool → emits another text block = two text parts with different ids).
3. Reasoning / Thinking
| type | Meaning | Client handling |
|---|---|---|
reasoning-start / reasoning-delta / reasoning-end | Anthropic thinking block / OpenAI reasoning trace | Same structure as text; collapsible in UI |
4. Tool calls
| type | Meaning | Client handling (ToolUIPart.state) |
|---|---|---|
tool-input-start | Tool call starts, params streaming in | state = 'input-streaming' |
tool-input-delta | Params text delta (a chunk of the JSON blob) | Accumulate into the “generating” form of input |
tool-input-available | Params complete | state = 'input-streaming' → 'input-available', input parsed |
tool-input-error | Params parse or schema validation failed | state = 'output-error', errorText injected |
tool-approval-request | Tool requires human approval (rare) | state = 'approval-requested' |
tool-output-available | Tool execution complete | state = 'output-available', output injected |
tool-output-error | Tool execution failed | state = 'output-error' |
tool-output-denied | Approval denied | state = 'approval-responded' (denied) |
dynamic-tool variants: if the tool is runtime-discovered (MCP / dynamic registration), the chunk carries
dynamic: true, and the client should render it as a DynamicToolUIPart (type: 'dynamic-tool') rather than the
static tool-${name} form.
5. Sources / Files
| type | Meaning |
|---|---|
source-url | A cited URL (RAG / search) |
source-document | A cited document (PDF / markdown) |
file | A file produced as part of this response (image / audio / …) |
6. Custom business events
{
type: `data-${string}`; // e.g. data-progress, data-todos-update, data-run-init
data: unknown; // business-layer-defined shape
id?: string;
transient?: boolean; // true = client only sees it in onData, not written to message.parts
}
Full detail of this protocol: UI Stream Orchestration → Custom data-* event protocol and useChat → onData.
7. Message-level metadata
| type | Meaning |
|---|---|
message-metadata | Attach metadata to the current message (message-level, not part-level) |
Cheat sheet
A total of 25 types (counting data-${string} as one):
| Group | Count | Types |
|---|---|---|
| Lifecycle control | 6 | start / start-step / finish-step / finish / abort / error |
| Text | 3 | text-start / text-delta / text-end |
| Reasoning | 3 | reasoning-start / reasoning-delta / reasoning-end |
| Tools | 8 | tool-input-start / tool-input-delta / tool-input-available / tool-input-error / tool-approval-request / tool-output-available / tool-output-error / tool-output-denied |
| Sources / Files | 3 | source-url / source-document / file |
| Custom | 1 | data-${string} |
| Message metadata | 1 | message-metadata |
Bidirectional abort — the easiest hop to drop
The user clicks stop, or closes the tab, or their network drops. That signal must propagate all the way from client through server to LLM provider. Drop any single hop and you get:
- Client says “stopped”, but the server is still burning tokens
- LLM call completes and writes to DB, but the user is already gone
- Long tool operations (shell commands, HTTP requests) can’t be aborted
The full four-hop chain:
Dropping each hop: symptoms
| Dropped hop | Symptom |
|---|---|
Client’s chat.stop() not called | User pressed stop but stream continues — check the stop button handler |
| fetch signal not passed | Client aborted but HTTP connection is alive — DefaultChatTransport handles this correctly; custom transports often miss it |
Server doesn’t read request.signal | Server keeps executing — the most common dropped hop, check the route handler |
streamText doesn’t accept abortSignal | LLM call continues — billing keeps running |
tool.execute ignores its abortSignal | Long tools keep running (shell commands, long fetches) |
Correct template (Hono)
import { convertToModelMessages, streamText } from "ai";
import { Hono } from "hono";
app.post("/api/chat", async (c) => {
const { messages } = await c.req.json();
const result = streamText({
model,
messages: await convertToModelMessages(messages),
tools,
abortSignal: c.req.raw.signal, // ← Key: Hono's raw request carries the signal
// tool execute signatures MUST accept { abortSignal } and forward to downstream fetch / subprocess
});
return result.toUIMessageStreamResponse();
});
Correct template (Next.js Route Handler)
export async function POST(req: Request) {
const { messages } = await req.json();
const result = streamText({
model,
messages: await convertToModelMessages(messages),
tools,
abortSignal: req.signal, // ← Key
});
return result.toUIMessageStreamResponse();
}
Tool-internal abort convention
const searchTool = tool({
description: "...",
inputSchema: z.object({ query: z.string() }),
execute: async ({ query }, { abortSignal }) => {
const res = await fetch(`https://api.example.com/search?q=${query}`, {
signal: abortSignal, // ← This line, so long fetches can be aborted
});
return res.json();
},
});
End-to-end resume implementation
The useChat page covered the one-line client side (resume: true); the server side is where the real work lives. This
section gives the full two-sided pattern.
Core constraints
- Chunks must be replayable: on reconnect, all previously emitted chunks must be re-sent, then new chunks continue. This requires the server to buffer as it generates.
- Chat id must be stable: the client’s
useChat({ id })on re-mount must pass the same id, or the server won’t find the buffer. - Storage must be cross-process visible: a single-process in-memory buffer handles only single-worker reconnect; multi-worker deployments or pod restarts lose state — production uses Redis stream / Pub-Sub.
Client behavior
useChat({
id: chatId, // stable, derived from URL / props
resume: true, // on mount auto-calls transport.reconnectToStream({ chatId })
});
DefaultChatTransport.reconnectToStream POSTs to ${api}?chatId={id}. If it returns null or an empty stream, client
enters ready, no replay; if it returns a non-empty stream, client consumes it like a normal stream.
Server implementation (Redis Stream approach)
Pseudocode, core idea:
// POST /api/chat — new conversation
app.post("/api/chat", async (c) => {
const body = await c.req.json();
const chatId = body.id;
const result = streamText({...});
const uiStream = result.toUIMessageStream();
// Key: tee one copy into Redis, return the other to the client
const [forClient, forBuffer] = uiStream.tee();
// Asynchronously write each chunk of forBuffer into Redis stream (keyed by chatId)
ctx.waitUntil(writeStreamToRedis(chatId, forBuffer));
return new Response(forClient, { headers: sseHeaders });
});
// POST /api/chat?chatId=xxx — reconnect
app.post("/api/chat", async (c) => {
const reconnectId = c.req.query("chatId");
if (reconnectId) {
// Query Redis; if the stream exists, return (replay existing + subscribe to new)
const bufferedStream = await readStreamFromRedis(reconnectId);
if (bufferedStream) {
return new Response(bufferedStream, { headers: sseHeaders });
}
// Stream completed or doesn't exist → return empty, client returns to ready
return new Response(null, { status: 204 });
}
// ... normal new-conversation path
});
Idempotency
- Each chunk on the server can carry an auto-incrementing seq number (Redis XADD’s native
idfield works) - On reconnect, client passes the last received seq (
?chatId=xxx&lastSeq=42) - Server replays only
seq > lastSeq - This avoids double-applying text-delta on reconnect (text-delta is incremental — replaying twice doubles the content)
DefaultChatTransport’s default implementation does not auto-dedupe by seq — exact dedup requires custom
prepareReconnectToStreamRequest coordinated with the server route. For most cases: agent step loops produce full
text-start → text-delta → text-end blocks per step; persist after text-end, and the client’s message.parts rebuild
naturally dedupes by part id.
Common pitfalls
| Error | Reality |
|---|---|
| Using an in-memory Map for buffer | Single pod restart loses everything |
| No TTL | Redis fills up indefinitely (recommend 24-48h TTL) |
| Randomized chatId | Page refresh loses the association; resume always fails |
| On reconnect, call streamText again | Bypasses buffer, LLM runs twice — billing doubles |
Forget tee() | Can’t both write buffer and send to client |
Four error paths
Server-side errors reach the client’s onError via four different paths, corresponding to four different network
behaviors:
| Path | When | HTTP layer | Stream layer | How client receives |
|---|---|---|---|---|
| A. Pre-flight (4xx) | Auth failure / validation / rate limit | Returns 4xx with body | Stream not started | fetch returns non-ok response → transport throws → onError(new Error("HTTP 401 ...")) |
| B. Pre-flight (5xx) | Server boot failure / dependency init failure | Returns 5xx with body | Stream not started | Same as above, but message is 5xx content |
| C. Mid-stream (agent internal) | streamText throws / tool throws | HTTP 200, stream already started | Emits error chunk | Chunk parsed → onError(new Error(errorText)) + onFinish({ isError: true }) |
| D. Connection severed | TCP drop / proxy timeout / client network died | HTTP 200, stream started then broke | No error / finish chunk | fetch rejects → onFinish({ isDisconnect: true }) |
Client-side signals per path
useChat({
onError: (error) => {
// Fires only for paths A / B / C
// Path D does NOT go here; it goes to onFinish's isDisconnect
},
onFinish: ({ isAbort, isDisconnect, isError, finishReason }) => {
// Four states are mutually exclusive:
// - Normal completion: all three flags false, finishReason = 'stop' / 'length' / 'tool-calls' / ...
// - User stop: isAbort = true
// - Network disconnect: isDisconnect = true (path D)
// - Mid-stream error: isError = true (path C)
// - Paths A / B: onFinish does NOT fire (stream never started)
},
});
Correct server-side layering
Paths A / B (pre-flight): just throw / return HTTP error response:
app.post("/api/chat", async (c) => {
if (!c.req.header("Authorization")) {
return c.json({ error: "Unauthorized" }, 401); // ← Path A
}
// ...
});
Path C (mid-stream): go through the onError serializer:
const result = streamText({...});
return result.toUIMessageStreamResponse({
onError: (error) => {
// Key: this return value becomes the error chunk's errorText
// Don't return error.message directly — may leak internal stack info
log.error("stream.failed", error);
return "Internal error, please retry.";
},
});
If onError returns a string, the stream emits a { type: "error", errorText: "..." } chunk before ending; the
client’s onError and onFinish({ isError: true }) both fire.
Path D (connection severed): server often can’t detect this (or only indirectly via request.signal). The client
detects it from the fetch rejection.
Four client-side recovery strategies
| Case | Strategy | Implementation |
|---|---|---|
isAbort | Do nothing, user chose this | Empty handler in onFinish |
isDisconnect | Expose reconnect | Show “Reconnect?” button, call resumeStream() |
isError | Show error, optionally roll back | onError + setMessages removes the last user msg |
| Paths A / B | Form-error UI | In onError, branch on error.message (or return structured JSON from the server) |
SSE environment-layer traps
Streaming SSE puts requirements on every layer of the network path. In production, these are the layers that commonly break:
Reverse-proxy buffering
nginx by default buffers the response and holds it until enough bytes accumulate. SSE stream buffered = client sees “a big burst all at once” instead of “small chunks as they arrive”.
location /api/chat {
proxy_pass http://app;
proxy_buffering off; # ← Key
proxy_cache off;
proxy_set_header Connection "";
proxy_http_version 1.1;
chunked_transfer_encoding on;
}
Cloudflare
Default Auto Minify and some compression features can break SSE. In Cloudflare dashboard, disable Auto Minify / Rocket
Loader / Speed features for the /api/chat path. Or add a response header:
return result.toUIMessageStreamResponse({
headers: {
"X-Accel-Buffering": "no", // Generic bypass (nginx / some CDNs recognize)
"Cache-Control": "no-cache",
},
});
Service Worker
If your app registers a PWA service worker, it may cache or replay POST requests. Explicitly exclude /api/chat:
self.addEventListener("fetch", (event) => {
if (event.request.url.includes("/api/chat")) {
return; // Let the browser handle natively, don't go through SW
}
// ...
});
Browser connection limits
HTTP/1.1 limits 6 concurrent connections per origin. An SSE stream takes one, and additional XHRs on the same page can easily hit the cap. Deploy over HTTP/2 or HTTP/3 (Vercel / Cloudflare default) to avoid.
Load-balancer idle timeout
AWS ALB / nginx / Cloudflare’s default idle timeout is typically 60s–5min. An agent running a long task (10+ minutes) may be dropped. Either:
- Bump the LB idle timeout
- Or have the server proactively flush a no-op event every 15–30s (e.g. a transient
data-heartbeat) to keep the connection alive
Further reading
Other chapters in this section
- Runtime Lifecycle — the 12 callbacks on the timeline
- UI Stream Orchestration —
createUIMessageStream/writer/ data-* event protocol - Message Reference Model — the internal model of UIMessage / ModelMessage
- prepareStep Semantics — the deepest per-step hook
- Client Consumption (useChat) — full receive-side API
SDK source anchors
[email protected]—dist/index.d.ts:2158-2273(UIMessageChunkfull union definition)[email protected]—dist/index.d.ts:2150-2156(DataUIMessageChunk+ transient)[email protected]—dist/index.js:5101-5198(createUIMessageStreamResponseSSE wrapping)@ai-sdk/[email protected]—dist/index.js(AbstractChatstream consumption logic)
External references
- MDN — Server-Sent Events (SSE protocol spec)
- Vercel AI SDK Resumable Streams — official resume guide
- nginx SSE config — reverse-proxy buffering settings