Resume after disconnect
Reconnect to an in-flight or recently-completed SSE stream after a network blip — no duplicated work, no token re-billing.
When your client disconnects mid-stream, the agent keeps running on the server side. Events land in a per-turn buffer that you can reconnect to via Last-Event-ID, picking up exactly where you left off. The same primitive works for both the agentic /v1/sessions/.../messages path and the stateless /v1/responses path.
Where the response_id comes from
The {rid} in the resume URL is the response_id for one specific turn — distinct from the {id} (session id). You get it back from the original streaming POST in two redundant places:
-
X-Response-IdHTTP response header — easiest. Available as soon as the response headers land, before any SSE data:curl -i -X POST "$POKEE_API/v1/sessions/$SID/messages" \ -H "Authorization: Bearer $POKEE_KEY" \ -d '{"message": "hi"}' # HTTP/1.1 200 # x-response-id: msg_a1b2c3d4e5f6 ← THIS is your {rid} # content-type: text/event-stream # ... -
First SSE event payload —
data.idonmessage.created(sessions) orresponse.created(/v1/responses):id: 0 event: message.created data: {"id": "msg_a1b2c3d4e5f6", "session_id": "sess_...", "status": "in_progress"}
Persist whichever you read first. If your client died before reading either (the connection dropped during the TLS handshake or before headers arrived), use an Idempotency-Key on the original POST and retry — the server will return the original response_id instead of starting a duplicate turn.
response_id format: msg_<24 hex> for sessions, resp_<24 hex> for stateless /v1/responses.
How it works
- Each streaming POST returns an
X-Response-Idheader (and the same id on the first SSE event). - Every SSE event carries an
id: <seq>line — a per-turn monotonic integer starting at 0. The same value is also embedded in the data payload assequence_number, so clients that don't parse the SSEid:line can read the seq directly from JSON. - On disconnect, reconnect to the resume endpoint with
Last-Event-ID: <last seq seen>. The server replays events from<last + 1>and live-tails until the turn closes. - Closed turns are retained for 120 seconds (
RESUME_GRACE_SECONDS) for late reconnects, then evicted.
Sources of the seq, in order of preference:
- SSE
id:line — what the W3C spec defines and whatEventSourcereads automatically. Use this if you can.data.sequence_number— same value, embedded in the JSON payload. Use this if your HTTP client doesn't exposeid:lines (e.g. some HTTP/2 streaming libraries) or if you'd rather parse JSON than SSE framing.- Both always agree.
Endpoints
GET /v1/sessions/{id}/responses/{rid}/events
Resume a session-turn stream.
GET /v1/responses/{rid}/events
Resume a stateless /v1/responses stream. Auth-scoped to the original caller.
Both endpoints share the same request shape:
| Header | Purpose |
|---|---|
Authorization: Bearer ... | Required. |
Last-Event-ID: <int> | Optional. Resume from <int> + 1. Browser EventSource sends this automatically. |
(Query) ?last_event_id=<int> | Fallback for clients that can't set headers. |
Status codes:
200 text/event-stream— streaming. Sameevent:/data:shape as the original POST, with monotonicid:lines preserved.400—Last-Event-IDis non-integer or negative.404—ridunknown for this session/caller, or buffer evicted past the 120s grace window. Start a new turn.
Browser pattern (zero-config)
EventSource handles reconnection automatically — give it the URL and you're done:
const es = new EventSource(
`${POKEE_API}/v1/sessions/${sid}/responses/${rid}/events`,
{ withCredentials: true }
);
es.addEventListener("text.delta", (e) => {
output.append(JSON.parse(e.data).content);
});
es.addEventListener("message.completed", () => es.close());
es.addEventListener("message.cancelled", () => es.close());
es.addEventListener("message.error", () => es.close());The browser reconnects automatically on disconnect with Last-Event-ID set to the last id: it saw — events resume cleanly.
Python pattern
If you're using httpx, drive the resume yourself. Track the highest id: and replay on reconnect:
import httpx, json
last_id = -1
while True:
headers = {"Authorization": f"Bearer {KEY}"}
if last_id >= 0:
headers["Last-Event-ID"] = str(last_id)
try:
with httpx.stream(
"GET", f"{API}/v1/sessions/{sid}/responses/{rid}/events",
headers=headers, timeout=httpx.Timeout(None),
) as r:
event = None
for line in r.iter_lines():
if line.startswith("id:"):
last_id = int(line[3:].strip())
elif line.startswith("event:"):
event = line[6:].strip()
elif line.startswith("data:"):
data = json.loads(line[5:].strip())
# Equivalent to the id: line above — pick whichever
# access path is more convenient. Both always agree.
# last_id = data.get("sequence_number", last_id)
handle(event, data)
if event in ("message.completed", "message.cancelled", "message.error"):
return
except (httpx.RemoteProtocolError, httpx.ReadError):
# Connection dropped — loop and reconnect with the last id we saw.
continueFor /v1/responses, swap the URL to /v1/responses/{rid}/events and watch for response.completed / response.cancelled / response.failed instead of message.*.
What's NOT replayed
- Server restarts drop in-flight buffers (single-replica gateway today; the buffer is in-memory). After a pod restart, the original turn is gone — start a new one.
- Past-grace turns. The 120-second grace window covers typical Wi-Fi blips and proxy timeouts. If you reconnect later, you'll get
404.
Pairs well with
- Idempotency keys — closes the gap when your POST landed but the client never saw the response headers (so you don't have a
response_idto resume against). - Cancel — explicitly abort an in-flight turn. The cancel terminal (
message.cancelled/response.cancelled) flows through the same buffer, so resume listeners see it too.