Streaming messages
Send a message; receive a Server-Sent Events stream of the agent's response.
POST /v1/sessions/{id}/messages
Send a message; the response is a Server-Sent Events stream.
Request body:
{"message": "Your prompt here. Plain text or markdown."}Response: text/event-stream, one event per line-pair. Stream ends with a message.completed, message.cancelled, or message.error event; the connection closes shortly after.
The response carries an X-Response-Id header, e.g. x-response-id: msg_a1b2c3d4e5f6. This is the response_id (referred to as {rid} in the resume/cancel URLs) — the same id you'll see in the first event's data.id. Capture it as soon as headers land. You'll need it to:
- Resume the stream after a disconnect (
GET /v1/sessions/{id}/responses/{rid}/events). - Cancel the turn mid-flight (
POST /v1/sessions/{id}/responses/{rid}/cancel).
A session can only run one message at a time. Sending while busy returns 409 Conflict. Once a message starts, the agent runs to completion even if you disconnect — the next call will see the session as busy until the prior turn finishes.
Optional headers:
| Header | Purpose |
|---|---|
Idempotency-Key: <string> | Make the POST safe to retry — same key + same body returns the original response_id instead of starting a duplicate turn. See Idempotency keys. |
Streaming format
Each event is three lines (the id: line is what makes resume work):
id: <seq>
event: <type>
data: <json>
Followed by a blank line. The id: is a per-turn monotonic integer starting at 0 — EventSource picks it up automatically as the Last-Event-ID for any reconnect attempt. If you're rolling your own SSE parser, store the highest id you've seen so you can pass it back on resume.
The same value is also embedded in the data payload as sequence_number — pick whichever is easier to read in your client (the SSE-standard id: line, or the JSON field).
Two distinct identifiers — don't confuse them:
| Where | Format | Cardinality | Used for | |
|---|---|---|---|---|
response_id ({rid}) | X-Response-Id header AND data.id on terminal events (message.created, message.completed, message.cancelled, message.error) | string msg_<24 hex> | One per turn | The {rid} segment in resume / cancel URLs |
event id: / sequence_number | The id: <int> SSE line AND data.sequence_number on every event | integer 0, 1, 2, ... | One per event within a turn | The Last-Event-ID value on resume |
Examples:
| Event | Data shape | Meaning |
|---|---|---|
message.created | {id, session_id, status:"in_progress"} | First event — gives you the response id |
text.start | {} | Text block opened |
text.delta | {content: "..."} | Incremental text |
text.stop | {} | Text block closed |
thinking.start/.delta/.stop | similar to text | Agent's reasoning trace (when extended thinking is on) |
tool_use.start | {name, id} | Agent invoked a tool |
tool_use.input_delta | {content: "..."} | Tool's input args streaming |
tool_use.executing | {name, id, input} | Final input, about to execute |
tool_use.stop | {} | Tool block closed |
tool_result | {id, content, is_error} | Tool returned (matches tool_use.start.id) |
tool_call.requested | {tool_call_id, name, arguments} | Agent invoked one of your registered tools — turn is suspended until you POST /v1/sessions/{id}/tool_results |
message.completed | {id, status:"completed", usage:{...}} | Successful end of turn |
message.cancelled | {id, status:"cancelled"} | You called the cancel endpoint mid-turn — terminal event, no message.completed follows |
message.error | {id, error: "..."} | Failed end of turn |
Reading robustly: parse on \n\n boundaries. Many SSE clients (browser EventSource, Python httpx-sse, etc.) handle the framing for you.
Disconnection: if your client disconnects mid-stream, the agent keeps running and events keep landing in a per-turn buffer. Reconnect via GET /v1/sessions/{id}/responses/{rid}/events with Last-Event-ID to pick up where you left off — no duplicated work, no token re-billing.
Python example
import json
import os
import httpx
API = os.environ["POKEE_API"]
KEY = os.environ["POKEE_KEY"]
H = {"Authorization": f"Bearer {KEY}", "Content-Type": "application/json"}
with httpx.Client(base_url=API, headers=H, timeout=httpx.Timeout(None)) as c:
sid = c.post("/v1/sessions", json={"persistent": True}).json()["id"]
with c.stream("POST", f"/v1/sessions/{sid}/messages",
json={"message": "Write a short poem about persistent storage."}) as r:
# Grab the response_id off the headers BEFORE iterating events.
# This is the {rid} you'd pass to the resume / cancel endpoints.
rid = r.headers["X-Response-Id"]
last_event_id = -1
for line in r.iter_lines():
if line.startswith("id:"):
last_event_id = int(line[3:].strip())
elif line.startswith("event:"):
event = line[6:].strip()
elif line.startswith("data:"):
data = json.loads(line[5:].strip())
if event == "text.delta":
print(data["content"], end="", flush=True)
elif event == "message.completed":
print(f"\n\n[done] usage: {data['usage']}")
elif event == "message.error":
print(f"\n\n[error] {data['error']}")
# On disconnect, pass `rid` and `last_event_id` to the resume
# endpoint to pick up where you left off — see /docs/resume.