PokeePokee Enterprise API

Streaming messages

Send a message; receive a Server-Sent Events stream of the agent's response.

POST /v1/sessions/{id}/messages

Send a message; the response is a Server-Sent Events stream.

Request body:

{"message": "Your prompt here. Plain text or markdown."}

Response: text/event-stream, one event per line-pair. Stream ends with a message.completed, message.cancelled, or message.error event; the connection closes shortly after.

The response carries an X-Response-Id header, e.g. x-response-id: msg_a1b2c3d4e5f6. This is the response_id (referred to as {rid} in the resume/cancel URLs) — the same id you'll see in the first event's data.id. Capture it as soon as headers land. You'll need it to:

  • Resume the stream after a disconnect (GET /v1/sessions/{id}/responses/{rid}/events).
  • Cancel the turn mid-flight (POST /v1/sessions/{id}/responses/{rid}/cancel).

A session can only run one message at a time. Sending while busy returns 409 Conflict. Once a message starts, the agent runs to completion even if you disconnect — the next call will see the session as busy until the prior turn finishes.

Optional headers:

HeaderPurpose
Idempotency-Key: <string>Make the POST safe to retry — same key + same body returns the original response_id instead of starting a duplicate turn. See Idempotency keys.

Streaming format

Each event is three lines (the id: line is what makes resume work):

id: <seq>
event: <type>
data: <json>

Followed by a blank line. The id: is a per-turn monotonic integer starting at 0 — EventSource picks it up automatically as the Last-Event-ID for any reconnect attempt. If you're rolling your own SSE parser, store the highest id you've seen so you can pass it back on resume.

The same value is also embedded in the data payload as sequence_number — pick whichever is easier to read in your client (the SSE-standard id: line, or the JSON field).

Two distinct identifiers — don't confuse them:

WhereFormatCardinalityUsed for
response_id ({rid})X-Response-Id header AND data.id on terminal events (message.created, message.completed, message.cancelled, message.error)string msg_<24 hex>One per turnThe {rid} segment in resume / cancel URLs
event id: / sequence_numberThe id: <int> SSE line AND data.sequence_number on every eventinteger 0, 1, 2, ...One per event within a turnThe Last-Event-ID value on resume

Examples:

EventData shapeMeaning
message.created{id, session_id, status:"in_progress"}First event — gives you the response id
text.start{}Text block opened
text.delta{content: "..."}Incremental text
text.stop{}Text block closed
thinking.start/.delta/.stopsimilar to textAgent's reasoning trace (when extended thinking is on)
tool_use.start{name, id}Agent invoked a tool
tool_use.input_delta{content: "..."}Tool's input args streaming
tool_use.executing{name, id, input}Final input, about to execute
tool_use.stop{}Tool block closed
tool_result{id, content, is_error}Tool returned (matches tool_use.start.id)
tool_call.requested{tool_call_id, name, arguments}Agent invoked one of your registered tools — turn is suspended until you POST /v1/sessions/{id}/tool_results
message.completed{id, status:"completed", usage:{...}}Successful end of turn
message.cancelled{id, status:"cancelled"}You called the cancel endpoint mid-turn — terminal event, no message.completed follows
message.error{id, error: "..."}Failed end of turn

Reading robustly: parse on \n\n boundaries. Many SSE clients (browser EventSource, Python httpx-sse, etc.) handle the framing for you.

Disconnection: if your client disconnects mid-stream, the agent keeps running and events keep landing in a per-turn buffer. Reconnect via GET /v1/sessions/{id}/responses/{rid}/events with Last-Event-ID to pick up where you left off — no duplicated work, no token re-billing.

Python example

import json
import os
import httpx

API = os.environ["POKEE_API"]
KEY = os.environ["POKEE_KEY"]
H   = {"Authorization": f"Bearer {KEY}", "Content-Type": "application/json"}

with httpx.Client(base_url=API, headers=H, timeout=httpx.Timeout(None)) as c:
    sid = c.post("/v1/sessions", json={"persistent": True}).json()["id"]

    with c.stream("POST", f"/v1/sessions/{sid}/messages",
                  json={"message": "Write a short poem about persistent storage."}) as r:
        # Grab the response_id off the headers BEFORE iterating events.
        # This is the {rid} you'd pass to the resume / cancel endpoints.
        rid = r.headers["X-Response-Id"]
        last_event_id = -1
        for line in r.iter_lines():
            if line.startswith("id:"):
                last_event_id = int(line[3:].strip())
            elif line.startswith("event:"):
                event = line[6:].strip()
            elif line.startswith("data:"):
                data = json.loads(line[5:].strip())
                if event == "text.delta":
                    print(data["content"], end="", flush=True)
                elif event == "message.completed":
                    print(f"\n\n[done] usage: {data['usage']}")
                elif event == "message.error":
                    print(f"\n\n[error] {data['error']}")
        # On disconnect, pass `rid` and `last_event_id` to the resume
        # endpoint to pick up where you left off — see /docs/resume.

On this page