Tool calling

Let the agent call your functions. Two surfaces — stateless OpenAI-compatible /v1/responses and session-mode tool egress over SSE.

Define functions in your code; let the agent decide when to call them. The model returns the call (name + JSON arguments); you execute the function on your side; you feed the result back. Identical pattern to OpenAI's function calling — the wire shape matches OpenAI's /v1/responses.

Two surfaces, same primitive:

	Stateless `/v1/responses`	Session tools
Agent loop	None — pure inference	Full agentic session (skills, files, tools)
State	You send full message history each call	Server holds the conversation
Tool call surfaces as	`output[type="function_call"]` in the response body	`tool_call.requested` SSE event mid-stream
Continuation	New call with `function_call_output` item in `input`	`POST /v1/sessions/{id}/tool_results` to resume the open SSE

Use stateless when you're swapping in for OpenAI's API or the model just needs to pick a tool. Use session tools when the agent should reason, run skills, AND occasionally call your tools in the same turn.

Stateless `/v1/responses`

A single inference round-trip with tools attached. No session, no agent loop. OpenAI Responses-compatible.

Request

curl -X POST "$POKEE_API/v1/responses" \
  -H "Authorization: Bearer $POKEE_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "input": "What is the weather in Paris?",
    "tools": [
      {
        "type": "function",
        "name": "get_weather",
        "description": "Look up current weather for a city.",
        "parameters": {
          "type": "object",
          "properties": {"location": {"type": "string"}},
          "required": ["location"]
        }
      }
    ]
  }'

Body fields:

Field	Type	Description
`input`	string \| array	Required. A string is treated as one user message. An array uses OpenAI Responses items (see Continuation).
`model`	string	Optional. Defaults to your tenant's configured model.
`tools`	array	OpenAI Responses tool shape. The flat form (`{type, name, description, parameters}`) and the nested Chat Completions form (`{type, function: {name, description, parameters}}`) are both accepted.
`tool_choice`	`"auto"` \| `"none"` \| `"required"` \| `{type:"function", name}`	Defaults to `"auto"`. `"none"` strips tools entirely.
`instructions`	string	Optional system prompt.
`max_output_tokens` (or `max_tokens`)	integer	Default 4096, ceiling 32000.
`temperature`, `top_p`, `stop` / `stop_sequences`	—	Pass-through.
`metadata`	object	Echoed back on the response.

stream: true is not yet supported — set false or omit. Returns 400 if true.

Response

{
  "id": "resp_922b14f0c88f4ca1aff9a4d9",
  "object": "response",
  "created_at": 1777469641,
  "status": "completed",
  "model": "claude-opus-4-7",
  "output": [
    {
      "type": "function_call",
      "id": "fc_ef8ee26e79474ba4af9d134c",
      "call_id": "toolu_bdrk_01MMvD8B4QyyX4EuqAMy7ogD",
      "name": "get_weather",
      "arguments": "{\"location\": \"Paris\"}",
      "status": "completed"
    }
  ],
  "usage": {
    "input_tokens": 565,
    "input_tokens_details": {"cached_tokens": 0, "cache_creation_tokens": 0},
    "output_tokens": 54,
    "total_tokens": 619
  },
  "tool_choice": "auto",
  "parallel_tool_calls": true,
  "stop_reason": "tool_use"
}

output[] may contain a mix of message items (assistant text) and function_call items (tool calls). Iterate the array in order.

Continuation

Execute the function on your side, then call again with the original messages plus a function_call_output item:

curl -X POST "$POKEE_API/v1/responses" \
  -H "Authorization: Bearer $POKEE_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "input": [
      {"role": "user", "content": "What is the weather in Paris?"},
      {
        "type": "function_call",
        "call_id": "toolu_bdrk_01MMvD8B4QyyX4EuqAMy7ogD",
        "name": "get_weather",
        "arguments": "{\"location\": \"Paris\"}"
      },
      {
        "type": "function_call_output",
        "call_id": "toolu_bdrk_01MMvD8B4QyyX4EuqAMy7ogD",
        "output": "{\"temperature_c\": 14, \"condition\": \"cloudy\"}"
      }
    ],
    "tools": [
      {"type": "function", "name": "get_weather",
       "description": "Look up current weather for a city.",
       "parameters": {"type": "object", "properties": {"location": {"type": "string"}}, "required": ["location"]}}
    ]
  }'

The server is stateless across calls — always pass the full message history. The call_id in function_call_output must match the one in the prior function_call.

Errors

Status	When
`400`	Malformed input, unknown tool type, invalid `tool_choice`, `stream: true`
`402`	Insufficient credits
`429`	Rate limit exceeded
`502`	Upstream inference failure
`504`	Upstream timeout

Session-mode tools

Attach tools to a session at create time. The agent reasons normally — using its skills, files, and built-in tools — and additionally may decide to call one of yours. When it does, the open SSE stream emits a tool_call.requested event and waits for you to POST the result back.

1. Register tools at session create

Pass a tools array on POST /v1/sessions:

curl -X POST "$POKEE_API/v1/sessions" \
  -H "Authorization: Bearer $POKEE_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "tools": [
      {
        "type": "function",
        "name": "get_weather",
        "description": "Look up the current weather for a city.",
        "parameters": {
          "type": "object",
          "properties": {"location": {"type": "string"}},
          "required": ["location"]
        }
      }
    ]
  }'

Same OpenAI tool shape as /v1/responses. Tools are locked at session create time — the underlying SDK can't swap MCP servers mid-session. Set them once and reuse the session across many turns. To change the tool list, destroy and recreate the session.

GET /v1/sessions/{id} echoes back a summary of registered tools under external_tools.

2. Agent decides to call your tool

Send a message normally:

curl -N -X POST "$POKEE_API/v1/sessions/$SID/messages" \
  -H "Authorization: Bearer $POKEE_KEY" \
  -H "Content-Type: application/json" \
  -d '{"message": "What is the weather in Paris right now?"}'

If the agent decides to use one of your tools, the SSE stream emits a new event:

event: tool_call.requested
data: {
  "tool_call_id": "call_a1b2c3d4e5f6...",
  "name": "get_weather",
  "arguments": "{\"location\": \"Paris\"}"
}

The stream stays open. The agent's turn is suspended server-side — no more events flow until you respond. The standard tool_use.start / tool_use.executing / tool_use.stop events still fire too, so consumers that aggregate by tool_use.id see the call there as well; the new tool_call.requested event is what carries the developer-facing tool_call_id.

3. POST the result back

curl -X POST "$POKEE_API/v1/sessions/$SID/tool_results" \
  -H "Authorization: Bearer $POKEE_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "tool_call_id": "call_a1b2c3d4e5f6...",
    "output": "{\"temperature_c\": 14, \"condition\": \"cloudy\"}",
    "is_error": false
  }'

Returns 204 No Content. The agent immediately resumes — your output is fed back to the model, and the SSE stream continues with whatever comes next (more tool calls, text deltas, or message.completed).

output field: strings pass through unchanged. Objects/arrays are JSON-serialized server-side so the model receives a string. Set is_error: true to signal the call failed (the model sees the failure and may retry or change approach).

Timeout

A suspended tool call times out after 300 seconds by default. On timeout the agent receives an error result and message.error fires on the SSE stream. If your tool can take longer, surface the call to a queue and POST when ready — there's no penalty for keeping the session open as long as it stays under TTL.

tool_choice

Not supported in session mode — the agent has its own tools (skills, Bash, Read, etc.) and forcing one of yours would fight the agent loop. Always behaves as auto. Pass tool_choice: "none" to /v1/responses if you need a model that won't call any tools.

Errors

Status	When
`400`	Session has no external tools, or `tool_call_id` is missing
`404`	`tool_call_id` is unknown — already resolved, timed out, or never existed

Python end-to-end

Stateless:

import os, json, httpx
API = os.environ["POKEE_API"]; KEY = os.environ["POKEE_KEY"]
H   = {"Authorization": f"Bearer {KEY}", "Content-Type": "application/json"}
TOOLS = [{
    "type": "function", "name": "get_weather",
    "description": "Look up current weather for a city.",
    "parameters": {"type": "object",
                   "properties": {"location": {"type": "string"}},
                   "required": ["location"]},
}]

def get_weather(location):
    return {"temperature_c": 14, "condition": "cloudy"}

with httpx.Client(base_url=API, headers=H, timeout=60) as c:
    history = [{"role": "user", "content": "What is the weather in Paris?"}]
    while True:
        resp = c.post("/v1/responses",
                      json={"input": history, "tools": TOOLS}).json()
        calls = [x for x in resp["output"] if x["type"] == "function_call"]
        if not calls:
            text = "".join(p["text"] for x in resp["output"] if x["type"] == "message"
                           for p in x["content"] if p["type"] == "output_text")
            print(text); break
        for call in calls:
            history.append(call)
            args = json.loads(call["arguments"])
            history.append({
                "type": "function_call_output",
                "call_id": call["call_id"],
                "output": json.dumps(get_weather(**args)),
            })

Session:

import os, json, threading, httpx
API = os.environ["POKEE_API"]; KEY = os.environ["POKEE_KEY"]
H   = {"Authorization": f"Bearer {KEY}", "Content-Type": "application/json"}

with httpx.Client(base_url=API, headers=H, timeout=httpx.Timeout(None)) as c:
    sid = c.post("/v1/sessions", json={
        "tools": [{
            "type": "function", "name": "get_weather",
            "description": "Look up current weather for a city.",
            "parameters": {"type":"object",
                           "properties":{"location":{"type":"string"}},
                           "required":["location"]},
        }]
    }).json()["id"]

    def on_tool_call(data):
        out = {"temperature_c": 14, "condition": "cloudy"}
        c.post(f"/v1/sessions/{sid}/tool_results", json={
            "tool_call_id": data["tool_call_id"],
            "output": json.dumps(out),
        }).raise_for_status()

    with c.stream("POST", f"/v1/sessions/{sid}/messages",
                  json={"message": "What is the weather in Paris?"}) as r:
        event = None
        for line in r.iter_lines():
            if line.startswith("event:"): event = line[6:].strip()
            elif line.startswith("data:"):
                data = json.loads(line[5:].strip())
                if event == "tool_call.requested":
                    # POST the result on a side thread so the SSE stream keeps draining.
                    threading.Thread(target=on_tool_call, args=(data,), daemon=True).start()
                elif event == "text.delta":
                    print(data["content"], end="", flush=True)
                elif event == "message.completed":
                    print(f"\n[done] {data['usage']}")

On this page