Tool calling
Let the agent call your functions. Two surfaces — stateless OpenAI-compatible /v1/responses and session-mode tool egress over SSE.
Define functions in your code; let the agent decide when to call them. The model returns the call (name + JSON arguments); you execute the function on your side; you feed the result back. Identical pattern to OpenAI's function calling — the wire shape matches OpenAI's /v1/responses.
Two surfaces, same primitive:
Stateless /v1/responses | Session tools | |
|---|---|---|
| Agent loop | None — pure inference | Full agentic session (skills, files, tools) |
| State | You send full message history each call | Server holds the conversation |
| Tool call surfaces as | output[type="function_call"] in the response body | tool_call.requested SSE event mid-stream |
| Continuation | New call with function_call_output item in input | POST /v1/sessions/{id}/tool_results to resume the open SSE |
Use stateless when you're swapping in for OpenAI's API or the model just needs to pick a tool. Use session tools when the agent should reason, run skills, AND occasionally call your tools in the same turn.
Stateless /v1/responses
A single inference round-trip with tools attached. No session, no agent loop. OpenAI Responses-compatible.
Request
curl -X POST "$POKEE_API/v1/responses" \
-H "Authorization: Bearer $POKEE_KEY" \
-H "Content-Type: application/json" \
-d '{
"input": "What is the weather in Paris?",
"tools": [
{
"type": "function",
"name": "get_weather",
"description": "Look up current weather for a city.",
"parameters": {
"type": "object",
"properties": {"location": {"type": "string"}},
"required": ["location"]
}
}
]
}'Body fields:
| Field | Type | Description |
|---|---|---|
input | string | array | Required. A string is treated as one user message. An array uses OpenAI Responses items (see Continuation). |
model | string | Optional. Defaults to your tenant's configured model. |
tools | array | OpenAI Responses tool shape. The flat form ({type, name, description, parameters}) and the nested Chat Completions form ({type, function: {name, description, parameters}}) are both accepted. |
tool_choice | "auto" | "none" | "required" | {type:"function", name} | Defaults to "auto". "none" strips tools entirely. |
instructions | string | Optional system prompt. |
max_output_tokens (or max_tokens) | integer | Default 4096, ceiling 32000. |
temperature, top_p, stop / stop_sequences | — | Pass-through. |
metadata | object | Echoed back on the response. |
stream: true is not yet supported — set false or omit. Returns 400 if true.
Response
{
"id": "resp_922b14f0c88f4ca1aff9a4d9",
"object": "response",
"created_at": 1777469641,
"status": "completed",
"model": "claude-opus-4-7",
"output": [
{
"type": "function_call",
"id": "fc_ef8ee26e79474ba4af9d134c",
"call_id": "toolu_bdrk_01MMvD8B4QyyX4EuqAMy7ogD",
"name": "get_weather",
"arguments": "{\"location\": \"Paris\"}",
"status": "completed"
}
],
"usage": {
"input_tokens": 565,
"input_tokens_details": {"cached_tokens": 0, "cache_creation_tokens": 0},
"output_tokens": 54,
"total_tokens": 619
},
"tool_choice": "auto",
"parallel_tool_calls": true,
"stop_reason": "tool_use"
}output[] may contain a mix of message items (assistant text) and function_call items (tool calls). Iterate the array in order.
Continuation
Execute the function on your side, then call again with the original messages plus a function_call_output item:
curl -X POST "$POKEE_API/v1/responses" \
-H "Authorization: Bearer $POKEE_KEY" \
-H "Content-Type: application/json" \
-d '{
"input": [
{"role": "user", "content": "What is the weather in Paris?"},
{
"type": "function_call",
"call_id": "toolu_bdrk_01MMvD8B4QyyX4EuqAMy7ogD",
"name": "get_weather",
"arguments": "{\"location\": \"Paris\"}"
},
{
"type": "function_call_output",
"call_id": "toolu_bdrk_01MMvD8B4QyyX4EuqAMy7ogD",
"output": "{\"temperature_c\": 14, \"condition\": \"cloudy\"}"
}
],
"tools": [
{"type": "function", "name": "get_weather",
"description": "Look up current weather for a city.",
"parameters": {"type": "object", "properties": {"location": {"type": "string"}}, "required": ["location"]}}
]
}'The server is stateless across calls — always pass the full message history. The call_id in function_call_output must match the one in the prior function_call.
Errors
| Status | When |
|---|---|
400 | Malformed input, unknown tool type, invalid tool_choice, stream: true |
402 | Insufficient credits |
429 | Rate limit exceeded |
502 | Upstream inference failure |
504 | Upstream timeout |
Session-mode tools
Attach tools to a session at create time. The agent reasons normally — using its skills, files, and built-in tools — and additionally may decide to call one of yours. When it does, the open SSE stream emits a tool_call.requested event and waits for you to POST the result back.
1. Register tools at session create
Pass a tools array on POST /v1/sessions:
curl -X POST "$POKEE_API/v1/sessions" \
-H "Authorization: Bearer $POKEE_KEY" \
-H "Content-Type: application/json" \
-d '{
"tools": [
{
"type": "function",
"name": "get_weather",
"description": "Look up the current weather for a city.",
"parameters": {
"type": "object",
"properties": {"location": {"type": "string"}},
"required": ["location"]
}
}
]
}'Same OpenAI tool shape as /v1/responses. Tools are locked at session create time — the underlying SDK can't swap MCP servers mid-session. Set them once and reuse the session across many turns. To change the tool list, destroy and recreate the session.
GET /v1/sessions/{id} echoes back a summary of registered tools under external_tools.
2. Agent decides to call your tool
Send a message normally:
curl -N -X POST "$POKEE_API/v1/sessions/$SID/messages" \
-H "Authorization: Bearer $POKEE_KEY" \
-H "Content-Type: application/json" \
-d '{"message": "What is the weather in Paris right now?"}'If the agent decides to use one of your tools, the SSE stream emits a new event:
event: tool_call.requested
data: {
"tool_call_id": "call_a1b2c3d4e5f6...",
"name": "get_weather",
"arguments": "{\"location\": \"Paris\"}"
}The stream stays open. The agent's turn is suspended server-side — no more events flow until you respond. The standard tool_use.start / tool_use.executing / tool_use.stop events still fire too, so consumers that aggregate by tool_use.id see the call there as well; the new tool_call.requested event is what carries the developer-facing tool_call_id.
3. POST the result back
curl -X POST "$POKEE_API/v1/sessions/$SID/tool_results" \
-H "Authorization: Bearer $POKEE_KEY" \
-H "Content-Type: application/json" \
-d '{
"tool_call_id": "call_a1b2c3d4e5f6...",
"output": "{\"temperature_c\": 14, \"condition\": \"cloudy\"}",
"is_error": false
}'Returns 204 No Content. The agent immediately resumes — your output is fed back to the model, and the SSE stream continues with whatever comes next (more tool calls, text deltas, or message.completed).
output field: strings pass through unchanged. Objects/arrays are JSON-serialized server-side so the model receives a string. Set is_error: true to signal the call failed (the model sees the failure and may retry or change approach).
Timeout
A suspended tool call times out after 300 seconds by default. On timeout the agent receives an error result and message.error fires on the SSE stream. If your tool can take longer, surface the call to a queue and POST when ready — there's no penalty for keeping the session open as long as it stays under TTL.
tool_choice
Not supported in session mode — the agent has its own tools (skills, Bash, Read, etc.) and forcing one of yours would fight the agent loop. Always behaves as auto. Pass tool_choice: "none" to /v1/responses if you need a model that won't call any tools.
Errors
| Status | When |
|---|---|
400 | Session has no external tools, or tool_call_id is missing |
404 | tool_call_id is unknown — already resolved, timed out, or never existed |
Python end-to-end
Stateless:
import os, json, httpx
API = os.environ["POKEE_API"]; KEY = os.environ["POKEE_KEY"]
H = {"Authorization": f"Bearer {KEY}", "Content-Type": "application/json"}
TOOLS = [{
"type": "function", "name": "get_weather",
"description": "Look up current weather for a city.",
"parameters": {"type": "object",
"properties": {"location": {"type": "string"}},
"required": ["location"]},
}]
def get_weather(location):
return {"temperature_c": 14, "condition": "cloudy"}
with httpx.Client(base_url=API, headers=H, timeout=60) as c:
history = [{"role": "user", "content": "What is the weather in Paris?"}]
while True:
resp = c.post("/v1/responses",
json={"input": history, "tools": TOOLS}).json()
calls = [x for x in resp["output"] if x["type"] == "function_call"]
if not calls:
text = "".join(p["text"] for x in resp["output"] if x["type"] == "message"
for p in x["content"] if p["type"] == "output_text")
print(text); break
for call in calls:
history.append(call)
args = json.loads(call["arguments"])
history.append({
"type": "function_call_output",
"call_id": call["call_id"],
"output": json.dumps(get_weather(**args)),
})Session:
import os, json, threading, httpx
API = os.environ["POKEE_API"]; KEY = os.environ["POKEE_KEY"]
H = {"Authorization": f"Bearer {KEY}", "Content-Type": "application/json"}
with httpx.Client(base_url=API, headers=H, timeout=httpx.Timeout(None)) as c:
sid = c.post("/v1/sessions", json={
"tools": [{
"type": "function", "name": "get_weather",
"description": "Look up current weather for a city.",
"parameters": {"type":"object",
"properties":{"location":{"type":"string"}},
"required":["location"]},
}]
}).json()["id"]
def on_tool_call(data):
out = {"temperature_c": 14, "condition": "cloudy"}
c.post(f"/v1/sessions/{sid}/tool_results", json={
"tool_call_id": data["tool_call_id"],
"output": json.dumps(out),
}).raise_for_status()
with c.stream("POST", f"/v1/sessions/{sid}/messages",
json={"message": "What is the weather in Paris?"}) as r:
event = None
for line in r.iter_lines():
if line.startswith("event:"): event = line[6:].strip()
elif line.startswith("data:"):
data = json.loads(line[5:].strip())
if event == "tool_call.requested":
# POST the result on a side thread so the SSE stream keeps draining.
threading.Thread(target=on_tool_call, args=(data,), daemon=True).start()
elif event == "text.delta":
print(data["content"], end="", flush=True)
elif event == "message.completed":
print(f"\n[done] {data['usage']}")