PokeePokee Enterprise API

Tool calling

Let the agent call your functions. Two surfaces — stateless OpenAI-compatible /v1/responses and session-mode tool egress over SSE.

Define functions in your code; let the agent decide when to call them. The model returns the call (name + JSON arguments); you execute the function on your side; you feed the result back. Identical pattern to OpenAI's function calling — the wire shape matches OpenAI's /v1/responses.

Two surfaces, same primitive:

Stateless /v1/responsesSession tools
Agent loopNone — pure inferenceFull agentic session (skills, files, tools)
StateYou send full message history each callServer holds the conversation
Tool call surfaces asoutput[type="function_call"] in the response bodytool_call.requested SSE event mid-stream
ContinuationNew call with function_call_output item in inputPOST /v1/sessions/{id}/tool_results to resume the open SSE

Use stateless when you're swapping in for OpenAI's API or the model just needs to pick a tool. Use session tools when the agent should reason, run skills, AND occasionally call your tools in the same turn.

Stateless /v1/responses

A single inference round-trip with tools attached. No session, no agent loop. OpenAI Responses-compatible.

Request

curl -X POST "$POKEE_API/v1/responses" \
  -H "Authorization: Bearer $POKEE_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "input": "What is the weather in Paris?",
    "tools": [
      {
        "type": "function",
        "name": "get_weather",
        "description": "Look up current weather for a city.",
        "parameters": {
          "type": "object",
          "properties": {"location": {"type": "string"}},
          "required": ["location"]
        }
      }
    ]
  }'

Body fields:

FieldTypeDescription
inputstring | arrayRequired. A string is treated as one user message. An array uses OpenAI Responses items (see Continuation).
modelstringOptional. Defaults to your tenant's configured model.
toolsarrayOpenAI Responses tool shape. The flat form ({type, name, description, parameters}) and the nested Chat Completions form ({type, function: {name, description, parameters}}) are both accepted.
tool_choice"auto" | "none" | "required" | {type:"function", name}Defaults to "auto". "none" strips tools entirely.
instructionsstringOptional system prompt.
max_output_tokens (or max_tokens)integerDefault 4096, ceiling 32000.
temperature, top_p, stop / stop_sequencesPass-through.
metadataobjectEchoed back on the response.

stream: true is not yet supported — set false or omit. Returns 400 if true.

Response

{
  "id": "resp_922b14f0c88f4ca1aff9a4d9",
  "object": "response",
  "created_at": 1777469641,
  "status": "completed",
  "model": "claude-opus-4-7",
  "output": [
    {
      "type": "function_call",
      "id": "fc_ef8ee26e79474ba4af9d134c",
      "call_id": "toolu_bdrk_01MMvD8B4QyyX4EuqAMy7ogD",
      "name": "get_weather",
      "arguments": "{\"location\": \"Paris\"}",
      "status": "completed"
    }
  ],
  "usage": {
    "input_tokens": 565,
    "input_tokens_details": {"cached_tokens": 0, "cache_creation_tokens": 0},
    "output_tokens": 54,
    "total_tokens": 619
  },
  "tool_choice": "auto",
  "parallel_tool_calls": true,
  "stop_reason": "tool_use"
}

output[] may contain a mix of message items (assistant text) and function_call items (tool calls). Iterate the array in order.

Continuation

Execute the function on your side, then call again with the original messages plus a function_call_output item:

curl -X POST "$POKEE_API/v1/responses" \
  -H "Authorization: Bearer $POKEE_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "input": [
      {"role": "user", "content": "What is the weather in Paris?"},
      {
        "type": "function_call",
        "call_id": "toolu_bdrk_01MMvD8B4QyyX4EuqAMy7ogD",
        "name": "get_weather",
        "arguments": "{\"location\": \"Paris\"}"
      },
      {
        "type": "function_call_output",
        "call_id": "toolu_bdrk_01MMvD8B4QyyX4EuqAMy7ogD",
        "output": "{\"temperature_c\": 14, \"condition\": \"cloudy\"}"
      }
    ],
    "tools": [
      {"type": "function", "name": "get_weather",
       "description": "Look up current weather for a city.",
       "parameters": {"type": "object", "properties": {"location": {"type": "string"}}, "required": ["location"]}}
    ]
  }'

The server is stateless across calls — always pass the full message history. The call_id in function_call_output must match the one in the prior function_call.

Errors

StatusWhen
400Malformed input, unknown tool type, invalid tool_choice, stream: true
402Insufficient credits
429Rate limit exceeded
502Upstream inference failure
504Upstream timeout

Session-mode tools

Attach tools to a session at create time. The agent reasons normally — using its skills, files, and built-in tools — and additionally may decide to call one of yours. When it does, the open SSE stream emits a tool_call.requested event and waits for you to POST the result back.

1. Register tools at session create

Pass a tools array on POST /v1/sessions:

curl -X POST "$POKEE_API/v1/sessions" \
  -H "Authorization: Bearer $POKEE_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "tools": [
      {
        "type": "function",
        "name": "get_weather",
        "description": "Look up the current weather for a city.",
        "parameters": {
          "type": "object",
          "properties": {"location": {"type": "string"}},
          "required": ["location"]
        }
      }
    ]
  }'

Same OpenAI tool shape as /v1/responses. Tools are locked at session create time — the underlying SDK can't swap MCP servers mid-session. Set them once and reuse the session across many turns. To change the tool list, destroy and recreate the session.

GET /v1/sessions/{id} echoes back a summary of registered tools under external_tools.

2. Agent decides to call your tool

Send a message normally:

curl -N -X POST "$POKEE_API/v1/sessions/$SID/messages" \
  -H "Authorization: Bearer $POKEE_KEY" \
  -H "Content-Type: application/json" \
  -d '{"message": "What is the weather in Paris right now?"}'

If the agent decides to use one of your tools, the SSE stream emits a new event:

event: tool_call.requested
data: {
  "tool_call_id": "call_a1b2c3d4e5f6...",
  "name": "get_weather",
  "arguments": "{\"location\": \"Paris\"}"
}

The stream stays open. The agent's turn is suspended server-side — no more events flow until you respond. The standard tool_use.start / tool_use.executing / tool_use.stop events still fire too, so consumers that aggregate by tool_use.id see the call there as well; the new tool_call.requested event is what carries the developer-facing tool_call_id.

3. POST the result back

curl -X POST "$POKEE_API/v1/sessions/$SID/tool_results" \
  -H "Authorization: Bearer $POKEE_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "tool_call_id": "call_a1b2c3d4e5f6...",
    "output": "{\"temperature_c\": 14, \"condition\": \"cloudy\"}",
    "is_error": false
  }'

Returns 204 No Content. The agent immediately resumes — your output is fed back to the model, and the SSE stream continues with whatever comes next (more tool calls, text deltas, or message.completed).

output field: strings pass through unchanged. Objects/arrays are JSON-serialized server-side so the model receives a string. Set is_error: true to signal the call failed (the model sees the failure and may retry or change approach).

Timeout

A suspended tool call times out after 300 seconds by default. On timeout the agent receives an error result and message.error fires on the SSE stream. If your tool can take longer, surface the call to a queue and POST when ready — there's no penalty for keeping the session open as long as it stays under TTL.

tool_choice

Not supported in session mode — the agent has its own tools (skills, Bash, Read, etc.) and forcing one of yours would fight the agent loop. Always behaves as auto. Pass tool_choice: "none" to /v1/responses if you need a model that won't call any tools.

Errors

StatusWhen
400Session has no external tools, or tool_call_id is missing
404tool_call_id is unknown — already resolved, timed out, or never existed

Python end-to-end

Stateless:

import os, json, httpx
API = os.environ["POKEE_API"]; KEY = os.environ["POKEE_KEY"]
H   = {"Authorization": f"Bearer {KEY}", "Content-Type": "application/json"}
TOOLS = [{
    "type": "function", "name": "get_weather",
    "description": "Look up current weather for a city.",
    "parameters": {"type": "object",
                   "properties": {"location": {"type": "string"}},
                   "required": ["location"]},
}]

def get_weather(location):
    return {"temperature_c": 14, "condition": "cloudy"}

with httpx.Client(base_url=API, headers=H, timeout=60) as c:
    history = [{"role": "user", "content": "What is the weather in Paris?"}]
    while True:
        resp = c.post("/v1/responses",
                      json={"input": history, "tools": TOOLS}).json()
        calls = [x for x in resp["output"] if x["type"] == "function_call"]
        if not calls:
            text = "".join(p["text"] for x in resp["output"] if x["type"] == "message"
                           for p in x["content"] if p["type"] == "output_text")
            print(text); break
        for call in calls:
            history.append(call)
            args = json.loads(call["arguments"])
            history.append({
                "type": "function_call_output",
                "call_id": call["call_id"],
                "output": json.dumps(get_weather(**args)),
            })

Session:

import os, json, threading, httpx
API = os.environ["POKEE_API"]; KEY = os.environ["POKEE_KEY"]
H   = {"Authorization": f"Bearer {KEY}", "Content-Type": "application/json"}

with httpx.Client(base_url=API, headers=H, timeout=httpx.Timeout(None)) as c:
    sid = c.post("/v1/sessions", json={
        "tools": [{
            "type": "function", "name": "get_weather",
            "description": "Look up current weather for a city.",
            "parameters": {"type":"object",
                           "properties":{"location":{"type":"string"}},
                           "required":["location"]},
        }]
    }).json()["id"]

    def on_tool_call(data):
        out = {"temperature_c": 14, "condition": "cloudy"}
        c.post(f"/v1/sessions/{sid}/tool_results", json={
            "tool_call_id": data["tool_call_id"],
            "output": json.dumps(out),
        }).raise_for_status()

    with c.stream("POST", f"/v1/sessions/{sid}/messages",
                  json={"message": "What is the weather in Paris?"}) as r:
        event = None
        for line in r.iter_lines():
            if line.startswith("event:"): event = line[6:].strip()
            elif line.startswith("data:"):
                data = json.loads(line[5:].strip())
                if event == "tool_call.requested":
                    # POST the result on a side thread so the SSE stream keeps draining.
                    threading.Thread(target=on_tool_call, args=(data,), daemon=True).start()
                elif event == "text.delta":
                    print(data["content"], end="", flush=True)
                elif event == "message.completed":
                    print(f"\n[done] {data['usage']}")

On this page