Agent Interface Contract (v0)

An agent is a stateful control loop that receives observations, decides on a single action, updates internal state, and repeats until termination.

Required Agent API

agent.py

class Agent:
    def reset(self, task_spec: dict) -> None:
        ...

    def observe(self, observation: dict) -> None:
        ...

    def act(self) -> dict:
        ...

No async. No callbacks. No streaming.

Lifecycle

Initialization: instantiate once per run.
Reset: agent.reset(task_spec) called before the loop begins.
Observe → Act loop runs per step until termination.

Observation Contract

Observations are fully structured. No free-form text. Fields are stable and always present.

Action Contract

Agents return a single JSON-serializable action. Invalid action causes immediate failure.

action.json

{
  "type": "read_file",
  "args": {
    "path": "/app/config.yaml"
  }
}

Constraints

Agents may keep internal state. The harness does not inspect internals.
Agents should notice failures, retry conservatively, and replan when necessary.
Agents receive remaining budgets and must not exceed them.
Agents cannot terminate themselves explicitly.
Given identical inputs, agent behavior should be reproducible.