Agent Interface Contract (v0)

An agent is a stateful control loop that receives observations, decides on a single action, updates internal state, and repeats until termination.

Required Agent API

agent.py
class Agent:
    def reset(self, task_spec: dict) -> None:
        ...

    def observe(self, observation: dict) -> None:
        ...

    def act(self) -> dict:
        ...

No async. No callbacks. No streaming.

Lifecycle

  • Initialization: instantiate once per run.
  • Reset: agent.reset(task_spec) called before the loop begins.
  • Observe → Act loop runs per step until termination.

Observation Contract

Observations are fully structured. No free-form text. Fields are stable and always present.

Action Contract

Agents return a single JSON-serializable action. Invalid action causes immediate failure.

action.json
{
  "type": "read_file",
  "args": {
    "path": "/app/config.yaml"
  }
}

Constraints

  • Agents may keep internal state. The harness does not inspect internals.
  • Agents should notice failures, retry conservatively, and replan when necessary.
  • Agents receive remaining budgets and must not exceed them.
  • Agents cannot terminate themselves explicitly.
  • Given identical inputs, agent behavior should be reproducible.