Agent Catalog

Reference agents that illustrate API usage and serve as baselines.

Available Agents

AgentTarget TasksHighlightsFile
ToyAgentfilesystem_hidden_config@1Cautious filesystem exploration, basic error handlingagents/toy_agent.py
NaiveLLMLoopAgentfilesystem_hidden_config@1Single-shot read/extract with one retry; minimal competenceagents/naive_llm_agent.py
RateLimitAgentrate_limited_api@1Respects retry_after, retries transient failures, payload cachingagents/rate_limit_agent.py
ChainAgentrate_limited_chain@1, deterministic_rate_service@1Handshake orchestration, payload resets, fatal/transient recoveryagents/chain_agent.py
CheaterSimAgentAll tasks (defense testing)Intentionally probes sandbox to exfiltrate hidden state; expected to failagents/cheater_agent.py
OpsTriageAgentlog_alert_triage@1, config_drift_remediation@1, incident_recovery_chain@1Deterministic log/config triage, drift diffing, handoff chainingagents/ops_triage_agent.py
LogStreamMonitorAgentlog_stream_monitor@1Patience + trigger detection across paginated log stream; record mode prototypeagents/log_stream_monitor_agent.py

Usage

Each agent can be passed to the runner via the --agent flag. Agents are Python classes that implement the required reset/observe/act interface. The CheaterSimAgent is provided specifically to verify sandbox defenses and is expected to fail every run.