TraceCore Documentation
A deterministic, budgeted, and auditable test runner for agent control loops.
What is TraceCore?
TraceCore is best understood as a deterministic test runner for agent loops — think "pytest for agents." It does not measure broad intelligence. It validates whether an agent can operate correctly, repeatedly, under constraints.
The invariant core is a Deterministic Episode Runtime: a bounded runtime that executes agent-environment interaction with fixed inputs and emits replayable traces plus a structured verdict. TraceCore v1.0 locks this behavior behind the tracecore-spec v1.0 contract and `tracecore run --strict-spec`.
Key Concepts
- Episodes are the smallest valid execution unit.
- Each episode runs Agent + Environment + Seed + Budgets and produces a trace + outcome.
- Validation is always deterministic and mechanical — no LLM judges.
- Artifacts are machine-readable and CI-friendly.
- Replay is a first-class property, not a convenience feature.
Quick Start
pip install tracecore tracecore version tracecore run pairing log_stream_monitor --strict-spec
Practical Value
- Regression detection with stable seeds and baseline compare workflows.
- Actionable failures via structured taxonomy and full trace context.
- CI-native gating using deterministic pass/fail and policy thresholds.
- Auditable evidence through persisted run artifacts and replayability.
One-Line Mental Model
If pytest tests functions, TraceCore executes deterministic episodes. If Docker packages containers, TraceCore packages bounded agent-environment interactions.