🌰 seedling
Sprint Contracts - Negotiated Agent Agreements

Sprint Contracts — Negotiated Agent Agreements


Why contracts matter

High-level product specs are intentionally vague about implementation details — specifying too much upfront risks cascading errors when the spec gets something wrong. But an evaluator needs concrete criteria to grade against. Sprint contracts bridge this gap: they are specific enough to test against but written just before implementation rather than at planning time.

Source: Harness design for long-running application development (Prithvi Rajasekaran, Anthropic, March 2026)

The negotiation flow

  1. Generator proposes — what it will build in this sprint and how success will be verified
  2. Evaluator reviews — checks whether the generator is building the right thing relative to the spec
  3. Iterate — the two agents go back and forth until they agree on the contract
  4. Generator builds — implements against the agreed contract
  5. Evaluator grades — tests against the contract’s criteria

Communication happens via files: one agent writes a file, the other reads it and responds within that file or with a new one. No shared context window needed.

Granularity in practice

Contracts can be very granular. In one example, Sprint 3 alone had 27 criteria covering the level editor. The evaluator’s findings were specific enough to act on without extra investigation:

Contract criterionEvaluator finding
Rectangle fill tool allows click-drag to fill areaFAIL — Tool only places tiles at drag start/end points instead of filling the region
User can select and delete entity spawn pointsFAIL — Delete key handler requires both selection and selectedEntityId to be set, but clicking an entity only sets one
User can reorder animation frames via APIFAIL — PUT /frames/reorder route defined after /{frame_id} routes; FastAPI matches “reorder” as a frame_id integer

These are implementation-level bugs caught by contract-level criteria — the contract translated high-level user stories into testable behaviors that the evaluator could exercise through browser automation.

The spec cascade problem

The planner intentionally avoids specifying granular technical details. If the planner gets implementation details wrong upfront, errors cascade into downstream work. Sprint contracts solve this by deferring implementation specifics to the moment before building, when the generator has the most context about the current state of the codebase.

This mirrors a well-known software engineering principle: defer decisions to the last responsible moment. The sprint contract is that moment for agent harnesses.


Connected Notes