Research
Every non-trivial optimisation in the format pipeline is grounded in a paper measured against a real corpus — 523 Claude Code sessions, 10,644 MCP tool responses from production traffic. The full source-of-truth (methods, datasets, reproducibility scripts) lives at docs/research/INDEX.md in the repo. This page is a quick orientation.
Papers
Corpus baselines
These numbers ground every paper above (paper 1 §B):
get_merge_request_diffs— P90 = 35 k chars ≈ 10 k tokens, 28% of responses exceed an 8 k-token budgetget_epics— P90 = 43 k chars ≈ 12 k tokens, 37% exceed budget- After overflow, agents always produce a text response on the next turn — they never retry / paginate
Status
- Paper 1 — complete draft, replicated across 3 corpora, lands in the next minor version.
- Paper 2 — complete draft, lands in the next minor version (the format-adaptive encoder is already in the codebase under feature flag).
- Paper 3 — prefetch dispatcher merged in v0.22; production telemetry pending.
- Paper 4 — concept stage, no production code yet.
Notebooks & data
Reproducibility scripts and the corpus notebook are under docs/research/ in the repo (paper-1-repro, benchmarks, data, notebooks).