·8 min read
Ground-truth leakage in agentic evals
Why RAG and tool-using agent evals leak the target answer into retrieved context, the four channels it travels through, and two checks — a deterministic static scan and a blind baseline — to detect and fix it. Includes a zero-dependency open-source skill.