Golden cases for CODE_QA pipeline calibration
Each case defines:
id: unique case idquery: user query textexpected_intent: CODE_QA (or DOCS_QA for docs-only; this set is code-only)expected_sub_intent: OPEN_FILE | EXPLAIN | FIND_TESTS | FIND_ENTRYPOINTS | GENERAL_QAexpected_answer_mode: normal | degraded | insufficientexpected_target_hint: optional — path (for OPEN_FILE), symbol (for EXPLAIN), or test-likeexpected_layers: optional — list of layer ids we expect in the retrieval plannotes: optional — borderline, negative, or calibration hint
We assert routing, retrieval alignment, evidence sufficiency, and answer mode — not exact LLM wording.