Golden cases for CODE_QA pipeline calibration

Each case defines:

id: unique case id
query: user query text
expected_intent: CODE_QA (or DOCS_QA for docs-only; this set is code-only)
expected_sub_intent: OPEN_FILE | EXPLAIN | FIND_TESTS | FIND_ENTRYPOINTS | GENERAL_QA
expected_answer_mode: normal | degraded | insufficient
expected_target_hint: optional — path (for OPEN_FILE), symbol (for EXPLAIN), or test-like
expected_layers: optional — list of layer ids we expect in the retrieval plan
notes: optional — borderline, negative, or calibration hint

We assert routing, retrieval alignment, evidence sufficiency, and answer mode — not exact LLM wording.