Снимок runtime-контура CODE_QA (answer layer)

Документ фиксирует текущее состояние runtime-контура CODE_QA после рефакторинга для планирования доработок answer layer. Без предложений по новому дизайну и без implementation brief.

1. Entry point

HTTP: POST /api/chat/messages → ChatModule.public_router() → send_message().
Файл: src/app/modules/chat/module.py, строки 74–81.
Условие: при SIMPLE_CODE_EXPLAIN_ONLY=true запрос идёт в CodeExplainChatService.handle_message() (прямой explain без полного CODE_QA pipeline). При false — в оркестратор.
Оркестратор: ChatOrchestrator.enqueue_message() создаёт задачу и запускает _process_task() → в нём вызывается self._runtime.run(...).
Файл: src/app/modules/chat/service.py, строки 47–69, 71–132.
Runtime-адаптер: CodeQaRunnerAdapter реализует AgentRunner; в run() вызывает self._executor.execute(user_query=..., rag_session_id=..., files_map=...) в thread pool.
Файл: src/app/modules/agent/runtime/code_qa_runner_adapter.py, строки 21–41.
Фактическая точка входа CODE_QA: AgentRuntimeExecutor.execute().
Файл: src/app/modules/agent/runtime/executor.py, строка 53.
Создание executor: application.py, строка 48 — _executor = AgentRuntimeExecutor(llm=..., retrieval=...).

2. Runtime pipeline

Цепочка внутри AgentRuntimeExecutor.execute() (файл executor.py):

Шаг	Файл	Класс/функция	Роль
1. Роутинг	`executor.py`	`self._router.route(user_query, ...)`	Intent + sub-intent, query_plan, retrieval_spec, symbol_resolution (pending).
2. Сборка запроса retrieval	`retrieval_request_builder.py`	`build_retrieval_request(router_result, rag_session_id)`	Из `RouterResult` собирается `RetrievalRequest`: query, sub_intent, path_scope, requested_layers, retrieval_spec, constraints, query_plan.
3. Retrieval	`executor.py`	`self._retrieve(state)` → `RuntimeRetrievalAdapter.retrieve_with_plan()` или `retrieve_exact_files()` для OPEN_FILE	По плану или по точным путям; возвращает `raw_rows` (list[dict]).
4. Догидрация (только FIND_ENTRYPOINTS)	`executor.py`	`_hydrate_entrypoint_sources()`	Дозапрос C0 по путям из C3 entrypoints.
5. Разрешение символа	`executor.py`	`_resolve_symbol(initial, raw_rows)`	По C1_SYMBOL_CATALOG: resolved / ambiguous / not_found; обновляет `state.router_result.symbol_resolution`.
6. Retrieval result	`retrieval_result_builder.py`	`build_retrieval_result(raw_rows, report, symbol_resolution)`	Нормализованный `RetrievalResult`: code_chunks, relations, entrypoints, test_candidates, layer_outcomes и т.д. Для EXPLAIN при not_found/ambiguous — пересборка с пустыми rows (строки 90–91 executor).
7. Evidence bundle	`evidence_bundle_builder.py`	`build_evidence_bundle(retrieval_result, router_result)`	`EvidenceBundle`: resolved_sub_intent, resolved_target, code_chunks, relations, entrypoints, test_evidence, retrieval_summary. sufficient/failure_reasons не выставляются здесь.
8. Pre evidence gate	`evidence_gate.py`	`evaluate_evidence(state.evidence_pack)`	По sub_intent проверяет достаточность (target, evidence_count, слои, entrypoints, tests). Выставляет `bundle.sufficient`, возвращает `EvidenceGateDecision`; от этого — `state.answer_mode` (normal/degraded).
9. Answer policy	`policy.py`	`self._answer_policy.decide(router_result, gate_decision)`	Решение: вызывать LLM или короткий ответ (OPEN_FILE not_found, EXPLAIN not_found/ambiguous, gate не прошёл). При `should_call_llm=False` сразу идём в `assemble_final_result` с `decision.answer`.
10. Synthesis input	`answer_synthesis.py`	`build_answer_synthesis_input(user_query, state.evidence_pack)`	Строит `AnswerSynthesisInput`: fast_context, deep_context, evidence_summary, semantic_hints, curated_facts (из answer_fact_curator).
11. Выбор промпта	`prompt_selector.py`	`self._prompt_selector.select(sub_intent=..., answer_mode=...)`	Имя системного промпта по sub_intent (и degraded).
12. Payload	`prompt_payload_builder.py`	`self._payload_builder.build(user_query, synthesis_input, evidence_pack, answer_mode)`	JSON payload для LLM: user_query, resolved_scenario, fast/deep_context, evidence_summary, curated must_mention_*, layer_guide, entrypoints, scenario-specific поля.
13. Генерация черновика	`generator.py`	`self._generator.generate(prompt_name, prompt_payload)`	Вызов `AgentLlmService.generate(prompt_name, payload)` → черновик ответа.
14. Post evidence gate	`post_gate.py`	`self._post_gate.validate(answer, answer_mode, ..., sub_intent, user_query, evidence_pack)`	Проверка черновика по sub_intent (EXPLAIN/ARCHITECTURE/TRACE_FLOW/…), возврат `RuntimeValidationResult(passed, action, reasons)`.
15. Repair (если не passed)	`repair.py`	`self._repair.repair(draft_answer, validation, prompt_payload)`	Один вызов LLM с промптом `code_qa_repair_answer`; повторная валидация; при повторном fail — fallback answer.
16. Финальный результат	`result_assembler.py`	`assemble_final_result(state, draft=..., final_answer=..., ...)`	Сборка `RuntimeFinalResult` и диагностики.

Sub-intent для CODE_QA задаётся в роутере: QueryPlanBuilder использует SubIntentDetector.detect() и _resolve_sub_intent(); итог в query_plan.sub_intent. Ретривал-слои по sub_intent задаются в RetrievalSpecFactory._with_sub_intent_layers() (retrieval_spec_factory.py).

3. Answer path

Выбор промпта: RuntimePromptSelector.select(sub_intent, answer_mode) — src/app/modules/agent/runtime/steps/generation/prompt_selector.py, строки 18–21. При answer_mode in {"degraded","not_found","insufficient"} возвращается code_qa_degraded_answer, иначе — по sub_intent из словаря (fallback code_qa_explain_answer).
Сборка payload: RuntimePromptPayloadBuilder.build() — prompt_payload_builder.py, строки 21–44. В payload попадают: user_query, resolved_scenario, resolved_target, answer_mode, fast_context, deep_context, evidence_summary, semantic_hints, diagnostic_hints, retrieval_summary, confirmed_entrypoints, required_entrypoints, layer_guide, плюс сценарий-специфичные поля из _scenario_payload(synthesis_input) (must_mention_*, fact_gaps и т.д.).
Draft answer: создаётся в executor.py, строки 242–246: RuntimeDraftAnswer(prompt_name=..., prompt_payload=..., answer=self._generator.generate(...)).
Post-processing: отдельного шага нет; после генерации сразу идёт post-validation.
Repair: RuntimeAnswerRepairService.repair() — repair.py, строки 16–37. Формирует JSON с draft_answer, validation_reasons, repair_focus, prompt_payload и один раз вызывает LLM с code_qa_repair_answer.
Final text: в executor: при passed — final_answer = draft.answer (или результат repair); при не passed после repair — _fallback_answer(state). Итоговая строка попадает в RuntimeFinalResult.final_answer в assemble_final_result().

4. Prompt selection

Где: src/app/modules/agent/runtime/steps/generation/prompt_selector.py, класс RuntimePromptSelector, метод select(sub_intent, answer_mode).
Правила:
- answer_mode in {"degraded","not_found","insufficient"} → code_qa_degraded_answer.
- Иначе по sub_intent.upper() из _PROMPTS; при отсутствии ключа — code_qa_explain_answer.
Используемые имена промптов для целевых sub_intent:

sub_intent	prompt name
EXPLAIN	`code_qa_explain_answer`
EXPLAIN_LOCAL	`code_qa_explain_local_answer`
ARCHITECTURE	`code_qa_architecture_answer`
TRACE_FLOW	`code_qa_trace_flow_answer`

Шаблоны: загружаются по имени из YAML в AgentLlmService.generate() → PromptLoader.load(name); конфиг — src/app/modules/agent/llm/prompts.yml. Ключи в YAML совпадают с именами выше (в т.ч. code_qa_explain_answer, code_qa_architecture_answer, code_qa_trace_flow_answer); repair — code_qa_repair_answer.
Выбор по sub_intent: да, только через RuntimePromptSelector.select(sub_intent=state.retrieval_request.sub_intent, ...) в executor, строка 231.

5. Evidence-to-answer boundary

В answer layer evidence приходит как:
- EvidenceBundle (в state.evidence_pack) и
- AnswerSynthesisInput (state.synthesis_input), собранный из bundle в build_answer_synthesis_input().
Модели/DTO:
- EvidenceBundle: contracts.py, 90–106 — resolved_intent, resolved_sub_intent, resolved_target, target_type, code_chunks, relations, entrypoints, test_evidence, evidence_count, retrieval_summary.
- AnswerSynthesisInput: contracts.py, 109–121 — user_question, resolved_scenario, resolved_target, fast_context, deep_context, evidence_summary, semantic_hints, curated_facts, evidence_sufficient, diagnostic_hints.
- Curated facts строит answer_fact_curator.build_curated_answer_facts(bundle) — словарь с ключами explain, architecture, trace_flow и общими полями (scenario, semantic_hints, relation_count и т.д.).
Что реально уходит в payload (prompt_payload_builder):
- Общее: user_query, resolved_scenario, resolved_target, answer_mode, fast_context, deep_context, evidence_summary, semantic_hints, diagnostic_hints, retrieval_summary, confirmed_entrypoints, required_entrypoints, layer_guide.
- EXPLAIN: must_mention_methods/fields/calls/dependencies/constructor_args/files, must_not_infer_missing_details, fact_gaps (из curated_facts["explain"]).
- ARCHITECTURE: must_mention_components/relations, must_use_relation_verbs, must_avoid_semantic_labels_as_primary_claims, must_not_use_retrieval_labels, fact_gaps (из curated_facts["architecture"]).
- TRACE_FLOW: must_mention_flow_steps/calls/sequence_edges, must_avoid_overclaiming_full_flow, fact_gaps (из curated_facts["trace_flow"]).
Curated-поля (answer_fact_curator):
- explain: required_methods, required_calls, required_fields, required_dependencies, required_constructor_args, required_files, fact_gaps (и др.).
- architecture: required_components, required_relations (source/verb/target/edge_type), required_relation_verbs, required_*_edges, forbidden_labels, fact_gaps.
- trace_flow: required_flow_steps (step, source, verb, target, path, line_span), required_calls, required_sequence_edges, fact_gaps.

То есть в LLM попадает не сырой retrieval, а нормализованный контекст (fast/deep_context, evidence_summary) плюс явные списки «must_mention_*» и fact_gaps по сценарию; для methods/dependencies/relations/flow steps уже есть выделенные curated-поля.

6. Post-validation / answer quality control

Post-evidence gate (runtime): есть. RuntimePostEvidenceGate.validate() — src/app/modules/agent/runtime/steps/gates/post/post_gate.py, строки 39–65. Вызывается после генерации черновика (и после repair — повторно).
Answer validator: это тот же post_gate: проверяет пустой ответ, соответствие answer_mode (degraded/not_found/ambiguous) требуемым формулировкам, длину при degraded, затем для normal — _normal_answer_reasons() по sub_intent.
Repair loop: один раунд. При not validation.passed и наличии self._repair вызывается repair(); затем повторный validate(); если снова не passed — подставляется _fallback_answer() и смена answer_mode (executor.py, 281–298).
Правила по sub_intent (post_gate):
- EXPLAIN (93–124): target focus; vagueness (_VAGUE_PHRASES); наличие required_methods/calls/dependencies (хотя бы одна группа); «too_vague_for_explain» при нуле совпадений; semantic_leakage (роли из semantic_hints без опоры на код).
- ARCHITECTURE (126–150): target focus; vagueness; required_components, required_relations, relation_verbs; forbidden_labels (retrieval artifacts); methods_as_primary_components; «too_vague_for_architecture»; semantic_leakage.
- TRACE_FLOW (152–171): target focus; vagueness; required_flow_steps и required_calls; _mentions_steps (сначала/затем или нумерация); overclaims (_OPTIMISTIC_TRACE_CLAIMS); «too_vague_for_trace_flow».
Technical precision для EXPLAIN: проверяется косвенно: упоминание методов/вызовов/зависимостей из curated; явной проверки «только факты из кода» по токенам нет.
Concrete relations для ARCHITECTURE: да — _mentions_relations(answer, relations) и упоминание verbs.
Concrete steps и overclaim для TRACE_FLOW: да — _mentions_steps, _mentions_relations по steps, и проверка фраз из _OPTIMISTIC_TRACE_CLAIMS.

7. Problem sources (что может давать слабые ответы)

Payload shaping: prompt_payload_builder.py — если curated_facts пустые или скудные (мало methods/calls/relations/steps), must_mention_* не направляют модель; deep_context обрезается до 30 чанков по 800 символов — возможна потеря важных деталей.
Prompts: prompts.yml — длинные общие инструкции; для EXPLAIN/ARCHITECTURE/TRACE_FLOW нет жёсткой привязки к структуре payload (например, «обязательно используй must_mention_flow_steps по порядку»); модель может игнорировать fact_gaps.
Evidence normalization: answer_fact_curator — методы/вызовы/relations извлекаются эвристически (regex, C1/C2); при слабом C1/C2 или нестандартных именах curated-списки пустеют → валидатор не к чему привязываться, ответ считается «vague».
Weak validation: post_gate — проверки по вхождению подстрок (alias) и по небольшому набору фраз; нет проверки полноты (все ли must_mention_* упомянуты), нет проверки порядка шагов для TRACE_FLOW; semantic_leakage выключается при has_concrete_support, что может пропускать смешанные ответы.
Repair policy: один вызов repair с общим промптом code_qa_repair_answer и repair_focus по reasons; при множественных reasons фокус может размываться; после repair при повторном fail сразу fallback — без второго раунда repair.

8. Minimal intervention points

src/app/modules/agent/runtime/steps/generation/prompt_payload_builder.py
Класс RuntimePromptPayloadBuilder, метод build() и _scenario_payload().
Контролирует: какие поля и списки (must_mention_*, fact_gaps, layer_guide) попадают в JSON для LLM.
Удобно: один вход в «что видит модель»; можно усилить структуру под EXPLAIN/ARCHITECTURE/TRACE_FLOW без трогания оркестрации.
src/app/modules/agent/runtime/steps/context/answer_fact_curator.py
Функции _explain_facts(), _architecture_facts(), _trace_flow_facts().
Контролируют: состав и качество curated_facts (required_*, fact_gaps).
Удобно: улучшение извлечения методов/relations/steps напрямую улучшает и payload, и валидацию.
src/app/modules/agent/runtime/steps/gates/post/post_gate.py
Класс RuntimePostEvidenceGate, методы _validate_explain(), _validate_architecture(), _validate_trace_flow() и хелперы (_mentions_fact_group, _mentions_relations, _mentions_steps).
Контролирует: критерии прохождения и набор reasons для repair.
Удобно: уже разбито по сценариям; можно ужесточить правила и добавить новые reasons без смены архитектуры.
src/app/modules/agent/llm/prompts.yml
Блоки code_qa_explain_answer, code_qa_architecture_answer, code_qa_trace_flow_answer, code_qa_repair_answer.
Контролируют: инструкции для черновика и починки.
Удобно: точечные правки формулировок и явные отсылки к полям payload (must_mention_*, fact_gaps).
src/app/modules/agent/runtime/steps/generation/prompt_selector.py
Класс RuntimePromptSelector, словарь _PROMPTS и метод select().
Контролирует: какой системный промпт выбирается по sub_intent/answer_mode.
Удобно: введение отдельных промптов для подвидов (например, TRACE_FLOW по типу запроса) без изменения executor.
src/app/modules/agent/runtime/steps/context/answer_synthesis.py
Функция build_answer_synthesis_input(), формирование fast_context и deep_context (в т.ч. фильтр по C4 для EXPLAIN/ARCHITECTURE).
Контролирует: объём и приоритет контекста, передаваемого в synthesis_input.
Удобно: можно менять лимиты, порядок чанков или фильтры слоёв локально.
src/app/modules/agent/runtime/steps/finalization/repair.py
Класс RuntimeAnswerRepairService, метод repair() и _repair_focus().
Контролирует: как validation.reasons мапятся в repair_focus и что уходит в промпт починки.
Удобно: можно сузить фокус repair под конкретные reasons или добавить приоритизацию без изменения цикла в executor.

Документ описывает только текущую реализацию по коду после рефакторинга.

20 KiB Raw Blame History Unescape Escape