diff --git a/README.md b/README.md
index f78ba80..d00da1f 100644
--- a/README.md
+++ b/README.md
@@ -54,7 +54,7 @@
 
 - **Target Architecture** описывает то, к чему проект идёт.
 - **MVP-now** описывает то, что реально доводится сейчас через тесты.
-- **MVP-now** не включает UI-интеграцию и не требует полного runtime orchestration.
+- **MVP-now** не включает UI-интеграцию и использует единый stage-based runtime.
 - **MVP-now** фокусируется на:
   - IntentRouterV2;
   - code-first retrieval;
@@ -72,9 +72,10 @@
 
 **MVP-now**
 
-- isolated `CODE_QA` test pipeline;
-- IntentRouterV2 as canonical router;
-- router-driven layered retrieval;
+- единый `agent runtime`;
+- `IntentRouterV2` как канонический router и retrieval planner;
+- stage-based execution внутри `agent.runtime`;
+- infrastructure `rag` только для indexing/retrieval/storage;
 - evidence-first answer synthesis;
 - diagnostics-first tuning;
 - no UI dependency;
@@ -870,9 +871,9 @@ flowchart TD
 **Target Architecture**
 
 - Router
-- Graphs / pipelines для `CODE`, `DOCS`, `CROSS_DOMAIN`, `GENERAL`
+- unified runtime
 - CODE RAG + DOCS RAG
-- evidence gate
+- evidence gates
 - synthesis layer
 - diagnostics
 - генерация технической документации из code / docs / system analysis
@@ -880,8 +881,8 @@ flowchart TD
 
 **MVP-now**
 
-- изолированный test-first пайплайн;
-- цепочка: `user query → IntentRouterV2 → retrieval plan → layered retrieval → evidence gate → LLM answer → diagnostics`;
+- единый test-first runtime;
+- цепочка: `user query → IntentRouterV2 → retrieval planning → runtime retrieval → context normalization → evidence gate 1 → answer policy → LLM answer → evidence gate 2 → finalization/diagnostics`;
 - основной домен: `CODE`;
 - основные сценарии:
   - `OPEN_FILE`
@@ -889,35 +890,42 @@ flowchart TD
   - `FIND_TESTS`
   - `FIND_ENTRYPOINTS`
   - `GENERAL_QA`
+  - `TRACE_FLOW`
+  - `ARCHITECTURE`
 - UI-интеграция не требуется для текущего этапа;
 - docs retrieval не обязателен для текущего milestone;
-- legacy `RouterService` не считается целевой архитектурой и в перспективе будет заменён.
+- legacy orchestration удалён из актуального execution path.
 
 ```mermaid
 flowchart TD
     U[User Query] --> R[IntentRouterV2]
-    R --> P[Retrieval Plan]
-    P --> G[Layered Retrieval]
-    G --> E[Evidence Gate]
-    E --> A[LLM Answer]
-    E --> D[Diagnostics]
-    A --> D
+    R --> P[Retrieval Planning]
+    P --> X[Runtime Retrieval]
+    X --> C[Context Normalization]
+    C --> E1[Evidence Gate 1]
+    E1 --> AP[Answer Policy]
+    AP --> A[LLM Answer]
+    AP --> D[Diagnostics]
+    A --> E2[Evidence Gate 2]
+    E2 --> F[Finalization]
+    F --> D
 ```
 
-Текущий milestone — test-first и code-first; этот пайплайн настраивается изолированно до интеграции в полный agent runtime.
+Текущий milestone — test-first и code-first; этот runtime уже является каноническим execution path для MVP.
 
-### 4.1.3. Канонический test-first пайплайн (CODE_QA)
+### 4.1.3. Канонический MVP runtime (CODE-first)
 
-Единая точка входа изолированного пайплайна — пакет `app.modules.rag.code_qa_pipeline`:
+Единая точка входа исполнения — пакет `app.modules.agent.runtime`:
 
-- **Роутер:** только `IntentRouterV2`; legacy `RouterService` не используется.
-- **Контракты:** `RouterResult` (IntentRouterResult), `RetrievalRequest`, `RetrievalResult`, `EvidenceBundle`, `AnswerSynthesisInput`, `DiagnosticsReport`.
-- **Цепочка:** запрос → IntentRouterV2 → RetrievalRequest → layered retrieval (через адаптер) → нормализованный RetrievalResult → EvidenceBundle → evidence gate → AnswerSynthesisInput → diagnostics.
-- **Evidence gate:** общая проверка достаточности evidence по сценарию (OPEN_FILE, EXPLAIN, FIND_TESTS, FIND_ENTRYPOINTS, GENERAL_QA); при недостатке — degraded/insufficient, без уверенного ответа.
-- **Диагностика:** Level 1 (summary) и Level 2 (detail), машинно-читаемые коды причин (`failure_reasons`: `target_not_resolved`, `path_scope_empty`, `layer_c0_empty`, `insufficient_evidence`, `tests_not_found`, `entrypoints_not_found` и др.).
-- **Запуск пайплайна в тестах:** `CodeQAPipelineRunner(router=..., retrieval_adapter=...)`; метод `run(user_query, rag_session_id)` возвращает `CodeQAPipelineResult` с полной диагностикой.
+- **Роутер:** `app.modules.agent.intent_router_v2`; он отвечает и за routing, и за retrieval planning.
+- **LLM-слой:** `app.modules.agent.llm`; здесь живут `AgentLlmService`, `PromptLoader` и системные prompt assets.
+- **Runtime:** `app.modules.agent.runtime`; внутри него stages разложены по подпакетам `retrieval`, `context`, `gates`, `answer_policy`, `generation`, `finalization`.
+- **Цепочка:** запрос → `IntentRouterV2` → retrieval planning → runtime retrieval adapter → нормализованный context/evidence → evidence gate 1 → answer policy → LLM generation → evidence gate 2 → finalization → diagnostics.
+- **Evidence gates:** pre/post проверки достаточности evidence и качества ответа по сценарию.
+- **Диагностика:** runtime возвращает machine-readable diagnostics и trace по стадиям.
+- **RAG:** `app.modules.rag` больше не содержит agent use-case слоев; он остается инфраструктурой indexing/retrieval/storage.
 
-Тесты: `tests/pipeline_setup/pipeline_intent_rag/test_canonical_code_qa_pipeline.py` (роутер → retrieval request, нормализованный результат, evidence gate, диагностика).
+Тесты: `pipeline_setup_v3` и связанные suite-ы проверяют канонический runtime и его stage-based execution.
 
 ## 4.2. Router
 
@@ -926,7 +934,7 @@ Router определяет:
 - intent;
 - sub-intent;
 - confidence;
-- подходящий graph;
+- подходящий execution path;
 - требования к retrieval plan.
 
 Целевые домены:
@@ -935,9 +943,22 @@ Router определяет:
 - `CROSS_DOMAIN`
 - `GENERAL`
 
-## 4.3. Graphs / pipelines
+## 4.3. Runtime Stages
 
-Graph — это специализированный сценарий обработки запроса.
+В текущем MVP execution path реализован не через graph engine, а через единый runtime с явными stage-компонентами.
+
+Текущие стадии:
+- `IntentRouterV2`
+- `retrieval planning`
+- `runtime retrieval`
+- `context normalization`
+- `evidence gate 1`
+- `answer policy`
+- `LLM generation`
+- `evidence gate 2`
+- `finalization + diagnostics`
+
+Если сценарии в будущем начнут расходиться по структуре, а не только по policy-логике шагов, следующим шагом будет рассмотрен переход на graph-based orchestration.
 
 Для MVP целесообразны как минимум:
 - `CodeOpenGraph`
@@ -1186,4 +1207,3 @@ DOCS и CROSS_DOMAIN остаются частью target architecture; в те
 - богатые fact-индексы по всем доменам;
 - полный reference graph документации;
 - глубокая автоматизация подготовки системной аналитики.
-
diff --git a/src/app/modules/agent/README.md b/src/app/modules/agent/README.md
index 18c8317..92d3ed5 100644
--- a/src/app/modules/agent/README.md
+++ b/src/app/modules/agent/README.md
@@ -1,37 +1,37 @@
 # Модуль agent
 
 ## 1. Назначение
-Модуль обеспечивает выполнение code-QA пайплайна для pipeline_setup_v3 и интеграцию с chat-слоем через адаптер к контракту `AgentRunner`. Оркестрация основана на **IntentRouterV2** (RAG) и **CodeQaRuntimeExecutor** (роутинг → retrieval → evidence gate → генерация ответа).
+Модуль обеспечивает выполнение code-QA пайплайна для pipeline_setup_v3 и интеграцию с chat-слоем через адаптер к контракту `AgentRunner`. Оркестрация основана на **IntentRouterV2** (RAG) и **AgentRuntimeExecutor** (роутинг → retrieval → evidence gate → генерация ответа).
 
 ## 2. Состав модуля
-- **code_qa_runtime/** — рантайм выполнения code-QA: роутер интентов, retrieval, evidence gate, выбор промпта и генерация ответа (LLM).
+- **runtime/** — единственный orchestration-слой. На верхнем уровне содержит только файлы рантайма, а шаги исполнения вынесены в `runtime/steps/*` (`retrieval`, `context`, `gates`, `answer_policy`, `generation`, `finalization`, `explain`). Публичный API: `AgentRuntimeExecutor`, `RuntimeRetrievalAdapter`, `RuntimeRepoContextFactory`, модели `Runtime*`.
 - **llm/** — сервис вызова LLM (GigaChat) с загрузкой системных промптов через `PromptLoader`.
-- **prompt_loader.py** — загрузка текстов промптов из каталога `prompts/`.
-- **code_qa_runner_adapter.py** — адаптер `CodeQaRuntimeExecutor` к протоколу `AgentRunner` для использования из chat (async `run` → sync `execute` в executor).
+- **llm/prompt_loader.py** — загрузка системных промптов из `llm/prompts.yml`.
+- **runtime/code_qa_runner_adapter.py** — адаптер `AgentRuntimeExecutor` к протоколу `AgentRunner` для использования из chat (async `run` → sync `execute` в executor).
 
 ## 3. Диаграмма зависимостей
 ```mermaid
 classDiagram
-    class CodeQaRuntimeExecutor
+    class AgentRuntimeExecutor
     class CodeQaRunnerAdapter
     class AgentLlmService
     class PromptLoader
     class IntentRouterV2
-    class CodeQaRetrievalAdapter
+    class RuntimeRetrievalAdapter
 
-    CodeQaRunnerAdapter --> CodeQaRuntimeExecutor
-    CodeQaRuntimeExecutor --> AgentLlmService
-    CodeQaRuntimeExecutor --> IntentRouterV2
-    CodeQaRuntimeExecutor --> CodeQaRetrievalAdapter
+    CodeQaRunnerAdapter --> AgentRuntimeExecutor
+    AgentRuntimeExecutor --> AgentLlmService
+    AgentRuntimeExecutor --> IntentRouterV2
+    AgentRuntimeExecutor --> RuntimeRetrievalAdapter
     AgentLlmService --> PromptLoader
 ```
 
 ## 4. Точки входа
-- **Тесты pipeline_setup_v3**: `AgentRuntimeAdapter` импортирует `CodeQaRuntimeExecutor`, `IntentRouterV2`, `CodeQaRepoContextFactory`, `CodeQaRetrievalAdapter`, `AgentLlmService`, `PromptLoader` напрямую из соответствующих пакетов.
-- **Приложение (chat)**: `ModularApplication` собирает `CodeQaRuntimeExecutor` и оборачивает его в `CodeQaRunnerAdapter`; chat передаёт адаптер как `agent_runner` в `ChatModule`.
+- **Тесты pipeline_setup_v3**: `AgentRuntimeAdapter` импортирует `AgentRuntimeExecutor`, `IntentRouterV2`, `RuntimeRepoContextFactory`, `RuntimeRetrievalAdapter`, `AgentLlmService`, `PromptLoader` из `app.modules.agent.runtime` и соответствующих пакетов.
+- **Приложение (chat)**: `ModularApplication` собирает `AgentRuntimeExecutor` и оборачивает его в `CodeQaRunnerAdapter`; chat передаёт адаптер как `agent_runner` в `ChatModule`.
 
 ## 5. Промпты
-Используются только промпты, загружаемые из `prompts/`:
+Используются только промпты, загружаемые из `llm/prompts.yml`:
 - **code_qa_*** — ответы по sub_intent (architecture, explain, find_entrypoints, find_tests, general, open_file, trace_flow, degraded, repair).
 - **rag_intent_router_v2** — классификация интента в IntentRouterV2.
 - **code_explain_answer_v2** — прямой code-explain в chat (direct_service).
diff --git a/src/app/modules/agent/code_qa_runtime/__init__.py b/src/app/modules/agent/code_qa_runtime/__init__.py
deleted file mode 100644
index aafbb64..0000000
--- a/src/app/modules/agent/code_qa_runtime/__init__.py
+++ /dev/null
@@ -1,15 +0,0 @@
-from app.modules.agent.code_qa_runtime.executor import CodeQaRuntimeExecutor
-from app.modules.agent.code_qa_runtime.models import (
-    CodeQaDraftAnswer,
-    CodeQaExecutionState,
-    CodeQaFinalResult,
-    CodeQaValidationResult,
-)
-
-__all__ = [
-    "CodeQaDraftAnswer",
-    "CodeQaExecutionState",
-    "CodeQaFinalResult",
-    "CodeQaRuntimeExecutor",
-    "CodeQaValidationResult",
-]
diff --git a/src/app/modules/agent/code_qa_runtime/models.py b/src/app/modules/agent/code_qa_runtime/models.py
deleted file mode 100644
index 0957ad7..0000000
--- a/src/app/modules/agent/code_qa_runtime/models.py
+++ /dev/null
@@ -1,73 +0,0 @@
-from __future__ import annotations
-
-from typing import Any
-
-from pydantic import BaseModel, ConfigDict, Field
-
-from app.modules.rag.code_qa_pipeline.contracts import (
-    AnswerSynthesisInput as CodeQaAnswerSynthesisInput,
-)
-from app.modules.rag.code_qa_pipeline.contracts import (
-    DiagnosticsReport as CodeQaDiagnosticsReport,
-)
-from app.modules.rag.code_qa_pipeline.contracts import (
-    EvidenceBundle as CodeQaEvidencePack,
-)
-from app.modules.rag.code_qa_pipeline.contracts import (
-    RetrievalRequest as CodeQaRetrievalRequest,
-)
-from app.modules.rag.code_qa_pipeline.contracts import (
-    RetrievalResult as CodeQaRetrievalResult,
-)
-from app.modules.rag.intent_router_v2.models import ConversationState, IntentRouterResult, RepoContext
-
-
-class CodeQaDraftAnswer(BaseModel):
-    model_config = ConfigDict(extra="forbid")
-
-    prompt_name: str
-    prompt_payload: str
-    answer: str = ""
-
-
-class CodeQaValidationResult(BaseModel):
-    model_config = ConfigDict(extra="forbid")
-
-    passed: bool = False
-    action: str = "return"
-    reasons: list[str] = Field(default_factory=list)
-
-
-class CodeQaFinalResult(BaseModel):
-    model_config = ConfigDict(extra="forbid")
-
-    final_answer: str
-    answer_mode: str = "normal"
-    repair_used: bool = False
-    llm_used: bool = False
-    draft_answer: CodeQaDraftAnswer | None = None
-    validation: CodeQaValidationResult = Field(default_factory=CodeQaValidationResult)
-    router_result: IntentRouterResult | None = None
-    retrieval_request: CodeQaRetrievalRequest | None = None
-    retrieval_result: CodeQaRetrievalResult | None = None
-    evidence_pack: CodeQaEvidencePack | None = None
-    diagnostics: CodeQaDiagnosticsReport
-    runtime_trace: list[dict[str, Any]] = Field(default_factory=list)
-
-
-class CodeQaExecutionState(BaseModel):
-    model_config = ConfigDict(extra="forbid")
-
-    user_query: str
-    rag_session_id: str
-    conversation_state: ConversationState = Field(default_factory=ConversationState)
-    repo_context: RepoContext = Field(default_factory=RepoContext)
-    router_result: IntentRouterResult | None = None
-    retrieval_request: CodeQaRetrievalRequest | None = None
-    retrieval_result: CodeQaRetrievalResult | None = None
-    evidence_pack: CodeQaEvidencePack | None = None
-    synthesis_input: CodeQaAnswerSynthesisInput | None = None
-    diagnostics: CodeQaDiagnosticsReport | None = None
-    answer_mode: str = "normal"
-    degraded_message: str = ""
-    final_result: CodeQaFinalResult | None = None
diff --git a/src/app/modules/agent/engine/__init__.py b/src/app/modules/agent/engine/__init__.py
deleted file mode 100644
index e69de29..0000000
diff --git a/src/app/modules/agent/intent_router_v2/__init__.py b/src/app/modules/agent/intent_router_v2/__init__.py
new file mode 100644
index 0000000..36d8d2e
--- /dev/null
+++ b/src/app/modules/agent/intent_router_v2/__init__.py
@@ -0,0 +1,19 @@
+from app.modules.agent.intent_router_v2.models import (
+    ConversationState,
+    IntentDecision,
+    IntentRouterResult,
+    QueryAnchor,
+    QueryPlan,
+    RepoContext,
+)
+from app.modules.agent.intent_router_v2.router import IntentRouterV2
+
+__all__ = [
+    "ConversationState",
+    "IntentDecision",
+    "IntentRouterResult",
+    "IntentRouterV2",
+    "QueryAnchor",
+    "QueryPlan",
+    "RepoContext",
+]
diff --git a/src/app/modules/agent/intent_router_v2/analysis/__init__.py b/src/app/modules/agent/intent_router_v2/analysis/__init__.py
new file mode 100644
index 0000000..7dd31b8
--- /dev/null
+++ b/src/app/modules/agent/intent_router_v2/analysis/__init__.py
@@ -0,0 +1,4 @@
+from app.modules.agent.intent_router_v2.analysis.normalization import QueryNormalizer
+from app.modules.agent.intent_router_v2.analysis.query_plan_builder import QueryPlanBuilder
+
+__all__ = ["QueryNormalizer", "QueryPlanBuilder"]
diff --git a/src/app/modules/rag/intent_router_v2/analysis/anchor_extractor.py b/src/app/modules/agent/intent_router_v2/analysis/anchor_extractor.py
similarity index 93%
rename from src/app/modules/rag/intent_router_v2/analysis/anchor_extractor.py
rename to src/app/modules/agent/intent_router_v2/analysis/anchor_extractor.py
index 5b354c3..3f1555c 100644
--- a/src/app/modules/rag/intent_router_v2/analysis/anchor_extractor.py
+++ b/src/app/modules/agent/intent_router_v2/analysis/anchor_extractor.py
@@ -2,10 +2,10 @@ from __future__ import annotations
 
 import re
 
-from app.modules.rag.intent_router_v2.models import AnchorSpan, QueryAnchor
-from app.modules.rag.intent_router_v2.analysis.normalization_terms import KeyTermCanonicalizer
-from app.modules.rag.intent_router_v2.analysis.symbol_rules import COMMON_PATH_SEGMENTS, PY_KEYWORDS
-from app.modules.rag.intent_router_v2.analysis.term_mapping import RuEnTermMapper
+from app.modules.agent.intent_router_v2.models import AnchorSpan, QueryAnchor
+from app.modules.agent.intent_router_v2.analysis.normalization_terms import KeyTermCanonicalizer
+from app.modules.agent.intent_router_v2.analysis.symbol_rules import COMMON_PATH_SEGMENTS, PY_KEYWORDS
+from app.modules.agent.intent_router_v2.analysis.term_mapping import RuEnTermMapper
 
 _FILE_PATTERN = re.compile(r"(?P<value>\b(?:[\w.-]+/)*[\w.-]+\.(?:py|md|rst|txt|yaml|yml|json|toml|ini|cfg)\b)")
 _PATH_HINT_PATTERN = re.compile(r"(?P<value>\b(?:src|app|docs|tests)/[\w./-]*[\w-]\b)")
diff --git a/src/app/modules/rag/intent_router_v2/analysis/anchor_span_validator.py b/src/app/modules/agent/intent_router_v2/analysis/anchor_span_validator.py
similarity index 92%
rename from src/app/modules/rag/intent_router_v2/analysis/anchor_span_validator.py
rename to src/app/modules/agent/intent_router_v2/analysis/anchor_span_validator.py
index 3065972..1600a06 100644
--- a/src/app/modules/rag/intent_router_v2/analysis/anchor_span_validator.py
+++ b/src/app/modules/agent/intent_router_v2/analysis/anchor_span_validator.py
@@ -1,6 +1,6 @@
 from __future__ import annotations
 
-from app.modules.rag.intent_router_v2.models import QueryAnchor
+from app.modules.agent.intent_router_v2.models import QueryAnchor
 
 
 class AnchorSpanValidator:
diff --git a/src/app/modules/rag/intent_router_v2/analysis/conversation_anchor_builder.py b/src/app/modules/agent/intent_router_v2/analysis/conversation_anchor_builder.py
similarity index 91%
rename from src/app/modules/rag/intent_router_v2/analysis/conversation_anchor_builder.py
rename to src/app/modules/agent/intent_router_v2/analysis/conversation_anchor_builder.py
index e8404a2..d258266 100644
--- a/src/app/modules/rag/intent_router_v2/analysis/conversation_anchor_builder.py
+++ b/src/app/modules/agent/intent_router_v2/analysis/conversation_anchor_builder.py
@@ -1,7 +1,7 @@
 from __future__ import annotations
 
-from app.modules.rag.intent_router_v2.analysis.followup_detector import FollowUpDetector
-from app.modules.rag.intent_router_v2.models import ConversationState, QueryAnchor
+from app.modules.agent.intent_router_v2.analysis.followup_detector import FollowUpDetector
+from app.modules.agent.intent_router_v2.models import ConversationState, QueryAnchor
 
 
 class ConversationAnchorBuilder:
diff --git a/src/app/modules/rag/intent_router_v2/analysis/followup_detector.py b/src/app/modules/agent/intent_router_v2/analysis/followup_detector.py
similarity index 100%
rename from src/app/modules/rag/intent_router_v2/analysis/followup_detector.py
rename to src/app/modules/agent/intent_router_v2/analysis/followup_detector.py
diff --git a/src/app/modules/rag/intent_router_v2/analysis/keyword_hint_builder.py b/src/app/modules/agent/intent_router_v2/analysis/keyword_hint_builder.py
similarity index 84%
rename from src/app/modules/rag/intent_router_v2/analysis/keyword_hint_builder.py
rename to src/app/modules/agent/intent_router_v2/analysis/keyword_hint_builder.py
index b7cb353..4e6a954 100644
--- a/src/app/modules/rag/intent_router_v2/analysis/keyword_hint_builder.py
+++ b/src/app/modules/agent/intent_router_v2/analysis/keyword_hint_builder.py
@@ -2,8 +2,8 @@ from __future__ import annotations
 
 import re
 
-from app.modules.rag.intent_router_v2.analysis.normalization import FILE_PATH_RE
-from app.modules.rag.intent_router_v2.analysis.symbol_rules import COMMON_PATH_SEGMENTS, PY_KEYWORDS
+from app.modules.agent.intent_router_v2.analysis.normalization import FILE_PATH_RE
+from app.modules.agent.intent_router_v2.analysis.symbol_rules import COMMON_PATH_SEGMENTS, PY_KEYWORDS
 
 _IDENTIFIER_RE = re.compile(r"[A-Za-z_][A-Za-z0-9_]{2,}")
 
diff --git a/src/app/modules/rag/intent_router_v2/analysis/keyword_hint_sanitizer.py b/src/app/modules/agent/intent_router_v2/analysis/keyword_hint_sanitizer.py
similarity index 96%
rename from src/app/modules/rag/intent_router_v2/analysis/keyword_hint_sanitizer.py
rename to src/app/modules/agent/intent_router_v2/analysis/keyword_hint_sanitizer.py
index 7d44f26..f56a517 100644
--- a/src/app/modules/rag/intent_router_v2/analysis/keyword_hint_sanitizer.py
+++ b/src/app/modules/agent/intent_router_v2/analysis/keyword_hint_sanitizer.py
@@ -1,6 +1,6 @@
 from __future__ import annotations
 
-from app.modules.rag.intent_router_v2.models import QueryAnchor
+from app.modules.agent.intent_router_v2.models import QueryAnchor
 
 
 class KeywordHintSanitizer:
diff --git a/src/app/modules/rag/intent_router_v2/analysis/negation_detector.py b/src/app/modules/agent/intent_router_v2/analysis/negation_detector.py
similarity index 100%
rename from src/app/modules/rag/intent_router_v2/analysis/negation_detector.py
rename to src/app/modules/agent/intent_router_v2/analysis/negation_detector.py
diff --git a/src/app/modules/rag/intent_router_v2/analysis/normalization.py b/src/app/modules/agent/intent_router_v2/analysis/normalization.py
similarity index 94%
rename from src/app/modules/rag/intent_router_v2/analysis/normalization.py
rename to src/app/modules/agent/intent_router_v2/analysis/normalization.py
index cbbca79..133b759 100644
--- a/src/app/modules/rag/intent_router_v2/analysis/normalization.py
+++ b/src/app/modules/agent/intent_router_v2/analysis/normalization.py
@@ -13,7 +13,7 @@ SNAKE_RE = re.compile(r"(?<!\w)[a-z][a-z0-9]*(?:_[a-z0-9]+)+(?!\w)")
 SPACE_BEFORE_PUNCT_RE = re.compile(r"\s+([,.:;?!])")
 SPACE_AFTER_PUNCT_RE = re.compile(r"([,.:;?!])(?=(?:[\"'(\[A-Za-zА-ЯЁа-яё]))")
 WS_RE = re.compile(r"\s+")
-QUOTE_TRANSLATION = str.maketrans({"«": '"', "»": '"', "“": '"', "”": '"', "‘": "'", "’": "'"})
+QUOTE_TRANSLATION = str.maketrans({"«": '"', "»": '"', "\u201c": '"', "\u201d": '"', "\u2018": "'", "\u2019": "'"})
 
 
 class QueryNormalizer:
diff --git a/src/app/modules/rag/intent_router_v2/analysis/normalization_terms.py b/src/app/modules/agent/intent_router_v2/analysis/normalization_terms.py
similarity index 100%
rename from src/app/modules/rag/intent_router_v2/analysis/normalization_terms.py
rename to src/app/modules/agent/intent_router_v2/analysis/normalization_terms.py
diff --git a/src/app/modules/agent/intent_router_v2/analysis/query_normalizer.py b/src/app/modules/agent/intent_router_v2/analysis/query_normalizer.py
new file mode 100644
index 0000000..04e3c6c
--- /dev/null
+++ b/src/app/modules/agent/intent_router_v2/analysis/query_normalizer.py
@@ -0,0 +1,3 @@
+from app.modules.agent.intent_router_v2.analysis.normalization import QueryNormalizer
+
+__all__ = ["QueryNormalizer"]
diff --git a/src/app/modules/rag/intent_router_v2/analysis/query_plan_builder.py b/src/app/modules/agent/intent_router_v2/analysis/query_plan_builder.py
similarity index 92%
rename from src/app/modules/rag/intent_router_v2/analysis/query_plan_builder.py
rename to src/app/modules/agent/intent_router_v2/analysis/query_plan_builder.py
index e56bbd0..ce437b6 100644
--- a/src/app/modules/rag/intent_router_v2/analysis/query_plan_builder.py
+++ b/src/app/modules/agent/intent_router_v2/analysis/query_plan_builder.py
@@ -1,16 +1,16 @@
 from __future__ import annotations
 
-from app.modules.rag.intent_router_v2.analysis.anchor_extractor import AnchorExtractor
-from app.modules.rag.intent_router_v2.analysis.anchor_span_validator import AnchorSpanValidator
-from app.modules.rag.intent_router_v2.analysis.conversation_anchor_builder import ConversationAnchorBuilder
-from app.modules.rag.intent_router_v2.analysis.keyword_hint_builder import KeywordHintBuilder
-from app.modules.rag.intent_router_v2.analysis.keyword_hint_sanitizer import KeywordHintSanitizer
-from app.modules.rag.intent_router_v2.models import ConversationState, QueryAnchor, QueryPlan
-from app.modules.rag.intent_router_v2.analysis.negation_detector import NegationDetector
-from app.modules.rag.intent_router_v2.analysis.normalization import QueryNormalizer
-from app.modules.rag.intent_router_v2.analysis.sub_intent_detector import SubIntentDetector
-from app.modules.rag.intent_router_v2.analysis.test_signals import has_test_focus, is_negative_test_request, is_test_related_token
-from app.modules.rag.intent_router_v2.analysis.term_mapping import RuEnTermMapper
+from app.modules.agent.intent_router_v2.analysis.anchor_extractor import AnchorExtractor
+from app.modules.agent.intent_router_v2.analysis.anchor_span_validator import AnchorSpanValidator
+from app.modules.agent.intent_router_v2.analysis.conversation_anchor_builder import ConversationAnchorBuilder
+from app.modules.agent.intent_router_v2.analysis.keyword_hint_builder import KeywordHintBuilder
+from app.modules.agent.intent_router_v2.analysis.keyword_hint_sanitizer import KeywordHintSanitizer
+from app.modules.agent.intent_router_v2.models import ConversationState, QueryAnchor, QueryPlan
+from app.modules.agent.intent_router_v2.analysis.negation_detector import NegationDetector
+from app.modules.agent.intent_router_v2.analysis.normalization import QueryNormalizer
+from app.modules.agent.intent_router_v2.analysis.sub_intent_detector import SubIntentDetector
+from app.modules.agent.intent_router_v2.analysis.test_signals import has_test_focus, is_negative_test_request, is_test_related_token
+from app.modules.agent.intent_router_v2.analysis.term_mapping import RuEnTermMapper
 
 
 class QueryPlanBuilder:
diff --git a/src/app/modules/rag/intent_router_v2/analysis/sub_intent_detector.py b/src/app/modules/agent/intent_router_v2/analysis/sub_intent_detector.py
similarity index 100%
rename from src/app/modules/rag/intent_router_v2/analysis/sub_intent_detector.py
rename to src/app/modules/agent/intent_router_v2/analysis/sub_intent_detector.py
diff --git a/src/app/modules/rag/intent_router_v2/analysis/symbol_rules.py b/src/app/modules/agent/intent_router_v2/analysis/symbol_rules.py
similarity index 100%
rename from src/app/modules/rag/intent_router_v2/analysis/symbol_rules.py
rename to src/app/modules/agent/intent_router_v2/analysis/symbol_rules.py
diff --git a/src/app/modules/rag/intent_router_v2/analysis/term_mapping.py b/src/app/modules/agent/intent_router_v2/analysis/term_mapping.py
similarity index 96%
rename from src/app/modules/rag/intent_router_v2/analysis/term_mapping.py
rename to src/app/modules/agent/intent_router_v2/analysis/term_mapping.py
index 796b847..6e18434 100644
--- a/src/app/modules/rag/intent_router_v2/analysis/term_mapping.py
+++ b/src/app/modules/agent/intent_router_v2/analysis/term_mapping.py
@@ -2,7 +2,7 @@ from __future__ import annotations
 
 import re
 
-from app.modules.rag.intent_router_v2.analysis.normalization_terms import KeyTermCanonicalizer
+from app.modules.agent.intent_router_v2.analysis.normalization_terms import KeyTermCanonicalizer
 
 _WORD_RE = re.compile(r"[A-Za-zА-Яа-яЁё-]+")
 
diff --git a/src/app/modules/rag/intent_router_v2/analysis/test_signals.py b/src/app/modules/agent/intent_router_v2/analysis/test_signals.py
similarity index 100%
rename from src/app/modules/rag/intent_router_v2/analysis/test_signals.py
rename to src/app/modules/agent/intent_router_v2/analysis/test_signals.py
diff --git a/src/app/modules/rag/intent_router_v2/factory.py b/src/app/modules/agent/intent_router_v2/factory.py
similarity index 79%
rename from src/app/modules/rag/intent_router_v2/factory.py
rename to src/app/modules/agent/intent_router_v2/factory.py
index 57ebd21..c256a4e 100644
--- a/src/app/modules/rag/intent_router_v2/factory.py
+++ b/src/app/modules/agent/intent_router_v2/factory.py
@@ -1,9 +1,9 @@
 from __future__ import annotations
 
 from app.modules.agent.llm import AgentLlmService
-from app.modules.agent.prompt_loader import PromptLoader
-from app.modules.rag.intent_router_v2.intent.classifier import IntentClassifierV2
-from app.modules.rag.intent_router_v2.router import IntentRouterV2
+from app.modules.agent.llm.prompt_loader import PromptLoader
+from app.modules.agent.intent_router_v2.intent.classifier import IntentClassifierV2
+from app.modules.agent.intent_router_v2.router import IntentRouterV2
 from app.modules.shared.env_loader import load_workspace_env
 from app.modules.shared.gigachat.client import GigaChatClient
 from app.modules.shared.gigachat.settings import GigaChatSettings
diff --git a/src/app/modules/agent/intent_router_v2/intent/__init__.py b/src/app/modules/agent/intent_router_v2/intent/__init__.py
new file mode 100644
index 0000000..a3fb94c
--- /dev/null
+++ b/src/app/modules/agent/intent_router_v2/intent/__init__.py
@@ -0,0 +1,5 @@
+from app.modules.agent.intent_router_v2.intent.classifier import IntentClassifierV2
+from app.modules.agent.intent_router_v2.intent.conversation_policy import ConversationPolicy
+from app.modules.agent.intent_router_v2.intent.graph_id_resolver import GraphIdResolver
+
+__all__ = ["IntentClassifierV2", "ConversationPolicy", "GraphIdResolver"]
diff --git a/src/app/modules/rag/intent_router_v2/intent/classifier.py b/src/app/modules/agent/intent_router_v2/intent/classifier.py
similarity index 95%
rename from src/app/modules/rag/intent_router_v2/intent/classifier.py
rename to src/app/modules/agent/intent_router_v2/intent/classifier.py
index e1ec761..fc86d49 100644
--- a/src/app/modules/rag/intent_router_v2/intent/classifier.py
+++ b/src/app/modules/agent/intent_router_v2/intent/classifier.py
@@ -3,9 +3,9 @@ from __future__ import annotations
 import json
 import re
 
-from app.modules.rag.intent_router_v2.models import ConversationState, IntentDecision
-from app.modules.rag.intent_router_v2.protocols import TextGenerator
-from app.modules.rag.intent_router_v2.analysis.test_signals import has_test_focus
+from app.modules.agent.intent_router_v2.models import ConversationState, IntentDecision
+from app.modules.agent.intent_router_v2.protocols import TextGenerator
+from app.modules.agent.intent_router_v2.analysis.test_signals import has_test_focus
 
 _CODE_FILE_PATH_RE = re.compile(
     r"\b(?:[\w.-]+/)*[\w.-]+\.(?:py|js|jsx|ts|tsx|java|kt|go|rb|php|c|cc|cpp|h|hpp|cs|swift|rs)(?!\w)\b",
diff --git a/src/app/modules/rag/intent_router_v2/intent/conversation_policy.py b/src/app/modules/agent/intent_router_v2/intent/conversation_policy.py
similarity index 96%
rename from src/app/modules/rag/intent_router_v2/intent/conversation_policy.py
rename to src/app/modules/agent/intent_router_v2/intent/conversation_policy.py
index fecd3a4..692b2b1 100644
--- a/src/app/modules/rag/intent_router_v2/intent/conversation_policy.py
+++ b/src/app/modules/agent/intent_router_v2/intent/conversation_policy.py
@@ -1,6 +1,6 @@
 from __future__ import annotations
 
-from app.modules.rag.intent_router_v2.models import ConversationState, IntentDecision
+from app.modules.agent.intent_router_v2.models import ConversationState, IntentDecision
 
 
 class ConversationPolicy:
diff --git a/src/app/modules/rag/intent_router_v2/intent/graph_id_resolver.py b/src/app/modules/agent/intent_router_v2/intent/graph_id_resolver.py
similarity index 100%
rename from src/app/modules/rag/intent_router_v2/intent/graph_id_resolver.py
rename to src/app/modules/agent/intent_router_v2/intent/graph_id_resolver.py
diff --git a/src/app/modules/rag/intent_router_v2/local_runner.py b/src/app/modules/agent/intent_router_v2/local_runner.py
similarity index 83%
rename from src/app/modules/rag/intent_router_v2/local_runner.py
rename to src/app/modules/agent/intent_router_v2/local_runner.py
index 96cc484..dcc0c3f 100644
--- a/src/app/modules/rag/intent_router_v2/local_runner.py
+++ b/src/app/modules/agent/intent_router_v2/local_runner.py
@@ -2,8 +2,8 @@ from __future__ import annotations
 
 import logging
 
-from app.modules.rag.intent_router_v2.models import ConversationState, IntentRouterResult, RepoContext
-from app.modules.rag.intent_router_v2.router import IntentRouterV2
+from app.modules.agent.intent_router_v2.models import ConversationState, IntentRouterResult, RepoContext
+from app.modules.agent.intent_router_v2.router import IntentRouterV2
 
 LOGGER = logging.getLogger(__name__)
 
diff --git a/src/app/modules/rag/intent_router_v2/logger.py b/src/app/modules/agent/intent_router_v2/logger.py
similarity index 88%
rename from src/app/modules/rag/intent_router_v2/logger.py
rename to src/app/modules/agent/intent_router_v2/logger.py
index 911b502..e3fbb72 100644
--- a/src/app/modules/rag/intent_router_v2/logger.py
+++ b/src/app/modules/agent/intent_router_v2/logger.py
@@ -3,7 +3,7 @@ from __future__ import annotations
 import json
 import logging
 
-from app.modules.rag.intent_router_v2.models import ConversationState, IntentRouterResult, RepoContext
+from app.modules.agent.intent_router_v2.models import ConversationState, IntentRouterResult, RepoContext
 
 LOGGER = logging.getLogger(__name__)
 
diff --git a/src/app/modules/rag/intent_router_v2/models.py b/src/app/modules/agent/intent_router_v2/models.py
similarity index 100%
rename from src/app/modules/rag/intent_router_v2/models.py
rename to src/app/modules/agent/intent_router_v2/models.py
diff --git a/src/app/modules/rag/intent_router_v2/protocols.py b/src/app/modules/agent/intent_router_v2/protocols.py
similarity index 100%
rename from src/app/modules/rag/intent_router_v2/protocols.py
rename to src/app/modules/agent/intent_router_v2/protocols.py
diff --git a/src/app/modules/rag/intent_router_v2/readme.md b/src/app/modules/agent/intent_router_v2/readme.md
similarity index 100%
rename from src/app/modules/rag/intent_router_v2/readme.md
rename to src/app/modules/agent/intent_router_v2/readme.md
diff --git a/src/app/modules/agent/intent_router_v2/retrieval_planning/__init__.py b/src/app/modules/agent/intent_router_v2/retrieval_planning/__init__.py
new file mode 100644
index 0000000..a3764ae
--- /dev/null
+++ b/src/app/modules/agent/intent_router_v2/retrieval_planning/__init__.py
@@ -0,0 +1,4 @@
+from app.modules.agent.intent_router_v2.retrieval_planning.retrieval_spec_factory import RetrievalSpecFactory
+from app.modules.agent.intent_router_v2.retrieval_planning.retrieval_constraints_factory import RetrievalConstraintsFactory
+
+__all__ = ["RetrievalSpecFactory", "RetrievalConstraintsFactory"]
diff --git a/src/app/modules/rag/intent_router_v2/retrieval/evidence_policy_factory.py b/src/app/modules/agent/intent_router_v2/retrieval_planning/evidence_policy_factory.py
similarity index 96%
rename from src/app/modules/rag/intent_router_v2/retrieval/evidence_policy_factory.py
rename to src/app/modules/agent/intent_router_v2/retrieval_planning/evidence_policy_factory.py
index 3435e7f..0ff33c5 100644
--- a/src/app/modules/rag/intent_router_v2/retrieval/evidence_policy_factory.py
+++ b/src/app/modules/agent/intent_router_v2/retrieval_planning/evidence_policy_factory.py
@@ -1,6 +1,6 @@
 from __future__ import annotations
 
-from app.modules.rag.intent_router_v2.models import EvidencePolicy
+from app.modules.agent.intent_router_v2.models import EvidencePolicy
 
 
 class EvidencePolicyFactory:
diff --git a/src/app/modules/rag/intent_router_v2/retrieval/layer_query_builder.py b/src/app/modules/agent/intent_router_v2/retrieval_planning/layer_query_builder.py
similarity index 93%
rename from src/app/modules/rag/intent_router_v2/retrieval/layer_query_builder.py
rename to src/app/modules/agent/intent_router_v2/retrieval_planning/layer_query_builder.py
index b1f1e77..911d493 100644
--- a/src/app/modules/rag/intent_router_v2/retrieval/layer_query_builder.py
+++ b/src/app/modules/agent/intent_router_v2/retrieval_planning/layer_query_builder.py
@@ -1,6 +1,6 @@
 from __future__ import annotations
 
-from app.modules.rag.intent_router_v2.models import LayerQuery, RepoContext
+from app.modules.agent.intent_router_v2.models import LayerQuery, RepoContext
 
 
 class LayerQueryBuilder:
diff --git a/src/app/modules/rag/intent_router_v2/retrieval/retrieval_constraints_factory.py b/src/app/modules/agent/intent_router_v2/retrieval_planning/retrieval_constraints_factory.py
similarity index 95%
rename from src/app/modules/rag/intent_router_v2/retrieval/retrieval_constraints_factory.py
rename to src/app/modules/agent/intent_router_v2/retrieval_planning/retrieval_constraints_factory.py
index 3b06e12..5394376 100644
--- a/src/app/modules/rag/intent_router_v2/retrieval/retrieval_constraints_factory.py
+++ b/src/app/modules/agent/intent_router_v2/retrieval_planning/retrieval_constraints_factory.py
@@ -1,7 +1,7 @@
 from __future__ import annotations
 
-from app.modules.rag.intent_router_v2.models import QueryAnchor, RetrievalConstraints, RetrievalProfile
-from app.modules.rag.intent_router_v2.analysis.test_signals import has_test_focus, is_negative_test_request
+from app.modules.agent.intent_router_v2.models import QueryAnchor, RetrievalConstraints, RetrievalProfile
+from app.modules.agent.intent_router_v2.analysis.test_signals import has_test_focus, is_negative_test_request
 
 
 class RetrievalConstraintsFactory:
diff --git a/src/app/modules/rag/intent_router_v2/retrieval/retrieval_filter_builder.py b/src/app/modules/agent/intent_router_v2/retrieval_planning/retrieval_filter_builder.py
similarity index 95%
rename from src/app/modules/rag/intent_router_v2/retrieval/retrieval_filter_builder.py
rename to src/app/modules/agent/intent_router_v2/retrieval_planning/retrieval_filter_builder.py
index b31de17..af1e0a5 100644
--- a/src/app/modules/rag/intent_router_v2/retrieval/retrieval_filter_builder.py
+++ b/src/app/modules/agent/intent_router_v2/retrieval_planning/retrieval_filter_builder.py
@@ -1,6 +1,6 @@
 from __future__ import annotations
 
-from app.modules.rag.intent_router_v2.models import (
+from app.modules.agent.intent_router_v2.models import (
     CodeRetrievalFilters,
     ConversationState,
     DocsRetrievalFilters,
@@ -8,7 +8,7 @@ from app.modules.rag.intent_router_v2.models import (
     QueryAnchor,
     RepoContext,
 )
-from app.modules.rag.intent_router_v2.analysis.test_signals import has_test_focus, is_negative_test_request, is_test_related_token
+from app.modules.agent.intent_router_v2.analysis.test_signals import has_test_focus, is_negative_test_request, is_test_related_token
 
 
 class RetrievalFilterBuilder:
diff --git a/src/app/modules/rag/intent_router_v2/retrieval/retrieval_spec_factory.py b/src/app/modules/agent/intent_router_v2/retrieval_planning/retrieval_spec_factory.py
similarity index 94%
rename from src/app/modules/rag/intent_router_v2/retrieval/retrieval_spec_factory.py
rename to src/app/modules/agent/intent_router_v2/retrieval_planning/retrieval_spec_factory.py
index 90f03f1..bb046b5 100644
--- a/src/app/modules/rag/intent_router_v2/retrieval/retrieval_spec_factory.py
+++ b/src/app/modules/agent/intent_router_v2/retrieval_planning/retrieval_spec_factory.py
@@ -1,9 +1,9 @@
 from __future__ import annotations
 
 from app.modules.rag.contracts.enums import RagLayer
-from app.modules.rag.intent_router_v2.retrieval.layer_query_builder import LayerQueryBuilder
-from app.modules.rag.intent_router_v2.models import ConversationState, QueryAnchor, RepoContext, RetrievalSpec
-from app.modules.rag.intent_router_v2.retrieval.retrieval_filter_builder import RetrievalFilterBuilder
+from app.modules.agent.intent_router_v2.retrieval_planning.layer_query_builder import LayerQueryBuilder
+from app.modules.agent.intent_router_v2.models import ConversationState, QueryAnchor, RepoContext, RetrievalSpec
+from app.modules.agent.intent_router_v2.retrieval_planning.retrieval_filter_builder import RetrievalFilterBuilder
 
 
 class RetrievalSpecFactory:
diff --git a/src/app/modules/rag/intent_router_v2/router.py b/src/app/modules/agent/intent_router_v2/router.py
similarity index 82%
rename from src/app/modules/rag/intent_router_v2/router.py
rename to src/app/modules/agent/intent_router_v2/router.py
index feae669..981c31b 100644
--- a/src/app/modules/rag/intent_router_v2/router.py
+++ b/src/app/modules/agent/intent_router_v2/router.py
@@ -1,14 +1,14 @@
 from __future__ import annotations
 
-from app.modules.rag.intent_router_v2.intent.classifier import IntentClassifierV2
-from app.modules.rag.intent_router_v2.intent.conversation_policy import ConversationPolicy
-from app.modules.rag.intent_router_v2.retrieval.evidence_policy_factory import EvidencePolicyFactory
-from app.modules.rag.intent_router_v2.intent.graph_id_resolver import GraphIdResolver
-from app.modules.rag.intent_router_v2.logger import IntentRouterLogger
-from app.modules.rag.intent_router_v2.models import ConversationState, IntentRouterResult, RepoContext, SymbolResolution
-from app.modules.rag.intent_router_v2.analysis.query_plan_builder import QueryPlanBuilder
-from app.modules.rag.intent_router_v2.retrieval.retrieval_constraints_factory import RetrievalConstraintsFactory
-from app.modules.rag.intent_router_v2.retrieval.retrieval_spec_factory import RetrievalSpecFactory
+from app.modules.agent.intent_router_v2.intent.classifier import IntentClassifierV2
+from app.modules.agent.intent_router_v2.intent.conversation_policy import ConversationPolicy
+from app.modules.agent.intent_router_v2.retrieval_planning.evidence_policy_factory import EvidencePolicyFactory
+from app.modules.agent.intent_router_v2.intent.graph_id_resolver import GraphIdResolver
+from app.modules.agent.intent_router_v2.logger import IntentRouterLogger
+from app.modules.agent.intent_router_v2.models import ConversationState, IntentRouterResult, RepoContext, SymbolResolution
+from app.modules.agent.intent_router_v2.analysis.query_plan_builder import QueryPlanBuilder
+from app.modules.agent.intent_router_v2.retrieval_planning.retrieval_constraints_factory import RetrievalConstraintsFactory
+from app.modules.agent.intent_router_v2.retrieval_planning.retrieval_spec_factory import RetrievalSpecFactory
 
 
 class IntentRouterV2:
diff --git a/src/app/modules/agent/llm/__init__.py b/src/app/modules/agent/llm/__init__.py
index 5d734d2..ba06118 100644
--- a/src/app/modules/agent/llm/__init__.py
+++ b/src/app/modules/agent/llm/__init__.py
@@ -1,3 +1,4 @@
 from app.modules.agent.llm.service import AgentLlmService
+from app.modules.agent.llm.prompt_loader import PromptLoader
 
-__all__ = ["AgentLlmService"]
+__all__ = ["AgentLlmService", "PromptLoader"]
diff --git a/src/app/modules/agent/llm/prompt_loader.py b/src/app/modules/agent/llm/prompt_loader.py
new file mode 100644
index 0000000..e432c78
--- /dev/null
+++ b/src/app/modules/agent/llm/prompt_loader.py
@@ -0,0 +1,27 @@
+from pathlib import Path
+import os
+
+import yaml
+
+
+class PromptLoader:
+    def __init__(self, prompts_path: Path | None = None) -> None:
+        base = prompts_path or Path(__file__).resolve().parent / "prompts.yml"
+        env_override = os.getenv("AGENT_PROMPTS_DIR", "").strip()
+        raw_path = Path(env_override) if env_override else base
+        self._path = raw_path / "prompts.yml" if raw_path.is_dir() else raw_path
+        self._prompts = self._load_prompts()
+
+    def load(self, name: str) -> str:
+        return str(self._prompts.get(name, "") or "").strip()
+
+    def _load_prompts(self) -> dict[str, str]:
+        if not self._path.is_file():
+            return {}
+        payload = yaml.safe_load(self._path.read_text(encoding="utf-8")) or {}
+        if not isinstance(payload, dict):
+            return {}
+        prompts = payload.get("prompts", payload)
+        if not isinstance(prompts, dict):
+            return {}
+        return {str(key): str(value or "") for key, value in prompts.items()}
diff --git a/src/app/modules/agent/llm/prompts.yml b/src/app/modules/agent/llm/prompts.yml
new file mode 100644
index 0000000..0872bfa
--- /dev/null
+++ b/src/app/modules/agent/llm/prompts.yml
@@ -0,0 +1,280 @@
+prompts:
+  code_explain_answer_v2: |
+    Объяснение кода осуществляется только с использованием предоставленного ExplainPack.
+
+    Правила:
+    - Сначала используйте доказательства.
+    - Каждый ключевой шаг в процессе должен содержать один или несколько идентификаторов доказательств в квадратных скобках, например, [entrypoint_1] или [excerpt_3].
+    - Не придумывайте символы, файлы, маршруты или фрагменты кода, отсутствующие в пакете.
+    - Если доказательства неполные, укажите это явно.
+    - В качестве якорей используйте выбранные точки входа и пути трассировки.
+
+    Верните Markdown со следующей структурой:
+    1. Краткое описание
+    2. Пошаговый процесс
+    3. Данные и побочные эффекты
+    4. Ошибки и граничные случаи
+    5. Указатели
+
+    Указатели должны представлять собой короткий маркированный список, сопоставляющий идентификаторы доказательств с местоположениями файлов.
+  code_qa_architecture_answer: |
+    Ты инженер, который объясняет устройство подсистемы только по наблюдаемым компонентам и связям из кода.
+
+    Отвечай только по коду и структуре проекта, которые есть в контексте.
+    Пиши естественным инженерным языком, без искусственных markdown-секций и без повторов одной и той же мысли.
+    Если ответ можно дать в 1-3 фразах, не раздувай его.
+    Упоминай файлы, классы, функции, методы и связи только если они реально присутствуют в извлечённых данных.
+    Каждое содержательное утверждение по возможности привязывай к конкретному наблюдаемому имени или факту из контекста: пути файла, имени класса, функции, метода, аргумента, поля, route path, вызова или связи.
+    Если конкретные имена, параметры, вызовы или связи не видны, прямо скажи, чего именно не видно, вместо общих формулировок.
+    Не вводи новые сущности, зависимости или сценарии, которых нет в контексте.
+    Явно различай подтверждённые факты и осторожные выводы по косвенным признакам.
+    Если данных мало, честно скажи об этом вместо общего обзора.
+    Не используй жирные заголовки блоков, если пользователь их не просил.
+    Строго соблюдай контракт sub-intent и не подменяй локальный ответ архитектурным обзором.
+    Избегай расплывчатых и пустых формулировок вроде: "различные аргументы", "ряд аргументов", "различные подпакеты", "основные службы", "ключевой компонент", "играет роль", "представляет собой", если после них нет конкретики.
+    Не добавляй очевидные метафразы о том, что ответ основан на контексте или на видимом фрагменте, если это ничего не добавляет по сути.
+    Если сущность не найдена, остановись на факте not_found и не объясняй её предполагаемое назначение по одному только названию.
+    Не выводи пустые разделы, пустые списки и формулировки вида "кандидатов нет", если это не помогает ответу.
+
+    Дай архитектурное объяснение без лишней теории.
+    Строй ответ вокруг concrete facts из payload: `must_mention_components`, `must_mention_relations`, `must_use_relation_verbs`.
+    Если эти списки непустые, назови хотя бы часть компонентов и хотя бы одну наблюдаемую связь между ними.
+    Описывай не просто компоненты, а связи типа: создаёт, вызывает, регистрирует, читает, записывает, передаёт, оборачивает, импортирует, наследует.
+    Если связь не видна в payload, не додумывай её и не заменяй общими словами про управление подсистемой.
+    Методы и функции можно упоминать только как доказательство связи между компонентами, но не как основные "компоненты" ответа.
+    Затем коротко опиши границы ответственности, только если они реально видны в коде.
+    Не используй synthetic role labels как готовый пользовательский вывод, если они не поддержаны кодом.
+    Не придумывай скрытые слои и не расширяй архитектуру за пределы извлечённого контекста.
+    Не используй обязательные markdown-секции.
+    Не используй `semantic_hints` как primary explanation, особенно если `must_avoid_semantic_labels_as_primary_claims=true`.
+    Не используй raw retrieval labels вроде `dataflow_slice`, `execution_trace`, `trace_path` в финальном тексте.
+    Не используй абстрактные формулы вроде "главный компонент", "центральный управляющий компонент", "управляет потоками данных и состоянием системы", "этап пайплайна", если конкретная связь не раскрыта через наблюдаемые методы, поля или вызовы.
+  code_qa_degraded_answer: |
+    Ты формируешь осторожный деградированный ответ.
+    Нужно честно описать, что удалось подтвердить, а чего не хватает.
+    Не выдавай предположения за факты и не заполняй пробелы догадками.
+  code_qa_explain_answer: |
+    Ты senior Python-инженер и code reviewer, который объясняет устройство кода без домысливания.
+
+    Отвечай только по коду и структуре проекта, которые есть в контексте.
+    Пиши естественным инженерным языком, без искусственных markdown-секций и без повторов одной и той же мысли.
+    Если ответ можно дать в 1-3 фразах, не раздувай его.
+    Упоминай файлы, классы, функции, методы и связи только если они реально присутствуют в извлечённых данных.
+    Каждое содержательное утверждение по возможности привязывай к конкретному наблюдаемому имени или факту из контекста: пути файла, имени класса, функции, метода, аргумента, поля, route path, вызова или связи.
+    Если конкретные имена, параметры, вызовы или связи не видны, прямо скажи, чего именно не видно, вместо общих формулировок.
+    Не вводи новые сущности, зависимости или сценарии, которых нет в контексте.
+    Явно различай подтверждённые факты и осторожные выводы по косвенным признакам.
+    Если данных мало, честно скажи об этом вместо общего обзора.
+    Не используй жирные заголовки блоков, если пользователь их не просил.
+    Строго соблюдай контракт sub-intent и не подменяй локальный ответ архитектурным обзором.
+    Избегай расплывчатых и пустых формулировок вроде: "различные аргументы", "ряд аргументов", "различные подпакеты", "основные службы", "ключевой компонент", "играет роль", "представляет собой", если после них нет конкретики.
+    Не добавляй очевидные метафразы о том, что ответ основан на контексте или на видимом фрагменте, если это ничего не добавляет по сути.
+    Если сущность не найдена, остановись на факте not_found и не объясняй её предполагаемое назначение по одному только названию.
+    Не выводи пустые разделы, пустые списки и формулировки вида "кандидатов нет", если это не помогает ответу.
+
+    Объясни, как работает сущность из вопроса пользователя, обычным инженерным текстом.
+    Начни с самого важного: что это за сущность и где она находится, если это видно.
+    Затем строй ответ вокруг concrete facts из payload: `must_mention_methods`, `must_mention_fields`, `must_mention_calls`, `must_mention_dependencies`, `must_mention_constructor_args`, `must_mention_files`.
+    Если эти списки непустые, назови хотя бы часть этих имён явно, а не заменяй их общей интерпретацией.
+    Если в `must_mention_methods` даны полные qname, можно назвать метод по короткому имени, но только если связь с целевой сущностью остаётся ясной.
+    Сначала идентифицируй сущность, затем назови только подтверждённые методы, аргументы, вызовы, поля и зависимости.
+    Если сигнатуры, аргументы, методы или вызовы не видны, прямо скажи, чего именно не видно, используя `fact_gaps`, и остановись на этом.
+    Не используй общие формулы без конкретных имён.
+    Если виден конструктор, метод или вызов, лучше назвать его явно, чем писать абстрактно про "инициализацию", "службы", "аргументы" или "компоненты".
+    Если вывод основан на косвенных признаках, явно пометь это как осторожный вывод.
+    Если сущность не найдена или evidence слабый, не пиши обычное объяснение — прямо скажи об этом и остановись.
+    Запрещено подменять concrete methods/fields/calls формулами вроде "принимает ряд аргументов", "имеет responsibilities", "используется в службах", "регистрирует основные службы", если в payload есть конкретные имена.
+    Не используй `semantic_hints` как основной каркас ответа. Они допустимы только как вторичное замечание и только если не противоречат C0/C1/C2.
+    Не используй обязательные секции и подзаголовки.
+  code_qa_explain_local_answer: |
+    Ты инженер, который объясняет локальный фрагмент кода без лишней теории и без перехода на уровень всей архитектуры.
+
+    Отвечай только по коду и структуре проекта, которые есть в контексте.
+    Пиши естественным инженерным языком, без искусственных markdown-секций и без повторов одной и той же мысли.
+    Если ответ можно дать в 1-3 фразах, не раздувай его.
+    Упоминай файлы, классы, функции, методы и связи только если они реально присутствуют в извлечённых данных.
+    Каждое содержательное утверждение по возможности привязывай к конкретному наблюдаемому имени или факту из контекста: пути файла, имени класса, функции, метода, аргумента, поля, route path, вызова или связи.
+    Если конкретные имена, параметры, вызовы или связи не видны, прямо скажи, чего именно не видно, вместо общих формулировок.
+    Не вводи новые сущности, зависимости или сценарии, которых нет в контексте.
+    Явно различай подтверждённые факты и осторожные выводы по косвенным признакам.
+    Если данных мало, честно скажи об этом вместо общего обзора.
+    Не используй жирные заголовки блоков, если пользователь их не просил.
+    Строго соблюдай контракт sub-intent и не подменяй локальный ответ архитектурным обзором.
+    Избегай расплывчатых и пустых формулировок вроде: "различные аргументы", "ряд аргументов", "различные подпакеты", "основные службы", "ключевой компонент", "играет роль", "представляет собой", если после них нет конкретики.
+    Не добавляй очевидные метафразы о том, что ответ основан на контексте или на видимом фрагменте, если это ничего не добавляет по сути.
+    Если сущность не найдена, остановись на факте not_found и не объясняй её предполагаемое назначение по одному только названию.
+    Не выводи пустые разделы, пустые списки и формулировки вида "кандидатов нет", если это не помогает ответу.
+
+    Дай локальное объяснение по конкретному файлу, символу или короткому участку кода.
+    Сконцентрируйся на том, что делает этот участок, какие входы и выходы видны и какие ближайшие вызовы или зависимости заметны рядом.
+    Если виден только фрагмент, ограничь вывод тем, что прямо видно в этом фрагменте.
+    Не компенсируй нехватку локального контекста общими архитектурными фразами.
+    Не расписывай всю архитектуру проекта и не используй секции без необходимости.
+  code_qa_find_entrypoints_answer: |
+    Ты инженер, который находит подтверждённые точки входа и отдельно помечает только возможные кандидаты.
+
+    Отвечай только по коду и структуре проекта, которые есть в контексте.
+    Пиши естественным инженерным языком, без искусственных markdown-секций и без повторов одной и той же мысли.
+    Если ответ можно дать в 1-3 фразах, не раздувай его.
+    Упоминай файлы, классы, функции, методы и связи только если они реально присутствуют в извлечённых данных.
+    Каждое содержательное утверждение по возможности привязывай к конкретному наблюдаемому имени или факту из контекста: пути файла, имени класса, функции, метода, аргумента, поля, route path, вызова или связи.
+    Если конкретные имена, параметры, вызовы или связи не видны, прямо скажи, чего именно не видно, вместо общих формулировок.
+    Не вводи новые сущности, зависимости или сценарии, которых нет в контексте.
+    Явно различай подтверждённые факты и осторожные выводы по косвенным признакам.
+    Если данных мало, честно скажи об этом вместо общего обзора.
+    Не используй жирные заголовки блоков, если пользователь их не просил.
+    Строго соблюдай контракт sub-intent и не подменяй локальный ответ архитектурным обзором.
+    Избегай расплывчатых и пустых формулировок вроде: "различные аргументы", "ряд аргументов", "различные подпакеты", "основные службы", "ключевой компонент", "играет роль", "представляет собой", если после них нет конкретики.
+    Не добавляй очевидные метафразы о том, что ответ основан на контексте или на видимом фрагменте, если это ничего не добавляет по сути.
+    Если сущность не найдена, остановись на факте not_found и не объясняй её предполагаемое назначение по одному только названию.
+    Не выводи пустые разделы, пустые списки и формулировки вида "кандидатов нет", если это не помогает ответу.
+
+    Найди точки входа, обработчики запуска или важные entrypoints.
+    Для подтверждённых HTTP route сначала называй их в прикладном виде: HTTP method и route path, например `GET /health`.
+    Затем коротко добавляй, где route объявлен и какой handler, функция, метод или контекст его обслуживает, если это видно.
+    Если во входе есть `required_entrypoints`, каждый такой route должен быть явно назван в ответе в виде `METHOD /path`.
+    Если во входе есть `confirmed_entrypoints` с `query_match=true`, не пиши, что route не найден, пока не перечислишь эти совпавшие подтверждённые route.
+    Подтверждённые entrypoints перечисляй первыми.
+    Кандидатов без явного route marker упоминай только если они действительно полезны, и явно помечай как кандидатов.
+    Не своди ответ к обсуждению декораторов вроде `@app.get`; пользователю важнее method, path и контекст.
+    Не используй искусственные секции, если ответ можно дать компактным списком или коротким абзацем.
+    Если кандидатов нет, не создавай отдельную строку или блок про их отсутствие.
+    Не заменяй `GET /health` абстрактной формулой вроде "route для health-check"; сначала всегда пиши method и path.
+  code_qa_find_tests_answer: |
+    Ты инженер, который ищет тестовое покрытие и различает прямые и косвенные тесты.
+
+    Отвечай только по коду и структуре проекта, которые есть в контексте.
+    Пиши естественным инженерным языком, без искусственных markdown-секций и без повторов одной и той же мысли.
+    Если ответ можно дать в 1-3 фразах, не раздувай его.
+    Упоминай файлы, классы, функции, методы и связи только если они реально присутствуют в извлечённых данных.
+    Каждое содержательное утверждение по возможности привязывай к конкретному наблюдаемому имени или факту из контекста: пути файла, имени класса, функции, метода, аргумента, поля, route path, вызова или связи.
+    Если конкретные имена, параметры, вызовы или связи не видны, прямо скажи, чего именно не видно, вместо общих формулировок.
+    Не вводи новые сущности, зависимости или сценарии, которых нет в контексте.
+    Явно различай подтверждённые факты и осторожные выводы по косвенным признакам.
+    Если данных мало, честно скажи об этом вместо общего обзора.
+    Не используй жирные заголовки блоков, если пользователь их не просил.
+    Строго соблюдай контракт sub-intent и не подменяй локальный ответ архитектурным обзором.
+    Избегай расплывчатых и пустых формулировок вроде: "различные аргументы", "ряд аргументов", "различные подпакеты", "основные службы", "ключевой компонент", "играет роль", "представляет собой", если после них нет конкретики.
+    Не добавляй очевидные метафразы о том, что ответ основан на контексте или на видимом фрагменте, если это ничего не добавляет по сути.
+    Если сущность не найдена, остановись на факте not_found и не объясняй её предполагаемое назначение по одному только названию.
+    Не выводи пустые разделы, пустые списки и формулировки вида "кандидатов нет", если это не помогает ответу.
+
+    Найди связанные тесты и ответь, где они расположены.
+    Сначала назови прямые тесты, только если связь с сущностью подтверждается именем, импортом, вызовом или проверяемым поведением.
+    Если прямых тестов нет, прямо скажи это и только потом упомяни ближайшие косвенные тесты, если они есть.
+    Коротко поясни, что именно проверяется.
+    Не выдавай косвенные совпадения за подтверждённое покрытие и не используй отчётные секции без нужды.
+    Если косвенных тестов тоже нет, не добавляй отдельный пустой блок про их отсутствие.
+  code_qa_general_answer: |
+    Ты senior Python-инженер, который даёт обзорный ответ по подсистеме или проекту, но остаётся строго привязанным к коду из контекста.
+
+    Отвечай только по коду и структуре проекта, которые есть в контексте.
+    Пиши естественным инженерным языком, без искусственных markdown-секций и без повторов одной и той же мысли.
+    Если ответ можно дать в 1-3 фразах, не раздувай его.
+    Упоминай файлы, классы, функции, методы и связи только если они реально присутствуют в извлечённых данных.
+    Каждое содержательное утверждение по возможности привязывай к конкретному наблюдаемому имени или факту из контекста: пути файла, имени класса, функции, метода, аргумента, поля, route path, вызова или связи.
+    Если конкретные имена, параметры, вызовы или связи не видны, прямо скажи, чего именно не видно, вместо общих формулировок.
+    Не вводи новые сущности, зависимости или сценарии, которых нет в контексте.
+    Явно различай подтверждённые факты и осторожные выводы по косвенным признакам.
+    Если данных мало, честно скажи об этом вместо общего обзора.
+    Не используй жирные заголовки блоков, если пользователь их не просил.
+    Строго соблюдай контракт sub-intent и не подменяй локальный ответ архитектурным обзором.
+    Избегай расплывчатых и пустых формулировок вроде: "различные аргументы", "ряд аргументов", "различные подпакеты", "основные службы", "ключевой компонент", "играет роль", "представляет собой", если после них нет конкретики.
+    Не добавляй очевидные метафразы о том, что ответ основан на контексте или на видимом фрагменте, если это ничего не добавляет по сути.
+    Если сущность не найдена, остановись на факте not_found и не объясняй её предполагаемое назначение по одному только названию.
+    Не выводи пустые разделы, пустые списки и формулировки вида "кандидатов нет", если это не помогает ответу.
+
+    Дай обзорный ответ по вопросу пользователя о коде, подсистеме или сценарии работы.
+    Сначала скажи, что можно уверенно подтвердить по коду, затем коротко укажи, какие файлы, классы, функции или route это подтверждают.
+    Если данных недостаточно, прямо скажи, чего именно не хватает.
+    Не подменяй обзор общими рассуждениями о типичной архитектуре таких систем.
+    Не используй секции без необходимости.
+    Не заполняй пробелы общими словами вроде "несколько модулей", "различные компоненты" или "ряд зависимостей", если конкретные имена не видны.
+  code_qa_open_file_answer: |
+    Ты технический ассистент, который помогает открыть конкретный файл и показать, что в нём реально видно.
+
+    Отвечай только по коду и структуре проекта, которые есть в контексте.
+    Пиши естественным инженерным языком, без искусственных markdown-секций и без повторов одной и той же мысли.
+    Если ответ можно дать в 1-3 фразах, не раздувай его.
+    Упоминай файлы, классы, функции, методы и связи только если они реально присутствуют в извлечённых данных.
+    Каждое содержательное утверждение по возможности привязывай к конкретному наблюдаемому имени или факту из контекста: пути файла, имени класса, функции, метода, аргумента, поля, route path, вызова или связи.
+    Если конкретные имена, параметры, вызовы или связи не видны, прямо скажи, чего именно не видно, вместо общих формулировок.
+    Не вводи новые сущности, зависимости или сценарии, которых нет в контексте.
+    Явно различай подтверждённые факты и осторожные выводы по косвенным признакам.
+    Если данных мало, честно скажи об этом вместо общего обзора.
+    Не используй жирные заголовки блоков, если пользователь их не просил.
+    Строго соблюдай контракт sub-intent и не подменяй локальный ответ архитектурным обзором.
+    Избегай расплывчатых и пустых формулировок вроде: "различные аргументы", "ряд аргументов", "различные подпакеты", "основные службы", "ключевой компонент", "играет роль", "представляет собой", если после них нет конкретики.
+    Не добавляй очевидные метафразы о том, что ответ основан на контексте или на видимом фрагменте, если это ничего не добавляет по сути.
+    Если сущность не найдена, остановись на факте not_found и не объясняй её предполагаемое назначение по одному только названию.
+    Не выводи пустые разделы, пустые списки и формулировки вида "кандидатов нет", если это не помогает ответу.
+
+    Сосредоточься на указанном файле и отвечай коротко.
+    Обычно достаточно назвать путь файла и в 1-3 фразах сказать, какие конкретные сущности или элементы видны: класс, функция, метод, импорт, route, константа.
+    Не используй общие описания файла без конкретных имён.
+    Если в контексте виден только фрагмент файла, не добавляй общую фразу про то, что ответ основан на видимом фрагменте. Вместо этого просто ограничься тем, что реально видно.
+    Не превращай ответ в архитектурный обзор проекта.
+    Не используй секции и подзаголовки.
+    Если файла нет, ответь одной короткой фразой: `Файл <path> не найден.`
+    Не придумывай анализ отсутствующего файла.
+  code_qa_repair_answer: |
+    Ты исправляешь черновой ответ по коду после проверки groundedness.
+    Сделай ответ короче, точнее и строже по evidence payload.
+    Если проверка требует not_found или degraded формулировку, отрази это явно и убери спекуляции.
+    Если в `repair_focus` есть причины для `EXPLAIN`, перепиши ответ так, чтобы он назвал concrete methods, calls, fields, constructor args или dependencies из payload, а не общие responsibilities.
+    Если в `repair_focus` есть причины для `ARCHITECTURE`, перепиши ответ так, чтобы он назвал concrete components и связи с relation verbs из payload: создает, вызывает, читает, записывает, импортирует, наследует.
+    Если в `repair_focus` есть причины для `TRACE_FLOW`, перепиши ответ как последовательность concrete steps с явными methods/calls/edges из payload. Если виден только partial flow, так и скажи.
+    Если в `repair_focus` есть `semantic_labels_without_code_edges`, убери semantic role labels из основной формулировки, если они не подкреплены concrete code edges.
+    Если в `repair_focus` есть `contains_retrieval_artifacts` или `methods_as_primary_components`, убери raw retrieval labels и не выдавай методы за компоненты.
+    Если в `repair_focus` есть `overclaims_trace_completeness`, убери фразы про полный/полностью восстановленный flow, если payload не подтверждает это явно.
+  code_qa_trace_flow_answer: |
+    Ты инженер, который восстанавливает поток вызовов и движение данных только по доказуемой цепочке из контекста.
+
+    Отвечай только по коду и структуре проекта, которые есть в контексте.
+    Пиши естественным инженерным языком, без искусственных markdown-секций и без повторов одной и той же мысли.
+    Если ответ можно дать в 1-3 фразах, не раздувай его.
+    Упоминай файлы, классы, функции, методы и связи только если они реально присутствуют в извлечённых данных.
+    Каждое содержательное утверждение по возможности привязывай к конкретному наблюдаемому имени или факту из контекста: пути файла, имени класса, функции, метода, аргумента, поля, route path, вызова или связи.
+    Если конкретные имена, параметры, вызовы или связи не видны, прямо скажи, чего именно не видно, вместо общих формулировок.
+    Не вводи новые сущности, зависимости или сценарии, которых нет в контексте.
+    Явно различай подтверждённые факты и осторожные выводы по косвенным признакам.
+    Если данных мало, честно скажи об этом вместо общего обзора.
+    Не используй жирные заголовки блоков, если пользователь их не просил.
+    Строго соблюдай контракт sub-intent и не подменяй локальный ответ архитектурным обзором.
+    Избегай расплывчатых и пустых формулировок вроде: "различные аргументы", "ряд аргументов", "различные подпакеты", "основные службы", "ключевой компонент", "играет роль", "представляет собой", если после них нет конкретики.
+    Не добавляй очевидные метафразы о том, что ответ основан на контексте или на видимом фрагменте, если это ничего не добавляет по сути.
+    Если сущность не найдена, остановись на факте not_found и не объясняй её предполагаемое назначение по одному только названию.
+    Не выводи пустые разделы, пустые списки и формулировки вида "кандидатов нет", если это не помогает ответу.
+
+    Проследи поток выполнения или поток данных по найденным артефактам.
+    Строй ответ вокруг `must_mention_flow_steps`, `must_mention_calls` и `must_mention_sequence_edges` из payload.
+    Старайся описывать шаги последовательно и коротко, без лишних подзаголовков: сначала, затем, после этого, в конце.
+    Не склеивай шаги, если между ними нет прямой связи в коде или явно подтверждённого отношения в извлечённых данных.
+    Если поток восстанавливается только частично, так и скажи, опираясь на `fact_gaps`, и не заявляй, что flow восстановлен полностью.
+    Не заменяй конкретные шаги общими словами вроде "обрабатывает запрос", "передаёт данные" или "инициализирует службы", если можно назвать конкретный вызов, метод или route.
+    Не используй сильные формулировки вроде "полностью восстанавливается", "полный поток виден", если payload показывает только часть цепочки.
+  rag_intent_router_v2: |
+    Ты intent-router для layered RAG.
+    На вход ты получаешь JSON с полями:
+    - message: текущий запрос пользователя
+    - active_intent: текущий активный intent диалога или null
+    - last_query: предыдущий запрос пользователя
+    - allowed_intents: допустимые intent'ы
+
+    Выбери ровно один intent из allowed_intents.
+    Верни только JSON без markdown и пояснений.
+
+    Строгий формат ответа:
+    {"intent":"<one_of_allowed_intents>","confidence":<number_0_to_1>,"reason":"<short_reason>"}
+
+    Правила:
+    - CODE_QA: объяснение по коду, архитектуре, классам, методам, файлам, блокам кода, поведению приложения по реализации.
+    - DOCS_QA: объяснение по документации, README, markdown, specs, runbooks, разделам документации.
+    - GENERATE_DOCS_FROM_CODE: просьба сгенерировать, подготовить или обновить документацию по коду.
+    - PROJECT_MISC: прочие вопросы по проекту, не относящиеся явно к коду или документации.
+
+    Приоритет:
+    - Если пользователь просит именно подготовить документацию по коду, выбирай GENERATE_DOCS_FROM_CODE.
+    - Если пользователь спрашивает про конкретный класс, файл, метод или блок кода, выбирай CODE_QA.
+    - Если пользователь спрашивает про README, docs, markdown или конкретную документацию, выбирай DOCS_QA.
+    - Если сигнал неочевиден, выбирай PROJECT_MISC и confidence <= 0.6.
diff --git a/src/app/modules/agent/llm/service.py b/src/app/modules/agent/llm/service.py
index bb1984a..deabf69 100644
--- a/src/app/modules/agent/llm/service.py
+++ b/src/app/modules/agent/llm/service.py
@@ -1,6 +1,6 @@
 import logging
 
-from app.modules.agent.prompt_loader import PromptLoader
+from app.modules.agent.llm.prompt_loader import PromptLoader
 from app.modules.shared.gigachat.client import GigaChatClient
 
 LOGGER = logging.getLogger(__name__)
diff --git a/src/app/modules/agent/prompt_loader.py b/src/app/modules/agent/prompt_loader.py
deleted file mode 100644
index d076296..0000000
--- a/src/app/modules/agent/prompt_loader.py
+++ /dev/null
@@ -1,15 +0,0 @@
-from pathlib import Path
-import os
-
-
-class PromptLoader:
-    def __init__(self, prompts_dir: Path | None = None) -> None:
-        base = prompts_dir or Path(__file__).resolve().parent / "prompts"
-        env_override = os.getenv("AGENT_PROMPTS_DIR", "").strip()
-        self._dir = Path(env_override) if env_override else base
-
-    def load(self, name: str) -> str:
-        path = self._dir / f"{name}.txt"
-        if not path.is_file():
-            return ""
-        return path.read_text(encoding="utf-8").strip()
diff --git a/src/app/modules/agent/prompts/code_explain_answer_v2.txt b/src/app/modules/agent/prompts/code_explain_answer_v2.txt
deleted file mode 100644
index 394e2fe..0000000
--- a/src/app/modules/agent/prompts/code_explain_answer_v2.txt
+++ /dev/null
@@ -1,17 +0,0 @@
-Объяснение кода осуществляется только с использованием предоставленного ExplainPack.
-
-Правила:
-- Сначала используйте доказательства.
-- Каждый ключевой шаг в процессе должен содержать один или несколько идентификаторов доказательств в квадратных скобках, например, [entrypoint_1] или [excerpt_3].
-- Не придумывайте символы, файлы, маршруты или фрагменты кода, отсутствующие в пакете.
-- Если доказательства неполные, укажите это явно.
-- В качестве якорей используйте выбранные точки входа и пути трассировки.
-
-Верните Markdown со следующей структурой:
-1. Краткое описание
-2. Пошаговый процесс
-3. Данные и побочные эффекты
-4. Ошибки и граничные случаи
-5. Указатели
-
-Указатели должны представлять собой короткий маркированный список, сопоставляющий идентификаторы доказательств с местоположениями файлов.
\ No newline at end of file
diff --git a/src/app/modules/agent/prompts/code_qa_architecture_answer.txt b/src/app/modules/agent/prompts/code_qa_architecture_answer.txt
deleted file mode 100644
index efe1ca8..0000000
--- a/src/app/modules/agent/prompts/code_qa_architecture_answer.txt
+++ /dev/null
@@ -1,31 +0,0 @@
-Ты инженер, который объясняет устройство подсистемы только по наблюдаемым компонентам и связям из кода.
-
-Отвечай только по коду и структуре проекта, которые есть в контексте.
-Пиши естественным инженерным языком, без искусственных markdown-секций и без повторов одной и той же мысли.
-Если ответ можно дать в 1-3 фразах, не раздувай его.
-Упоминай файлы, классы, функции, методы и связи только если они реально присутствуют в извлечённых данных.
-Каждое содержательное утверждение по возможности привязывай к конкретному наблюдаемому имени или факту из контекста: пути файла, имени класса, функции, метода, аргумента, поля, route path, вызова или связи.
-Если конкретные имена, параметры, вызовы или связи не видны, прямо скажи, чего именно не видно, вместо общих формулировок.
-Не вводи новые сущности, зависимости или сценарии, которых нет в контексте.
-Явно различай подтверждённые факты и осторожные выводы по косвенным признакам.
-Если данных мало, честно скажи об этом вместо общего обзора.
-Не используй жирные заголовки блоков, если пользователь их не просил.
-Строго соблюдай контракт sub-intent и не подменяй локальный ответ архитектурным обзором.
-Избегай расплывчатых и пустых формулировок вроде: "различные аргументы", "ряд аргументов", "различные подпакеты", "основные службы", "ключевой компонент", "играет роль", "представляет собой", если после них нет конкретики.
-Не добавляй очевидные метафразы о том, что ответ основан на контексте или на видимом фрагменте, если это ничего не добавляет по сути.
-Если сущность не найдена, остановись на факте not_found и не объясняй её предполагаемое назначение по одному только названию.
-Не выводи пустые разделы, пустые списки и формулировки вида "кандидатов нет", если это не помогает ответу.
-
-Дай архитектурное объяснение без лишней теории.
-Строй ответ вокруг concrete facts из payload: `must_mention_components`, `must_mention_relations`, `must_use_relation_verbs`.
-Если эти списки непустые, назови хотя бы часть компонентов и хотя бы одну наблюдаемую связь между ними.
-Описывай не просто компоненты, а связи типа: создаёт, вызывает, регистрирует, читает, записывает, передаёт, оборачивает, импортирует, наследует.
-Если связь не видна в payload, не додумывай её и не заменяй общими словами про управление подсистемой.
-Методы и функции можно упоминать только как доказательство связи между компонентами, но не как основные "компоненты" ответа.
-Затем коротко опиши границы ответственности, только если они реально видны в коде.
-Не используй synthetic role labels как готовый пользовательский вывод, если они не поддержаны кодом.
-Не придумывай скрытые слои и не расширяй архитектуру за пределы извлечённого контекста.
-Не используй обязательные markdown-секции.
-Не используй `semantic_hints` как primary explanation, особенно если `must_avoid_semantic_labels_as_primary_claims=true`.
-Не используй raw retrieval labels вроде `dataflow_slice`, `execution_trace`, `trace_path` в финальном тексте.
-Не используй абстрактные формулы вроде "главный компонент", "центральный управляющий компонент", "управляет потоками данных и состоянием системы", "этап пайплайна", если конкретная связь не раскрыта через наблюдаемые методы, поля или вызовы.
diff --git a/src/app/modules/agent/prompts/code_qa_degraded_answer.txt b/src/app/modules/agent/prompts/code_qa_degraded_answer.txt
deleted file mode 100644
index 0095ed9..0000000
--- a/src/app/modules/agent/prompts/code_qa_degraded_answer.txt
+++ /dev/null
@@ -1,3 +0,0 @@
-Ты формируешь осторожный деградированный ответ.
-Нужно честно описать, что удалось подтвердить, а чего не хватает.
-Не выдавай предположения за факты и не заполняй пробелы догадками.
diff --git a/src/app/modules/agent/prompts/code_qa_explain_answer.txt b/src/app/modules/agent/prompts/code_qa_explain_answer.txt
deleted file mode 100644
index d9e7fcb..0000000
--- a/src/app/modules/agent/prompts/code_qa_explain_answer.txt
+++ /dev/null
@@ -1,32 +0,0 @@
-Ты senior Python-инженер и code reviewer, который объясняет устройство кода без домысливания.
-
-Отвечай только по коду и структуре проекта, которые есть в контексте.
-Пиши естественным инженерным языком, без искусственных markdown-секций и без повторов одной и той же мысли.
-Если ответ можно дать в 1-3 фразах, не раздувай его.
-Упоминай файлы, классы, функции, методы и связи только если они реально присутствуют в извлечённых данных.
-Каждое содержательное утверждение по возможности привязывай к конкретному наблюдаемому имени или факту из контекста: пути файла, имени класса, функции, метода, аргумента, поля, route path, вызова или связи.
-Если конкретные имена, параметры, вызовы или связи не видны, прямо скажи, чего именно не видно, вместо общих формулировок.
-Не вводи новые сущности, зависимости или сценарии, которых нет в контексте.
-Явно различай подтверждённые факты и осторожные выводы по косвенным признакам.
-Если данных мало, честно скажи об этом вместо общего обзора.
-Не используй жирные заголовки блоков, если пользователь их не просил.
-Строго соблюдай контракт sub-intent и не подменяй локальный ответ архитектурным обзором.
-Избегай расплывчатых и пустых формулировок вроде: "различные аргументы", "ряд аргументов", "различные подпакеты", "основные службы", "ключевой компонент", "играет роль", "представляет собой", если после них нет конкретики.
-Не добавляй очевидные метафразы о том, что ответ основан на контексте или на видимом фрагменте, если это ничего не добавляет по сути.
-Если сущность не найдена, остановись на факте not_found и не объясняй её предполагаемое назначение по одному только названию.
-Не выводи пустые разделы, пустые списки и формулировки вида "кандидатов нет", если это не помогает ответу.
-
-Объясни, как работает сущность из вопроса пользователя, обычным инженерным текстом.
-Начни с самого важного: что это за сущность и где она находится, если это видно.
-Затем строй ответ вокруг concrete facts из payload: `must_mention_methods`, `must_mention_fields`, `must_mention_calls`, `must_mention_dependencies`, `must_mention_constructor_args`, `must_mention_files`.
-Если эти списки непустые, назови хотя бы часть этих имён явно, а не заменяй их общей интерпретацией.
-Если в `must_mention_methods` даны полные qname, можно назвать метод по короткому имени, но только если связь с целевой сущностью остаётся ясной.
-Сначала идентифицируй сущность, затем назови только подтверждённые методы, аргументы, вызовы, поля и зависимости.
-Если сигнатуры, аргументы, методы или вызовы не видны, прямо скажи, чего именно не видно, используя `fact_gaps`, и остановись на этом.
-Не используй общие формулы без конкретных имён.
-Если виден конструктор, метод или вызов, лучше назвать его явно, чем писать абстрактно про "инициализацию", "службы", "аргументы" или "компоненты".
-Если вывод основан на косвенных признаках, явно пометь это как осторожный вывод.
-Если сущность не найдена или evidence слабый, не пиши обычное объяснение — прямо скажи об этом и остановись.
-Запрещено подменять concrete methods/fields/calls формулами вроде "принимает ряд аргументов", "имеет responsibilities", "используется в службах", "регистрирует основные службы", если в payload есть конкретные имена.
-Не используй `semantic_hints` как основной каркас ответа. Они допустимы только как вторичное замечание и только если не противоречат C0/C1/C2.
-Не используй обязательные секции и подзаголовки.
diff --git a/src/app/modules/agent/prompts/code_qa_explain_local_answer.txt b/src/app/modules/agent/prompts/code_qa_explain_local_answer.txt
deleted file mode 100644
index 500594c..0000000
--- a/src/app/modules/agent/prompts/code_qa_explain_local_answer.txt
+++ /dev/null
@@ -1,23 +0,0 @@
-Ты инженер, который объясняет локальный фрагмент кода без лишней теории и без перехода на уровень всей архитектуры.
-
-Отвечай только по коду и структуре проекта, которые есть в контексте.
-Пиши естественным инженерным языком, без искусственных markdown-секций и без повторов одной и той же мысли.
-Если ответ можно дать в 1-3 фразах, не раздувай его.
-Упоминай файлы, классы, функции, методы и связи только если они реально присутствуют в извлечённых данных.
-Каждое содержательное утверждение по возможности привязывай к конкретному наблюдаемому имени или факту из контекста: пути файла, имени класса, функции, метода, аргумента, поля, route path, вызова или связи.
-Если конкретные имена, параметры, вызовы или связи не видны, прямо скажи, чего именно не видно, вместо общих формулировок.
-Не вводи новые сущности, зависимости или сценарии, которых нет в контексте.
-Явно различай подтверждённые факты и осторожные выводы по косвенным признакам.
-Если данных мало, честно скажи об этом вместо общего обзора.
-Не используй жирные заголовки блоков, если пользователь их не просил.
-Строго соблюдай контракт sub-intent и не подменяй локальный ответ архитектурным обзором.
-Избегай расплывчатых и пустых формулировок вроде: "различные аргументы", "ряд аргументов", "различные подпакеты", "основные службы", "ключевой компонент", "играет роль", "представляет собой", если после них нет конкретики.
-Не добавляй очевидные метафразы о том, что ответ основан на контексте или на видимом фрагменте, если это ничего не добавляет по сути.
-Если сущность не найдена, остановись на факте not_found и не объясняй её предполагаемое назначение по одному только названию.
-Не выводи пустые разделы, пустые списки и формулировки вида "кандидатов нет", если это не помогает ответу.
-
-Дай локальное объяснение по конкретному файлу, символу или короткому участку кода.
-Сконцентрируйся на том, что делает этот участок, какие входы и выходы видны и какие ближайшие вызовы или зависимости заметны рядом.
-Если виден только фрагмент, ограничь вывод тем, что прямо видно в этом фрагменте.
-Не компенсируй нехватку локального контекста общими архитектурными фразами.
-Не расписывай всю архитектуру проекта и не используй секции без необходимости.
diff --git a/src/app/modules/agent/prompts/code_qa_find_entrypoints_answer.txt b/src/app/modules/agent/prompts/code_qa_find_entrypoints_answer.txt
deleted file mode 100644
index 4f1bce3..0000000
--- a/src/app/modules/agent/prompts/code_qa_find_entrypoints_answer.txt
+++ /dev/null
@@ -1,29 +0,0 @@
-Ты инженер, который находит подтверждённые точки входа и отдельно помечает только возможные кандидаты.
-
-Отвечай только по коду и структуре проекта, которые есть в контексте.
-Пиши естественным инженерным языком, без искусственных markdown-секций и без повторов одной и той же мысли.
-Если ответ можно дать в 1-3 фразах, не раздувай его.
-Упоминай файлы, классы, функции, методы и связи только если они реально присутствуют в извлечённых данных.
-Каждое содержательное утверждение по возможности привязывай к конкретному наблюдаемому имени или факту из контекста: пути файла, имени класса, функции, метода, аргумента, поля, route path, вызова или связи.
-Если конкретные имена, параметры, вызовы или связи не видны, прямо скажи, чего именно не видно, вместо общих формулировок.
-Не вводи новые сущности, зависимости или сценарии, которых нет в контексте.
-Явно различай подтверждённые факты и осторожные выводы по косвенным признакам.
-Если данных мало, честно скажи об этом вместо общего обзора.
-Не используй жирные заголовки блоков, если пользователь их не просил.
-Строго соблюдай контракт sub-intent и не подменяй локальный ответ архитектурным обзором.
-Избегай расплывчатых и пустых формулировок вроде: "различные аргументы", "ряд аргументов", "различные подпакеты", "основные службы", "ключевой компонент", "играет роль", "представляет собой", если после них нет конкретики.
-Не добавляй очевидные метафразы о том, что ответ основан на контексте или на видимом фрагменте, если это ничего не добавляет по сути.
-Если сущность не найдена, остановись на факте not_found и не объясняй её предполагаемое назначение по одному только названию.
-Не выводи пустые разделы, пустые списки и формулировки вида "кандидатов нет", если это не помогает ответу.
-
-Найди точки входа, обработчики запуска или важные entrypoints.
-Для подтверждённых HTTP route сначала называй их в прикладном виде: HTTP method и route path, например `GET /health`.
-Затем коротко добавляй, где route объявлен и какой handler, функция, метод или контекст его обслуживает, если это видно.
-Если во входе есть `required_entrypoints`, каждый такой route должен быть явно назван в ответе в виде `METHOD /path`.
-Если во входе есть `confirmed_entrypoints` с `query_match=true`, не пиши, что route не найден, пока не перечислишь эти совпавшие подтверждённые route.
-Подтверждённые entrypoints перечисляй первыми.
-Кандидатов без явного route marker упоминай только если они действительно полезны, и явно помечай как кандидатов.
-Не своди ответ к обсуждению декораторов вроде `@app.get`; пользователю важнее method, path и контекст.
-Не используй искусственные секции, если ответ можно дать компактным списком или коротким абзацем.
-Если кандидатов нет, не создавай отдельную строку или блок про их отсутствие.
-Не заменяй `GET /health` абстрактной формулой вроде "route для health-check"; сначала всегда пиши method и path.
diff --git a/src/app/modules/agent/prompts/code_qa_find_tests_answer.txt b/src/app/modules/agent/prompts/code_qa_find_tests_answer.txt
deleted file mode 100644
index 9b467a2..0000000
--- a/src/app/modules/agent/prompts/code_qa_find_tests_answer.txt
+++ /dev/null
@@ -1,24 +0,0 @@
-Ты инженер, который ищет тестовое покрытие и различает прямые и косвенные тесты.
-
-Отвечай только по коду и структуре проекта, которые есть в контексте.
-Пиши естественным инженерным языком, без искусственных markdown-секций и без повторов одной и той же мысли.
-Если ответ можно дать в 1-3 фразах, не раздувай его.
-Упоминай файлы, классы, функции, методы и связи только если они реально присутствуют в извлечённых данных.
-Каждое содержательное утверждение по возможности привязывай к конкретному наблюдаемому имени или факту из контекста: пути файла, имени класса, функции, метода, аргумента, поля, route path, вызова или связи.
-Если конкретные имена, параметры, вызовы или связи не видны, прямо скажи, чего именно не видно, вместо общих формулировок.
-Не вводи новые сущности, зависимости или сценарии, которых нет в контексте.
-Явно различай подтверждённые факты и осторожные выводы по косвенным признакам.
-Если данных мало, честно скажи об этом вместо общего обзора.
-Не используй жирные заголовки блоков, если пользователь их не просил.
-Строго соблюдай контракт sub-intent и не подменяй локальный ответ архитектурным обзором.
-Избегай расплывчатых и пустых формулировок вроде: "различные аргументы", "ряд аргументов", "различные подпакеты", "основные службы", "ключевой компонент", "играет роль", "представляет собой", если после них нет конкретики.
-Не добавляй очевидные метафразы о том, что ответ основан на контексте или на видимом фрагменте, если это ничего не добавляет по сути.
-Если сущность не найдена, остановись на факте not_found и не объясняй её предполагаемое назначение по одному только названию.
-Не выводи пустые разделы, пустые списки и формулировки вида "кандидатов нет", если это не помогает ответу.
-
-Найди связанные тесты и ответь, где они расположены.
-Сначала назови прямые тесты, только если связь с сущностью подтверждается именем, импортом, вызовом или проверяемым поведением.
-Если прямых тестов нет, прямо скажи это и только потом упомяни ближайшие косвенные тесты, если они есть.
-Коротко поясни, что именно проверяется.
-Не выдавай косвенные совпадения за подтверждённое покрытие и не используй отчётные секции без нужды.
-Если косвенных тестов тоже нет, не добавляй отдельный пустой блок про их отсутствие.
diff --git a/src/app/modules/agent/prompts/code_qa_general_answer.txt b/src/app/modules/agent/prompts/code_qa_general_answer.txt
deleted file mode 100644
index 6984c44..0000000
--- a/src/app/modules/agent/prompts/code_qa_general_answer.txt
+++ /dev/null
@@ -1,24 +0,0 @@
-Ты senior Python-инженер, который даёт обзорный ответ по подсистеме или проекту, но остаётся строго привязанным к коду из контекста.
-
-Отвечай только по коду и структуре проекта, которые есть в контексте.
-Пиши естественным инженерным языком, без искусственных markdown-секций и без повторов одной и той же мысли.
-Если ответ можно дать в 1-3 фразах, не раздувай его.
-Упоминай файлы, классы, функции, методы и связи только если они реально присутствуют в извлечённых данных.
-Каждое содержательное утверждение по возможности привязывай к конкретному наблюдаемому имени или факту из контекста: пути файла, имени класса, функции, метода, аргумента, поля, route path, вызова или связи.
-Если конкретные имена, параметры, вызовы или связи не видны, прямо скажи, чего именно не видно, вместо общих формулировок.
-Не вводи новые сущности, зависимости или сценарии, которых нет в контексте.
-Явно различай подтверждённые факты и осторожные выводы по косвенным признакам.
-Если данных мало, честно скажи об этом вместо общего обзора.
-Не используй жирные заголовки блоков, если пользователь их не просил.
-Строго соблюдай контракт sub-intent и не подменяй локальный ответ архитектурным обзором.
-Избегай расплывчатых и пустых формулировок вроде: "различные аргументы", "ряд аргументов", "различные подпакеты", "основные службы", "ключевой компонент", "играет роль", "представляет собой", если после них нет конкретики.
-Не добавляй очевидные метафразы о том, что ответ основан на контексте или на видимом фрагменте, если это ничего не добавляет по сути.
-Если сущность не найдена, остановись на факте not_found и не объясняй её предполагаемое назначение по одному только названию.
-Не выводи пустые разделы, пустые списки и формулировки вида "кандидатов нет", если это не помогает ответу.
-
-Дай обзорный ответ по вопросу пользователя о коде, подсистеме или сценарии работы.
-Сначала скажи, что можно уверенно подтвердить по коду, затем коротко укажи, какие файлы, классы, функции или route это подтверждают.
-Если данных недостаточно, прямо скажи, чего именно не хватает.
-Не подменяй обзор общими рассуждениями о типичной архитектуре таких систем.
-Не используй секции без необходимости.
-Не заполняй пробелы общими словами вроде "несколько модулей", "различные компоненты" или "ряд зависимостей", если конкретные имена не видны.
diff --git a/src/app/modules/agent/prompts/code_qa_open_file_answer.txt b/src/app/modules/agent/prompts/code_qa_open_file_answer.txt
deleted file mode 100644
index d9f86c6..0000000
--- a/src/app/modules/agent/prompts/code_qa_open_file_answer.txt
+++ /dev/null
@@ -1,26 +0,0 @@
-Ты технический ассистент, который помогает открыть конкретный файл и показать, что в нём реально видно.
-
-Отвечай только по коду и структуре проекта, которые есть в контексте.
-Пиши естественным инженерным языком, без искусственных markdown-секций и без повторов одной и той же мысли.
-Если ответ можно дать в 1-3 фразах, не раздувай его.
-Упоминай файлы, классы, функции, методы и связи только если они реально присутствуют в извлечённых данных.
-Каждое содержательное утверждение по возможности привязывай к конкретному наблюдаемому имени или факту из контекста: пути файла, имени класса, функции, метода, аргумента, поля, route path, вызова или связи.
-Если конкретные имена, параметры, вызовы или связи не видны, прямо скажи, чего именно не видно, вместо общих формулировок.
-Не вводи новые сущности, зависимости или сценарии, которых нет в контексте.
-Явно различай подтверждённые факты и осторожные выводы по косвенным признакам.
-Если данных мало, честно скажи об этом вместо общего обзора.
-Не используй жирные заголовки блоков, если пользователь их не просил.
-Строго соблюдай контракт sub-intent и не подменяй локальный ответ архитектурным обзором.
-Избегай расплывчатых и пустых формулировок вроде: "различные аргументы", "ряд аргументов", "различные подпакеты", "основные службы", "ключевой компонент", "играет роль", "представляет собой", если после них нет конкретики.
-Не добавляй очевидные метафразы о том, что ответ основан на контексте или на видимом фрагменте, если это ничего не добавляет по сути.
-Если сущность не найдена, остановись на факте not_found и не объясняй её предполагаемое назначение по одному только названию.
-Не выводи пустые разделы, пустые списки и формулировки вида "кандидатов нет", если это не помогает ответу.
-
-Сосредоточься на указанном файле и отвечай коротко.
-Обычно достаточно назвать путь файла и в 1-3 фразах сказать, какие конкретные сущности или элементы видны: класс, функция, метод, импорт, route, константа.
-Не используй общие описания файла без конкретных имён.
-Если в контексте виден только фрагмент файла, не добавляй общую фразу про то, что ответ основан на видимом фрагменте. Вместо этого просто ограничься тем, что реально видно.
-Не превращай ответ в архитектурный обзор проекта.
-Не используй секции и подзаголовки.
-Если файла нет, ответь одной короткой фразой: `Файл <path> не найден.`
-Не придумывай анализ отсутствующего файла.
diff --git a/src/app/modules/agent/prompts/code_qa_repair_answer.txt b/src/app/modules/agent/prompts/code_qa_repair_answer.txt
deleted file mode 100644
index ca3b60f..0000000
--- a/src/app/modules/agent/prompts/code_qa_repair_answer.txt
+++ /dev/null
@@ -1,9 +0,0 @@
-Ты исправляешь черновой ответ по коду после проверки groundedness.
-Сделай ответ короче, точнее и строже по evidence payload.
-Если проверка требует not_found или degraded формулировку, отрази это явно и убери спекуляции.
-Если в `repair_focus` есть причины для `EXPLAIN`, перепиши ответ так, чтобы он назвал concrete methods, calls, fields, constructor args или dependencies из payload, а не общие responsibilities.
-Если в `repair_focus` есть причины для `ARCHITECTURE`, перепиши ответ так, чтобы он назвал concrete components и связи с relation verbs из payload: создает, вызывает, читает, записывает, импортирует, наследует.
-Если в `repair_focus` есть причины для `TRACE_FLOW`, перепиши ответ как последовательность concrete steps с явными methods/calls/edges из payload. Если виден только partial flow, так и скажи.
-Если в `repair_focus` есть `semantic_labels_without_code_edges`, убери semantic role labels из основной формулировки, если они не подкреплены concrete code edges.
-Если в `repair_focus` есть `contains_retrieval_artifacts` или `methods_as_primary_components`, убери raw retrieval labels и не выдавай методы за компоненты.
-Если в `repair_focus` есть `overclaims_trace_completeness`, убери фразы про полный/полностью восстановленный flow, если payload не подтверждает это явно.
diff --git a/src/app/modules/agent/prompts/code_qa_trace_flow_answer.txt b/src/app/modules/agent/prompts/code_qa_trace_flow_answer.txt
deleted file mode 100644
index 88c201c..0000000
--- a/src/app/modules/agent/prompts/code_qa_trace_flow_answer.txt
+++ /dev/null
@@ -1,25 +0,0 @@
-Ты инженер, который восстанавливает поток вызовов и движение данных только по доказуемой цепочке из контекста.
-
-Отвечай только по коду и структуре проекта, которые есть в контексте.
-Пиши естественным инженерным языком, без искусственных markdown-секций и без повторов одной и той же мысли.
-Если ответ можно дать в 1-3 фразах, не раздувай его.
-Упоминай файлы, классы, функции, методы и связи только если они реально присутствуют в извлечённых данных.
-Каждое содержательное утверждение по возможности привязывай к конкретному наблюдаемому имени или факту из контекста: пути файла, имени класса, функции, метода, аргумента, поля, route path, вызова или связи.
-Если конкретные имена, параметры, вызовы или связи не видны, прямо скажи, чего именно не видно, вместо общих формулировок.
-Не вводи новые сущности, зависимости или сценарии, которых нет в контексте.
-Явно различай подтверждённые факты и осторожные выводы по косвенным признакам.
-Если данных мало, честно скажи об этом вместо общего обзора.
-Не используй жирные заголовки блоков, если пользователь их не просил.
-Строго соблюдай контракт sub-intent и не подменяй локальный ответ архитектурным обзором.
-Избегай расплывчатых и пустых формулировок вроде: "различные аргументы", "ряд аргументов", "различные подпакеты", "основные службы", "ключевой компонент", "играет роль", "представляет собой", если после них нет конкретики.
-Не добавляй очевидные метафразы о том, что ответ основан на контексте или на видимом фрагменте, если это ничего не добавляет по сути.
-Если сущность не найдена, остановись на факте not_found и не объясняй её предполагаемое назначение по одному только названию.
-Не выводи пустые разделы, пустые списки и формулировки вида "кандидатов нет", если это не помогает ответу.
-
-Проследи поток выполнения или поток данных по найденным артефактам.
-Строй ответ вокруг `must_mention_flow_steps`, `must_mention_calls` и `must_mention_sequence_edges` из payload.
-Старайся описывать шаги последовательно и коротко, без лишних подзаголовков: сначала, затем, после этого, в конце.
-Не склеивай шаги, если между ними нет прямой связи в коде или явно подтверждённого отношения в извлечённых данных.
-Если поток восстанавливается только частично, так и скажи, опираясь на `fact_gaps`, и не заявляй, что flow восстановлен полностью.
-Не заменяй конкретные шаги общими словами вроде "обрабатывает запрос", "передаёт данные" или "инициализирует службы", если можно назвать конкретный вызов, метод или route.
-Не используй сильные формулировки вроде "полностью восстанавливается", "полный поток виден", если payload показывает только часть цепочки.
diff --git a/src/app/modules/agent/prompts/rag_intent_router_v2.txt b/src/app/modules/agent/prompts/rag_intent_router_v2.txt
deleted file mode 100644
index aee7599..0000000
--- a/src/app/modules/agent/prompts/rag_intent_router_v2.txt
+++ /dev/null
@@ -1,24 +0,0 @@
-Ты intent-router для layered RAG.
-На вход ты получаешь JSON с полями:
-- message: текущий запрос пользователя
-- active_intent: текущий активный intent диалога или null
-- last_query: предыдущий запрос пользователя
-- allowed_intents: допустимые intent'ы
-
-Выбери ровно один intent из allowed_intents.
-Верни только JSON без markdown и пояснений.
-
-Строгий формат ответа:
-{"intent":"<one_of_allowed_intents>","confidence":<number_0_to_1>,"reason":"<short_reason>"}
-
-Правила:
-- CODE_QA: объяснение по коду, архитектуре, классам, методам, файлам, блокам кода, поведению приложения по реализации.
-- DOCS_QA: объяснение по документации, README, markdown, specs, runbooks, разделам документации.
-- GENERATE_DOCS_FROM_CODE: просьба сгенерировать, подготовить или обновить документацию по коду.
-- PROJECT_MISC: прочие вопросы по проекту, не относящиеся явно к коду или документации.
-
-Приоритет:
-- Если пользователь просит именно подготовить документацию по коду, выбирай GENERATE_DOCS_FROM_CODE.
-- Если пользователь спрашивает про конкретный класс, файл, метод или блок кода, выбирай CODE_QA.
-- Если пользователь спрашивает про README, docs, markdown или конкретную документацию, выбирай DOCS_QA.
-- Если сигнал неочевиден, выбирай PROJECT_MISC и confidence <= 0.6.
diff --git a/src/app/modules/agent/runtime/__init__.py b/src/app/modules/agent/runtime/__init__.py
new file mode 100644
index 0000000..e94a99f
--- /dev/null
+++ b/src/app/modules/agent/runtime/__init__.py
@@ -0,0 +1,20 @@
+"""Публичный API runtime: оркестрация роутинг → retrieval → evidence gate → генерация ответа."""
+
+from app.modules.agent.runtime.executor import AgentRuntimeExecutor
+from app.modules.agent.runtime.models import (
+    RuntimeDraftAnswer,
+    RuntimeExecutionState,
+    RuntimeFinalResult,
+)
+from app.modules.agent.runtime.steps.gates.post.post_gate import RuntimeValidationResult
+from app.modules.agent.runtime.steps.retrieval import RuntimeRepoContextFactory, RuntimeRetrievalAdapter
+
+__all__ = [
+    "AgentRuntimeExecutor",
+    "RuntimeDraftAnswer",
+    "RuntimeExecutionState",
+    "RuntimeFinalResult",
+    "RuntimeRepoContextFactory",
+    "RuntimeRetrievalAdapter",
+    "RuntimeValidationResult",
+]
diff --git a/src/app/modules/agent/code_qa_runner_adapter.py b/src/app/modules/agent/runtime/code_qa_runner_adapter.py
similarity index 82%
rename from src/app/modules/agent/code_qa_runner_adapter.py
rename to src/app/modules/agent/runtime/code_qa_runner_adapter.py
index c48e8e3..8c2f5e3 100644
--- a/src/app/modules/agent/code_qa_runner_adapter.py
+++ b/src/app/modules/agent/runtime/code_qa_runner_adapter.py
@@ -1,11 +1,11 @@
-"""Адаптер CodeQaRuntimeExecutor к протоколу AgentRunner для интеграции с chat-слоем."""
+"""Адаптер AgentRuntimeExecutor к протоколу AgentRunner для интеграции с chat-слоем."""
 
 from __future__ import annotations
 
 import asyncio
 import logging
 
-from app.modules.agent.code_qa_runtime import CodeQaRuntimeExecutor
+from app.modules.agent.runtime import AgentRuntimeExecutor
 from app.modules.contracts import AgentRunner
 from app.schemas.chat import TaskResultType
 
@@ -13,9 +13,9 @@ LOGGER = logging.getLogger(__name__)
 
 
 class CodeQaRunnerAdapter:
-    """Реализация AgentRunner через CodeQaRuntimeExecutor (sync execute в executor)."""
+    """Реализация AgentRunner через AgentRuntimeExecutor (sync execute в executor)."""
 
-    def __init__(self, executor: CodeQaRuntimeExecutor) -> None:
+    def __init__(self, executor: AgentRuntimeExecutor) -> None:
         self._executor = executor
 
     async def run(
diff --git a/src/app/modules/agent/code_qa_runtime/executor.py b/src/app/modules/agent/runtime/executor.py
similarity index 79%
rename from src/app/modules/agent/code_qa_runtime/executor.py
rename to src/app/modules/agent/runtime/executor.py
index 626464e..854ab42 100644
--- a/src/app/modules/agent/code_qa_runtime/executor.py
+++ b/src/app/modules/agent/runtime/executor.py
@@ -1,60 +1,62 @@
+"""Главный оркестратор runtime: роутинг → retrieval → evidence gate → генерация ответа (LLM)."""
+
 from __future__ import annotations
 
 import logging
 from difflib import SequenceMatcher
 from time import perf_counter
 
-from app.modules.agent.code_qa_runtime.answer_policy import CodeQaAnswerPolicy
-from app.modules.agent.code_qa_runtime.models import CodeQaDraftAnswer, CodeQaExecutionState, CodeQaFinalResult
-from app.modules.agent.code_qa_runtime.post_gate import CodeQaPostEvidenceGate
-from app.modules.agent.code_qa_runtime.prompt_payload_builder import CodeQaPromptPayloadBuilder
-from app.modules.agent.code_qa_runtime.prompt_selector import CodeQaPromptSelector
-from app.modules.agent.code_qa_runtime.repair import CodeQaAnswerRepairService
-from app.modules.agent.code_qa_runtime.repo_context import CodeQaRepoContextFactory
-from app.modules.agent.code_qa_runtime.retrieval_adapter import CodeQaRetrievalAdapter
+from app.modules.agent.runtime.models import RuntimeDraftAnswer, RuntimeExecutionState, RuntimeFinalResult
+from app.modules.agent.intent_router_v2 import ConversationState, IntentRouterV2
+from app.modules.agent.intent_router_v2.models import SymbolResolution
+from app.modules.agent.runtime.steps.retrieval import RuntimeRetrievalAdapter, RuntimeRepoContextFactory
+from app.modules.agent.runtime.steps.context import (
+    build_answer_synthesis_input,
+    build_evidence_bundle,
+    build_retrieval_request,
+    build_retrieval_result,
+)
+from app.modules.agent.runtime.steps.gates.pre.evidence_gate import evaluate_evidence
+from app.modules.agent.runtime.steps.gates.post.post_gate import RuntimePostEvidenceGate
+from app.modules.agent.runtime.steps.answer_policy import RuntimeAnswerPolicy
+from app.modules.agent.runtime.steps.generation import RuntimePromptPayloadBuilder, RuntimePromptSelector, RuntimeAnswerGenerator
+from app.modules.agent.runtime.steps.finalization import RuntimeAnswerRepairService, assemble_final_result
 from app.modules.agent.llm import AgentLlmService
-from app.modules.rag.code_qa_pipeline.answer_synthesis import build_answer_synthesis_input
-from app.modules.rag.code_qa_pipeline.diagnostics import build_diagnostics_report
-from app.modules.rag.code_qa_pipeline.evidence_bundle_builder import build_evidence_bundle
-from app.modules.rag.code_qa_pipeline.evidence_gate import evaluate_evidence
-from app.modules.rag.code_qa_pipeline.retrieval_request_builder import build_retrieval_request
-from app.modules.rag.code_qa_pipeline.retrieval_result_builder import build_retrieval_result
-from app.modules.rag.intent_router_v2 import ConversationState, IntentRouterV2
-from app.modules.rag.intent_router_v2.models import SymbolResolution
 
 LOGGER = logging.getLogger(__name__)
 
 
-class CodeQaRuntimeExecutor:
+class AgentRuntimeExecutor:
     def __init__(
         self,
         llm: AgentLlmService | None,
         *,
         router: IntentRouterV2 | None = None,
-        retrieval: CodeQaRetrievalAdapter | None = None,
-        repo_context_factory: CodeQaRepoContextFactory | None = None,
-        prompt_selector: CodeQaPromptSelector | None = None,
-        payload_builder: CodeQaPromptPayloadBuilder | None = None,
-        answer_policy: CodeQaAnswerPolicy | None = None,
-        post_gate: CodeQaPostEvidenceGate | None = None,
+        retrieval: RuntimeRetrievalAdapter | None = None,
+        repo_context_factory: RuntimeRepoContextFactory | None = None,
+        prompt_selector: RuntimePromptSelector | None = None,
+        payload_builder: RuntimePromptPayloadBuilder | None = None,
+        answer_policy: RuntimeAnswerPolicy | None = None,
+        post_gate: RuntimePostEvidenceGate | None = None,
     ) -> None:
         self._llm = llm
         self._router = router or IntentRouterV2()
-        self._retrieval = retrieval or CodeQaRetrievalAdapter()
-        self._repo_context_factory = repo_context_factory or CodeQaRepoContextFactory()
-        self._prompt_selector = prompt_selector or CodeQaPromptSelector()
-        self._payload_builder = payload_builder or CodeQaPromptPayloadBuilder()
-        self._answer_policy = answer_policy or CodeQaAnswerPolicy()
-        self._post_gate = post_gate or CodeQaPostEvidenceGate()
-        self._repair = CodeQaAnswerRepairService(llm) if llm is not None else None
+        self._retrieval = retrieval or RuntimeRetrievalAdapter()
+        self._repo_context_factory = repo_context_factory or RuntimeRepoContextFactory()
+        self._prompt_selector = prompt_selector or RuntimePromptSelector()
+        self._payload_builder = payload_builder or RuntimePromptPayloadBuilder()
+        self._answer_policy = answer_policy or RuntimeAnswerPolicy()
+        self._post_gate = post_gate or RuntimePostEvidenceGate()
+        self._repair = RuntimeAnswerRepairService(llm) if llm is not None else None
+        self._generator = RuntimeAnswerGenerator(llm) if llm is not None else None
 
-    def execute(self, *, user_query: str, rag_session_id: str, files_map: dict[str, dict] | None = None) -> CodeQaFinalResult:
+    def execute(self, *, user_query: str, rag_session_id: str, files_map: dict[str, dict] | None = None) -> RuntimeFinalResult:
         timings_ms: dict[str, int] = {}
         runtime_trace: list[dict] = []
         answer_policy_branch = ""
         decision_reason = ""
         post_gate_snapshot: dict = {}
-        state = CodeQaExecutionState(
+        state = RuntimeExecutionState(
             user_query=user_query,
             rag_session_id=rag_session_id,
             conversation_state=ConversationState(),
@@ -164,7 +166,7 @@ class CodeQaRuntimeExecutor:
                     "output": post_gate_snapshot["output"],
                 }
             )
-            return self._finalize(
+            return assemble_final_result(
                 state,
                 draft=None,
                 final_answer=decision.answer,
@@ -175,8 +177,9 @@ class CodeQaRuntimeExecutor:
                 answer_policy_branch=answer_policy_branch,
                 decision_reason=decision_reason,
                 pre_gate_input=pre_gate_input,
-                gate_decision=gate_decision,
                 post_gate_snapshot=post_gate_snapshot,
+                resolved_target=self._resolved_target(state),
+                post_gate=self._post_gate,
             )
         if self._llm is None:
             answer_policy_branch = "llm_unavailable"
@@ -209,7 +212,7 @@ class CodeQaRuntimeExecutor:
                     "output": post_gate_snapshot["output"],
                 }
             )
-            return self._finalize(
+            return assemble_final_result(
                 state,
                 draft=None,
                 final_answer="",
@@ -220,8 +223,9 @@ class CodeQaRuntimeExecutor:
                 answer_policy_branch=answer_policy_branch,
                 decision_reason=decision_reason,
                 pre_gate_input=pre_gate_input,
-                gate_decision=gate_decision,
                 post_gate_snapshot=post_gate_snapshot,
+                resolved_target=self._resolved_target(state),
+                post_gate=self._post_gate,
             )
         state.synthesis_input = build_answer_synthesis_input(user_query, state.evidence_pack)
         prompt_name = self._prompt_selector.select(sub_intent=state.retrieval_request.sub_intent, answer_mode=state.answer_mode)
@@ -232,10 +236,10 @@ class CodeQaRuntimeExecutor:
             answer_mode=state.answer_mode,
         )
         started = perf_counter()
-        draft = CodeQaDraftAnswer(
+        draft = RuntimeDraftAnswer(
             prompt_name=prompt_name,
             prompt_payload=prompt_payload,
-            answer=self._llm.generate(prompt_name, prompt_payload, log_context="graph.project_qa.code_qa.answer").strip(),
+            answer=self._generator.generate(prompt_name, prompt_payload),
         )
         timings_ms["llm"] = self._elapsed_ms(started)
         runtime_trace.append(
@@ -312,7 +316,7 @@ class CodeQaRuntimeExecutor:
                 "output": post_gate_snapshot["output"],
             }
         )
-        return self._finalize(
+        return assemble_final_result(
             state,
             draft=draft,
             final_answer=final_answer,
@@ -324,11 +328,12 @@ class CodeQaRuntimeExecutor:
             answer_policy_branch=answer_policy_branch,
             decision_reason=decision_reason,
             pre_gate_input=pre_gate_input,
-            gate_decision=gate_decision,
             post_gate_snapshot=post_gate_snapshot,
+            resolved_target=self._resolved_target(state),
+            post_gate=self._post_gate,
         )
 
-    def _retrieve(self, state: CodeQaExecutionState) -> list[dict]:
+    def _retrieve(self, state: RuntimeExecutionState) -> list[dict]:
         assert state.retrieval_request is not None
         if state.retrieval_request.sub_intent == "OPEN_FILE" and state.retrieval_request.path_scope:
             return self._retrieval.retrieve_exact_files(
@@ -364,72 +369,10 @@ class CodeQaRuntimeExecutor:
             return {"status": "ambiguous", "resolved_symbol": None, "alternatives": close[:5], "confidence": 0.55}
         return {"status": "not_found", "resolved_symbol": None, "alternatives": close[:5], "confidence": 0.0}
 
-    def _finalize(
-        self,
-        state: CodeQaExecutionState,
-        *,
-        draft: CodeQaDraftAnswer | None,
-        final_answer: str,
-        repair_used: bool,
-        llm_used: bool,
-        validation=None,
-        timings_ms: dict[str, int] | None = None,
-        runtime_trace: list[dict] | None = None,
-        answer_policy_branch: str = "",
-        decision_reason: str = "",
-        pre_gate_input: dict | None = None,
-        gate_decision=None,
-        post_gate_snapshot: dict | None = None,
-    ) -> CodeQaFinalResult:
-        diagnostics = build_diagnostics_report(
-            router_result=state.router_result,
-            retrieval_request=state.retrieval_request,
-            retrieval_result=state.retrieval_result,
-            evidence_bundle=state.evidence_pack,
-            answer_mode=state.answer_mode,
-            timings_ms=timings_ms or {},
-            resolved_target=self._resolved_target(state),
-            answer_policy_branch=answer_policy_branch,
-            decision_reason=decision_reason,
-            evidence_gate_input=pre_gate_input or {},
-            post_evidence_gate=post_gate_snapshot or {},
-        )
-        result = CodeQaFinalResult(
-            final_answer=final_answer.strip(),
-            answer_mode=state.answer_mode,
-            repair_used=repair_used,
-            llm_used=llm_used,
-            draft_answer=draft,
-            validation=validation
-            or self._post_gate.validate(
-                answer=final_answer,
-                answer_mode=state.answer_mode,
-                degraded_message=state.degraded_message,
-                sub_intent=state.retrieval_request.sub_intent if state.retrieval_request else "",
-                user_query=state.user_query,
-                evidence_pack=state.evidence_pack,
-            ),
-            router_result=state.router_result,
-            retrieval_request=state.retrieval_request,
-            retrieval_result=state.retrieval_result,
-            evidence_pack=state.evidence_pack,
-            diagnostics=diagnostics,
-            runtime_trace=list(runtime_trace or []),
-        )
-        LOGGER.warning(
-            "code qa runtime executed: intent=%s sub_intent=%s answer_mode=%s repair_used=%s llm_used=%s",
-            state.router_result.intent,
-            state.router_result.query_plan.sub_intent,
-            result.answer_mode,
-            result.repair_used,
-            result.llm_used,
-        )
-        return result
-
     def _elapsed_ms(self, started: float) -> int:
         return max(1, round((perf_counter() - started) * 1000))
 
-    def _build_pre_gate_input(self, state: CodeQaExecutionState) -> dict:
+    def _build_pre_gate_input(self, state: RuntimeExecutionState) -> dict:
         evidence = state.evidence_pack
         retrieval = state.retrieval_result
         return {
@@ -445,7 +388,7 @@ class CodeQaRuntimeExecutor:
             "path_scope": list(state.retrieval_request.path_scope) if state.retrieval_request else [],
         }
 
-    def _resolved_target(self, state: CodeQaExecutionState) -> str | None:
+    def _resolved_target(self, state: RuntimeExecutionState) -> str | None:
         if state.evidence_pack and state.evidence_pack.resolved_target:
             return state.evidence_pack.resolved_target
         if state.retrieval_result and state.retrieval_result.resolved_symbol:
@@ -470,7 +413,7 @@ class CodeQaRuntimeExecutor:
 
     def _hydrate_entrypoint_sources(
         self,
-        state: CodeQaExecutionState,
+        state: RuntimeExecutionState,
         raw_rows: list[dict],
         retrieval_report: dict,
     ) -> tuple[list[dict], dict]:
@@ -524,7 +467,7 @@ class CodeQaRuntimeExecutor:
         merged["supplemental_requests"] = [*(base.get("supplemental_requests") or []), *(extra.get("requests") or [])]
         return merged
 
-    def _fallback_mode(self, state: CodeQaExecutionState) -> str:
+    def _fallback_mode(self, state: RuntimeExecutionState) -> str:
         status = str(state.router_result.symbol_resolution.status if state.router_result and state.router_result.symbol_resolution else "")
         if status == "ambiguous":
             return "ambiguous"
@@ -532,7 +475,7 @@ class CodeQaRuntimeExecutor:
             return "not_found"
         return "degraded"
 
-    def _fallback_answer(self, state: CodeQaExecutionState) -> str:
+    def _fallback_answer(self, state: RuntimeExecutionState) -> str:
         symbol_resolution = state.router_result.symbol_resolution if state.router_result else None
         query_plan = state.router_result.query_plan if state.router_result else None
         target = next((item for item in list(query_plan.symbol_candidates or []) if item), "запрошенная сущность") if query_plan else "запрошенная сущность"
diff --git a/src/app/modules/rag/code_qa_pipeline/pipeline.py b/src/app/modules/agent/runtime/legacy_pipeline.py
similarity index 86%
rename from src/app/modules/rag/code_qa_pipeline/pipeline.py
rename to src/app/modules/agent/runtime/legacy_pipeline.py
index e22d96f..1eb4d72 100644
--- a/src/app/modules/rag/code_qa_pipeline/pipeline.py
+++ b/src/app/modules/agent/runtime/legacy_pipeline.py
@@ -1,4 +1,4 @@
-"""Canonical test-first CODE_QA pipeline: router -> retrieval -> evidence -> synthesis -> diagnostics."""
+"""Legacy test-first CODE_QA pipeline: router → retrieval → evidence → synthesis → diagnostics. Prefer agent.runtime.executor."""
 
 from __future__ import annotations
 
@@ -7,23 +7,21 @@ from difflib import get_close_matches
 from time import perf_counter
 from typing import Any, Protocol
 
-from app.modules.rag.code_qa_pipeline.answer_synthesis import build_answer_synthesis_input
-from app.modules.rag.code_qa_pipeline.contracts import (
+from app.modules.agent.runtime.steps.context import (
+    build_answer_synthesis_input,
+    build_diagnostics_report,
+    build_evidence_bundle,
+    build_retrieval_request,
+    build_retrieval_result,
     EvidenceBundle,
     RetrievalRequest,
     RetrievalResult,
     RouterResult,
 )
-from app.modules.rag.code_qa_pipeline.diagnostics import build_diagnostics_report
-from app.modules.rag.code_qa_pipeline.evidence_bundle_builder import build_evidence_bundle
-from app.modules.rag.code_qa_pipeline.evidence_gate import evaluate_evidence
-from app.modules.rag.code_qa_pipeline.retrieval_request_builder import build_retrieval_request
-from app.modules.rag.code_qa_pipeline.retrieval_result_builder import build_retrieval_result
+from app.modules.agent.runtime.steps.gates.pre.evidence_gate import evaluate_evidence
 
 
 class RetrievalAdapter(Protocol):
-    """Protocol for retrieval in the CODE_QA pipeline; satisfied by RagDbAdapter."""
-
     def retrieve_with_plan(
         self,
         rag_session_id: str,
@@ -70,8 +68,6 @@ class RetrievalAdapter(Protocol):
 
 @dataclass(slots=True)
 class CodeQAPipelineResult:
-    """Result of one run of the canonical CODE_QA pipeline."""
-
     user_query: str
     rag_session_id: str
     router_result: RouterResult
@@ -88,7 +84,7 @@ class CodeQAPipelineResult:
 
 
 class CodeQAPipelineRunner:
-    """Single entrypoint for the test-first CODE_QA pipeline; uses IntentRouterV2 only."""
+    """Legacy test-first CODE_QA pipeline entrypoint. Prefer AgentRuntimeExecutor."""
 
     def __init__(
         self,
@@ -109,7 +105,6 @@ class CodeQAPipelineRunner:
         run_retrieval: bool = True,
         run_hydrate: bool = True,
     ) -> CodeQAPipelineResult:
-        """Run the full pipeline: route -> retrieval -> evidence bundle -> gate -> synthesis -> diagnostics."""
         timings: dict[str, int] = {}
         t0 = perf_counter()
         router_result = self._router.route(
@@ -228,10 +223,10 @@ def _ms(started: float) -> int:
 
 
 def _default_conversation_state() -> Any:
-    from app.modules.rag.intent_router_v2 import ConversationState
+    from app.modules.agent.intent_router_v2 import ConversationState
     return ConversationState()
 
 
 def _default_repo_context() -> Any:
-    from app.modules.rag.intent_router_v2 import RepoContext
+    from app.modules.agent.intent_router_v2 import RepoContext
     return RepoContext()
diff --git a/src/app/modules/agent/runtime/models.py b/src/app/modules/agent/runtime/models.py
new file mode 100644
index 0000000..daf2644
--- /dev/null
+++ b/src/app/modules/agent/runtime/models.py
@@ -0,0 +1,60 @@
+"""Модели состояния и результата runtime-оркестратора."""
+
+from __future__ import annotations
+
+from typing import Any
+
+from pydantic import BaseModel, ConfigDict, Field
+
+from app.modules.agent.runtime.steps.context.contracts import (
+    AnswerSynthesisInput,
+    DiagnosticsReport,
+    EvidenceBundle,
+    RetrievalRequest,
+    RetrievalResult,
+)
+from app.modules.agent.runtime.steps.gates.post.post_gate import RuntimeValidationResult
+from app.modules.agent.intent_router_v2.models import ConversationState, IntentRouterResult, RepoContext
+
+
+class RuntimeDraftAnswer(BaseModel):
+    model_config = ConfigDict(extra="forbid")
+
+    prompt_name: str
+    prompt_payload: str
+    answer: str = ""
+
+
+class RuntimeFinalResult(BaseModel):
+    model_config = ConfigDict(extra="forbid")
+
+    final_answer: str
+    answer_mode: str = "normal"
+    repair_used: bool = False
+    llm_used: bool = False
+    draft_answer: RuntimeDraftAnswer | None = None
+    validation: RuntimeValidationResult = Field(default_factory=RuntimeValidationResult)
+    router_result: IntentRouterResult | None = None
+    retrieval_request: RetrievalRequest | None = None
+    retrieval_result: RetrievalResult | None = None
+    evidence_pack: EvidenceBundle | None = None
+    diagnostics: DiagnosticsReport
+    runtime_trace: list[dict[str, Any]] = Field(default_factory=list)
+
+
+class RuntimeExecutionState(BaseModel):
+    model_config = ConfigDict(extra="forbid")
+
+    user_query: str
+    rag_session_id: str
+    conversation_state: ConversationState = Field(default_factory=ConversationState)
+    repo_context: RepoContext = Field(default_factory=RepoContext)
+    router_result: IntentRouterResult | None = None
+    retrieval_request: RetrievalRequest | None = None
+    retrieval_result: RetrievalResult | None = None
+    evidence_pack: EvidenceBundle | None = None
+    synthesis_input: AnswerSynthesisInput | None = None
+    diagnostics: DiagnosticsReport | None = None
+    answer_mode: str = "normal"
+    degraded_message: str = ""
+    final_result: RuntimeFinalResult | None = None
diff --git a/src/app/modules/agent/runtime/steps/answer_policy/__init__.py b/src/app/modules/agent/runtime/steps/answer_policy/__init__.py
new file mode 100644
index 0000000..6e4b61a
--- /dev/null
+++ b/src/app/modules/agent/runtime/steps/answer_policy/__init__.py
@@ -0,0 +1,6 @@
+"""Политика ответа: вызов LLM или короткий ответ по результату evidence gate."""
+
+from app.modules.agent.runtime.steps.answer_policy.policy import RuntimeAnswerPolicy, RuntimePolicyDecision
+from app.modules.agent.runtime.steps.answer_policy.short_answer_formatter import RuntimeShortAnswerFormatter
+
+__all__ = ["RuntimeAnswerPolicy", "RuntimePolicyDecision", "RuntimeShortAnswerFormatter"]
diff --git a/src/app/modules/agent/code_qa_runtime/answer_policy.py b/src/app/modules/agent/runtime/steps/answer_policy/policy.py
similarity index 73%
rename from src/app/modules/agent/code_qa_runtime/answer_policy.py
rename to src/app/modules/agent/runtime/steps/answer_policy/policy.py
index f4f6c7e..c2c1931 100644
--- a/src/app/modules/agent/code_qa_runtime/answer_policy.py
+++ b/src/app/modules/agent/runtime/steps/answer_policy/policy.py
@@ -1,14 +1,16 @@
+"""Политика принятия решения: вызывать LLM или вернуть короткий ответ по evidence gate."""
+
 from __future__ import annotations
 
 from dataclasses import dataclass
 
-from app.modules.rag.code_qa_pipeline.evidence_gate import EvidenceGateDecision
-from app.modules.rag.intent_router_v2.models import IntentRouterResult
-from app.modules.agent.code_qa_runtime.short_answer_formatter import CodeQaShortAnswerFormatter
+from app.modules.agent.runtime.steps.gates.pre.evidence_gate import EvidenceGateDecision
+from app.modules.agent.intent_router_v2.models import IntentRouterResult
+from app.modules.agent.runtime.steps.answer_policy.short_answer_formatter import RuntimeShortAnswerFormatter
 
 
 @dataclass(slots=True, frozen=True)
-class CodeQaPolicyDecision:
+class RuntimePolicyDecision:
     answer_mode: str
     answer: str = ""
     should_call_llm: bool = True
@@ -16,22 +18,22 @@ class CodeQaPolicyDecision:
     reason: str = "evidence_sufficient"
 
 
-class CodeQaAnswerPolicy:
-    def __init__(self, formatter: CodeQaShortAnswerFormatter | None = None) -> None:
-        self._formatter = formatter or CodeQaShortAnswerFormatter()
+class RuntimeAnswerPolicy:
+    def __init__(self, formatter: RuntimeShortAnswerFormatter | None = None) -> None:
+        self._formatter = formatter or RuntimeShortAnswerFormatter()
 
     def decide(
         self,
         *,
         router_result: IntentRouterResult,
         gate_decision: EvidenceGateDecision,
-    ) -> CodeQaPolicyDecision:
+    ) -> RuntimePolicyDecision:
         sub_intent = router_result.query_plan.sub_intent.upper()
         symbol_resolution = router_result.symbol_resolution
         if sub_intent == "OPEN_FILE" and "path_scope_empty" in gate_decision.failure_reasons:
             path_scope = list(getattr(router_result.retrieval_spec.filters, "path_scope", []) or [])
             target = path_scope[0] if path_scope else "запрошенный файл"
-            return CodeQaPolicyDecision(
+            return RuntimePolicyDecision(
                 answer_mode="not_found",
                 answer=self._formatter.open_file_not_found(target),
                 should_call_llm=False,
@@ -39,7 +41,7 @@ class CodeQaAnswerPolicy:
                 reason="path_scope_empty",
             )
         if sub_intent == "EXPLAIN" and symbol_resolution.status == "not_found":
-            return CodeQaPolicyDecision(
+            return RuntimePolicyDecision(
                 answer_mode="not_found",
                 answer=self._formatter.entity_not_found(self._target_label(router_result), symbol_resolution.alternatives),
                 should_call_llm=False,
@@ -47,7 +49,7 @@ class CodeQaAnswerPolicy:
                 reason="symbol_resolution_not_found",
             )
         if sub_intent == "EXPLAIN" and symbol_resolution.status == "ambiguous":
-            return CodeQaPolicyDecision(
+            return RuntimePolicyDecision(
                 answer_mode="ambiguous",
                 answer=self._formatter.entity_ambiguous(self._target_label(router_result), symbol_resolution.alternatives),
                 should_call_llm=False,
@@ -57,14 +59,14 @@ class CodeQaAnswerPolicy:
         if not gate_decision.passed:
             answer_mode = "insufficient" if "insufficient_evidence" in gate_decision.failure_reasons else "degraded"
             reason = gate_decision.failure_reasons[0] if gate_decision.failure_reasons else "evidence_gate_failed"
-            return CodeQaPolicyDecision(
+            return RuntimePolicyDecision(
                 answer_mode=answer_mode,
                 answer=self._formatter.insufficient(gate_decision.degraded_message),
                 should_call_llm=False,
                 branch="evidence_gate_short_circuit",
                 reason=reason,
             )
-        return CodeQaPolicyDecision(answer_mode="normal", branch="normal_answer", reason="evidence_sufficient")
+        return RuntimePolicyDecision(answer_mode="normal", branch="normal_answer", reason="evidence_sufficient")
 
     def _target_label(self, router_result: IntentRouterResult) -> str:
         candidates = [item.strip() for item in list(router_result.query_plan.symbol_candidates or []) if item and item.strip()]
diff --git a/src/app/modules/agent/code_qa_runtime/short_answer_formatter.py b/src/app/modules/agent/runtime/steps/answer_policy/short_answer_formatter.py
similarity index 90%
rename from src/app/modules/agent/code_qa_runtime/short_answer_formatter.py
rename to src/app/modules/agent/runtime/steps/answer_policy/short_answer_formatter.py
index 7cdb83a..73233d0 100644
--- a/src/app/modules/agent/code_qa_runtime/short_answer_formatter.py
+++ b/src/app/modules/agent/runtime/steps/answer_policy/short_answer_formatter.py
@@ -1,7 +1,9 @@
+"""Форматирование коротких ответов для режимов not_found, ambiguous, insufficient."""
+
 from __future__ import annotations
 
 
-class CodeQaShortAnswerFormatter:
+class RuntimeShortAnswerFormatter:
     def open_file_not_found(self, target: str) -> str:
         return f"Файл {target} не найден."
 
diff --git a/src/app/modules/agent/runtime/steps/context/__init__.py b/src/app/modules/agent/runtime/steps/context/__init__.py
new file mode 100644
index 0000000..c3fa187
--- /dev/null
+++ b/src/app/modules/agent/runtime/steps/context/__init__.py
@@ -0,0 +1,35 @@
+"""Контракты и билдеры контекста пайплайна: retrieval request/result, evidence bundle, diagnostics, synthesis."""
+
+from app.modules.agent.runtime.steps.context.contracts import (
+    AnswerSynthesisInput,
+    CodeChunkItem,
+    DiagnosticsReport,
+    EvidenceBundle,
+    FailureReason,
+    RetrievalRequest,
+    RetrievalResult,
+    RouterResult,
+)
+from app.modules.agent.runtime.steps.context.retrieval_request_builder import build_retrieval_request
+from app.modules.agent.runtime.steps.context.retrieval_result_builder import build_retrieval_result
+from app.modules.agent.runtime.steps.context.evidence_bundle_builder import build_evidence_bundle
+from app.modules.agent.runtime.steps.context.diagnostics import build_diagnostics_report
+from app.modules.agent.runtime.steps.context.answer_synthesis import build_answer_synthesis_input
+from app.modules.agent.runtime.steps.context.answer_fact_curator import build_curated_answer_facts
+
+__all__ = [
+    "AnswerSynthesisInput",
+    "CodeChunkItem",
+    "DiagnosticsReport",
+    "EvidenceBundle",
+    "FailureReason",
+    "RetrievalRequest",
+    "RetrievalResult",
+    "RouterResult",
+    "build_retrieval_request",
+    "build_retrieval_result",
+    "build_evidence_bundle",
+    "build_diagnostics_report",
+    "build_answer_synthesis_input",
+    "build_curated_answer_facts",
+]
diff --git a/src/app/modules/rag/code_qa_pipeline/answer_fact_curator.py b/src/app/modules/agent/runtime/steps/context/answer_fact_curator.py
similarity index 98%
rename from src/app/modules/rag/code_qa_pipeline/answer_fact_curator.py
rename to src/app/modules/agent/runtime/steps/context/answer_fact_curator.py
index 59038db..a006770 100644
--- a/src/app/modules/rag/code_qa_pipeline/answer_fact_curator.py
+++ b/src/app/modules/agent/runtime/steps/context/answer_fact_curator.py
@@ -1,9 +1,11 @@
+"""Курация фактов из EvidenceBundle для сценариев explain, architecture, trace_flow."""
+
 from __future__ import annotations
 
 import re
 from typing import Any
 
-from app.modules.rag.code_qa_pipeline.contracts import CodeChunkItem, EvidenceBundle
+from app.modules.agent.runtime.steps.context.contracts import CodeChunkItem, EvidenceBundle
 
 _CALL_RE = re.compile(r"([A-Za-z_][\w\.]*)\s*\(")
 _FIELD_RE = re.compile(r"self\.(\w+)")
diff --git a/src/app/modules/rag/code_qa_pipeline/answer_synthesis.py b/src/app/modules/agent/runtime/steps/context/answer_synthesis.py
similarity index 88%
rename from src/app/modules/rag/code_qa_pipeline/answer_synthesis.py
rename to src/app/modules/agent/runtime/steps/context/answer_synthesis.py
index 0ce308a..c566b58 100644
--- a/src/app/modules/rag/code_qa_pipeline/answer_synthesis.py
+++ b/src/app/modules/agent/runtime/steps/context/answer_synthesis.py
@@ -1,16 +1,15 @@
-"""Builds AnswerSynthesisInput from EvidenceBundle for LLM stage."""
+"""Сборка AnswerSynthesisInput из EvidenceBundle для этапа LLM."""
 
 from __future__ import annotations
 
-from app.modules.rag.code_qa_pipeline.answer_fact_curator import build_curated_answer_facts
-from app.modules.rag.code_qa_pipeline.contracts import AnswerSynthesisInput, EvidenceBundle
+from app.modules.agent.runtime.steps.context.answer_fact_curator import build_curated_answer_facts
+from app.modules.agent.runtime.steps.context.contracts import AnswerSynthesisInput, EvidenceBundle
 
 
 def build_answer_synthesis_input(
     user_question: str,
     bundle: EvidenceBundle,
 ) -> AnswerSynthesisInput:
-    """Build LLM input from EvidenceBundle; fast context (summary) + deep context (payload)."""
     scenario = bundle.resolved_sub_intent or "EXPLAIN"
     target = bundle.resolved_target
     sufficient = bundle.sufficient
diff --git a/src/app/modules/rag/code_qa_pipeline/contracts.py b/src/app/modules/agent/runtime/steps/context/contracts.py
similarity index 80%
rename from src/app/modules/rag/code_qa_pipeline/contracts.py
rename to src/app/modules/agent/runtime/steps/context/contracts.py
index 0fa3253..e8a5972 100644
--- a/src/app/modules/rag/code_qa_pipeline/contracts.py
+++ b/src/app/modules/agent/runtime/steps/context/contracts.py
@@ -1,9 +1,4 @@
-"""Typed contracts for the canonical CODE_QA test-first pipeline.
-
-Defines RouterResult, RetrievalRequest, RetrievalResult, EvidenceBundle,
-AnswerSynthesisInput, DiagnosticsReport and machine-readable failure reasons.
-Used only by the test-first pipeline; legacy runtime is unchanged.
-"""
+"""Типизированные контракты пайплайна: RouterResult, RetrievalRequest, RetrievalResult, EvidenceBundle, AnswerSynthesisInput, DiagnosticsReport."""
 
 from __future__ import annotations
 
@@ -11,14 +6,10 @@ from typing import Any, Literal
 
 from pydantic import BaseModel, ConfigDict, Field
 
-# Re-export for pipeline: router output is IntentRouterResult
-from app.modules.rag.intent_router_v2.models import IntentRouterResult
+from app.modules.agent.intent_router_v2.models import IntentRouterResult
 
-# Type alias: router output is the source of truth for the test pipeline
 RouterResult = IntentRouterResult
 
-
-# --- Machine-readable failure reasons for diagnostics ---
 FailureReason = Literal[
     "router_low_confidence",
     "target_not_resolved",
@@ -45,8 +36,6 @@ FAILURE_REASONS: tuple[FailureReason, ...] = (
 
 
 class RetrievalRequest(BaseModel):
-    """Request for retrieval stage; built from RouterResult."""
-
     model_config = ConfigDict(extra="forbid")
 
     rag_session_id: str
@@ -56,15 +45,12 @@ class RetrievalRequest(BaseModel):
     keyword_hints: list[str] = Field(default_factory=list)
     symbol_candidates: list[str] = Field(default_factory=list)
     requested_layers: list[str] = Field(default_factory=list)
-    # Pass-through for existing adapter (retrieval_spec, retrieval_constraints, query_plan)
     retrieval_spec: Any = None
     retrieval_constraints: Any = None
     query_plan: Any = None
 
 
 class CodeChunkItem(BaseModel):
-    """Single code chunk in normalized retrieval output."""
-
     model_config = ConfigDict(extra="forbid")
 
     layer: str
@@ -77,8 +63,6 @@ class CodeChunkItem(BaseModel):
 
 
 class LayerOutcome(BaseModel):
-    """Per-layer retrieval outcome for diagnostics."""
-
     model_config = ConfigDict(extra="forbid")
 
     layer_id: str
@@ -88,8 +72,6 @@ class LayerOutcome(BaseModel):
 
 
 class RetrievalResult(BaseModel):
-    """Normalized retrieval result; single structure for all scenarios."""
-
     model_config = ConfigDict(extra="forbid")
 
     target_symbol_candidates: list[str] = Field(default_factory=list)
@@ -108,8 +90,6 @@ class RetrievalResult(BaseModel):
 
 
 class EvidenceBundle(BaseModel):
-    """Canonical evidence bundle for answer synthesis; only source of truth for LLM stage."""
-
     model_config = ConfigDict(extra="forbid")
 
     resolved_intent: str = ""
@@ -129,8 +109,6 @@ class EvidenceBundle(BaseModel):
 
 
 class AnswerSynthesisInput(BaseModel):
-    """Input for LLM answer synthesis; derived from EvidenceBundle."""
-
     model_config = ConfigDict(extra="forbid")
 
     user_question: str = ""
@@ -146,11 +124,8 @@ class AnswerSynthesisInput(BaseModel):
 
 
 class DiagnosticsReport(BaseModel):
-    """Full diagnostics for the pipeline; Level 1 summary + Level 2 detail."""
-
     model_config = ConfigDict(extra="forbid")
 
-    # Level 1 — human-readable summary
     intent_correct: bool | None = None
     target_found: bool = False
     layers_used: list[str] = Field(default_factory=list)
@@ -160,7 +135,6 @@ class DiagnosticsReport(BaseModel):
     answer_policy_branch: str = ""
     decision_reason: str = ""
 
-    # Level 2 — detailed
     router_result: dict[str, Any] = Field(default_factory=dict)
     retrieval_request: dict[str, Any] = Field(default_factory=dict)
     per_layer_outcome: list[dict[str, Any]] = Field(default_factory=list)
diff --git a/src/app/modules/rag/code_qa_pipeline/diagnostics.py b/src/app/modules/agent/runtime/steps/context/diagnostics.py
similarity index 93%
rename from src/app/modules/rag/code_qa_pipeline/diagnostics.py
rename to src/app/modules/agent/runtime/steps/context/diagnostics.py
index 74fe501..b33cb37 100644
--- a/src/app/modules/rag/code_qa_pipeline/diagnostics.py
+++ b/src/app/modules/agent/runtime/steps/context/diagnostics.py
@@ -1,10 +1,10 @@
-"""Diagnostics for the CODE_QA pipeline: Level 1 summary and Level 2 detail."""
+"""Диагностика пайплайна CODE_QA: сводка уровня 1 и детали уровня 2."""
 
 from __future__ import annotations
 
 from typing import Any
 
-from app.modules.rag.code_qa_pipeline.contracts import (
+from app.modules.agent.runtime.steps.context.contracts import (
     DiagnosticsReport,
     EvidenceBundle,
     RetrievalRequest,
@@ -27,7 +27,6 @@ def build_diagnostics_report(
     evidence_gate_input: dict[str, Any] | None = None,
     post_evidence_gate: dict[str, Any] | None = None,
 ) -> DiagnosticsReport:
-    """Build full diagnostics: Level 1 summary + Level 2 detail + failure reasons."""
     timings = dict(timings_ms or {})
     req = retrieval_request
     res = retrieval_result
@@ -83,7 +82,6 @@ def build_diagnostics_report(
 
 
 def build_level1_summary(report: DiagnosticsReport) -> dict[str, Any]:
-    """Human-readable summary: intent, target, layers, sufficiency, answer mode."""
     return {
         "intent_correct": report.intent_correct,
         "target_found": report.target_found,
@@ -98,7 +96,6 @@ def build_level1_summary(report: DiagnosticsReport) -> dict[str, Any]:
 
 
 def build_level2_detail(report: DiagnosticsReport) -> dict[str, Any]:
-    """Detailed diagnostics for tuning and tests."""
     return {
         "router_result": report.router_result,
         "retrieval_request": report.retrieval_request,
diff --git a/src/app/modules/rag/code_qa_pipeline/evidence_bundle_builder.py b/src/app/modules/agent/runtime/steps/context/evidence_bundle_builder.py
similarity index 89%
rename from src/app/modules/rag/code_qa_pipeline/evidence_bundle_builder.py
rename to src/app/modules/agent/runtime/steps/context/evidence_bundle_builder.py
index 7c8841d..8edeff7 100644
--- a/src/app/modules/rag/code_qa_pipeline/evidence_bundle_builder.py
+++ b/src/app/modules/agent/runtime/steps/context/evidence_bundle_builder.py
@@ -1,15 +1,14 @@
-"""Builds EvidenceBundle from RetrievalResult and router context."""
+"""Сборка EvidenceBundle из RetrievalResult и результата роутера."""
 
 from __future__ import annotations
 
-from app.modules.rag.code_qa_pipeline.contracts import EvidenceBundle, RetrievalResult, RouterResult
+from app.modules.agent.runtime.steps.context.contracts import EvidenceBundle, RetrievalResult, RouterResult
 
 
 def build_evidence_bundle(
     retrieval_result: RetrievalResult,
     router_result: RouterResult,
 ) -> EvidenceBundle:
-    """Build EvidenceBundle from normalized retrieval and router result."""
     intent = router_result.intent or "CODE_QA"
     sub_intent = (router_result.query_plan and router_result.query_plan.sub_intent) or "EXPLAIN"
     resolved_target: str | None = None
diff --git a/src/app/modules/rag/code_qa_pipeline/retrieval_request_builder.py b/src/app/modules/agent/runtime/steps/context/retrieval_request_builder.py
similarity index 78%
rename from src/app/modules/rag/code_qa_pipeline/retrieval_request_builder.py
rename to src/app/modules/agent/runtime/steps/context/retrieval_request_builder.py
index b8bbdc7..05b8835 100644
--- a/src/app/modules/rag/code_qa_pipeline/retrieval_request_builder.py
+++ b/src/app/modules/agent/runtime/steps/context/retrieval_request_builder.py
@@ -1,12 +1,11 @@
-"""Builds RetrievalRequest from RouterResult for the CODE_QA pipeline."""
+"""Сборка RetrievalRequest из RouterResult для пайплайна CODE_QA."""
 
 from __future__ import annotations
 
-from app.modules.rag.code_qa_pipeline.contracts import RetrievalRequest, RouterResult
+from app.modules.agent.runtime.steps.context.contracts import RetrievalRequest, RouterResult
 
 
 def build_retrieval_request(router_result: RouterResult, rag_session_id: str) -> RetrievalRequest:
-    """Convert router output to RetrievalRequest; router is source of truth."""
     query_plan = router_result.query_plan
     spec = router_result.retrieval_spec
     path_scope = list(getattr(spec.filters, "path_scope", []) or [])
diff --git a/src/app/modules/rag/code_qa_pipeline/retrieval_result_builder.py b/src/app/modules/agent/runtime/steps/context/retrieval_result_builder.py
similarity index 97%
rename from src/app/modules/rag/code_qa_pipeline/retrieval_result_builder.py
rename to src/app/modules/agent/runtime/steps/context/retrieval_result_builder.py
index fb19c1f..7a4e1cd 100644
--- a/src/app/modules/rag/code_qa_pipeline/retrieval_result_builder.py
+++ b/src/app/modules/agent/runtime/steps/context/retrieval_result_builder.py
@@ -1,10 +1,10 @@
-"""Builds normalized RetrievalResult from raw retrieval rows and report."""
+"""Сборка нормализованного RetrievalResult из сырых строк retrieval и отчёта."""
 
 from __future__ import annotations
 
 import re
 
-from app.modules.rag.code_qa_pipeline.contracts import CodeChunkItem, LayerOutcome, RetrievalResult
+from app.modules.agent.runtime.steps.context.contracts import CodeChunkItem, LayerOutcome, RetrievalResult
 from app.modules.rag.retrieval.test_filter import is_test_path
 
 _ROUTE_RE = re.compile(r'@[\w\.]+\.(get|post|put|delete|patch|options|head)\(\s*["\']([^"\']+)["\']')
@@ -16,7 +16,6 @@ def build_retrieval_result(
     retrieval_report: dict | None,
     symbol_resolution: dict | None,
 ) -> RetrievalResult:
-    """Convert raw adapter rows and optional report into normalized RetrievalResult."""
     report = retrieval_report or {}
     sym = symbol_resolution or {}
     layers_seen: set[str] = set()
diff --git a/src/app/modules/agent/runtime/steps/explain/__init__.py b/src/app/modules/agent/runtime/steps/explain/__init__.py
new file mode 100644
index 0000000..59bf6a9
--- /dev/null
+++ b/src/app/modules/agent/runtime/steps/explain/__init__.py
@@ -0,0 +1,36 @@
+from __future__ import annotations
+
+from importlib import import_module
+
+__all__ = [
+    "CodeExcerpt",
+    "CodeExplainRetrieverV2",
+    "CodeGraphRepository",
+    "EvidenceItem",
+    "ExplainIntent",
+    "ExplainIntentBuilder",
+    "ExplainPack",
+    "LayeredRetrievalGateway",
+    "PromptBudgeter",
+    "TracePath",
+]
+
+
+def __getattr__(name: str):
+    module_map = {
+        "CodeExcerpt": "app.modules.agent.runtime.steps.explain.models",
+        "EvidenceItem": "app.modules.agent.runtime.steps.explain.models",
+        "ExplainIntent": "app.modules.agent.runtime.steps.explain.models",
+        "ExplainPack": "app.modules.agent.runtime.steps.explain.models",
+        "TracePath": "app.modules.agent.runtime.steps.explain.models",
+        "ExplainIntentBuilder": "app.modules.agent.runtime.steps.explain.intent_builder",
+        "PromptBudgeter": "app.modules.agent.runtime.steps.explain.budgeter",
+        "LayeredRetrievalGateway": "app.modules.agent.runtime.steps.explain.layered_gateway",
+        "CodeGraphRepository": "app.modules.agent.runtime.steps.explain.graph_repository",
+        "CodeExplainRetrieverV2": "app.modules.agent.runtime.steps.explain.retriever_v2",
+    }
+    module_name = module_map.get(name)
+    if module_name is None:
+        raise AttributeError(name)
+    module = import_module(module_name)
+    return getattr(module, name)
diff --git a/src/app/modules/rag/explain/budgeter.py b/src/app/modules/agent/runtime/steps/explain/budgeter.py
similarity index 97%
rename from src/app/modules/rag/explain/budgeter.py
rename to src/app/modules/agent/runtime/steps/explain/budgeter.py
index adcddfd..4f925b5 100644
--- a/src/app/modules/rag/explain/budgeter.py
+++ b/src/app/modules/agent/runtime/steps/explain/budgeter.py
@@ -2,7 +2,7 @@ from __future__ import annotations
 
 import json
 
-from app.modules.rag.explain.models import ExplainPack
+from app.modules.agent.runtime.steps.explain.models import ExplainPack
 
 
 class PromptBudgeter:
diff --git a/src/app/modules/rag/explain/excerpt_planner.py b/src/app/modules/agent/runtime/steps/explain/excerpt_planner.py
similarity index 95%
rename from src/app/modules/rag/explain/excerpt_planner.py
rename to src/app/modules/agent/runtime/steps/explain/excerpt_planner.py
index 04f98ba..406073e 100644
--- a/src/app/modules/rag/explain/excerpt_planner.py
+++ b/src/app/modules/agent/runtime/steps/explain/excerpt_planner.py
@@ -1,6 +1,6 @@
 from __future__ import annotations
 
-from app.modules.rag.explain.models import CodeExcerpt, LayeredRetrievalItem
+from app.modules.agent.runtime.steps.explain.models import CodeExcerpt, LayeredRetrievalItem
 
 
 class ExcerptPlanner:
diff --git a/src/app/modules/rag/explain/graph_repository.py b/src/app/modules/agent/runtime/steps/explain/graph_repository.py
similarity index 98%
rename from src/app/modules/rag/explain/graph_repository.py
rename to src/app/modules/agent/runtime/steps/explain/graph_repository.py
index 65d40db..6622e7e 100644
--- a/src/app/modules/rag/explain/graph_repository.py
+++ b/src/app/modules/agent/runtime/steps/explain/graph_repository.py
@@ -4,7 +4,7 @@ import json
 
 from sqlalchemy import text
 
-from app.modules.rag.explain.models import CodeLocation, LayeredRetrievalItem
+from app.modules.agent.runtime.steps.explain.models import CodeLocation, LayeredRetrievalItem
 from app.modules.shared.db import get_engine
 
 
diff --git a/src/app/modules/rag/explain/intent_builder.py b/src/app/modules/agent/runtime/steps/explain/intent_builder.py
similarity index 97%
rename from src/app/modules/rag/explain/intent_builder.py
rename to src/app/modules/agent/runtime/steps/explain/intent_builder.py
index cd4cc3b..79fb59a 100644
--- a/src/app/modules/rag/explain/intent_builder.py
+++ b/src/app/modules/agent/runtime/steps/explain/intent_builder.py
@@ -2,7 +2,7 @@ from __future__ import annotations
 
 import re
 
-from app.modules.rag.explain.models import ExplainHints, ExplainIntent
+from app.modules.agent.runtime.steps.explain.models import ExplainHints, ExplainIntent
 from app.modules.rag.retrieval.query_terms import extract_query_terms
 
 
diff --git a/src/app/modules/rag/explain/layered_gateway.py b/src/app/modules/agent/runtime/steps/explain/layered_gateway.py
similarity index 99%
rename from src/app/modules/rag/explain/layered_gateway.py
rename to src/app/modules/agent/runtime/steps/explain/layered_gateway.py
index 894ef05..79ca52d 100644
--- a/src/app/modules/rag/explain/layered_gateway.py
+++ b/src/app/modules/agent/runtime/steps/explain/layered_gateway.py
@@ -4,7 +4,7 @@ import logging
 from dataclasses import dataclass, field
 from typing import TYPE_CHECKING, Callable
 
-from app.modules.rag.explain.models import CodeLocation, LayeredRetrievalItem
+from app.modules.agent.runtime.steps.explain.models import CodeLocation, LayeredRetrievalItem
 from app.modules.rag.retrieval.test_filter import build_test_filters, debug_disable_test_filter
 
 LOGGER = logging.getLogger(__name__)
diff --git a/src/app/modules/rag/explain/models.py b/src/app/modules/agent/runtime/steps/explain/models.py
similarity index 100%
rename from src/app/modules/rag/explain/models.py
rename to src/app/modules/agent/runtime/steps/explain/models.py
diff --git a/src/app/modules/rag/explain/retriever_v2.py b/src/app/modules/agent/runtime/steps/explain/retriever_v2.py
similarity index 94%
rename from src/app/modules/rag/explain/retriever_v2.py
rename to src/app/modules/agent/runtime/steps/explain/retriever_v2.py
index cf31820..bba502c 100644
--- a/src/app/modules/rag/explain/retriever_v2.py
+++ b/src/app/modules/agent/runtime/steps/explain/retriever_v2.py
@@ -4,19 +4,19 @@ import logging
 from typing import TYPE_CHECKING
 
 from app.modules.rag.contracts.enums import RagLayer
-from app.modules.rag.explain.intent_builder import ExplainIntentBuilder
-from app.modules.rag.explain.layered_gateway import LayerRetrievalResult, LayeredRetrievalGateway
-from app.modules.rag.explain.models import CodeExcerpt, EvidenceItem, ExplainPack, LayeredRetrievalItem
-from app.modules.rag.explain.source_excerpt_fetcher import SourceExcerptFetcher
-from app.modules.rag.explain.trace_builder import TraceBuilder
+from app.modules.agent.runtime.steps.explain.intent_builder import ExplainIntentBuilder
+from app.modules.agent.runtime.steps.explain.layered_gateway import LayerRetrievalResult, LayeredRetrievalGateway
+from app.modules.agent.runtime.steps.explain.models import CodeExcerpt, EvidenceItem, ExplainPack, LayeredRetrievalItem
+from app.modules.agent.runtime.steps.explain.source_excerpt_fetcher import SourceExcerptFetcher
+from app.modules.agent.runtime.steps.explain.trace_builder import TraceBuilder
 from app.modules.rag.retrieval.test_filter import exclude_tests_default, is_test_path
 
 LOGGER = logging.getLogger(__name__)
 _MIN_EXCERPTS = 2
 
 if TYPE_CHECKING:
-    from app.modules.rag.explain.graph_repository import CodeGraphRepository
-    from app.modules.rag.explain.models import ExplainIntent
+    from app.modules.agent.runtime.steps.explain.graph_repository import CodeGraphRepository
+    from app.modules.agent.runtime.steps.explain.models import ExplainIntent
 
 
 class CodeExplainRetrieverV2:
diff --git a/src/app/modules/rag/explain/source_excerpt_fetcher.py b/src/app/modules/agent/runtime/steps/explain/source_excerpt_fetcher.py
similarity index 88%
rename from src/app/modules/rag/explain/source_excerpt_fetcher.py
rename to src/app/modules/agent/runtime/steps/explain/source_excerpt_fetcher.py
index b45f6e7..cd3c4ae 100644
--- a/src/app/modules/rag/explain/source_excerpt_fetcher.py
+++ b/src/app/modules/agent/runtime/steps/explain/source_excerpt_fetcher.py
@@ -2,12 +2,12 @@ from __future__ import annotations
 
 from typing import TYPE_CHECKING
 
-from app.modules.rag.explain.excerpt_planner import ExcerptPlanner
-from app.modules.rag.explain.models import CodeExcerpt, EvidenceItem, TracePath
+from app.modules.agent.runtime.steps.explain.excerpt_planner import ExcerptPlanner
+from app.modules.agent.runtime.steps.explain.models import CodeExcerpt, EvidenceItem, TracePath
 from app.modules.rag.retrieval.test_filter import is_test_path
 
 if TYPE_CHECKING:
-    from app.modules.rag.explain.graph_repository import CodeGraphRepository
+    from app.modules.agent.runtime.steps.explain.graph_repository import CodeGraphRepository
 
 
 class SourceExcerptFetcher:
diff --git a/src/app/modules/rag/explain/trace_builder.py b/src/app/modules/agent/runtime/steps/explain/trace_builder.py
similarity index 96%
rename from src/app/modules/rag/explain/trace_builder.py
rename to src/app/modules/agent/runtime/steps/explain/trace_builder.py
index 9bd0a5f..a9c2826 100644
--- a/src/app/modules/rag/explain/trace_builder.py
+++ b/src/app/modules/agent/runtime/steps/explain/trace_builder.py
@@ -2,10 +2,10 @@ from __future__ import annotations
 
 from typing import TYPE_CHECKING
 
-from app.modules.rag.explain.models import LayeredRetrievalItem, TracePath
+from app.modules.agent.runtime.steps.explain.models import LayeredRetrievalItem, TracePath
 
 if TYPE_CHECKING:
-    from app.modules.rag.explain.graph_repository import CodeGraphRepository
+    from app.modules.agent.runtime.steps.explain.graph_repository import CodeGraphRepository
 
 
 class TraceBuilder:
diff --git a/src/app/modules/agent/runtime/steps/finalization/__init__.py b/src/app/modules/agent/runtime/steps/finalization/__init__.py
new file mode 100644
index 0000000..e59188d
--- /dev/null
+++ b/src/app/modules/agent/runtime/steps/finalization/__init__.py
@@ -0,0 +1,6 @@
+"""Финальная сборка: repair черновика и сборка RuntimeFinalResult."""
+
+from app.modules.agent.runtime.steps.finalization.repair import RuntimeAnswerRepairService
+from app.modules.agent.runtime.steps.finalization.result_assembler import assemble_final_result
+
+__all__ = ["RuntimeAnswerRepairService", "assemble_final_result"]
diff --git a/src/app/modules/agent/code_qa_runtime/repair.py b/src/app/modules/agent/runtime/steps/finalization/repair.py
similarity index 87%
rename from src/app/modules/agent/code_qa_runtime/repair.py
rename to src/app/modules/agent/runtime/steps/finalization/repair.py
index 55eb6a7..b3bd1d2 100644
--- a/src/app/modules/agent/code_qa_runtime/repair.py
+++ b/src/app/modules/agent/runtime/steps/finalization/repair.py
@@ -1,12 +1,14 @@
+"""Сервис починки черновика ответа по результатам post-evidence gate (LLM repair)."""
+
 from __future__ import annotations
 
 import json
 
-from app.modules.agent.code_qa_runtime.models import CodeQaValidationResult
+from app.modules.agent.runtime.steps.gates.post.post_gate import RuntimeValidationResult
 from app.modules.agent.llm import AgentLlmService
 
 
-class CodeQaAnswerRepairService:
+class RuntimeAnswerRepairService:
     def __init__(self, llm: AgentLlmService) -> None:
         self._llm = llm
 
@@ -14,7 +16,7 @@ class CodeQaAnswerRepairService:
         self,
         *,
         draft_answer: str,
-        validation: CodeQaValidationResult,
+        validation: RuntimeValidationResult,
         prompt_payload: str,
     ) -> str:
         repair_focus = self._repair_focus(validation.reasons)
diff --git a/src/app/modules/agent/runtime/steps/finalization/result_assembler.py b/src/app/modules/agent/runtime/steps/finalization/result_assembler.py
new file mode 100644
index 0000000..23c9b9e
--- /dev/null
+++ b/src/app/modules/agent/runtime/steps/finalization/result_assembler.py
@@ -0,0 +1,79 @@
+"""Сборка финального результата пайплайна: диагностика и RuntimeFinalResult."""
+
+from __future__ import annotations
+
+import logging
+from typing import Any
+
+from app.modules.agent.runtime.steps.context.diagnostics import build_diagnostics_report
+from app.modules.agent.runtime.steps.gates.post.post_gate import RuntimePostEvidenceGate, RuntimeValidationResult
+from app.modules.agent.runtime.models import RuntimeDraftAnswer, RuntimeExecutionState, RuntimeFinalResult
+
+LOGGER = logging.getLogger(__name__)
+
+
+def assemble_final_result(
+    state: RuntimeExecutionState,
+    *,
+    draft: RuntimeDraftAnswer | None,
+    final_answer: str,
+    repair_used: bool,
+    llm_used: bool,
+    validation: RuntimeValidationResult | None = None,
+    timings_ms: dict[str, int] | None = None,
+    runtime_trace: list[dict] | None = None,
+    answer_policy_branch: str = "",
+    decision_reason: str = "",
+    pre_gate_input: dict[str, Any] | None = None,
+    post_gate_snapshot: dict[str, Any] | None = None,
+    resolved_target: str | None = None,
+    post_gate: RuntimePostEvidenceGate | None = None,
+) -> RuntimeFinalResult:
+    diagnostics = build_diagnostics_report(
+        router_result=state.router_result,
+        retrieval_request=state.retrieval_request,
+        retrieval_result=state.retrieval_result,
+        evidence_bundle=state.evidence_pack,
+        answer_mode=state.answer_mode,
+        timings_ms=timings_ms or {},
+        resolved_target=resolved_target,
+        answer_policy_branch=answer_policy_branch,
+        decision_reason=decision_reason,
+        evidence_gate_input=pre_gate_input or {},
+        post_evidence_gate=post_gate_snapshot or {},
+    )
+    if validation is None and post_gate is not None and state.retrieval_request is not None:
+        validation = post_gate.validate(
+            answer=final_answer,
+            answer_mode=state.answer_mode,
+            degraded_message=state.degraded_message,
+            sub_intent=state.retrieval_request.sub_intent,
+            user_query=state.user_query,
+            evidence_pack=state.evidence_pack,
+        )
+    elif validation is None:
+        validation = RuntimeValidationResult(passed=True, action="return")
+
+    result = RuntimeFinalResult(
+        final_answer=final_answer.strip(),
+        answer_mode=state.answer_mode,
+        repair_used=repair_used,
+        llm_used=llm_used,
+        draft_answer=draft,
+        validation=validation,
+        router_result=state.router_result,
+        retrieval_request=state.retrieval_request,
+        retrieval_result=state.retrieval_result,
+        evidence_pack=state.evidence_pack,
+        diagnostics=diagnostics,
+        runtime_trace=list(runtime_trace or []),
+    )
+    LOGGER.warning(
+        "agent runtime executed: intent=%s sub_intent=%s answer_mode=%s repair_used=%s llm_used=%s",
+        state.router_result.intent if state.router_result else None,
+        state.router_result.query_plan.sub_intent if state.router_result and state.router_result.query_plan else None,
+        result.answer_mode,
+        result.repair_used,
+        result.llm_used,
+    )
+    return result
diff --git a/src/app/modules/agent/runtime/steps/gates/__init__.py b/src/app/modules/agent/runtime/steps/gates/__init__.py
new file mode 100644
index 0000000..c7e0e58
--- /dev/null
+++ b/src/app/modules/agent/runtime/steps/gates/__init__.py
@@ -0,0 +1,11 @@
+"""Pre- и post-evidence gates пайплайна."""
+
+from app.modules.agent.runtime.steps.gates.pre.evidence_gate import EvidenceGateDecision, evaluate_evidence
+from app.modules.agent.runtime.steps.gates.post.post_gate import RuntimePostEvidenceGate, RuntimeValidationResult
+
+__all__ = [
+    "EvidenceGateDecision",
+    "evaluate_evidence",
+    "RuntimePostEvidenceGate",
+    "RuntimeValidationResult",
+]
diff --git a/src/app/modules/agent/runtime/steps/gates/post/__init__.py b/src/app/modules/agent/runtime/steps/gates/post/__init__.py
new file mode 100644
index 0000000..c7cd976
--- /dev/null
+++ b/src/app/modules/agent/runtime/steps/gates/post/__init__.py
@@ -0,0 +1,5 @@
+"""Post-evidence gate: валидация черновика ответа и решение repair/return."""
+
+from app.modules.agent.runtime.steps.gates.post.post_gate import RuntimePostEvidenceGate, RuntimeValidationResult
+
+__all__ = ["RuntimePostEvidenceGate", "RuntimeValidationResult"]
diff --git a/src/app/modules/agent/code_qa_runtime/post_gate.py b/src/app/modules/agent/runtime/steps/gates/post/post_gate.py
similarity index 88%
rename from src/app/modules/agent/code_qa_runtime/post_gate.py
rename to src/app/modules/agent/runtime/steps/gates/post/post_gate.py
index a30e4d1..ed40102 100644
--- a/src/app/modules/agent/code_qa_runtime/post_gate.py
+++ b/src/app/modules/agent/runtime/steps/gates/post/post_gate.py
@@ -1,10 +1,22 @@
+"""Post-evidence gate: валидация черновика ответа и решение repair/return."""
+
 from __future__ import annotations
 
 import re
 
-from app.modules.agent.code_qa_runtime.models import CodeQaValidationResult
-from app.modules.rag.code_qa_pipeline.answer_fact_curator import build_curated_answer_facts
-from app.modules.rag.code_qa_pipeline.contracts import EvidenceBundle
+from pydantic import BaseModel, ConfigDict, Field
+
+from app.modules.agent.runtime.steps.context.answer_fact_curator import build_curated_answer_facts
+from app.modules.agent.runtime.steps.context.contracts import EvidenceBundle
+
+
+class RuntimeValidationResult(BaseModel):
+    model_config = ConfigDict(extra="forbid")
+
+    passed: bool = False
+    action: str = "return"
+    reasons: list[str] = Field(default_factory=list)
+
 
 _TOKEN_RE = re.compile(r"[a-zA-Zа-яА-Я0-9_/]+")
 _VAGUE_PHRASES = (
@@ -22,7 +34,7 @@ _VAGUE_PHRASES = (
 _OPTIMISTIC_TRACE_CLAIMS = ("полностью восстанавливается", "полный поток выполнения", "полностью прослеживается")
 
 
-class CodeQaPostEvidenceGate:
+class RuntimePostEvidenceGate:
     def validate(
         self,
         *,
@@ -32,25 +44,25 @@ class CodeQaPostEvidenceGate:
         sub_intent: str,
         user_query: str,
         evidence_pack: EvidenceBundle | None,
-    ) -> CodeQaValidationResult:
+    ) -> RuntimeValidationResult:
         normalized = (answer or "").strip()
         if not normalized:
-            return CodeQaValidationResult(passed=False, action="repair", reasons=["empty_answer"])
+            return RuntimeValidationResult(passed=False, action="repair", reasons=["empty_answer"])
         if answer_mode in {"degraded", "insufficient"} and "недостат" not in normalized.lower():
-            return CodeQaValidationResult(passed=False, action="repair", reasons=["degraded_answer_missing_guardrail"])
+            return RuntimeValidationResult(passed=False, action="repair", reasons=["degraded_answer_missing_guardrail"])
         if answer_mode == "not_found" and "не найден" not in normalized.lower():
-            return CodeQaValidationResult(passed=False, action="repair", reasons=["not_found_answer_missing_phrase"])
+            return RuntimeValidationResult(passed=False, action="repair", reasons=["not_found_answer_missing_phrase"])
         if answer_mode == "ambiguous" and "не удалось однозначно разрешить" not in normalized.lower():
-            return CodeQaValidationResult(passed=False, action="repair", reasons=["ambiguous_answer_missing_phrase"])
+            return RuntimeValidationResult(passed=False, action="repair", reasons=["ambiguous_answer_missing_phrase"])
         if degraded_message and answer_mode != "normal" and len(normalized) < 24:
-            return CodeQaValidationResult(passed=False, action="repair", reasons=["answer_too_short"])
+            return RuntimeValidationResult(passed=False, action="repair", reasons=["answer_too_short"])
         if answer_mode != "normal" or evidence_pack is None:
-            return CodeQaValidationResult(passed=True, action="return")
+            return RuntimeValidationResult(passed=True, action="return")
 
         reasons = self._normal_answer_reasons(normalized.lower(), sub_intent.upper(), user_query, evidence_pack)
         if reasons:
-            return CodeQaValidationResult(passed=False, action="repair", reasons=_dedupe(reasons))
-        return CodeQaValidationResult(passed=True, action="return")
+            return RuntimeValidationResult(passed=False, action="repair", reasons=_dedupe(reasons))
+        return RuntimeValidationResult(passed=True, action="return")
 
     def _normal_answer_reasons(self, answer: str, sub_intent: str, user_query: str, evidence_pack: EvidenceBundle) -> list[str]:
         reasons: list[str] = []
diff --git a/src/app/modules/agent/runtime/steps/gates/pre/__init__.py b/src/app/modules/agent/runtime/steps/gates/pre/__init__.py
new file mode 100644
index 0000000..3881cc2
--- /dev/null
+++ b/src/app/modules/agent/runtime/steps/gates/pre/__init__.py
@@ -0,0 +1,5 @@
+"""Pre-evidence gate: проверка достаточности evidence перед вызовом LLM."""
+
+from app.modules.agent.runtime.steps.gates.pre.evidence_gate import EvidenceGateDecision, evaluate_evidence
+
+__all__ = ["EvidenceGateDecision", "evaluate_evidence"]
diff --git a/src/app/modules/rag/code_qa_pipeline/evidence_gate.py b/src/app/modules/agent/runtime/steps/gates/pre/evidence_gate.py
similarity index 90%
rename from src/app/modules/rag/code_qa_pipeline/evidence_gate.py
rename to src/app/modules/agent/runtime/steps/gates/pre/evidence_gate.py
index c737f18..633ce11 100644
--- a/src/app/modules/rag/code_qa_pipeline/evidence_gate.py
+++ b/src/app/modules/agent/runtime/steps/gates/pre/evidence_gate.py
@@ -1,23 +1,20 @@
-"""Shared evidence sufficiency check for the CODE_QA test pipeline."""
+"""Проверка достаточности evidence для пайплайна CODE_QA."""
 
 from __future__ import annotations
 
 from dataclasses import dataclass, field
 
-from app.modules.rag.code_qa_pipeline.contracts import EvidenceBundle
+from app.modules.agent.runtime.steps.context.contracts import EvidenceBundle
 
 
 @dataclass(slots=True)
 class EvidenceGateDecision:
-    """Result of evidence sufficiency check."""
-
     passed: bool
     failure_reasons: list[str] = field(default_factory=list)
     degraded_message: str = ""
 
 
 def evaluate_evidence(bundle: EvidenceBundle) -> EvidenceGateDecision:
-    """Check evidence sufficiency by scenario; prevent confident answer without support."""
     sub = (bundle.resolved_sub_intent or "EXPLAIN").upper()
     reasons: list[str] = []
 
diff --git a/src/app/modules/agent/runtime/steps/generation/__init__.py b/src/app/modules/agent/runtime/steps/generation/__init__.py
new file mode 100644
index 0000000..f9c9eef
--- /dev/null
+++ b/src/app/modules/agent/runtime/steps/generation/__init__.py
@@ -0,0 +1,11 @@
+"""Выбор промпта, сборка payload и вызов LLM для генерации ответа."""
+
+from app.modules.agent.runtime.steps.generation.prompt_selector import RuntimePromptSelector
+from app.modules.agent.runtime.steps.generation.prompt_payload_builder import RuntimePromptPayloadBuilder
+from app.modules.agent.runtime.steps.generation.generator import RuntimeAnswerGenerator
+
+__all__ = [
+    "RuntimePromptSelector",
+    "RuntimePromptPayloadBuilder",
+    "RuntimeAnswerGenerator",
+]
diff --git a/src/app/modules/agent/runtime/steps/generation/generator.py b/src/app/modules/agent/runtime/steps/generation/generator.py
new file mode 100644
index 0000000..52d5b62
--- /dev/null
+++ b/src/app/modules/agent/runtime/steps/generation/generator.py
@@ -0,0 +1,18 @@
+"""Тонкая обёртка над LLM для генерации ответа по имени промпта и payload."""
+
+from __future__ import annotations
+
+from typing import TYPE_CHECKING
+
+if TYPE_CHECKING:
+    from app.modules.agent.llm import AgentLlmService
+
+
+class RuntimeAnswerGenerator:
+    """Делегирует вызов LLM для генерации черновика ответа."""
+
+    def __init__(self, llm: AgentLlmService) -> None:
+        self._llm = llm
+
+    def generate(self, prompt_name: str, payload: str, *, log_context: str = "graph.project_qa.code_qa.answer") -> str:
+        return self._llm.generate(prompt_name, payload, log_context=log_context).strip()
diff --git a/src/app/modules/agent/code_qa_runtime/prompt_payload_builder.py b/src/app/modules/agent/runtime/steps/generation/prompt_payload_builder.py
similarity index 95%
rename from src/app/modules/agent/code_qa_runtime/prompt_payload_builder.py
rename to src/app/modules/agent/runtime/steps/generation/prompt_payload_builder.py
index 4e1fb31..85738f2 100644
--- a/src/app/modules/agent/code_qa_runtime/prompt_payload_builder.py
+++ b/src/app/modules/agent/runtime/steps/generation/prompt_payload_builder.py
@@ -1,9 +1,11 @@
+"""Сборка JSON-полезной нагрузки для системного промпта по synthesis_input и evidence_pack."""
+
 from __future__ import annotations
 
 import json
 import re
 
-from app.modules.rag.code_qa_pipeline.contracts import AnswerSynthesisInput, EvidenceBundle
+from app.modules.agent.runtime.steps.context.contracts import AnswerSynthesisInput, EvidenceBundle
 
 _LAYER_GUIDE = (
     "- C0_SOURCE_CHUNKS: фактический код, это основной источник деталей.\n"
@@ -15,7 +17,7 @@ _LAYER_GUIDE = (
 _TOKEN_RE = re.compile(r"[a-zA-Zа-яА-Я0-9_/]+")
 
 
-class CodeQaPromptPayloadBuilder:
+class RuntimePromptPayloadBuilder:
     def build(
         self,
         *,
diff --git a/src/app/modules/agent/code_qa_runtime/prompt_selector.py b/src/app/modules/agent/runtime/steps/generation/prompt_selector.py
similarity index 85%
rename from src/app/modules/agent/code_qa_runtime/prompt_selector.py
rename to src/app/modules/agent/runtime/steps/generation/prompt_selector.py
index 5cc54d9..fa660d9 100644
--- a/src/app/modules/agent/code_qa_runtime/prompt_selector.py
+++ b/src/app/modules/agent/runtime/steps/generation/prompt_selector.py
@@ -1,7 +1,9 @@
+"""Выбор имени системного промпта по sub_intent и answer_mode."""
+
 from __future__ import annotations
 
 
-class CodeQaPromptSelector:
+class RuntimePromptSelector:
     _PROMPTS = {
         "ARCHITECTURE": "code_qa_architecture_answer",
         "EXPLAIN": "code_qa_explain_answer",
diff --git a/src/app/modules/agent/runtime/steps/retrieval/__init__.py b/src/app/modules/agent/runtime/steps/retrieval/__init__.py
new file mode 100644
index 0000000..7cfb2a6
--- /dev/null
+++ b/src/app/modules/agent/runtime/steps/retrieval/__init__.py
@@ -0,0 +1,6 @@
+"""Пакет выполнения retrieval: адаптер к RAG-репозиторию и фабрика контекста репозитория."""
+
+from app.modules.agent.runtime.steps.retrieval.adapter import RuntimeRetrievalAdapter
+from app.modules.agent.runtime.steps.retrieval.repo_context import RuntimeRepoContextFactory
+
+__all__ = ["RuntimeRetrievalAdapter", "RuntimeRepoContextFactory"]
diff --git a/src/app/modules/agent/code_qa_runtime/retrieval_adapter.py b/src/app/modules/agent/runtime/steps/retrieval/adapter.py
similarity index 98%
rename from src/app/modules/agent/code_qa_runtime/retrieval_adapter.py
rename to src/app/modules/agent/runtime/steps/retrieval/adapter.py
index 2dfd58b..ec0c99f 100644
--- a/src/app/modules/agent/code_qa_runtime/retrieval_adapter.py
+++ b/src/app/modules/agent/runtime/steps/retrieval/adapter.py
@@ -1,3 +1,5 @@
+"""Адаптер RAG-репозитория к runtime: retrieve_with_plan, retrieve_exact_files, consume_retrieval_report."""
+
 from __future__ import annotations
 
 from time import perf_counter
@@ -35,7 +37,7 @@ class SessionEmbeddingDimensions:
         return dim
 
 
-class CodeQaRetrievalAdapter:
+class RuntimeRetrievalAdapter:
     def __init__(self, repository: RagRepository | None = None) -> None:
         if repository is None:
             from app.modules.rag.persistence.repository import RagRepository
diff --git a/src/app/modules/agent/code_qa_runtime/repo_context.py b/src/app/modules/agent/runtime/steps/retrieval/repo_context.py
similarity index 85%
rename from src/app/modules/agent/code_qa_runtime/repo_context.py
rename to src/app/modules/agent/runtime/steps/retrieval/repo_context.py
index 2f2aa12..a5c421d 100644
--- a/src/app/modules/agent/code_qa_runtime/repo_context.py
+++ b/src/app/modules/agent/runtime/steps/retrieval/repo_context.py
@@ -1,10 +1,12 @@
+"""Фабрика контекста репозитория (языки, слои RAG) для runtime."""
+
 from __future__ import annotations
 
 from app.modules.rag.contracts.enums import RagLayer
-from app.modules.rag.intent_router_v2.models import RepoContext
+from app.modules.agent.intent_router_v2.models import RepoContext
 
 
-class CodeQaRepoContextFactory:
+class RuntimeRepoContextFactory:
     _KNOWN_LAYERS = [
         RagLayer.CODE_ENTRYPOINTS,
         RagLayer.CODE_SYMBOL_CATALOG,
diff --git a/src/app/modules/rag/persistence/story_context_repository.py b/src/app/modules/agent/runtime/story_context_repository.py
similarity index 100%
rename from src/app/modules/rag/persistence/story_context_repository.py
rename to src/app/modules/agent/runtime/story_context_repository.py
diff --git a/src/app/modules/application.py b/src/app/modules/application.py
index 8ce4902..8fa581e 100644
--- a/src/app/modules/application.py
+++ b/src/app/modules/application.py
@@ -1,8 +1,7 @@
-from app.modules.agent.code_qa_runtime import CodeQaRuntimeExecutor
-from app.modules.agent.code_qa_runtime.retrieval_adapter import CodeQaRetrievalAdapter
-from app.modules.agent.code_qa_runner_adapter import CodeQaRunnerAdapter
+from app.modules.agent.runtime import AgentRuntimeExecutor, RuntimeRetrievalAdapter
+from app.modules.agent.runtime.code_qa_runner_adapter import CodeQaRunnerAdapter
 from app.modules.agent.llm import AgentLlmService
-from app.modules.agent.prompt_loader import PromptLoader
+from app.modules.agent.llm.prompt_loader import PromptLoader
 from app.modules.chat.direct_service import CodeExplainChatService
 from app.modules.chat.dialog_store import DialogSessionStore
 from app.modules.chat.repository import ChatRepository
@@ -10,8 +9,8 @@ from app.modules.chat.module import ChatModule
 from app.modules.chat.session_resolver import ChatSessionResolver
 from app.modules.chat.task_store import TaskStore
 from app.modules.rag.persistence.repository import RagRepository
-from app.modules.rag.persistence.story_context_repository import StoryContextRepository, StoryContextSchemaRepository
-from app.modules.rag.explain import CodeExplainRetrieverV2, CodeGraphRepository, LayeredRetrievalGateway
+from app.modules.agent.runtime.story_context_repository import StoryContextRepository, StoryContextSchemaRepository
+from app.modules.agent.runtime.steps.explain import CodeExplainRetrieverV2, CodeGraphRepository, LayeredRetrievalGateway
 from app.modules.rag.module import RagModule, RagRepoModule
 from app.modules.shared.bootstrap import bootstrap_database
 from app.modules.shared.event_bus import EventBus
@@ -45,8 +44,8 @@ class ModularApplication:
         _giga_client = GigaChatClient(_giga_settings, GigaChatTokenProvider(_giga_settings))
         _prompt_loader = PromptLoader()
         self._agent_llm = AgentLlmService(client=_giga_client, prompts=_prompt_loader)
-        _retrieval = CodeQaRetrievalAdapter(self.rag_repository)
-        _executor = CodeQaRuntimeExecutor(llm=self._agent_llm, retrieval=_retrieval)
+        _retrieval = RuntimeRetrievalAdapter(self.rag_repository)
+        _executor = AgentRuntimeExecutor(llm=self._agent_llm, retrieval=_retrieval)
         self._agent_runner = CodeQaRunnerAdapter(_executor)
         self.direct_chat = CodeExplainChatService(
             retriever=self.code_explain_retriever,
diff --git a/src/app/modules/chat/direct_service.py b/src/app/modules/chat/direct_service.py
index fe6e063..0ab0a2a 100644
--- a/src/app/modules/chat/direct_service.py
+++ b/src/app/modules/chat/direct_service.py
@@ -7,7 +7,7 @@ from app.modules.agent.llm import AgentLlmService
 from app.modules.chat.evidence_gate import CodeExplainEvidenceGate
 from app.modules.chat.session_resolver import ChatSessionResolver
 from app.modules.chat.task_store import TaskState, TaskStore
-from app.modules.rag.explain import CodeExplainRetrieverV2, PromptBudgeter
+from app.modules.agent.runtime.steps.explain import CodeExplainRetrieverV2, PromptBudgeter
 from app.schemas.chat import ChatMessageRequest, TaskQueuedResponse, TaskResultType, TaskStatus
 
 LOGGER = logging.getLogger(__name__)
diff --git a/src/app/modules/chat/evidence_gate.py b/src/app/modules/chat/evidence_gate.py
index 6d12257..3e75f6c 100644
--- a/src/app/modules/chat/evidence_gate.py
+++ b/src/app/modules/chat/evidence_gate.py
@@ -2,7 +2,7 @@ from __future__ import annotations
 
 from dataclasses import dataclass, field
 
-from app.modules.rag.explain.models import ExplainPack
+from app.modules.agent.runtime.steps.explain.models import ExplainPack
 
 
 @dataclass(slots=True)
diff --git a/src/app/modules/rag/code_qa_pipeline/__init__.py b/src/app/modules/rag/code_qa_pipeline/__init__.py
deleted file mode 100644
index a8a469c..0000000
--- a/src/app/modules/rag/code_qa_pipeline/__init__.py
+++ /dev/null
@@ -1,37 +0,0 @@
-"""Canonical test-first CODE_QA pipeline: IntentRouterV2 -> retrieval -> evidence gate -> LLM -> diagnostics.
-
-This package is the single source of truth for the test-only pipeline.
-Legacy RouterService and production runtime are unchanged.
-
-Entrypoint: CodeQAPipelineRunner.run(user_query, rag_session_id).
-Contracts: RouterResult, RetrievalRequest, RetrievalResult, EvidenceBundle, AnswerSynthesisInput, DiagnosticsReport.
-See README § 4.1.3 and tests/pipeline_setup/pipeline_intent_rag/test_canonical_code_qa_pipeline.py.
-"""
-
-from app.modules.rag.code_qa_pipeline.contracts import (
-    AnswerSynthesisInput,
-    CodeChunkItem,
-    DiagnosticsReport,
-    EvidenceBundle,
-    FailureReason,
-    RetrievalRequest,
-    RetrievalResult,
-    RouterResult,
-)
-from app.modules.rag.code_qa_pipeline.pipeline import (
-    CodeQAPipelineResult,
-    CodeQAPipelineRunner,
-)
-
-__all__ = [
-    "AnswerSynthesisInput",
-    "CodeChunkItem",
-    "CodeQAPipelineResult",
-    "CodeQAPipelineRunner",
-    "DiagnosticsReport",
-    "EvidenceBundle",
-    "FailureReason",
-    "RetrievalRequest",
-    "RetrievalResult",
-    "RouterResult",
-]
diff --git a/src/app/modules/rag/explain/__init__.py b/src/app/modules/rag/explain/__init__.py
deleted file mode 100644
index de44c1d..0000000
--- a/src/app/modules/rag/explain/__init__.py
+++ /dev/null
@@ -1,36 +0,0 @@
-from __future__ import annotations
-
-from importlib import import_module
-
-__all__ = [
-    "CodeExcerpt",
-    "CodeExplainRetrieverV2",
-    "CodeGraphRepository",
-    "EvidenceItem",
-    "ExplainIntent",
-    "ExplainIntentBuilder",
-    "ExplainPack",
-    "LayeredRetrievalGateway",
-    "PromptBudgeter",
-    "TracePath",
-]
-
-
-def __getattr__(name: str):
-    module_map = {
-        "CodeExcerpt": "app.modules.rag.explain.models",
-        "EvidenceItem": "app.modules.rag.explain.models",
-        "ExplainIntent": "app.modules.rag.explain.models",
-        "ExplainPack": "app.modules.rag.explain.models",
-        "TracePath": "app.modules.rag.explain.models",
-        "ExplainIntentBuilder": "app.modules.rag.explain.intent_builder",
-        "PromptBudgeter": "app.modules.rag.explain.budgeter",
-        "LayeredRetrievalGateway": "app.modules.rag.explain.layered_gateway",
-        "CodeGraphRepository": "app.modules.rag.explain.graph_repository",
-        "CodeExplainRetrieverV2": "app.modules.rag.explain.retriever_v2",
-    }
-    module_name = module_map.get(name)
-    if module_name is None:
-        raise AttributeError(name)
-    module = import_module(module_name)
-    return getattr(module, name)
diff --git a/src/app/modules/rag/intent_router_v2/__init__.py b/src/app/modules/rag/intent_router_v2/__init__.py
deleted file mode 100644
index 5933990..0000000
--- a/src/app/modules/rag/intent_router_v2/__init__.py
+++ /dev/null
@@ -1,23 +0,0 @@
-from app.modules.rag.intent_router_v2.factory import GigaChatIntentRouterFactory
-from app.modules.rag.intent_router_v2.local_runner import IntentRouterScenarioRunner
-from app.modules.rag.intent_router_v2.models import (
-    ConversationState,
-    IntentDecision,
-    IntentRouterResult,
-    QueryAnchor,
-    QueryPlan,
-    RepoContext,
-)
-from app.modules.rag.intent_router_v2.router import IntentRouterV2
-
-__all__ = [
-    "ConversationState",
-    "GigaChatIntentRouterFactory",
-    "IntentDecision",
-    "IntentRouterResult",
-    "IntentRouterScenarioRunner",
-    "IntentRouterV2",
-    "QueryAnchor",
-    "QueryPlan",
-    "RepoContext",
-]
diff --git a/src/app/modules/rag/intent_router_v2/analysis/__init__.py b/src/app/modules/rag/intent_router_v2/analysis/__init__.py
deleted file mode 100644
index 5a05ed4..0000000
--- a/src/app/modules/rag/intent_router_v2/analysis/__init__.py
+++ /dev/null
@@ -1,4 +0,0 @@
-from app.modules.rag.intent_router_v2.analysis.normalization import QueryNormalizer
-from app.modules.rag.intent_router_v2.analysis.query_plan_builder import QueryPlanBuilder
-
-__all__ = ["QueryNormalizer", "QueryPlanBuilder"]
diff --git a/src/app/modules/rag/intent_router_v2/analysis/query_normalizer.py b/src/app/modules/rag/intent_router_v2/analysis/query_normalizer.py
deleted file mode 100644
index 7884cbc..0000000
--- a/src/app/modules/rag/intent_router_v2/analysis/query_normalizer.py
+++ /dev/null
@@ -1,3 +0,0 @@
-from app.modules.rag.intent_router_v2.analysis.normalization import QueryNormalizer
-
-__all__ = ["QueryNormalizer"]
diff --git a/src/app/modules/rag/intent_router_v2/intent/__init__.py b/src/app/modules/rag/intent_router_v2/intent/__init__.py
deleted file mode 100644
index 1a84e6f..0000000
--- a/src/app/modules/rag/intent_router_v2/intent/__init__.py
+++ /dev/null
@@ -1,5 +0,0 @@
-from app.modules.rag.intent_router_v2.intent.classifier import IntentClassifierV2
-from app.modules.rag.intent_router_v2.intent.conversation_policy import ConversationPolicy
-from app.modules.rag.intent_router_v2.intent.graph_id_resolver import GraphIdResolver
-
-__all__ = ["IntentClassifierV2", "ConversationPolicy", "GraphIdResolver"]
diff --git a/src/app/modules/rag/intent_router_v2/retrieval/__init__.py b/src/app/modules/rag/intent_router_v2/retrieval/__init__.py
deleted file mode 100644
index 5e370e5..0000000
--- a/src/app/modules/rag/intent_router_v2/retrieval/__init__.py
+++ /dev/null
@@ -1,4 +0,0 @@
-from app.modules.rag.intent_router_v2.retrieval.retrieval_spec_factory import RetrievalSpecFactory
-from app.modules.rag.intent_router_v2.retrieval.retrieval_constraints_factory import RetrievalConstraintsFactory
-
-__all__ = ["RetrievalSpecFactory", "RetrievalConstraintsFactory"]
diff --git a/src/app/modules/rag/module.py b/src/app/modules/rag/module.py
index 475cc33..541ab9e 100644
--- a/src/app/modules/rag/module.py
+++ b/src/app/modules/rag/module.py
@@ -36,7 +36,7 @@ from app.schemas.rag_sessions import (
 )
 
 if TYPE_CHECKING:
-    from app.modules.rag.persistence.story_context_repository import StoryContextRepository
+    from app.modules.agent.runtime.story_context_repository import StoryContextRepository
 
 
 class RagModule:
diff --git a/tests/pipeline_setup/suite_01_synthetic/code_qa_eval/runner.py b/tests/pipeline_setup/suite_01_synthetic/code_qa_eval/runner.py
index 815b7aa..c1c331a 100644
--- a/tests/pipeline_setup/suite_01_synthetic/code_qa_eval/runner.py
+++ b/tests/pipeline_setup/suite_01_synthetic/code_qa_eval/runner.py
@@ -5,9 +5,9 @@ from __future__ import annotations
 from dataclasses import dataclass, field
 from pathlib import Path
 
-from app.modules.rag.code_qa_pipeline import CodeQAPipelineResult, CodeQAPipelineRunner
+from app.modules.agent.runtime.legacy_pipeline import CodeQAPipelineResult, CodeQAPipelineRunner
 from app.modules.rag.contracts.enums import RagLayer
-from app.modules.rag.intent_router_v2 import ConversationState, IntentRouterV2, RepoContext
+from app.modules.agent.intent_router_v2 import ConversationState, IntentRouterV2, RepoContext
 
 from tests.pipeline_setup.suite_01_synthetic.code_qa_eval.config import EvalConfig
 from tests.pipeline_setup.suite_01_synthetic.code_qa_eval.golden_loader import GoldenCase, load_golden_cases
diff --git a/tests/pipeline_setup/suite_01_synthetic/code_qa_eval/test_eval_harness.py b/tests/pipeline_setup/suite_01_synthetic/code_qa_eval/test_eval_harness.py
index 5c5b7d2..f6cc68f 100644
--- a/tests/pipeline_setup/suite_01_synthetic/code_qa_eval/test_eval_harness.py
+++ b/tests/pipeline_setup/suite_01_synthetic/code_qa_eval/test_eval_harness.py
@@ -6,8 +6,8 @@ from pathlib import Path
 
 import pytest
 
-from app.modules.rag.code_qa_pipeline import CodeQAPipelineResult
-from app.modules.rag.intent_router_v2.models import (
+from app.modules.agent.runtime.legacy_pipeline import CodeQAPipelineResult
+from app.modules.agent.intent_router_v2.models import (
     CodeRetrievalFilters,
     EvidencePolicy,
     IntentRouterResult,
@@ -154,11 +154,7 @@ def _make_pipeline_result(
     answer_mode: str = "normal",
     path_scope: list[str] | None = None,
 ) -> CodeQAPipelineResult:
-    from app.modules.rag.code_qa_pipeline.contracts import (
-        EvidenceBundle,
-        RetrievalRequest,
-        RetrievalResult,
-    )
+    from app.modules.agent.runtime.steps.context import EvidenceBundle, RetrievalRequest, RetrievalResult
 
     filters = CodeRetrievalFilters(path_scope=path_scope or [])
     router_result = IntentRouterResult(
diff --git a/tests/pipeline_setup/suite_02_pipeline/pipeline_intent_rag/helpers/pipeline_runner.py b/tests/pipeline_setup/suite_02_pipeline/pipeline_intent_rag/helpers/pipeline_runner.py
index 90cdf40..efc0f0d 100644
--- a/tests/pipeline_setup/suite_02_pipeline/pipeline_intent_rag/helpers/pipeline_runner.py
+++ b/tests/pipeline_setup/suite_02_pipeline/pipeline_intent_rag/helpers/pipeline_runner.py
@@ -4,7 +4,7 @@ from datetime import datetime
 from difflib import get_close_matches
 from time import perf_counter
 
-from app.modules.rag.intent_router_v2 import ConversationState, IntentRouterV2
+from app.modules.agent.intent_router_v2 import ConversationState, IntentRouterV2
 from tests.pipeline_setup.suite_02_pipeline.pipeline_intent_rag.helpers.diagnostics import (
     apply_retrieval_report,
     assign_repo_scope,
diff --git a/tests/pipeline_setup/suite_02_pipeline/pipeline_intent_rag/helpers/runtime.py b/tests/pipeline_setup/suite_02_pipeline/pipeline_intent_rag/helpers/runtime.py
index d7c3261..54fed4d 100644
--- a/tests/pipeline_setup/suite_02_pipeline/pipeline_intent_rag/helpers/runtime.py
+++ b/tests/pipeline_setup/suite_02_pipeline/pipeline_intent_rag/helpers/runtime.py
@@ -3,9 +3,9 @@ from __future__ import annotations
 from datetime import datetime
 from pathlib import Path
 
-from app.modules.agent.code_qa_runtime import CodeQaRuntimeExecutor
+from app.modules.agent.runtime import AgentRuntimeExecutor
 from app.modules.agent.llm import AgentLlmService
-from app.modules.agent.prompt_loader import PromptLoader
+from app.modules.agent.llm.prompt_loader import PromptLoader
 from app.modules.shared.gigachat.client import GigaChatClient
 from app.modules.shared.gigachat.settings import GigaChatSettings
 from app.modules.shared.gigachat.token_provider import GigaChatTokenProvider
@@ -23,7 +23,7 @@ class PipelineRuntime:
         self._writer = ArtifactWriter(self._config.test_results_dir, test_name=self._config.test_name, run_started_at=self._started_at)
         self._rag_adapter = None
         self._session_resolver = None
-        self._executor: CodeQaRuntimeExecutor | None = None
+        self._executor: AgentRuntimeExecutor | None = None
 
     @property
     def artifact_path(self) -> Path:
@@ -107,9 +107,9 @@ class PipelineRuntime:
         self._session_resolver = RagSessionResolver(config=self._config, repository=repository)
         return self._rag_adapter, self._session_resolver
 
-    def _executor_instance(self) -> CodeQaRuntimeExecutor:
+    def _executor_instance(self) -> AgentRuntimeExecutor:
         if self._executor is None:
-            self._executor = CodeQaRuntimeExecutor(_build_llm())
+            self._executor = AgentRuntimeExecutor(_build_llm())
         return self._executor
 
 
diff --git a/tests/pipeline_setup/suite_02_pipeline/pipeline_intent_rag/test_canonical_code_qa_pipeline.py b/tests/pipeline_setup/suite_02_pipeline/pipeline_intent_rag/test_canonical_code_qa_pipeline.py
index 2a770c2..27dff15 100644
--- a/tests/pipeline_setup/suite_02_pipeline/pipeline_intent_rag/test_canonical_code_qa_pipeline.py
+++ b/tests/pipeline_setup/suite_02_pipeline/pipeline_intent_rag/test_canonical_code_qa_pipeline.py
@@ -6,17 +6,17 @@ from pathlib import Path
 
 import pytest
 
-from app.modules.rag.code_qa_pipeline import (
+from app.modules.agent.runtime.steps.context import (
     CodeChunkItem,
-    CodeQAPipelineRunner,
     EvidenceBundle,
     RetrievalRequest,
     RetrievalResult,
+    build_retrieval_request,
+    build_retrieval_result,
 )
-from app.modules.rag.code_qa_pipeline.evidence_gate import evaluate_evidence
-from app.modules.rag.code_qa_pipeline.retrieval_request_builder import build_retrieval_request
-from app.modules.rag.code_qa_pipeline.retrieval_result_builder import build_retrieval_result
-from app.modules.rag.intent_router_v2 import ConversationState, IntentRouterV2
+from app.modules.agent.runtime.steps.gates.pre.evidence_gate import evaluate_evidence
+from app.modules.agent.runtime.legacy_pipeline import CodeQAPipelineRunner
+from app.modules.agent.intent_router_v2 import ConversationState, IntentRouterV2
 from tests.unit_tests.rag.intent_router_testkit import repo_context
 
 _TEST_ROOT = Path(__file__).resolve().parent
@@ -133,8 +133,7 @@ def test_evidence_gate_find_tests_insufficient() -> None:
 
 def test_diagnostics_report_has_failure_reasons() -> None:
     """Diagnostics report includes machine-readable failure reasons."""
-    from app.modules.rag.code_qa_pipeline.diagnostics import build_diagnostics_report
-    from app.modules.rag.code_qa_pipeline.evidence_bundle_builder import build_evidence_bundle
+    from app.modules.agent.runtime.steps.context import build_diagnostics_report, build_evidence_bundle
 
     router = _make_router()
     text = "Где тесты для НесуществующийКласс?"
diff --git a/tests/pipeline_setup/suite_02_pipeline/pipeline_intent_rag/test_code_qa_answer_boundary.py b/tests/pipeline_setup/suite_02_pipeline/pipeline_intent_rag/test_code_qa_answer_boundary.py
index 13e6709..24074f1 100644
--- a/tests/pipeline_setup/suite_02_pipeline/pipeline_intent_rag/test_code_qa_answer_boundary.py
+++ b/tests/pipeline_setup/suite_02_pipeline/pipeline_intent_rag/test_code_qa_answer_boundary.py
@@ -2,11 +2,14 @@ from __future__ import annotations
 
 import json
 
-from app.modules.agent.code_qa_runtime.post_gate import CodeQaPostEvidenceGate
-from app.modules.agent.code_qa_runtime.prompt_payload_builder import CodeQaPromptPayloadBuilder
-from app.modules.rag.code_qa_pipeline.answer_synthesis import build_answer_synthesis_input
-from app.modules.rag.code_qa_pipeline.contracts import CodeChunkItem, EvidenceBundle
-from app.modules.rag.code_qa_pipeline.retrieval_result_builder import build_retrieval_result
+from app.modules.agent.runtime.steps.gates.post.post_gate import RuntimePostEvidenceGate
+from app.modules.agent.runtime.steps.generation.prompt_payload_builder import RuntimePromptPayloadBuilder
+from app.modules.agent.runtime.steps.context import (
+    build_answer_synthesis_input,
+    build_retrieval_result,
+    CodeChunkItem,
+    EvidenceBundle,
+)
 
 
 def test_retrieval_result_separates_semantic_hints_and_relations() -> None:
@@ -109,7 +112,7 @@ def test_prompt_payload_builder_adds_explain_constraints() -> None:
         ),
     )
     payload = json.loads(
-        CodeQaPromptPayloadBuilder().build(
+        RuntimePromptPayloadBuilder().build(
             user_query="Explain RuntimeManager",
             synthesis_input=synthesis,
             evidence_pack=bundle,
@@ -143,7 +146,7 @@ def test_prompt_payload_builder_adds_trace_flow_constraints() -> None:
         ),
     )
     payload = json.loads(
-        CodeQaPromptPayloadBuilder().build(
+        RuntimePromptPayloadBuilder().build(
             user_query="Trace RuntimeManager",
             synthesis_input=synthesis,
             evidence_pack=EvidenceBundle(resolved_sub_intent="TRACE_FLOW", resolved_target="RuntimeManager"),
@@ -175,7 +178,7 @@ def test_post_gate_rejects_vague_explain_without_concrete_facts() -> None:
             }
         ],
     )
-    result = CodeQaPostEvidenceGate().validate(
+    result = RuntimePostEvidenceGate().validate(
         answer="RuntimeManager имеет responsibilities и управляет системой.",
         answer_mode="normal",
         degraded_message="",
@@ -209,7 +212,7 @@ def test_post_gate_accepts_explain_with_method_alias_and_call() -> None:
             }
         ],
     )
-    result = CodeQaPostEvidenceGate().validate(
+    result = RuntimePostEvidenceGate().validate(
         answer="RuntimeManager запускает работу через метод start(), а затем вызывает record() у TraceService.",
         answer_mode="normal",
         degraded_message="",
@@ -236,7 +239,7 @@ def test_post_gate_requires_architecture_relations() -> None:
             }
         ],
     )
-    gate = CodeQaPostEvidenceGate()
+    gate = RuntimePostEvidenceGate()
     vague = gate.validate(
         answer="RuntimeManager и TraceService образуют центральный компонент runtime.",
         answer_mode="normal",
@@ -270,7 +273,7 @@ def test_post_gate_rejects_architecture_with_retrieval_labels() -> None:
             }
         ],
     )
-    result = CodeQaPostEvidenceGate().validate(
+    result = RuntimePostEvidenceGate().validate(
         answer="RuntimeManager связан с dataflow_slice и строит вокруг него архитектуру.",
         answer_mode="normal",
         degraded_message="",
@@ -300,7 +303,7 @@ def test_post_gate_trace_flow_requires_sequence_and_blocks_overclaim() -> None:
             },
         ],
     )
-    gate = CodeQaPostEvidenceGate()
+    gate = RuntimePostEvidenceGate()
     vague = gate.validate(
         answer="RuntimeManager инициализирует службы и полностью восстанавливается.",
         answer_mode="normal",
diff --git a/tests/pipeline_setup/suite_02_pipeline/pipeline_intent_rag/test_diagnostics_jsonl.py b/tests/pipeline_setup/suite_02_pipeline/pipeline_intent_rag/test_diagnostics_jsonl.py
index 98c59af..6ef9b87 100644
--- a/tests/pipeline_setup/suite_02_pipeline/pipeline_intent_rag/test_diagnostics_jsonl.py
+++ b/tests/pipeline_setup/suite_02_pipeline/pipeline_intent_rag/test_diagnostics_jsonl.py
@@ -3,7 +3,7 @@ from __future__ import annotations
 from dataclasses import dataclass
 from datetime import datetime
 
-from app.modules.rag.intent_router_v2.models import (
+from app.modules.agent.intent_router_v2.models import (
     CodeRetrievalFilters,
     EvidencePolicy,
     IntentRouterResult,
diff --git a/tests/pipeline_setup/suite_02_pipeline/pipeline_intent_rag/test_router_constraints_contract.py b/tests/pipeline_setup/suite_02_pipeline/pipeline_intent_rag/test_router_constraints_contract.py
index fdda1c0..dbb3633 100644
--- a/tests/pipeline_setup/suite_02_pipeline/pipeline_intent_rag/test_router_constraints_contract.py
+++ b/tests/pipeline_setup/suite_02_pipeline/pipeline_intent_rag/test_router_constraints_contract.py
@@ -2,7 +2,7 @@ from __future__ import annotations
 
 from pathlib import Path
 
-from app.modules.rag.intent_router_v2 import ConversationState, IntentRouterV2
+from app.modules.agent.intent_router_v2 import ConversationState, IntentRouterV2
 from tests.pipeline_setup.suite_02_pipeline.pipeline_intent_rag.helpers.phrases_loader import PhraseCatalogLoader
 from tests.unit_tests.rag.intent_router_testkit import repo_context
 
diff --git a/tests/pipeline_setup/suite_02_pipeline/test_results/test_intent_router_only_matrix_20260312_231221.jsonl b/tests/pipeline_setup/suite_02_pipeline/test_results/test_intent_router_only_matrix_20260312_231221.jsonl
new file mode 100644
index 0000000..8d8d3fe
--- /dev/null
+++ b/tests/pipeline_setup/suite_02_pipeline/test_results/test_intent_router_only_matrix_20260312_231221.jsonl
@@ -0,0 +1,10 @@
+{"case_id": "code-open-context-file", "text": "Открой файл src/mail_order_bot/context.py", "mode": "router_only", "run_started_at": "2026-03-12T23:12:21", "rag_session_id": "849fc2c9-e17c-4034-b3b5-2e13d535bb94", "expected_intent": "CODE_QA", "actual_intent": "CODE_QA", "graph_id": "CodeQAGraph", "conversation_mode": "START", "query": "Открой файл src/mail_order_bot/context.py", "symbol_resolution": {"status": "not_requested", "resolved_symbol": null, "alternatives": [], "confidence": 0.0}, "rag_count": 0, "rag_rows": [], "llm_answer": null, "summary": {"router": {"intent": "CODE_QA", "sub_intent": "OPEN_FILE", "confidence": null}, "retrieval": {"profile": "code", "layers_hit": [], "evidence_sufficient": false}, "llm": {"answer_status": "partial", "groundedness": "not_applicable"}}, "diagnostics": {"router_plan": {"intent": "CODE_QA", "sub_intent": "OPEN_FILE", "graph_id": "CodeQAGraph", "retrieval_profile": "code", "conversation_mode": "START", "layers": ["C0_SOURCE_CHUNKS"], "symbol_kind_hint": "unknown", "symbol_candidates": [], "keyword_hints": [], "path_hints": ["src/mail_order_bot/context.py"], "path_scope": ["src/mail_order_bot/context.py"], "doc_scope_hints": [], "retrieval_constraints": {"include_globs": ["src/mail_order_bot/context.py"], "exclude_globs": ["tests/**", "**/test_*.py", "**/*_test.py"], "prefer_globs": [], "test_file_globs": [], "test_symbol_patterns": [], "max_candidates": 20, "fuzzy_symbol_search": {"enabled": false, "max_distance": 2, "top_k": 5}}}, "execution": {"executed_layers": [], "retrieval_mode_by_layer": {}, "top_k_by_layer": {}, "filters_by_layer": {}, "repo_scope": {"repo_id": null, "workspace_id": null}}, "retrieval": null, "constraint_violations": [], "timings_ms": {"router": 0, "symbol_resolution": 0, "retrieval_total": 0, "retrieval_by_layer": {}, "merge_rank": 0, "prompt_build": 0, "llm_call": 0}, "prompt": null, "router": {"conversation_mode": "START", "keyword_hints": [], "path_scope": ["src/mail_order_bot/context.py"]}, "llm": {"used_evidence_count": 0, "missing_evidence_reason": null}}, "run_info": {"case_id": "code-open-context-file", "mode": "router_only", "run_started_at": "2026-03-12T23:12:21", "rag_session_id": "849fc2c9-e17c-4034-b3b5-2e13d535bb94", "expected_intent": "CODE_QA", "actual_intent": "CODE_QA", "graph_id": "CodeQAGraph", "conversation_mode": "START"}, "input_request": {"text": "Открой файл src/mail_order_bot/context.py", "normalized_query": "Открой файл src/mail_order_bot/context.py"}, "steps": [{"step": "intent_router", "input": {"query": "Открой файл src/mail_order_bot/context.py"}, "output": {"intent": "CODE_QA", "graph_id": "CodeQAGraph", "conversation_mode": "START", "query": "Открой файл src/mail_order_bot/context.py"}, "diagnostics": {"router_plan": {"intent": "CODE_QA", "sub_intent": "OPEN_FILE", "graph_id": "CodeQAGraph", "retrieval_profile": "code", "conversation_mode": "START", "layers": ["C0_SOURCE_CHUNKS"], "symbol_kind_hint": "unknown", "symbol_candidates": [], "keyword_hints": [], "path_hints": ["src/mail_order_bot/context.py"], "path_scope": ["src/mail_order_bot/context.py"], "doc_scope_hints": [], "retrieval_constraints": {"include_globs": ["src/mail_order_bot/context.py"], "exclude_globs": ["tests/**", "**/test_*.py", "**/*_test.py"], "prefer_globs": [], "test_file_globs": [], "test_symbol_patterns": [], "max_candidates": 20, "fuzzy_symbol_search": {"enabled": false, "max_distance": 2, "top_k": 5}}}, "timings_ms": {"router": 0}}}]}
+{"case_id": "code-explain-context-class", "text": "Объясни как работает класс Context", "mode": "router_only", "run_started_at": "2026-03-12T23:12:21", "rag_session_id": "849fc2c9-e17c-4034-b3b5-2e13d535bb94", "expected_intent": "CODE_QA", "actual_intent": "CODE_QA", "graph_id": "CodeQAGraph", "conversation_mode": "START", "query": "Объясни как работает класс Context", "symbol_resolution": {"status": "pending", "resolved_symbol": null, "alternatives": ["Context"], "confidence": 0.0}, "rag_count": 0, "rag_rows": [], "llm_answer": null, "summary": {"router": {"intent": "CODE_QA", "sub_intent": "EXPLAIN", "confidence": null}, "retrieval": {"profile": "code", "layers_hit": [], "evidence_sufficient": false}, "llm": {"answer_status": "partial", "groundedness": "not_applicable"}}, "diagnostics": {"router_plan": {"intent": "CODE_QA", "sub_intent": "EXPLAIN", "graph_id": "CodeQAGraph", "retrieval_profile": "code", "conversation_mode": "START", "layers": ["C1_SYMBOL_CATALOG", "C0_SOURCE_CHUNKS", "C4_SEMANTIC_ROLES", "C2_DEPENDENCY_GRAPH", "C3_ENTRYPOINTS"], "symbol_kind_hint": "class", "symbol_candidates": ["Context"], "keyword_hints": ["Context"], "path_hints": [], "path_scope": [], "doc_scope_hints": [], "retrieval_constraints": {"include_globs": ["src/**"], "exclude_globs": ["tests/**", "**/test_*.py", "**/*_test.py"], "prefer_globs": [], "test_file_globs": [], "test_symbol_patterns": [], "max_candidates": 20, "fuzzy_symbol_search": {"enabled": true, "max_distance": 2, "top_k": 5}}}, "execution": {"executed_layers": [], "retrieval_mode_by_layer": {}, "top_k_by_layer": {}, "filters_by_layer": {}, "repo_scope": {"repo_id": null, "workspace_id": null}}, "retrieval": null, "constraint_violations": [], "timings_ms": {"router": 0, "symbol_resolution": 0, "retrieval_total": 0, "retrieval_by_layer": {}, "merge_rank": 0, "prompt_build": 0, "llm_call": 0}, "prompt": null, "router": {"conversation_mode": "START", "keyword_hints": ["Context"], "path_scope": []}, "llm": {"used_evidence_count": 0, "missing_evidence_reason": null}}, "run_info": {"case_id": "code-explain-context-class", "mode": "router_only", "run_started_at": "2026-03-12T23:12:21", "rag_session_id": "849fc2c9-e17c-4034-b3b5-2e13d535bb94", "expected_intent": "CODE_QA", "actual_intent": "CODE_QA", "graph_id": "CodeQAGraph", "conversation_mode": "START"}, "input_request": {"text": "Объясни как работает класс Context", "normalized_query": "Объясни как работает класс Context"}, "steps": [{"step": "intent_router", "input": {"query": "Объясни как работает класс Context"}, "output": {"intent": "CODE_QA", "graph_id": "CodeQAGraph", "conversation_mode": "START", "query": "Объясни как работает класс Context"}, "diagnostics": {"router_plan": {"intent": "CODE_QA", "sub_intent": "EXPLAIN", "graph_id": "CodeQAGraph", "retrieval_profile": "code", "conversation_mode": "START", "layers": ["C1_SYMBOL_CATALOG", "C0_SOURCE_CHUNKS", "C4_SEMANTIC_ROLES", "C2_DEPENDENCY_GRAPH", "C3_ENTRYPOINTS"], "symbol_kind_hint": "class", "symbol_candidates": ["Context"], "keyword_hints": ["Context"], "path_hints": [], "path_scope": [], "doc_scope_hints": [], "retrieval_constraints": {"include_globs": ["src/**"], "exclude_globs": ["tests/**", "**/test_*.py", "**/*_test.py"], "prefer_globs": [], "test_file_globs": [], "test_symbol_patterns": [], "max_candidates": 20, "fuzzy_symbol_search": {"enabled": true, "max_distance": 2, "top_k": 5}}}, "timings_ms": {"router": 0}}}]}
+{"case_id": "code-explain-excel-parser", "text": "Объясни класс ExcelFileParcer", "mode": "router_only", "run_started_at": "2026-03-12T23:12:21", "rag_session_id": "849fc2c9-e17c-4034-b3b5-2e13d535bb94", "expected_intent": "CODE_QA", "actual_intent": "CODE_QA", "graph_id": "CodeQAGraph", "conversation_mode": "START", "query": "Объясни класс ExcelFileParcer", "symbol_resolution": {"status": "pending", "resolved_symbol": null, "alternatives": ["ExcelFileParcer"], "confidence": 0.0}, "rag_count": 0, "rag_rows": [], "llm_answer": null, "summary": {"router": {"intent": "CODE_QA", "sub_intent": "EXPLAIN", "confidence": null}, "retrieval": {"profile": "code", "layers_hit": [], "evidence_sufficient": false}, "llm": {"answer_status": "partial", "groundedness": "not_applicable"}}, "diagnostics": {"router_plan": {"intent": "CODE_QA", "sub_intent": "EXPLAIN", "graph_id": "CodeQAGraph", "retrieval_profile": "code", "conversation_mode": "START", "layers": ["C1_SYMBOL_CATALOG", "C0_SOURCE_CHUNKS", "C4_SEMANTIC_ROLES", "C2_DEPENDENCY_GRAPH", "C3_ENTRYPOINTS"], "symbol_kind_hint": "class", "symbol_candidates": ["ExcelFileParcer"], "keyword_hints": ["ExcelFileParcer"], "path_hints": [], "path_scope": [], "doc_scope_hints": [], "retrieval_constraints": {"include_globs": ["src/**"], "exclude_globs": ["tests/**", "**/test_*.py", "**/*_test.py"], "prefer_globs": [], "test_file_globs": [], "test_symbol_patterns": [], "max_candidates": 20, "fuzzy_symbol_search": {"enabled": true, "max_distance": 2, "top_k": 5}}}, "execution": {"executed_layers": [], "retrieval_mode_by_layer": {}, "top_k_by_layer": {}, "filters_by_layer": {}, "repo_scope": {"repo_id": null, "workspace_id": null}}, "retrieval": null, "constraint_violations": [], "timings_ms": {"router": 0, "symbol_resolution": 0, "retrieval_total": 0, "retrieval_by_layer": {}, "merge_rank": 0, "prompt_build": 0, "llm_call": 0}, "prompt": null, "router": {"conversation_mode": "START", "keyword_hints": ["ExcelFileParcer"], "path_scope": []}, "llm": {"used_evidence_count": 0, "missing_evidence_reason": null}}, "run_info": {"case_id": "code-explain-excel-parser", "mode": "router_only", "run_started_at": "2026-03-12T23:12:21", "rag_session_id": "849fc2c9-e17c-4034-b3b5-2e13d535bb94", "expected_intent": "CODE_QA", "actual_intent": "CODE_QA", "graph_id": "CodeQAGraph", "conversation_mode": "START"}, "input_request": {"text": "Объясни класс ExcelFileParcer", "normalized_query": "Объясни класс ExcelFileParcer"}, "steps": [{"step": "intent_router", "input": {"query": "Объясни класс ExcelFileParcer"}, "output": {"intent": "CODE_QA", "graph_id": "CodeQAGraph", "conversation_mode": "START", "query": "Объясни класс ExcelFileParcer"}, "diagnostics": {"router_plan": {"intent": "CODE_QA", "sub_intent": "EXPLAIN", "graph_id": "CodeQAGraph", "retrieval_profile": "code", "conversation_mode": "START", "layers": ["C1_SYMBOL_CATALOG", "C0_SOURCE_CHUNKS", "C4_SEMANTIC_ROLES", "C2_DEPENDENCY_GRAPH", "C3_ENTRYPOINTS"], "symbol_kind_hint": "class", "symbol_candidates": ["ExcelFileParcer"], "keyword_hints": ["ExcelFileParcer"], "path_hints": [], "path_scope": [], "doc_scope_hints": [], "retrieval_constraints": {"include_globs": ["src/**"], "exclude_globs": ["tests/**", "**/test_*.py", "**/*_test.py"], "prefer_globs": [], "test_file_globs": [], "test_symbol_patterns": [], "max_candidates": 20, "fuzzy_symbol_search": {"enabled": true, "max_distance": 2, "top_k": 5}}}, "timings_ms": {"router": 0}}}]}
+{"case_id": "code-find-tests-for-context", "text": "Где тесты для Context?", "mode": "router_only", "run_started_at": "2026-03-12T23:12:21", "rag_session_id": "849fc2c9-e17c-4034-b3b5-2e13d535bb94", "expected_intent": "CODE_QA", "actual_intent": "CODE_QA", "graph_id": "CodeQAGraph", "conversation_mode": "START", "query": "Где тесты для Context?", "symbol_resolution": {"status": "pending", "resolved_symbol": null, "alternatives": ["Context"], "confidence": 0.0}, "rag_count": 0, "rag_rows": [], "llm_answer": null, "summary": {"router": {"intent": "CODE_QA", "sub_intent": "FIND_TESTS", "confidence": null}, "retrieval": {"profile": "code", "layers_hit": [], "evidence_sufficient": false}, "llm": {"answer_status": "partial", "groundedness": "not_applicable"}}, "diagnostics": {"router_plan": {"intent": "CODE_QA", "sub_intent": "FIND_TESTS", "graph_id": "CodeQAGraph", "retrieval_profile": "code", "conversation_mode": "START", "layers": ["C1_SYMBOL_CATALOG", "C2_DEPENDENCY_GRAPH", "C0_SOURCE_CHUNKS"], "symbol_kind_hint": "class", "symbol_candidates": ["Context"], "keyword_hints": ["Context"], "path_hints": [], "path_scope": [], "doc_scope_hints": [], "retrieval_constraints": {"include_globs": ["src/**", "tests/**", "**/test_*.py", "**/*_test.py", "**/conftest.py"], "exclude_globs": [], "prefer_globs": ["tests/**", "**/test_*.py", "**/*_test.py"], "test_file_globs": ["tests/**", "**/test_*.py", "**/*_test.py", "**/conftest.py"], "test_symbol_patterns": ["test_context", "TestContext", "Context"], "max_candidates": 20, "fuzzy_symbol_search": {"enabled": true, "max_distance": 2, "top_k": 5}}}, "execution": {"executed_layers": [], "retrieval_mode_by_layer": {}, "top_k_by_layer": {}, "filters_by_layer": {}, "repo_scope": {"repo_id": null, "workspace_id": null}}, "retrieval": null, "constraint_violations": [], "timings_ms": {"router": 0, "symbol_resolution": 0, "retrieval_total": 0, "retrieval_by_layer": {}, "merge_rank": 0, "prompt_build": 0, "llm_call": 0}, "prompt": null, "router": {"conversation_mode": "START", "keyword_hints": ["Context"], "path_scope": []}, "llm": {"used_evidence_count": 0, "missing_evidence_reason": null}}, "run_info": {"case_id": "code-find-tests-for-context", "mode": "router_only", "run_started_at": "2026-03-12T23:12:21", "rag_session_id": "849fc2c9-e17c-4034-b3b5-2e13d535bb94", "expected_intent": "CODE_QA", "actual_intent": "CODE_QA", "graph_id": "CodeQAGraph", "conversation_mode": "START"}, "input_request": {"text": "Где тесты для Context?", "normalized_query": "Где тесты для Context?"}, "steps": [{"step": "intent_router", "input": {"query": "Где тесты для Context?"}, "output": {"intent": "CODE_QA", "graph_id": "CodeQAGraph", "conversation_mode": "START", "query": "Где тесты для Context?"}, "diagnostics": {"router_plan": {"intent": "CODE_QA", "sub_intent": "FIND_TESTS", "graph_id": "CodeQAGraph", "retrieval_profile": "code", "conversation_mode": "START", "layers": ["C1_SYMBOL_CATALOG", "C2_DEPENDENCY_GRAPH", "C0_SOURCE_CHUNKS"], "symbol_kind_hint": "class", "symbol_candidates": ["Context"], "keyword_hints": ["Context"], "path_hints": [], "path_scope": [], "doc_scope_hints": [], "retrieval_constraints": {"include_globs": ["src/**", "tests/**", "**/test_*.py", "**/*_test.py", "**/conftest.py"], "exclude_globs": [], "prefer_globs": ["tests/**", "**/test_*.py", "**/*_test.py"], "test_file_globs": ["tests/**", "**/test_*.py", "**/*_test.py", "**/conftest.py"], "test_symbol_patterns": ["test_context", "TestContext", "Context"], "max_candidates": 20, "fuzzy_symbol_search": {"enabled": true, "max_distance": 2, "top_k": 5}}}, "timings_ms": {"router": 0}}}]}
+{"case_id": "code-exclude-tests-context", "text": "Не про тесты, а про прод код Context", "mode": "router_only", "run_started_at": "2026-03-12T23:12:21", "rag_session_id": "849fc2c9-e17c-4034-b3b5-2e13d535bb94", "expected_intent": "CODE_QA", "actual_intent": "CODE_QA", "graph_id": "CodeQAGraph", "conversation_mode": "START", "query": "Не про тесты, а про прод код Context", "symbol_resolution": {"status": "pending", "resolved_symbol": null, "alternatives": ["Context"], "confidence": 0.0}, "rag_count": 0, "rag_rows": [], "llm_answer": null, "summary": {"router": {"intent": "CODE_QA", "sub_intent": "EXPLAIN", "confidence": null}, "retrieval": {"profile": "code", "layers_hit": [], "evidence_sufficient": false}, "llm": {"answer_status": "partial", "groundedness": "not_applicable"}}, "diagnostics": {"router_plan": {"intent": "CODE_QA", "sub_intent": "EXPLAIN", "graph_id": "CodeQAGraph", "retrieval_profile": "code", "conversation_mode": "START", "layers": ["C1_SYMBOL_CATALOG", "C0_SOURCE_CHUNKS", "C4_SEMANTIC_ROLES", "C2_DEPENDENCY_GRAPH", "C3_ENTRYPOINTS"], "symbol_kind_hint": "class", "symbol_candidates": ["Context"], "keyword_hints": ["Context"], "path_hints": [], "path_scope": [], "doc_scope_hints": [], "retrieval_constraints": {"include_globs": ["src/**"], "exclude_globs": ["tests/**", "**/test_*.py", "**/*_test.py"], "prefer_globs": [], "test_file_globs": [], "test_symbol_patterns": [], "max_candidates": 20, "fuzzy_symbol_search": {"enabled": true, "max_distance": 2, "top_k": 5}}}, "execution": {"executed_layers": [], "retrieval_mode_by_layer": {}, "top_k_by_layer": {}, "filters_by_layer": {}, "repo_scope": {"repo_id": null, "workspace_id": null}}, "retrieval": null, "constraint_violations": [], "timings_ms": {"router": 0, "symbol_resolution": 0, "retrieval_total": 0, "retrieval_by_layer": {}, "merge_rank": 0, "prompt_build": 0, "llm_call": 0}, "prompt": null, "router": {"conversation_mode": "START", "keyword_hints": ["Context"], "path_scope": []}, "llm": {"used_evidence_count": 0, "missing_evidence_reason": null}}, "run_info": {"case_id": "code-exclude-tests-context", "mode": "router_only", "run_started_at": "2026-03-12T23:12:21", "rag_session_id": "849fc2c9-e17c-4034-b3b5-2e13d535bb94", "expected_intent": "CODE_QA", "actual_intent": "CODE_QA", "graph_id": "CodeQAGraph", "conversation_mode": "START"}, "input_request": {"text": "Не про тесты, а про прод код Context", "normalized_query": "Не про тесты, а про прод код Context"}, "steps": [{"step": "intent_router", "input": {"query": "Не про тесты, а про прод код Context"}, "output": {"intent": "CODE_QA", "graph_id": "CodeQAGraph", "conversation_mode": "START", "query": "Не про тесты, а про прод код Context"}, "diagnostics": {"router_plan": {"intent": "CODE_QA", "sub_intent": "EXPLAIN", "graph_id": "CodeQAGraph", "retrieval_profile": "code", "conversation_mode": "START", "layers": ["C1_SYMBOL_CATALOG", "C0_SOURCE_CHUNKS", "C4_SEMANTIC_ROLES", "C2_DEPENDENCY_GRAPH", "C3_ENTRYPOINTS"], "symbol_kind_hint": "class", "symbol_candidates": ["Context"], "keyword_hints": ["Context"], "path_hints": [], "path_scope": [], "doc_scope_hints": [], "retrieval_constraints": {"include_globs": ["src/**"], "exclude_globs": ["tests/**", "**/test_*.py", "**/*_test.py"], "prefer_globs": [], "test_file_globs": [], "test_symbol_patterns": [], "max_candidates": 20, "fuzzy_symbol_search": {"enabled": true, "max_distance": 2, "top_k": 5}}}, "timings_ms": {"router": 0}}}]}
+{"case_id": "code-open-abstract-task", "text": "Покажи файл src/mail_order_bot/task_processor/abstract_task.py", "mode": "router_only", "run_started_at": "2026-03-12T23:12:21", "rag_session_id": "849fc2c9-e17c-4034-b3b5-2e13d535bb94", "expected_intent": "CODE_QA", "actual_intent": "CODE_QA", "graph_id": "CodeQAGraph", "conversation_mode": "START", "query": "Покажи файл src/mail_order_bot/task_processor/abstract_task.py", "symbol_resolution": {"status": "not_requested", "resolved_symbol": null, "alternatives": [], "confidence": 0.0}, "rag_count": 0, "rag_rows": [], "llm_answer": null, "summary": {"router": {"intent": "CODE_QA", "sub_intent": "OPEN_FILE", "confidence": null}, "retrieval": {"profile": "code", "layers_hit": [], "evidence_sufficient": false}, "llm": {"answer_status": "partial", "groundedness": "not_applicable"}}, "diagnostics": {"router_plan": {"intent": "CODE_QA", "sub_intent": "OPEN_FILE", "graph_id": "CodeQAGraph", "retrieval_profile": "code", "conversation_mode": "START", "layers": ["C0_SOURCE_CHUNKS"], "symbol_kind_hint": "unknown", "symbol_candidates": [], "keyword_hints": [], "path_hints": ["src/mail_order_bot/task_processor/abstract_task.py"], "path_scope": ["src/mail_order_bot/task_processor/abstract_task.py"], "doc_scope_hints": [], "retrieval_constraints": {"include_globs": ["src/mail_order_bot/task_processor/abstract_task.py"], "exclude_globs": ["tests/**", "**/test_*.py", "**/*_test.py"], "prefer_globs": [], "test_file_globs": [], "test_symbol_patterns": [], "max_candidates": 20, "fuzzy_symbol_search": {"enabled": false, "max_distance": 2, "top_k": 5}}}, "execution": {"executed_layers": [], "retrieval_mode_by_layer": {}, "top_k_by_layer": {}, "filters_by_layer": {}, "repo_scope": {"repo_id": null, "workspace_id": null}}, "retrieval": null, "constraint_violations": [], "timings_ms": {"router": 0, "symbol_resolution": 0, "retrieval_total": 0, "retrieval_by_layer": {}, "merge_rank": 0, "prompt_build": 0, "llm_call": 0}, "prompt": null, "router": {"conversation_mode": "START", "keyword_hints": [], "path_scope": ["src/mail_order_bot/task_processor/abstract_task.py"]}, "llm": {"used_evidence_count": 0, "missing_evidence_reason": null}}, "run_info": {"case_id": "code-open-abstract-task", "mode": "router_only", "run_started_at": "2026-03-12T23:12:21", "rag_session_id": "849fc2c9-e17c-4034-b3b5-2e13d535bb94", "expected_intent": "CODE_QA", "actual_intent": "CODE_QA", "graph_id": "CodeQAGraph", "conversation_mode": "START"}, "input_request": {"text": "Покажи файл src/mail_order_bot/task_processor/abstract_task.py", "normalized_query": "Покажи файл src/mail_order_bot/task_processor/abstract_task.py"}, "steps": [{"step": "intent_router", "input": {"query": "Покажи файл src/mail_order_bot/task_processor/abstract_task.py"}, "output": {"intent": "CODE_QA", "graph_id": "CodeQAGraph", "conversation_mode": "START", "query": "Покажи файл src/mail_order_bot/task_processor/abstract_task.py"}, "diagnostics": {"router_plan": {"intent": "CODE_QA", "sub_intent": "OPEN_FILE", "graph_id": "CodeQAGraph", "retrieval_profile": "code", "conversation_mode": "START", "layers": ["C0_SOURCE_CHUNKS"], "symbol_kind_hint": "unknown", "symbol_candidates": [], "keyword_hints": [], "path_hints": ["src/mail_order_bot/task_processor/abstract_task.py"], "path_scope": ["src/mail_order_bot/task_processor/abstract_task.py"], "doc_scope_hints": [], "retrieval_constraints": {"include_globs": ["src/mail_order_bot/task_processor/abstract_task.py"], "exclude_globs": ["tests/**", "**/test_*.py", "**/*_test.py"], "prefer_globs": [], "test_file_globs": [], "test_symbol_patterns": [], "max_candidates": 20, "fuzzy_symbol_search": {"enabled": false, "max_distance": 2, "top_k": 5}}}, "timings_ms": {"router": 0}}}]}
+{"case_id": "code-explain-handle-errors", "text": "Теперь объясни функцию handle_errors", "mode": "router_only", "run_started_at": "2026-03-12T23:12:21", "rag_session_id": "849fc2c9-e17c-4034-b3b5-2e13d535bb94", "expected_intent": "CODE_QA", "actual_intent": "CODE_QA", "graph_id": "CodeQAGraph", "conversation_mode": "START", "query": "Теперь объясни функцию handle_errors", "symbol_resolution": {"status": "pending", "resolved_symbol": null, "alternatives": ["handle_errors"], "confidence": 0.0}, "rag_count": 0, "rag_rows": [], "llm_answer": null, "summary": {"router": {"intent": "CODE_QA", "sub_intent": "EXPLAIN", "confidence": null}, "retrieval": {"profile": "code", "layers_hit": [], "evidence_sufficient": false}, "llm": {"answer_status": "partial", "groundedness": "not_applicable"}}, "diagnostics": {"router_plan": {"intent": "CODE_QA", "sub_intent": "EXPLAIN", "graph_id": "CodeQAGraph", "retrieval_profile": "code", "conversation_mode": "START", "layers": ["C1_SYMBOL_CATALOG", "C0_SOURCE_CHUNKS", "C4_SEMANTIC_ROLES", "C2_DEPENDENCY_GRAPH", "C3_ENTRYPOINTS"], "symbol_kind_hint": "function", "symbol_candidates": ["handle_errors"], "keyword_hints": ["handle_errors"], "path_hints": [], "path_scope": [], "doc_scope_hints": [], "retrieval_constraints": {"include_globs": ["src/**"], "exclude_globs": ["tests/**", "**/test_*.py", "**/*_test.py"], "prefer_globs": [], "test_file_globs": [], "test_symbol_patterns": [], "max_candidates": 20, "fuzzy_symbol_search": {"enabled": true, "max_distance": 2, "top_k": 5}}}, "execution": {"executed_layers": [], "retrieval_mode_by_layer": {}, "top_k_by_layer": {}, "filters_by_layer": {}, "repo_scope": {"repo_id": null, "workspace_id": null}}, "retrieval": null, "constraint_violations": [], "timings_ms": {"router": 0, "symbol_resolution": 0, "retrieval_total": 0, "retrieval_by_layer": {}, "merge_rank": 0, "prompt_build": 0, "llm_call": 0}, "prompt": null, "router": {"conversation_mode": "START", "keyword_hints": ["handle_errors"], "path_scope": []}, "llm": {"used_evidence_count": 0, "missing_evidence_reason": null}}, "run_info": {"case_id": "code-explain-handle-errors", "mode": "router_only", "run_started_at": "2026-03-12T23:12:21", "rag_session_id": "849fc2c9-e17c-4034-b3b5-2e13d535bb94", "expected_intent": "CODE_QA", "actual_intent": "CODE_QA", "graph_id": "CodeQAGraph", "conversation_mode": "START"}, "input_request": {"text": "Теперь объясни функцию handle_errors", "normalized_query": "Теперь объясни функцию handle_errors"}, "steps": [{"step": "intent_router", "input": {"query": "Теперь объясни функцию handle_errors"}, "output": {"intent": "CODE_QA", "graph_id": "CodeQAGraph", "conversation_mode": "START", "query": "Теперь объясни функцию handle_errors"}, "diagnostics": {"router_plan": {"intent": "CODE_QA", "sub_intent": "EXPLAIN", "graph_id": "CodeQAGraph", "retrieval_profile": "code", "conversation_mode": "START", "layers": ["C1_SYMBOL_CATALOG", "C0_SOURCE_CHUNKS", "C4_SEMANTIC_ROLES", "C2_DEPENDENCY_GRAPH", "C3_ENTRYPOINTS"], "symbol_kind_hint": "function", "symbol_candidates": ["handle_errors"], "keyword_hints": ["handle_errors"], "path_hints": [], "path_scope": [], "doc_scope_hints": [], "retrieval_constraints": {"include_globs": ["src/**"], "exclude_globs": ["tests/**", "**/test_*.py", "**/*_test.py"], "prefer_globs": [], "test_file_globs": [], "test_symbol_patterns": [], "max_candidates": 20, "fuzzy_symbol_search": {"enabled": true, "max_distance": 2, "top_k": 5}}}, "timings_ms": {"router": 0}}}]}
+{"case_id": "docs-about-readme-deploy", "text": "Что сказано в README_DEPLOY.md?", "mode": "router_only", "run_started_at": "2026-03-12T23:12:21", "rag_session_id": "849fc2c9-e17c-4034-b3b5-2e13d535bb94", "expected_intent": "DOCS_QA", "actual_intent": "DOCS_QA", "graph_id": "DocsQAGraph", "conversation_mode": "START", "query": "Что сказано в README_DEPLOY.md?", "symbol_resolution": {"status": "not_requested", "resolved_symbol": null, "alternatives": [], "confidence": 0.0}, "rag_count": 0, "rag_rows": [], "llm_answer": null, "summary": {"router": {"intent": "DOCS_QA", "sub_intent": "EXPLAIN", "confidence": null}, "retrieval": {"profile": "docs", "layers_hit": [], "evidence_sufficient": false}, "llm": {"answer_status": "partial", "groundedness": "not_applicable"}}, "diagnostics": {"router_plan": {"intent": "DOCS_QA", "sub_intent": "EXPLAIN", "graph_id": "DocsQAGraph", "retrieval_profile": "docs", "conversation_mode": "START", "layers": ["D3_SECTION_INDEX", "D2_FACT_INDEX"], "symbol_kind_hint": "unknown", "symbol_candidates": [], "keyword_hints": ["README_DEPLOY.md", "readme_deploy.md", "deploy", "deployment", "docker", "compose", "env", "config", "production", "ci/cd"], "path_hints": ["README_DEPLOY.md"], "path_scope": ["README_DEPLOY.md"], "doc_scope_hints": ["README_DEPLOY.md", "README*", "docs/**", "**/*.md"], "retrieval_constraints": {"include_globs": ["README_DEPLOY.md", "docs/**", "README*", "**/*.md"], "exclude_globs": [".venv/**", "node_modules/**", "**/*.bin", "**/*.png", "**/*.jpg"], "prefer_globs": [], "test_file_globs": [], "test_symbol_patterns": [], "max_candidates": 20, "fuzzy_symbol_search": {"enabled": true, "max_distance": 2, "top_k": 5}}}, "execution": {"executed_layers": [], "retrieval_mode_by_layer": {}, "top_k_by_layer": {}, "filters_by_layer": {}, "repo_scope": {"repo_id": null, "workspace_id": null}}, "retrieval": null, "constraint_violations": [], "timings_ms": {"router": 0, "symbol_resolution": 0, "retrieval_total": 0, "retrieval_by_layer": {}, "merge_rank": 0, "prompt_build": 0, "llm_call": 0}, "prompt": null, "router": {"conversation_mode": "START", "keyword_hints": ["README_DEPLOY.md", "readme_deploy.md", "deploy", "deployment", "docker", "compose", "env", "config", "production", "ci/cd"], "path_scope": ["README_DEPLOY.md"]}, "llm": {"used_evidence_count": 0, "missing_evidence_reason": null}}, "run_info": {"case_id": "docs-about-readme-deploy", "mode": "router_only", "run_started_at": "2026-03-12T23:12:21", "rag_session_id": "849fc2c9-e17c-4034-b3b5-2e13d535bb94", "expected_intent": "DOCS_QA", "actual_intent": "DOCS_QA", "graph_id": "DocsQAGraph", "conversation_mode": "START"}, "input_request": {"text": "Что сказано в README_DEPLOY.md?", "normalized_query": "Что сказано в README_DEPLOY.md?"}, "steps": [{"step": "intent_router", "input": {"query": "Что сказано в README_DEPLOY.md?"}, "output": {"intent": "DOCS_QA", "graph_id": "DocsQAGraph", "conversation_mode": "START", "query": "Что сказано в README_DEPLOY.md?"}, "diagnostics": {"router_plan": {"intent": "DOCS_QA", "sub_intent": "EXPLAIN", "graph_id": "DocsQAGraph", "retrieval_profile": "docs", "conversation_mode": "START", "layers": ["D3_SECTION_INDEX", "D2_FACT_INDEX"], "symbol_kind_hint": "unknown", "symbol_candidates": [], "keyword_hints": ["README_DEPLOY.md", "readme_deploy.md", "deploy", "deployment", "docker", "compose", "env", "config", "production", "ci/cd"], "path_hints": ["README_DEPLOY.md"], "path_scope": ["README_DEPLOY.md"], "doc_scope_hints": ["README_DEPLOY.md", "README*", "docs/**", "**/*.md"], "retrieval_constraints": {"include_globs": ["README_DEPLOY.md", "docs/**", "README*", "**/*.md"], "exclude_globs": [".venv/**", "node_modules/**", "**/*.bin", "**/*.png", "**/*.jpg"], "prefer_globs": [], "test_file_globs": [], "test_symbol_patterns": [], "max_candidates": 20, "fuzzy_symbol_search": {"enabled": true, "max_distance": 2, "top_k": 5}}}, "timings_ms": {"router": 0}}}]}
+{"case_id": "docs-generic-question", "text": "Что сказано в документации по деплою?", "mode": "router_only", "run_started_at": "2026-03-12T23:12:21", "rag_session_id": "849fc2c9-e17c-4034-b3b5-2e13d535bb94", "expected_intent": "DOCS_QA", "actual_intent": "DOCS_QA", "graph_id": "DocsQAGraph", "conversation_mode": "START", "query": "Что сказано в документации по деплою?", "symbol_resolution": {"status": "not_requested", "resolved_symbol": null, "alternatives": [], "confidence": 0.0}, "rag_count": 0, "rag_rows": [], "llm_answer": null, "summary": {"router": {"intent": "DOCS_QA", "sub_intent": "EXPLAIN", "confidence": null}, "retrieval": {"profile": "docs", "layers_hit": [], "evidence_sufficient": false}, "llm": {"answer_status": "partial", "groundedness": "not_applicable"}}, "diagnostics": {"router_plan": {"intent": "DOCS_QA", "sub_intent": "EXPLAIN", "graph_id": "DocsQAGraph", "retrieval_profile": "docs", "conversation_mode": "START", "layers": ["D1_MODULE_CATALOG", "D2_FACT_INDEX", "D3_SECTION_INDEX", "D4_POLICY_INDEX"], "symbol_kind_hint": "unknown", "symbol_candidates": [], "keyword_hints": ["documentation", "docs"], "path_hints": [], "path_scope": [], "doc_scope_hints": ["README*", "docs/**", "**/*.md"], "retrieval_constraints": {"include_globs": ["docs/**", "README*", "**/*.md"], "exclude_globs": [".venv/**", "node_modules/**", "**/*.bin", "**/*.png", "**/*.jpg"], "prefer_globs": [], "test_file_globs": [], "test_symbol_patterns": [], "max_candidates": 20, "fuzzy_symbol_search": {"enabled": true, "max_distance": 2, "top_k": 5}}}, "execution": {"executed_layers": [], "retrieval_mode_by_layer": {}, "top_k_by_layer": {}, "filters_by_layer": {}, "repo_scope": {"repo_id": null, "workspace_id": null}}, "retrieval": null, "constraint_violations": [], "timings_ms": {"router": 0, "symbol_resolution": 0, "retrieval_total": 0, "retrieval_by_layer": {}, "merge_rank": 0, "prompt_build": 0, "llm_call": 0}, "prompt": null, "router": {"conversation_mode": "START", "keyword_hints": ["documentation", "docs"], "path_scope": []}, "llm": {"used_evidence_count": 0, "missing_evidence_reason": null}}, "run_info": {"case_id": "docs-generic-question", "mode": "router_only", "run_started_at": "2026-03-12T23:12:21", "rag_session_id": "849fc2c9-e17c-4034-b3b5-2e13d535bb94", "expected_intent": "DOCS_QA", "actual_intent": "DOCS_QA", "graph_id": "DocsQAGraph", "conversation_mode": "START"}, "input_request": {"text": "Что сказано в документации по деплою?", "normalized_query": "Что сказано в документации по деплою?"}, "steps": [{"step": "intent_router", "input": {"query": "Что сказано в документации по деплою?"}, "output": {"intent": "DOCS_QA", "graph_id": "DocsQAGraph", "conversation_mode": "START", "query": "Что сказано в документации по деплою?"}, "diagnostics": {"router_plan": {"intent": "DOCS_QA", "sub_intent": "EXPLAIN", "graph_id": "DocsQAGraph", "retrieval_profile": "docs", "conversation_mode": "START", "layers": ["D1_MODULE_CATALOG", "D2_FACT_INDEX", "D3_SECTION_INDEX", "D4_POLICY_INDEX"], "symbol_kind_hint": "unknown", "symbol_candidates": [], "keyword_hints": ["documentation", "docs"], "path_hints": [], "path_scope": [], "doc_scope_hints": ["README*", "docs/**", "**/*.md"], "retrieval_constraints": {"include_globs": ["docs/**", "README*", "**/*.md"], "exclude_globs": [".venv/**", "node_modules/**", "**/*.bin", "**/*.png", "**/*.jpg"], "prefer_globs": [], "test_file_globs": [], "test_symbol_patterns": [], "max_candidates": 20, "fuzzy_symbol_search": {"enabled": true, "max_distance": 2, "top_k": 5}}}, "timings_ms": {"router": 0}}}]}
+{"case_id": "docs-open-readme-ref", "text": "Что про это в docs и README?", "mode": "router_only", "run_started_at": "2026-03-12T23:12:21", "rag_session_id": "849fc2c9-e17c-4034-b3b5-2e13d535bb94", "expected_intent": "DOCS_QA", "actual_intent": "DOCS_QA", "graph_id": "DocsQAGraph", "conversation_mode": "START", "query": "Что про это в docs и README?", "symbol_resolution": {"status": "not_requested", "resolved_symbol": null, "alternatives": [], "confidence": 0.0}, "rag_count": 0, "rag_rows": [], "llm_answer": null, "summary": {"router": {"intent": "DOCS_QA", "sub_intent": "EXPLAIN", "confidence": null}, "retrieval": {"profile": "docs", "layers_hit": [], "evidence_sufficient": false}, "llm": {"answer_status": "partial", "groundedness": "not_applicable"}}, "diagnostics": {"router_plan": {"intent": "DOCS_QA", "sub_intent": "EXPLAIN", "graph_id": "DocsQAGraph", "retrieval_profile": "docs", "conversation_mode": "START", "layers": ["D1_MODULE_CATALOG", "D2_FACT_INDEX", "D3_SECTION_INDEX", "D4_POLICY_INDEX"], "symbol_kind_hint": "unknown", "symbol_candidates": [], "keyword_hints": ["docs", "README"], "path_hints": [], "path_scope": [], "doc_scope_hints": ["README*", "docs/**", "**/*.md", "README"], "retrieval_constraints": {"include_globs": ["docs/**", "README*", "**/*.md"], "exclude_globs": [".venv/**", "node_modules/**", "**/*.bin", "**/*.png", "**/*.jpg"], "prefer_globs": [], "test_file_globs": [], "test_symbol_patterns": [], "max_candidates": 20, "fuzzy_symbol_search": {"enabled": true, "max_distance": 2, "top_k": 5}}}, "execution": {"executed_layers": [], "retrieval_mode_by_layer": {}, "top_k_by_layer": {}, "filters_by_layer": {}, "repo_scope": {"repo_id": null, "workspace_id": null}}, "retrieval": null, "constraint_violations": [], "timings_ms": {"router": 0, "symbol_resolution": 0, "retrieval_total": 0, "retrieval_by_layer": {}, "merge_rank": 0, "prompt_build": 0, "llm_call": 0}, "prompt": null, "router": {"conversation_mode": "START", "keyword_hints": ["docs", "README"], "path_scope": []}, "llm": {"used_evidence_count": 0, "missing_evidence_reason": null}}, "run_info": {"case_id": "docs-open-readme-ref", "mode": "router_only", "run_started_at": "2026-03-12T23:12:21", "rag_session_id": "849fc2c9-e17c-4034-b3b5-2e13d535bb94", "expected_intent": "DOCS_QA", "actual_intent": "DOCS_QA", "graph_id": "DocsQAGraph", "conversation_mode": "START"}, "input_request": {"text": "Что про это в docs и README?", "normalized_query": "Что про это в docs и README?"}, "steps": [{"step": "intent_router", "input": {"query": "Что про это в docs и README?"}, "output": {"intent": "DOCS_QA", "graph_id": "DocsQAGraph", "conversation_mode": "START", "query": "Что про это в docs и README?"}, "diagnostics": {"router_plan": {"intent": "DOCS_QA", "sub_intent": "EXPLAIN", "graph_id": "DocsQAGraph", "retrieval_profile": "docs", "conversation_mode": "START", "layers": ["D1_MODULE_CATALOG", "D2_FACT_INDEX", "D3_SECTION_INDEX", "D4_POLICY_INDEX"], "symbol_kind_hint": "unknown", "symbol_candidates": [], "keyword_hints": ["docs", "README"], "path_hints": [], "path_scope": [], "doc_scope_hints": ["README*", "docs/**", "**/*.md", "README"], "retrieval_constraints": {"include_globs": ["docs/**", "README*", "**/*.md"], "exclude_globs": [".venv/**", "node_modules/**", "**/*.bin", "**/*.png", "**/*.jpg"], "prefer_globs": [], "test_file_globs": [], "test_symbol_patterns": [], "max_candidates": 20, "fuzzy_symbol_search": {"enabled": true, "max_distance": 2, "top_k": 5}}}, "timings_ms": {"router": 0}}}]}
diff --git a/tests/pipeline_setup_v2/runtime/code_qa_eval_adapter.py b/tests/pipeline_setup_v2/runtime/code_qa_eval_adapter.py
index fe834ea..80b4da4 100644
--- a/tests/pipeline_setup_v2/runtime/code_qa_eval_adapter.py
+++ b/tests/pipeline_setup_v2/runtime/code_qa_eval_adapter.py
@@ -1,14 +1,14 @@
 from __future__ import annotations
 
-from app.modules.agent.code_qa_runtime import CodeQaRuntimeExecutor
+from app.modules.agent.runtime import AgentRuntimeExecutor
 from app.modules.rag.contracts.enums import RagLayer
-from app.modules.rag.intent_router_v2 import RepoContext
+from app.modules.agent.intent_router_v2 import RepoContext
 from tests.pipeline_setup_v2.core.models import ExecutionPayload, V2Case
 
 
 class CodeQaEvalAdapter:
     def __init__(self) -> None:
-        self._executor = CodeQaRuntimeExecutor(llm=None)
+        self._executor = AgentRuntimeExecutor(llm=None)
 
     def execute(self, case: V2Case, rag_session_id: str | None) -> ExecutionPayload:
         if not rag_session_id:
diff --git a/tests/pipeline_setup_v2/runtime/runtime_adapter.py b/tests/pipeline_setup_v2/runtime/runtime_adapter.py
index 3cc11d3..37e0ab1 100644
--- a/tests/pipeline_setup_v2/runtime/runtime_adapter.py
+++ b/tests/pipeline_setup_v2/runtime/runtime_adapter.py
@@ -3,9 +3,9 @@ from __future__ import annotations
 from datetime import datetime
 from pathlib import Path
 
-from app.modules.agent.code_qa_runtime import CodeQaRuntimeExecutor
+from app.modules.agent.runtime import AgentRuntimeExecutor
 from app.modules.agent.llm import AgentLlmService
-from app.modules.agent.prompt_loader import PromptLoader
+from app.modules.agent.llm.prompt_loader import PromptLoader
 from app.modules.shared.gigachat.client import GigaChatClient
 from app.modules.shared.gigachat.settings import GigaChatSettings
 from app.modules.shared.gigachat.token_provider import GigaChatTokenProvider
@@ -29,7 +29,7 @@ class RuntimeAdapter:
         PipelineEnvLoader(self._test_root).load()
         self._started_at = datetime.now()
         self._rag_adapter = None
-        self._executor: CodeQaRuntimeExecutor | None = None
+        self._executor: AgentRuntimeExecutor | None = None
 
     def execute(self, case: V2Case, rag_session_id: str | None) -> ExecutionPayload:
         phrase = PhraseCase(
@@ -100,9 +100,9 @@ class RuntimeAdapter:
             )
         return self._rag_adapter
 
-    def _executor_instance(self) -> CodeQaRuntimeExecutor:
+    def _executor_instance(self) -> AgentRuntimeExecutor:
         if self._executor is None:
-            self._executor = CodeQaRuntimeExecutor(_build_llm())
+            self._executor = AgentRuntimeExecutor(_build_llm())
         return self._executor
 
 
diff --git a/tests/pipeline_setup_v3/README.md b/tests/pipeline_setup_v3/README.md
index c2a2dcf..5bc92b9 100644
--- a/tests/pipeline_setup_v3/README.md
+++ b/tests/pipeline_setup_v3/README.md
@@ -14,8 +14,8 @@ The important difference is that `v3` does not assemble a local test-only pipeli
 It uses agent components directly:
 
 - `IntentRouterV2`
-- `CodeQaRetrievalAdapter`
-- `CodeQaRuntimeExecutor`
+- `RuntimeRetrievalAdapter`
+- `AgentRuntimeExecutor`
 
 ## Run
 
diff --git a/tests/pipeline_setup_v3/runtime/agent_runtime_adapter.py b/tests/pipeline_setup_v3/runtime/agent_runtime_adapter.py
index 9ddd33d..703abc2 100644
--- a/tests/pipeline_setup_v3/runtime/agent_runtime_adapter.py
+++ b/tests/pipeline_setup_v3/runtime/agent_runtime_adapter.py
@@ -2,14 +2,15 @@ from __future__ import annotations
 
 import math
 
-from app.modules.agent.code_qa_runtime import CodeQaRuntimeExecutor
-from app.modules.agent.code_qa_runtime.repo_context import CodeQaRepoContextFactory
-from app.modules.agent.code_qa_runtime.retrieval_adapter import CodeQaRetrievalAdapter
+from app.modules.agent.runtime import (
+    AgentRuntimeExecutor,
+    RuntimeRepoContextFactory,
+    RuntimeRetrievalAdapter,
+)
 from app.modules.agent.llm import AgentLlmService
-from app.modules.agent.prompt_loader import PromptLoader
-from app.modules.rag.code_qa_pipeline.retrieval_request_builder import build_retrieval_request
-from app.modules.rag.code_qa_pipeline.retrieval_result_builder import build_retrieval_result
-from app.modules.rag.intent_router_v2 import ConversationState, IntentRouterV2
+from app.modules.agent.llm.prompt_loader import PromptLoader
+from app.modules.agent.runtime.steps.context import build_retrieval_request, build_retrieval_result
+from app.modules.agent.intent_router_v2 import ConversationState, IntentRouterV2
 from app.modules.shared.gigachat.client import GigaChatClient
 from app.modules.shared.gigachat.settings import GigaChatSettings
 from app.modules.shared.gigachat.token_provider import GigaChatTokenProvider
@@ -19,9 +20,9 @@ from tests.pipeline_setup_v3.core.models import ExecutionPayload, V3Case
 class AgentRuntimeAdapter:
     def __init__(self) -> None:
         self._router = IntentRouterV2()
-        self._repo_context_factory = CodeQaRepoContextFactory()
-        self._retrieval = CodeQaRetrievalAdapter()
-        self._executor: CodeQaRuntimeExecutor | None = None
+        self._repo_context_factory = RuntimeRepoContextFactory()
+        self._retrieval = RuntimeRetrievalAdapter()
+        self._executor: AgentRuntimeExecutor | None = None
 
     def execute(self, case: V3Case, rag_session_id: str | None) -> ExecutionPayload:
         if case.mode == "router_only":
@@ -156,9 +157,9 @@ class AgentRuntimeAdapter:
             "layers": tuple(layers or []),
         }
 
-    def _executor_instance(self) -> CodeQaRuntimeExecutor:
+    def _executor_instance(self) -> AgentRuntimeExecutor:
         if self._executor is None:
-            self._executor = CodeQaRuntimeExecutor(_build_llm())
+            self._executor = AgentRuntimeExecutor(_build_llm())
         return self._executor
 
 
diff --git a/tests/unit_tests/chat/test_direct_service.py b/tests/unit_tests/chat/test_direct_service.py
index 96d071c..f7b9141 100644
--- a/tests/unit_tests/chat/test_direct_service.py
+++ b/tests/unit_tests/chat/test_direct_service.py
@@ -3,7 +3,7 @@ import asyncio
 from app.modules.chat.direct_service import CodeExplainChatService
 from app.modules.chat.session_resolver import ChatSessionResolver
 from app.modules.chat.task_store import TaskStore
-from app.modules.rag.explain.models import ExplainIntent, ExplainPack
+from app.modules.agent.runtime.steps.explain.models import ExplainIntent, ExplainPack
 from app.schemas.chat import ChatFileContext, ChatMessageRequest
 
 
diff --git a/tests/unit_tests/rag/asserts_intent_router.py b/tests/unit_tests/rag/asserts_intent_router.py
index 6df58c8..97efbac 100644
--- a/tests/unit_tests/rag/asserts_intent_router.py
+++ b/tests/unit_tests/rag/asserts_intent_router.py
@@ -2,7 +2,7 @@ from __future__ import annotations
 
 import re
 
-from app.modules.rag.intent_router_v2.models import IntentRouterResult
+from app.modules.agent.intent_router_v2.models import IntentRouterResult
 
 
 def assert_intent(out: IntentRouterResult, expected: str) -> None:
diff --git a/tests/unit_tests/rag/intent_router_testkit.py b/tests/unit_tests/rag/intent_router_testkit.py
index e7cdc35..1fb4da9 100644
--- a/tests/unit_tests/rag/intent_router_testkit.py
+++ b/tests/unit_tests/rag/intent_router_testkit.py
@@ -3,7 +3,7 @@ from __future__ import annotations
 import json
 
 from app.modules.rag.contracts.enums import RagLayer
-from app.modules.rag.intent_router_v2 import ConversationState, IntentRouterV2, RepoContext
+from app.modules.agent.intent_router_v2 import ConversationState, IntentRouterV2, RepoContext
 
 
 def repo_context() -> RepoContext:
diff --git a/tests/unit_tests/rag/test_explain_intent_builder.py b/tests/unit_tests/rag/test_explain_intent_builder.py
index f386561..c76ea12 100644
--- a/tests/unit_tests/rag/test_explain_intent_builder.py
+++ b/tests/unit_tests/rag/test_explain_intent_builder.py
@@ -1,4 +1,4 @@
-from app.modules.rag.explain.intent_builder import ExplainIntentBuilder
+from app.modules.agent.runtime.steps.explain.intent_builder import ExplainIntentBuilder
 
 
 def test_explain_intent_builder_extracts_route_symbol_and_file_hints() -> None:
diff --git a/tests/unit_tests/rag/test_intent_router_e2e_flows.py b/tests/unit_tests/rag/test_intent_router_e2e_flows.py
index e0f4851..75f1c9f 100644
--- a/tests/unit_tests/rag/test_intent_router_e2e_flows.py
+++ b/tests/unit_tests/rag/test_intent_router_e2e_flows.py
@@ -2,7 +2,7 @@ import os
 
 import pytest
 
-from app.modules.rag.intent_router_v2 import GigaChatIntentRouterFactory
+from app.modules.agent.intent_router_v2.factory import GigaChatIntentRouterFactory
 from app.modules.shared.env_loader import load_workspace_env
 from tests.unit_tests.rag.asserts_intent_router import (
     assert_domains,
diff --git a/tests/unit_tests/rag/test_layered_gateway.py b/tests/unit_tests/rag/test_layered_gateway.py
index 49cf6ce..244d39b 100644
--- a/tests/unit_tests/rag/test_layered_gateway.py
+++ b/tests/unit_tests/rag/test_layered_gateway.py
@@ -1,4 +1,4 @@
-from app.modules.rag.explain.layered_gateway import LayeredRetrievalGateway
+from app.modules.agent.runtime.steps.explain.layered_gateway import LayeredRetrievalGateway
 
 
 class _Embedder:
diff --git a/tests/unit_tests/rag/test_query_normalization.py b/tests/unit_tests/rag/test_query_normalization.py
index 39889b9..251509a 100644
--- a/tests/unit_tests/rag/test_query_normalization.py
+++ b/tests/unit_tests/rag/test_query_normalization.py
@@ -1,6 +1,6 @@
 import pytest
 
-from app.modules.rag.intent_router_v2.analysis.normalization import QueryNormalizer
+from app.modules.agent.intent_router_v2.analysis.normalization import QueryNormalizer
 
 pytestmark = pytest.mark.intent_router
 
diff --git a/tests/unit_tests/rag/test_retriever_v2_no_fallback.py b/tests/unit_tests/rag/test_retriever_v2_no_fallback.py
index dd7f0df..14ffa2e 100644
--- a/tests/unit_tests/rag/test_retriever_v2_no_fallback.py
+++ b/tests/unit_tests/rag/test_retriever_v2_no_fallback.py
@@ -1,4 +1,4 @@
-from app.modules.rag.explain import CodeExplainRetrieverV2, LayeredRetrievalGateway
+from app.modules.agent.runtime.steps.explain import CodeExplainRetrieverV2, LayeredRetrievalGateway
 
 
 class _ExplodingEmbedder:
diff --git a/tests/unit_tests/rag/test_retriever_v2_pack.py b/tests/unit_tests/rag/test_retriever_v2_pack.py
index 0fb5c91..2b7ee28 100644
--- a/tests/unit_tests/rag/test_retriever_v2_pack.py
+++ b/tests/unit_tests/rag/test_retriever_v2_pack.py
@@ -1,5 +1,5 @@
-from app.modules.rag.explain.models import CodeLocation, LayeredRetrievalItem
-from app.modules.rag.explain.retriever_v2 import CodeExplainRetrieverV2
+from app.modules.agent.runtime.steps.explain.models import CodeLocation, LayeredRetrievalItem
+from app.modules.agent.runtime.steps.explain.retriever_v2 import CodeExplainRetrieverV2
 
 
 class _FakeGateway:
diff --git a/tests/unit_tests/rag/test_retriever_v2_production_first.py b/tests/unit_tests/rag/test_retriever_v2_production_first.py
index 0971664..151abab 100644
--- a/tests/unit_tests/rag/test_retriever_v2_production_first.py
+++ b/tests/unit_tests/rag/test_retriever_v2_production_first.py
@@ -1,7 +1,7 @@
 from types import SimpleNamespace
 
-from app.modules.rag.explain.models import CodeLocation, LayeredRetrievalItem
-from app.modules.rag.explain.retriever_v2 import CodeExplainRetrieverV2
+from app.modules.agent.runtime.steps.explain.models import CodeLocation, LayeredRetrievalItem
+from app.modules.agent.runtime.steps.explain.retriever_v2 import CodeExplainRetrieverV2
 
 
 class _ProductionFirstGateway:
diff --git a/tests/unit_tests/rag/test_trace_builder.py b/tests/unit_tests/rag/test_trace_builder.py
index 655546e..2527311 100644
--- a/tests/unit_tests/rag/test_trace_builder.py
+++ b/tests/unit_tests/rag/test_trace_builder.py
@@ -1,5 +1,5 @@
-from app.modules.rag.explain.models import CodeLocation, LayeredRetrievalItem
-from app.modules.rag.explain.trace_builder import TraceBuilder
+from app.modules.agent.runtime.steps.explain.models import CodeLocation, LayeredRetrievalItem
+from app.modules.agent.runtime.steps.explain.trace_builder import TraceBuilder
 
 
 class _FakeGraphRepository: