Files

zosimovaa da8ed4fa2b docs: add healthcheck requirements and README_DEPLOY

- Add docs/HEALTHCHECK_REQUIREMENTS.md with full spec (purpose, deploy.sh
  behaviour, endpoint contract, get_health_status(), app requirements,
  infrastructure)
- Add README_DEPLOY.md with healthcheck section, link to requirements
  and HEALTHCHECK_* env vars

Co-authored-by: Cursor <cursoragent@cursor.com>

2026-02-18 23:11:53 +03:00

8.2 KiB

Raw Blame History

Требования к healthcheck / Healthcheck Requirements

Единая спецификация для реализации healthcheck в config_manager и в приложениях (в т.ч. MailOrderBot).

A unified specification for implementing healthcheck in config_manager and in applications (including MailOrderBot).

1. Назначение / Purpose

Healthcheck используется скриптом деплоя (deploy.sh): после docker compose up -d деплой ждёт успешного ответа по HEALTHCHECK_URL; при таймауте — откат и выход с ошибкой.
The healthcheck is used by the deploy script (deploy.sh): after docker compose up -d, the deploy waits for a successful response at HEALTHCHECK_URL; on timeout — rollback and exit with error.
Эндпоинт должен отражать реальное состояние приложения, а не только факт работы HTTP-сервера (иначе деплой может считать успешным запуск «зависшего» или упавшего воркера).
The endpoint must reflect the actual state of the application, not just that the HTTP server is running (otherwise deploy may consider successful a «hung» or crashed worker).

2. Поведение deploy.sh / deploy.sh behaviour

(Логика уже может быть реализована в коде deploy-скрипта.)

Если задана переменная HEALTHCHECK_URL — после поднятия контейнеров вызывается wait_for_healthcheck: цикл с интервалом HEALTHCHECK_INTERVAL (по умолчанию 5 с), пока не истечёт HEALTHCHECK_TIMEOUT (по умолчанию 120 с).
If HEALTHCHECK_URL is set — after bringing up containers, wait_for_healthcheck is called: a loop with interval HEALTHCHECK_INTERVAL (default 5 s) until HEALTHCHECK_TIMEOUT (default 120 s) expires.
Проверка: curl -fsS --max-time 5 "$HEALTHCHECK_URL". Флаг -f: любой HTTP-код 4xx/5xx считается ошибкой (повтор до таймаута).
Check: curl -fsS --max-time 5 "$HEALTHCHECK_URL". Flag -f: any HTTP 4xx/5xx is treated as failure (retry until timeout).
Успех: HTTP 2xx. Неуспех: не 2xx или таймаут curl/соединения → деплой падает с откатом.
Success: HTTP 2xx. Failure: non-2xx or curl/connection timeout → deploy fails with rollback.

Переменные окружения / Environment variables

Переменная / Variable	Назначение / Purpose	Пример/дефолт / Example/default
`HEALTHCHECK_URL`	URL для проверки / Check URL	`http://127.0.0.1:8000/health`
`HEALTHCHECK_TIMEOUT`	Макс. время ожидания (сек) / Max wait (s)	`120`
`HEALTHCHECK_INTERVAL`	Интервал между попытками (сек) / Interval between attempts (s)	`5`

3. Контракт эндпоинта / Endpoint contract

Метод и путь / Method and path: GET /health (или иной путь по соглашению; один для всех приложений на config_manager).
Method and path: GET /health (or another agreed path; one for all applications using config_manager).
Успех (приложение в порядке) / Success (application healthy): HTTP 200, опционально тело JSON: {"status": "ok"}.
Success (application healthy): HTTP 200, optional JSON body: {"status": "ok"}.
Приложение не в порядке / Application not healthy: HTTP 503, опционально тело: {"status": "unhealthy"|"degraded", "detail": "причина"}.
Application not healthy: HTTP 503, optional body: {"status": "unhealthy"|"degraded", "detail": "reason"}.
Ответ должен приходить в разумное время (рекомендуемый таймаут вызова логики проверки в config_manager — 2–5 с), иначе deploy через curl --max-time 5 получит таймаут и будет повторять запросы.
The response must arrive within a reasonable time (recommended timeout for the check logic in config_manager — 2–5 s); otherwise deploy will get a timeout via curl --max-time 5 and will retry.

4. Обратная связь от приложения (config_manager) / Application feedback (config_manager)

Эндпоинт реализуется в config_manager (опционально, при включённой опции).
The endpoint is implemented in config_manager (optional, when the option is enabled).
При обработке GET /health config_manager не решает сам «здорово ли приложение», а вызывает метод приложения, например: app.get_health_status().
When handling GET /health, config_manager does not decide by itself whether the application is healthy; it calls the application method, e.g. app.get_health_status().
Контракт метода / Method contract (приложение переопределяет в наследнике):
- Method contract (application overrides in subclass):
- Возвращает dict: {"status": "ok" | "degraded" | "unhealthy", "detail": "..."} (поле detail опционально).
- Returns dict: {"status": "ok" | "degraded" | "unhealthy", "detail": "..."} (detail optional).
- При исключении или превышении таймаута вызова — считать состояние unhealthy и отдавать 503.
- On exception or call timeout — treat as unhealthy and return 503.

Таким образом, основное приложение «даёт обратную связь» через реализацию get_health_status() (флаги после сбоев, проверка БД, heartbeat и т.д.).

Thus, the main application provides feedback via get_health_status() (flags after failures, DB check, heartbeat, etc.).

5. Требования к приложению (например, MailOrderBot) / Application requirements (e.g. MailOrderBot)

Переопределить get_health_status() и возвращать:
Override get_health_status() and return:
- {"status": "ok"} при нормальной работе;
- {"status": "ok"} when running normally;
- {"status": "unhealthy", "detail": "..."} при критичном сбое (например, последний execute() упал, зависимость недоступна);
- {"status": "unhealthy", "detail": "..."} on critical failure (e.g. last execute() failed, dependency unavailable);
- при желании — {"status": "degraded", "detail": "..."} для нефатальной деградации (в обоих случаях эндпоинт отдаёт 503 для совместимости с curl -f).
- optionally — {"status": "degraded", "detail": "..."} for non-fatal degradation (in both cases the endpoint returns 503 for compatibility with curl -f).

6. Инфраструктура / Infrastructure

Для работы healthcheck приложение должно поднимать HTTP-сервер (в config_manager при включённой опции) и пробрасывать порт в docker-compose (например 8000:8000), чтобы deploy.sh на хосте мог обращаться по HEALTHCHECK_URL (например http://127.0.0.1:8000/health).
For healthcheck to work, the application must run an HTTP server (in config_manager when the option is enabled) and expose the port in docker-compose (e.g. 8000:8000), so that deploy.sh on the host can call HEALTHCHECK_URL (e.g. http://127.0.0.1:8000/health).

8.2 KiB Raw Blame History Unescape Escape