Compare commits
8 Commits
d07578f489
...
FEAT-1-age
| Author | SHA1 | Date | |
|---|---|---|---|
| dce020d637 | |||
| 15f8a57d3a | |||
| a990e704d9 | |||
| c8980abe2b | |||
| e210f483b7 | |||
| 20af12f47d | |||
| 5ce6335ad8 | |||
| e899f54f04 |
3
.gitignore
vendored
3
.gitignore
vendored
@@ -1 +1,4 @@
|
|||||||
src/rag_agent/.env
|
src/rag_agent/.env
|
||||||
|
.env
|
||||||
|
docker/ssh
|
||||||
|
docker/postgres_test_data
|
||||||
22
Dockerfile
Normal file
22
Dockerfile
Normal file
@@ -0,0 +1,22 @@
|
|||||||
|
# RAG Agent app. Build from repo root (clone git@git.lesha.spb.ru:alex/RagAgent.git then docker compose build).
|
||||||
|
FROM python:3.12-slim
|
||||||
|
|
||||||
|
WORKDIR /app
|
||||||
|
|
||||||
|
# Install git for optional in-image clone; app is usually COPY'd from build context
|
||||||
|
RUN apt-get update -qq && apt-get install -y --no-install-recommends git openssh-client \
|
||||||
|
&& rm -rf /var/lib/apt/lists/*
|
||||||
|
|
||||||
|
# Copy repo (when built from cloned repo: docker compose build)
|
||||||
|
COPY pyproject.toml ./
|
||||||
|
COPY src ./src
|
||||||
|
COPY README.md ./
|
||||||
|
|
||||||
|
RUN pip install --no-cache-dir -e .
|
||||||
|
|
||||||
|
# Default: run webhook server (override in compose or when running)
|
||||||
|
ENV RAG_DB_DSN=""
|
||||||
|
ENV RAG_REPO_PATH="/data"
|
||||||
|
EXPOSE 8000
|
||||||
|
ENTRYPOINT ["rag-agent"]
|
||||||
|
CMD ["serve", "--host", "0.0.0.0", "--port", "8000"]
|
||||||
126
README.md
126
README.md
@@ -1,23 +1,77 @@
|
|||||||
|
|
||||||
|
|
||||||
|
|
||||||
# RAG Agent (Postgres)
|
# RAG Agent (Postgres)
|
||||||
|
|
||||||
Custom RAG agent that indexes text files from a git repository into Postgres
|
Custom RAG agent that indexes text files from a git repository into Postgres
|
||||||
and answers queries using retrieval + LLM generation. Commits are tied to
|
and answers queries using retrieval + LLM generation. **Changes are always in the context of a Story**: the unit of work is the story, not individual commits. The agent indexes **all changes from all commits** in the story range (base_ref..head_ref); per-commit indexing is not used.
|
||||||
**stories**; indexing and retrieval can be scoped by story.
|
|
||||||
|
|
||||||
## Quick start
|
## Quick start
|
||||||
|
|
||||||
1. Configure environment variables:
|
1. (Optional) Run Postgres and the app via Docker (clone the repo first):
|
||||||
|
- `git clone git@git.lesha.spb.ru:alex/RagAgent.git && cd RagAgent`
|
||||||
|
- `docker compose up -d` — starts Postgres and the RAG app in one network `rag_net`; app connects to DB at host `postgres`.
|
||||||
|
- On first start (empty DB), scripts in `docker/postgres-init/` run automatically (extension + tables). To disable, comment out the init volume in `docker-compose.yml`.
|
||||||
|
- Default DSN inside the app: `postgresql://rag:rag_secret@postgres:5432/rag`. Override with `POSTGRES_*` and `RAG_REPO_PATH` (path to your knowledge-base repo, mounted into the app container).
|
||||||
|
- Run commands: `docker compose run --rm app index --story my-branch`, `docker compose run --rm app ask "Question?"`.
|
||||||
|
2. Configure environment variables:
|
||||||
- `RAG_REPO_PATH` — path to git repo with text files
|
- `RAG_REPO_PATH` — path to git repo with text files
|
||||||
- `RAG_DB_DSN` — Postgres DSN (e.g. `postgresql://user:pass@localhost:5432/rag`)
|
- `RAG_DB_DSN` — Postgres DSN (e.g. `postgresql://rag:rag_secret@localhost:5432/rag`)
|
||||||
- `RAG_EMBEDDINGS_DIM` — embedding vector dimension (e.g. `1536`)
|
- `RAG_EMBEDDINGS_DIM` — embedding vector dimension: **1024** for GigaChat Embeddings (default), 1536 for OpenAI
|
||||||
2. Create DB schema:
|
3. Create DB schema (only if not using Docker, or if init was disabled):
|
||||||
- `python scripts/create_db.py` (or `psql "$RAG_DB_DSN" -f scripts/schema.sql`)
|
- `python scripts/create_db.py` (or `psql "$RAG_DB_DSN" -f scripts/schema.sql`)
|
||||||
3. Index files for a story (e.g. branch name as story slug):
|
4. Index files for a story (e.g. branch name as story slug). Use the **full story range** so all commits in the story are included:
|
||||||
- `rag-agent index --story my-branch --changed --base-ref HEAD~1 --head-ref HEAD`
|
- `rag-agent index --story my-branch --changed --base-ref main --head-ref HEAD`
|
||||||
4. Ask a question (optionally scoped to a story):
|
- Or `--base-ref auto` to use merge-base(default-branch, head-ref) as the start of the story.
|
||||||
|
5. Ask a question (optionally scoped to a story):
|
||||||
- `rag-agent ask "What is covered?"`
|
- `rag-agent ask "What is covered?"`
|
||||||
- `rag-agent ask "What is covered?" --story my-branch`
|
- `rag-agent ask "What is covered?" --story my-branch`
|
||||||
|
|
||||||
|
## Webhook: index on push to remote
|
||||||
|
|
||||||
|
When the app runs as a service in Docker, it can start a **webhook server** so that each push to the remote repository triggers a pull and incremental indexing.
|
||||||
|
|
||||||
|
1. Start the stack with the webhook server (default in Docker):
|
||||||
|
- `docker compose up -d` — app runs `rag-agent serve` and listens on port 8000.
|
||||||
|
- Repo is mounted at `RAG_REPO_PATH` (e.g. `/data`) **writable**, so the container can run `git fetch` + `git merge --ff-only` to pull changes.
|
||||||
|
2. Clone the knowledge-base repo into the mounted directory (once), e.g. on the host: `git clone <url> ./data` so that `./data` is the worktree (or set `RAG_REPO_PATH` to that path and mount it).
|
||||||
|
3. In GitHub (or GitLab) add a **Webhook**:
|
||||||
|
- URL: `http://<your-server>:8000/webhook` (use HTTPS in production and put a reverse proxy in front).
|
||||||
|
- Content type: `application/json`.
|
||||||
|
- Secret: set a shared secret and export `WEBHOOK_SECRET` in the app environment (Docker: in `docker-compose.yml` or `.env`). If `WEBHOOK_SECRET` is empty, signature is not checked.
|
||||||
|
4. On each push to a branch, the server receives the webhook, pulls that branch into the worktree, and runs `rag-agent index --story <branch> --changed --base-ref <old_head> --head-ref <new_head>` so only changed files are re-indexed.
|
||||||
|
|
||||||
|
Health check: `GET http://<host>:8000/health` → `ok`. Port is configurable via `WEBHOOK_PORT` (default 8000) in docker-compose.
|
||||||
|
|
||||||
|
### Webhook diagnostics (202 Accepted but no new rows in DB)
|
||||||
|
|
||||||
|
1. **Logs** — After a push, check app logs. Each webhook logs `pull_and_index started branch=… repo_path=…`; then one of:
|
||||||
|
- `not a git repo or missing` — `/data` in the container is not a git clone; clone the repo into the mounted dir.
|
||||||
|
- `git fetch failed` — SSH/network (see `docker/ssh/README.md`) or wrong remote.
|
||||||
|
- `git checkout … failed` — branch missing in the clone.
|
||||||
|
- `git merge --ff-only failed` — non–fast-forward (e.g. force-push); index is skipped. Use a normal push or re-clone.
|
||||||
|
- `no new commits for branch=…` — merge was a no-op (already up to date); nothing to index.
|
||||||
|
- `running index story=…` then `index completed` — index ran; check tables for that story.
|
||||||
|
- `index failed` — stderr shows the `rag-agent index` error (e.g. DB, embeddings, repo path).
|
||||||
|
|
||||||
|
```bash
|
||||||
|
docker compose logs -f app
|
||||||
|
# or: docker logs -f rag-agent
|
||||||
|
```
|
||||||
|
Trigger a push and watch for the lines above.
|
||||||
|
|
||||||
|
2. **Story and tables** — Rows are per **story** (branch name). Query by story, e.g. `SELECT * FROM stories;` then `SELECT * FROM chunks WHERE story_id = (SELECT id FROM stories WHERE slug = 'main');`.
|
||||||
|
|
||||||
|
3. **Manual index** — Run index inside the container to confirm DB and repo work:
|
||||||
|
```bash
|
||||||
|
docker compose exec app rag-agent index --story main --changed --base-ref main --head-ref HEAD
|
||||||
|
```
|
||||||
|
If this inserts rows, the issue is in the webhook path (fetch/merge/refs).
|
||||||
|
|
||||||
|
4. **Allowed extensions** — Only `.md`, `.txt`, `.rst` (or `RAG_ALLOWED_EXTENSIONS`) are indexed; other files are skipped.
|
||||||
|
|
||||||
|
5. **"expected 1536 dimensions, not 1024"** — GigaChat Embeddings returns 1024-dim vectors; the default is now 1024. If the DB was created earlier with vector(1536), drop and recreate the tables so the app can create them with 1024: `psql "$RAG_DB_DSN" -c "DROP TABLE IF EXISTS chunks; DROP TABLE IF EXISTS documents;"` then restart the app (ensure_schema will recreate the tables).
|
||||||
|
|
||||||
## Git hook (index on commit)
|
## Git hook (index on commit)
|
||||||
|
|
||||||
Install the post-commit hook so changed files are indexed after each commit:
|
Install the post-commit hook so changed files are indexed after each commit:
|
||||||
@@ -28,16 +82,62 @@ cp scripts/post-commit .git/hooks/post-commit && chmod +x .git/hooks/post-commit
|
|||||||
|
|
||||||
Story for the commit is taken from (in order): env `RAG_STORY`, file `.rag-story` in repo root (one line = slug), or current branch name.
|
Story for the commit is taken from (in order): env `RAG_STORY`, file `.rag-story` in repo root (one line = slug), or current branch name.
|
||||||
|
|
||||||
|
## Git hook (server-side)
|
||||||
|
|
||||||
|
Use `scripts/post-receive` in the **bare repo** on the server so that pushes trigger indexing.
|
||||||
|
|
||||||
|
1. On the server, create a **non-bare clone** (worktree) that the hook will update and use for indexing, e.g. `git clone /path/to/repo.git /var/rag-worktree/repo`.
|
||||||
|
2. In the bare repo, install the hook: `cp /path/to/RagAgent/scripts/post-receive /path/to/repo.git/hooks/post-receive && chmod +x .../post-receive`.
|
||||||
|
3. Set env for the hook (e.g. in the hook or via systemd/sshd): `RAG_REPO_PATH=/var/rag-worktree/repo`, `RAG_DB_DSN=...`, `RAG_EMBEDDINGS_DIM=...`. Optionally `RAG_AGENT_VENV` (path to venv with `rag-agent`) or `RAG_AGENT_SRC` + `RAG_AGENT_PYTHON` for `python -m rag_agent.cli`.
|
||||||
|
4. On each push the hook updates the worktree to the new commit, then runs `rag-agent index --changed --base-ref main --head-ref newrev --story <branch>` so the story contains **all commits** on the branch (from main to newrev).
|
||||||
|
|
||||||
|
Story is taken from the ref name (e.g. `refs/heads/main` → `main`).
|
||||||
|
|
||||||
## DB structure
|
## DB structure
|
||||||
|
|
||||||
- **stories** — story slug (e.g. branch name); documents and chunks are tied to a story.
|
- **stories** — story slug (e.g. branch name); documents and chunks are tied to a story. Optional: `indexed_base_ref`, `indexed_head_ref`, `indexed_at` record the git range that was indexed (all commits in that range belong to the story).
|
||||||
- **documents** — path + version per story; unique `(story_id, path)`.
|
- **documents** — path + version per story; unique `(story_id, path)`.
|
||||||
- **chunks** — text chunks with embeddings (pgvector); updated when documents are re-indexed.
|
- **chunks** — text chunks with embeddings (pgvector), plus:
|
||||||
|
- `start_line`, `end_line` — position in the source file (for requirements/use-case files).
|
||||||
|
- `change_type` — `added` | `modified` | `unchanged` (relative to base ref when indexing with `--changed`).
|
||||||
|
- `previous_content` — for `modified` chunks, the content before the change (for test-case generation).
|
||||||
|
|
||||||
|
Indexing is **always per-story**: `base_ref..head_ref` defines the set of commits that belong to the story. Use `--base-ref main` (or `auto`) and `--head-ref HEAD` so the story contains all commits on the branch, not a single commit. When you run `index --changed`, the base ref is compared to head; each chunk is marked as added, modified, or unchanged.
|
||||||
|
|
||||||
|
### What changed in a story (for test cases)
|
||||||
|
|
||||||
|
To get only the chunks that were added or modified in a story (e.g. to generate test cases for the changed part):
|
||||||
|
|
||||||
|
```python
|
||||||
|
from rag_agent.index import fetch_changed_chunks
|
||||||
|
|
||||||
|
changed = fetch_changed_chunks(conn, story_id)
|
||||||
|
for r in changed:
|
||||||
|
# r.path, r.content, r.change_type, r.start_line, r.end_line, r.previous_content
|
||||||
|
...
|
||||||
|
```
|
||||||
|
|
||||||
Scripts: `scripts/create_db.py` (Python, uses `ensure_schema` and `RAG_*` env), `scripts/schema.sql` (raw SQL).
|
Scripts: `scripts/create_db.py` (Python, uses `ensure_schema` and `RAG_*` env), `scripts/schema.sql` (raw SQL).
|
||||||
|
|
||||||
|
## Embeddings (GigaChat)
|
||||||
|
|
||||||
|
If `GIGACHAT_CREDENTIALS` is set (e.g. in `.env` for local runs), embeddings use GigaChat API; otherwise the stub client is used. Optional env: `GIGACHAT_EMBEDDINGS_MODEL` (default `Embeddings`), `GIGACHAT_VERIFY_SSL` (`true`/`false`). Ensure `RAG_EMBEDDINGS_DIM` matches the model output (see GigaChat docs).
|
||||||
|
|
||||||
|
## Agent (GigaChat)
|
||||||
|
|
||||||
|
Ответы на вопросы формирует агент на базе GigaChat: поиск по базе знаний (RAG) + генерация текста. Если задана переменная `GIGACHAT_CREDENTIALS`, используется `GigaChatLLMClient` в `src/rag_agent/agent/pipeline.py`; иначе — заглушка. Модель чата задаётся через `RAG_LLM_MODEL` (по умолчанию `GigaChat`).
|
||||||
|
|
||||||
|
## Telegram-бот
|
||||||
|
|
||||||
|
Общение с пользователем через бота в Telegram: бот отвечает на текстовые сообщения, используя знания из базы (RAG + GigaChat).
|
||||||
|
|
||||||
|
1. Создайте бота через [@BotFather](https://t.me/BotFather) и получите токен.
|
||||||
|
2. Добавьте в `.env`: `TELEGRAM_BOT_TOKEN=<токен>`.
|
||||||
|
3. Запуск: `rag-agent bot` (или `python -m rag_agent.telegram_bot`).
|
||||||
|
4. Через Docker: `docker compose up -d` поднимает БД, вебхук-сервер и бота в отдельных контейнерах; в `.env` должен быть задан `TELEGRAM_BOT_TOKEN`.
|
||||||
|
|
||||||
|
Требуются: `RAG_DB_DSN`, `RAG_REPO_PATH`, `GIGACHAT_CREDENTIALS`, `TELEGRAM_BOT_TOKEN`. Расширенное логирование (входящие сообщения, число эмбеддингов, число чанков из БД, ответ LLM): `RAG_BOT_VERBOSE_LOGGING=true|false` (по умолчанию `true` для отладки).
|
||||||
|
|
||||||
## Notes
|
## Notes
|
||||||
|
|
||||||
- The default embedding/LLM clients are stubs. Replace them in
|
|
||||||
`src/rag_agent/index/embeddings.py` and `src/rag_agent/agent/pipeline.py`.
|
|
||||||
- This project requires Postgres with the `pgvector` extension.
|
- This project requires Postgres with the `pgvector` extension.
|
||||||
|
|||||||
91
docker-compose.yml
Normal file
91
docker-compose.yml
Normal file
@@ -0,0 +1,91 @@
|
|||||||
|
# Postgres with pgvector + RAG Agent app (from repo git@git.lesha.spb.ru:alex/RagAgent.git).
|
||||||
|
# Clone the repo, then: docker compose up -d
|
||||||
|
# App and DB share network "rag_net"; app uses RAG_DB_DSN with host=postgres.
|
||||||
|
# DB init: scripts in docker/postgres-init/ run on first start (empty volume); to disable, comment out the init volume.
|
||||||
|
|
||||||
|
services:
|
||||||
|
postgres:
|
||||||
|
image: pgvector/pgvector:pg16
|
||||||
|
container_name: rag-postgres
|
||||||
|
environment:
|
||||||
|
POSTGRES_USER: ${POSTGRES_USER:-rag}
|
||||||
|
POSTGRES_PASSWORD: ${POSTGRES_PASSWORD:-rag_secret}
|
||||||
|
POSTGRES_DB: ${POSTGRES_DB:-rag}
|
||||||
|
ports:
|
||||||
|
- "${POSTGRES_PORT:-5432}:5432"
|
||||||
|
volumes:
|
||||||
|
# PG 18+: mount at /var/lib/postgresql (data goes in versioned subdir). For pg16 use /var/lib/postgresql/data.
|
||||||
|
- rag_pgdata:/var/lib/postgresql
|
||||||
|
# Init scripts run once on first start (create extension, tables). Optional: comment out to skip.
|
||||||
|
- ./docker/postgres-init:/docker-entrypoint-initdb.d:ro
|
||||||
|
healthcheck:
|
||||||
|
test: ["CMD-SHELL", "pg_isready -U ${POSTGRES_USER:-rag} -d ${POSTGRES_DB:-rag}"]
|
||||||
|
interval: 5s
|
||||||
|
timeout: 5s
|
||||||
|
retries: 5
|
||||||
|
networks:
|
||||||
|
- rag_net
|
||||||
|
|
||||||
|
app:
|
||||||
|
build:
|
||||||
|
context: .
|
||||||
|
dockerfile: Dockerfile
|
||||||
|
image: rag-agent:latest
|
||||||
|
container_name: rag-agent
|
||||||
|
restart: unless-stopped
|
||||||
|
depends_on:
|
||||||
|
postgres:
|
||||||
|
condition: service_healthy
|
||||||
|
ports:
|
||||||
|
- "${WEBHOOK_PORT:-8000}:8000"
|
||||||
|
environment:
|
||||||
|
RAG_DB_DSN: "postgresql://${POSTGRES_USER:-rag}:${POSTGRES_PASSWORD:-rag_secret}@postgres:5432/${POSTGRES_DB:-rag}"
|
||||||
|
# In container repo is always at /data (mounted below). Use RAG_REPO_HOST in .env for host path.
|
||||||
|
RAG_REPO_PATH: "/data"
|
||||||
|
# Accept host key on first connect; git fetch uses SSH from /root/.ssh (mounted below).
|
||||||
|
GIT_SSH_COMMAND: "ssh -o StrictHostKeyChecking=accept-new"
|
||||||
|
RAG_EMBEDDINGS_DIM: ${RAG_EMBEDDINGS_DIM:-1024}
|
||||||
|
GIGACHAT_CREDENTIALS: ${GIGACHAT_CREDENTIALS:-}
|
||||||
|
GIGACHAT_EMBEDDINGS_MODEL: ${GIGACHAT_EMBEDDINGS_MODEL:-Embeddings}
|
||||||
|
WEBHOOK_SECRET: ${WEBHOOK_SECRET:-}
|
||||||
|
volumes:
|
||||||
|
# Host path: set RAG_REPO_HOST in .env (e.g. /Users/you/repo). Falls back to RAG_REPO_PATH then ./data.
|
||||||
|
- ${RAG_REPO_HOST:-${RAG_REPO_PATH:-./data}}:/data
|
||||||
|
# SSH for git fetch (webhook): put deploy key and known_hosts in RAG_SSH_DIR. See docker/ssh/README.md.
|
||||||
|
- ${RAG_SSH_DIR:-./docker/ssh}:/root/.ssh:ro
|
||||||
|
entrypoint: ["rag-agent"]
|
||||||
|
command: ["serve", "--host", "0.0.0.0", "--port", "8000"]
|
||||||
|
networks:
|
||||||
|
- rag_net
|
||||||
|
|
||||||
|
bot:
|
||||||
|
build:
|
||||||
|
context: .
|
||||||
|
dockerfile: Dockerfile
|
||||||
|
image: rag-agent:latest
|
||||||
|
container_name: rag-bot
|
||||||
|
restart: unless-stopped
|
||||||
|
depends_on:
|
||||||
|
postgres:
|
||||||
|
condition: service_healthy
|
||||||
|
environment:
|
||||||
|
RAG_DB_DSN: "postgresql://${POSTGRES_USER:-rag}:${POSTGRES_PASSWORD:-rag_secret}@postgres:5432/${POSTGRES_DB:-rag}"
|
||||||
|
RAG_REPO_PATH: "/data"
|
||||||
|
RAG_EMBEDDINGS_DIM: ${RAG_EMBEDDINGS_DIM:-1024}
|
||||||
|
GIGACHAT_CREDENTIALS: ${GIGACHAT_CREDENTIALS:-}
|
||||||
|
GIGACHAT_EMBEDDINGS_MODEL: ${GIGACHAT_EMBEDDINGS_MODEL:-Embeddings}
|
||||||
|
TELEGRAM_BOT_TOKEN: ${TELEGRAM_BOT_TOKEN:-}
|
||||||
|
RAG_BOT_VERBOSE_LOGGING: ${RAG_BOT_VERBOSE_LOGGING:-true}
|
||||||
|
volumes:
|
||||||
|
- ${RAG_REPO_HOST:-${RAG_REPO_PATH:-./data}}:/data
|
||||||
|
entrypoint: ["rag-agent"]
|
||||||
|
command: ["bot"]
|
||||||
|
networks:
|
||||||
|
- rag_net
|
||||||
|
|
||||||
|
networks:
|
||||||
|
rag_net:
|
||||||
|
driver: bridge
|
||||||
|
|
||||||
|
volumes:
|
||||||
|
rag_pgdata:
|
||||||
7
docker/postgres-init/00-example-extra-user.sql.example
Normal file
7
docker/postgres-init/00-example-extra-user.sql.example
Normal file
@@ -0,0 +1,7 @@
|
|||||||
|
-- Example: create an extra DB user (e.g. read-only). Not executed — rename to 00-create-extra-user.sql to enable.
|
||||||
|
-- Scripts in this folder run in alphabetical order; 00-* runs before 01-schema.sql.
|
||||||
|
|
||||||
|
-- CREATE USER rag_readonly WITH PASSWORD 'change_me';
|
||||||
|
-- GRANT CONNECT ON DATABASE rag TO rag_readonly;
|
||||||
|
-- GRANT USAGE ON SCHEMA public TO rag_readonly;
|
||||||
|
-- GRANT SELECT ON ALL TABLES IN SCHEMA public TO rag_readonly;
|
||||||
41
docker/postgres-init/01-schema.sql
Normal file
41
docker/postgres-init/01-schema.sql
Normal file
@@ -0,0 +1,41 @@
|
|||||||
|
-- RAG vector DB schema (runs automatically on first Postgres init).
|
||||||
|
-- GigaChat Embeddings = 1024; for OpenAI use vector(1536).
|
||||||
|
|
||||||
|
CREATE EXTENSION IF NOT EXISTS vector;
|
||||||
|
|
||||||
|
CREATE TABLE IF NOT EXISTS stories (
|
||||||
|
id SERIAL PRIMARY KEY,
|
||||||
|
slug TEXT UNIQUE NOT NULL,
|
||||||
|
created_at TIMESTAMPTZ NOT NULL DEFAULT (NOW() AT TIME ZONE 'utc'),
|
||||||
|
indexed_base_ref TEXT,
|
||||||
|
indexed_head_ref TEXT,
|
||||||
|
indexed_at TIMESTAMPTZ
|
||||||
|
);
|
||||||
|
|
||||||
|
CREATE TABLE IF NOT EXISTS documents (
|
||||||
|
id SERIAL PRIMARY KEY,
|
||||||
|
story_id INTEGER NOT NULL REFERENCES stories(id) ON DELETE CASCADE,
|
||||||
|
path TEXT NOT NULL,
|
||||||
|
version TEXT NOT NULL,
|
||||||
|
updated_at TIMESTAMPTZ NOT NULL,
|
||||||
|
UNIQUE(story_id, path)
|
||||||
|
);
|
||||||
|
|
||||||
|
CREATE TABLE IF NOT EXISTS chunks (
|
||||||
|
id SERIAL PRIMARY KEY,
|
||||||
|
document_id INTEGER NOT NULL REFERENCES documents(id) ON DELETE CASCADE,
|
||||||
|
chunk_index INTEGER NOT NULL,
|
||||||
|
hash TEXT NOT NULL,
|
||||||
|
content TEXT NOT NULL,
|
||||||
|
embedding vector(1024) NOT NULL,
|
||||||
|
start_line INTEGER,
|
||||||
|
end_line INTEGER,
|
||||||
|
change_type TEXT NOT NULL DEFAULT 'added'
|
||||||
|
CHECK (change_type IN ('added', 'modified', 'unchanged')),
|
||||||
|
previous_content TEXT
|
||||||
|
);
|
||||||
|
|
||||||
|
CREATE INDEX IF NOT EXISTS idx_documents_story_id ON documents(story_id);
|
||||||
|
CREATE INDEX IF NOT EXISTS idx_chunks_document_id ON chunks(document_id);
|
||||||
|
CREATE INDEX IF NOT EXISTS idx_chunks_embedding ON chunks USING ivfflat (embedding vector_cosine_ops);
|
||||||
|
CREATE INDEX IF NOT EXISTS idx_chunks_change_type ON chunks(change_type);
|
||||||
9
docker/postgres-init/README.md
Normal file
9
docker/postgres-init/README.md
Normal file
@@ -0,0 +1,9 @@
|
|||||||
|
# Postgres init scripts (optional)
|
||||||
|
|
||||||
|
Files here are mounted into the Postgres container at `/docker-entrypoint-initdb.d/` and run **only on first startup** (when the data volume is empty), in alphabetical order.
|
||||||
|
|
||||||
|
- `01-schema.sql` — creates pgvector extension and RAG tables (stories, documents, chunks).
|
||||||
|
- To add more users or other setup, add scripts with names like `00-create-user.sql` (they run before `01-schema.sql`).
|
||||||
|
- To disable init: in `docker-compose.yml`, comment out the postgres volume that mounts this folder, or remove/rename the `.sql` files.
|
||||||
|
|
||||||
|
After the first run, these scripts are not executed again. To re-run them, remove the volume: `docker compose down -v` (this deletes DB data), then `docker compose up -d`.
|
||||||
1
docker/postgres_test_data/18/docker/PG_VERSION
Normal file
1
docker/postgres_test_data/18/docker/PG_VERSION
Normal file
@@ -0,0 +1 @@
|
|||||||
|
18
|
||||||
BIN
docker/postgres_test_data/18/docker/base/1/112
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/112
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/113
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/113
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/1247
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/1247
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/1247_fsm
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/1247_fsm
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/1247_vm
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/1247_vm
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/1249
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/1249
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/1249_fsm
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/1249_fsm
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/1249_vm
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/1249_vm
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/1255
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/1255
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/1255_fsm
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/1255_fsm
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/1255_vm
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/1255_vm
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/1259
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/1259
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/1259_fsm
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/1259_fsm
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/1259_vm
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/1259_vm
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/13476
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/13476
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/13476_fsm
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/13476_fsm
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/13476_vm
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/13476_vm
Normal file
Binary file not shown.
0
docker/postgres_test_data/18/docker/base/1/13479
Normal file
0
docker/postgres_test_data/18/docker/base/1/13479
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/13480
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/13480
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/13481
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/13481
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/13481_fsm
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/13481_fsm
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/13481_vm
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/13481_vm
Normal file
Binary file not shown.
0
docker/postgres_test_data/18/docker/base/1/13484
Normal file
0
docker/postgres_test_data/18/docker/base/1/13484
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/13485
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/13485
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/13486
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/13486
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/13486_fsm
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/13486_fsm
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/13486_vm
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/13486_vm
Normal file
Binary file not shown.
0
docker/postgres_test_data/18/docker/base/1/13489
Normal file
0
docker/postgres_test_data/18/docker/base/1/13489
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/13490
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/13490
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/13491
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/13491
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/13491_fsm
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/13491_fsm
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/13491_vm
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/13491_vm
Normal file
Binary file not shown.
0
docker/postgres_test_data/18/docker/base/1/13494
Normal file
0
docker/postgres_test_data/18/docker/base/1/13494
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/13495
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/13495
Normal file
Binary file not shown.
0
docker/postgres_test_data/18/docker/base/1/1417
Normal file
0
docker/postgres_test_data/18/docker/base/1/1417
Normal file
0
docker/postgres_test_data/18/docker/base/1/1418
Normal file
0
docker/postgres_test_data/18/docker/base/1/1418
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/174
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/174
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/175
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/175
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/2187
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/2187
Normal file
Binary file not shown.
0
docker/postgres_test_data/18/docker/base/1/2224
Normal file
0
docker/postgres_test_data/18/docker/base/1/2224
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/2228
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/2228
Normal file
Binary file not shown.
0
docker/postgres_test_data/18/docker/base/1/2328
Normal file
0
docker/postgres_test_data/18/docker/base/1/2328
Normal file
0
docker/postgres_test_data/18/docker/base/1/2336
Normal file
0
docker/postgres_test_data/18/docker/base/1/2336
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/2337
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/2337
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/2579
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/2579
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/2600
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/2600
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/2600_fsm
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/2600_fsm
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/2600_vm
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/2600_vm
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/2601
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/2601
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/2601_fsm
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/2601_fsm
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/2601_vm
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/2601_vm
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/2602
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/2602
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/2602_fsm
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/2602_fsm
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/2602_vm
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/2602_vm
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/2603
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/2603
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/2603_fsm
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/2603_fsm
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/2603_vm
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/2603_vm
Normal file
Binary file not shown.
0
docker/postgres_test_data/18/docker/base/1/2604
Normal file
0
docker/postgres_test_data/18/docker/base/1/2604
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/2605
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/2605
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/2605_fsm
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/2605_fsm
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/2605_vm
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/2605_vm
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/2606
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/2606
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/2606_fsm
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/2606_fsm
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/2606_vm
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/2606_vm
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/2607
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/2607
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/2607_fsm
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/2607_fsm
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/2607_vm
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/2607_vm
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/2608
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/2608
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/2608_fsm
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/2608_fsm
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/2608_vm
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/2608_vm
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/2609
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/2609
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/2609_fsm
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/2609_fsm
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/2609_vm
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/2609_vm
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/2610
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/2610
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/2610_fsm
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/2610_fsm
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/2610_vm
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/2610_vm
Normal file
Binary file not shown.
0
docker/postgres_test_data/18/docker/base/1/2611
Normal file
0
docker/postgres_test_data/18/docker/base/1/2611
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/2612
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/2612
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/2612_fsm
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/2612_fsm
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/2612_vm
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/2612_vm
Normal file
Binary file not shown.
0
docker/postgres_test_data/18/docker/base/1/2613
Normal file
0
docker/postgres_test_data/18/docker/base/1/2613
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/2615
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/2615
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/2615_fsm
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/2615_fsm
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/2615_vm
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/2615_vm
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/2616
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/2616
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/2616_fsm
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/2616_fsm
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/2616_vm
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/2616_vm
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/2617
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/2617
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/2617_fsm
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/2617_fsm
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/2617_vm
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/2617_vm
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/2618
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/2618
Normal file
Binary file not shown.
BIN
docker/postgres_test_data/18/docker/base/1/2618_fsm
Normal file
BIN
docker/postgres_test_data/18/docker/base/1/2618_fsm
Normal file
Binary file not shown.
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user