d07578f489a55ad7c830b4391ddd3ac35ff29f1d
RAG Agent (Postgres)
Custom RAG agent that indexes text files from a git repository into Postgres and answers queries using retrieval + LLM generation. Commits are tied to stories; indexing and retrieval can be scoped by story.
Quick start
- Configure environment variables:
RAG_REPO_PATH— path to git repo with text filesRAG_DB_DSN— Postgres DSN (e.g.postgresql://user:pass@localhost:5432/rag)RAG_EMBEDDINGS_DIM— embedding vector dimension (e.g.1536)
- Create DB schema:
python scripts/create_db.py(orpsql "$RAG_DB_DSN" -f scripts/schema.sql)
- Index files for a story (e.g. branch name as story slug):
rag-agent index --story my-branch --changed --base-ref HEAD~1 --head-ref HEAD
- Ask a question (optionally scoped to a story):
rag-agent ask "What is covered?"rag-agent ask "What is covered?" --story my-branch
Git hook (index on commit)
Install the post-commit hook so changed files are indexed after each commit:
cp scripts/post-commit .git/hooks/post-commit && chmod +x .git/hooks/post-commit
Story for the commit is taken from (in order): env RAG_STORY, file .rag-story in repo root (one line = slug), or current branch name.
DB structure
- stories — story slug (e.g. branch name); documents and chunks are tied to a story.
- documents — path + version per story; unique
(story_id, path). - chunks — text chunks with embeddings (pgvector); updated when documents are re-indexed.
Scripts: scripts/create_db.py (Python, uses ensure_schema and RAG_* env), scripts/schema.sql (raw SQL).
Notes
- The default embedding/LLM clients are stubs. Replace them in
src/rag_agent/index/embeddings.pyandsrc/rag_agent/agent/pipeline.py. - This project requires Postgres with the
pgvectorextension.
Description
Languages
Python
93.1%
Shell
5.6%
Dockerfile
1.3%