codebase-chatbot
Codebase Q&A via Qwen RAG. Self-host so source never leaves your network. $50/dev/mo vs Cody's $19.
Launch kit
codebase-chatbot — launch kit
1-liner
Codebase Q&A via Qwen RAG. Self-host so source never leaves your network. $50/dev/mo vs Cody's $19.
Tweet hook
Cody / Sourcegraph / Cursor chat: cloud LLM, your code goes to OpenAI.
Fintech / gov-contractor teams can't ship to those. Built one on local Qwen. Source never leaves the network.
$50/dev/mo for the privacy-tier 🧵
- r/programming, r/devops — "Self-host codebase chat — privacy tier"
Cold-email ICP
Engineering managers at fintech, gov-contractor, healthcare, biotech. Privacy-sensitive, not allowed to send code to cloud.
Cold-email template
Subject: codebase chat without sending source to OpenAI
Hi {first} — for {fintech / gov / healthcare} eng teams.
Cursor / Cody / Sourcegraph all use cloud LLMs. Your security team
likely blocks them. Built a self-host version on Qwen 30B.
Source never leaves the network. $50/dev/mo managed; $0 self-host.
Free 30-day pilot on your top repo. Reply for setup.
SEO content
- "Self-host codebase chat for compliance-bound teams"
- "Cursor / Sourcegraph / Cody privacy posture compared"
- "On-prem Qwen RAG: full setup guide"
Documentation
codebase-chatbot
Codebase-aware chatbot for engineering teams. Qwen RAG over a private repo with keyword retrieval (no embeddings yet — v1). Self-hostable so source never leaves the org.
Pricing
- $50/dev/mo (up to 1k repos indexed)
- $199/dev/mo for advanced features (semantic embeddings, custom prompts)
- $2,499/mo team-tier (50 devs)
- DIY $0 (self-host)
vs Cursor's chat-over-codebase ($20/mo bundled), Cody (Sourcegraph) ($9-19/mo), Tabnine Pro ($12/mo). We compete on:
- Self-host privacy — source never leaves the network
- Keyword + future semantic retrieval
- Repo-team focus (not individual-dev)
Run
cd C:\openclaw-products\codebase-chatbot
python -m venv .venv
.\.venv\Scripts\activate
pip install -e .
codechat build C:\path\to\repo --out repo.idx
codechat ask --index repo.idx "How does our auth middleware verify tokens?"
Roadmap
- Embedding-based retrieval (BGE-small / nomic-embed-text — both run locally)
- Per-PR diff-aware mode (answer "what does this PR change?")
- Slack bot delivery
- CodeOwners integration (route ownership questions correctly)
- Diagram generation (mermaid)