To build an AI chatbot that actually answers from your content, use retrieval-augmented generation (RAG): index your website and docs into a vector database, retrieve the most relevant passages for each question, and have the LLM answer using only those passages — with citations. This is what stops the bot from making things up. A custom RAG chatbot typically takes a few weeks to build and costs from low-four to mid-five figures depending on scope, plus modest monthly run costs.
A generic chatbot that improvises is a liability. A grounded one that answers from your real content — and says "I don't know" when it should — is an asset. Here's how to build the second kind.
Why a grounded (RAG) chatbot, not a generic one
Out of the box, an LLM only knows its training data. Ask it about your pricing, your policies, or yesterday's changelog and it will guess. RAG fixes this by retrieving your actual content at answer time, so the bot responds from facts you control and can cite. If you're weighing this against fine-tuning, we break down the trade-offs in RAG vs fine-tuning — for a knowledge chatbot, RAG is almost always the right call.
The architecture
A website RAG chatbot has five parts:
- Ingestion — crawl your site/docs, clean and chunk the text.
- Embedding + vector store — turn chunks into vectors and store them.
- Retrieval — for each question, fetch the most relevant chunks.
- Generation — the LLM answers using those chunks, with citations.
- Chat UI — a streaming widget on your site, with a human-handoff path.
The model call and retrieval run on your backend — never the client — for the same security and cost reasons that apply to any AI feature.
How to build it, step by step
- Gather your sources. Pages, help center, PDFs, product docs — whatever the bot should know.
- Chunk thoughtfully. Split content into passages that are big enough to be meaningful but small enough to retrieve precisely. Bad chunking is the #1 cause of bad answers.
- Embed and index. Store vectors in a database such as pgvector (Postgres), Pinecone, or similar.
- Wire retrieval. On each message, embed the question, pull the top matches, and pass them to the model.
- Prompt for honesty. Instruct the model to answer only from the provided context and to say it doesn't know otherwise. This is what kills hallucination.
- Add citations and handoff. Show sources so users can verify, and route to a human or a form when the bot is unsure.
- Stream the reply for a responsive, modern feel.
What it costs
| Cost component | Typical range |
|---|---|
| Build (custom RAG chatbot) | ~$3,000 – $40,000 depending on scope |
| Monthly model/API usage | Scales with traffic; often modest |
| Vector DB / hosting | Low monthly, or free on existing Postgres |
| Maintenance & content refresh | Small ongoing effort |
A focused chatbot over a defined doc set can ship through a Quick Dive from $350 as a starting scope; a broader, multi-source assistant with analytics and handoff is a larger build.
Keep answers accurate over time
A chatbot is only as current as its index. Re-ingest content on a schedule (or on publish), monitor real conversations for wrong or low-confidence answers, and feed those gaps back into your content. The maintenance is light, but skipping it lets the bot drift out of date.
Frequently asked questions
How do I make an AI chatbot answer from my own website content? Use RAG: index your content into a vector database, retrieve the relevant passages for each question, and instruct the model to answer only from those passages with citations.
How do I stop the chatbot from hallucinating? Ground it in retrieved content and prompt it to answer only from that context — and to say "I don't know" otherwise. Showing citations lets users verify, and a human-handoff path covers the gaps.
How much does a custom AI chatbot cost in 2026? A focused RAG chatbot typically costs from a few thousand to mid-five figures to build depending on scope, plus modest monthly usage and hosting. A narrow scope can start much smaller.
Which is better — a no-code chatbot or a custom one? No-code tools are fast for simple FAQ bots. A custom RAG chatbot wins when you need accurate answers over your real content, citations, deeper integrations, or control over data and behavior.
Want a chatbot that actually knows your business? Our AI team builds grounded, cited assistants. Book a free call.