Retrieval & Augmentation Systems / Retrieval Methods

RAG

Advanced [4/5]
Retrieval-Augmented Generation Knowledge-augmented generation Context-injection generation

Definition

Retrieval-Augmented Generation (RAG) is a technique that enhances LLM responses by first retrieving relevant information from external sources, then including that information in the prompt for the model to use when generating its response.

RAG solves key LLM limitations: knowledge cutoff dates, inability to access private data, and hallucination of facts. It grounds responses in actual documents.

Key Concepts

  • Retrieval: Finding relevant documents from a knowledge base
  • Augmentation: Adding retrieved context to the prompt
  • Generation: LLM produces response using the context
  • Grounding: Responses are anchored in real documents

Examples

RAG Flow
How RAG Works Step by Step
1. USER QUERY "What's our company's refund policy?" 2. RETRIEVAL → Search knowledge base → Find: "refund-policy.pdf" (relevance: 0.94) → Find: "customer-faq.md" (relevance: 0.87) 3. AUGMENTATION Construct prompt: """ Context: [Contents of refund-policy.pdf] [Relevant section from customer-faq.md] Question: What's our company's refund policy? Answer based on the context above: """ 4. GENERATION LLM generates response using retrieved context
RAG ensures answers come from your actual documentation, not the model's training data.
RAG vs Fine-Tuning
When to Use Each Approach
Use RAG when: ✓ Knowledge changes frequently ✓ You need to cite sources ✓ You have lots of documents ✓ You want to avoid retraining Use Fine-Tuning when: ✓ You need specific behavior/style ✓ Knowledge is stable ✓ You need faster inference ✓ RAG context would be too large
RAG is often preferred for knowledge-heavy applications because it's more flexible.

Interactive Exercise

📚
Design a RAG System

You're building a RAG-powered assistant for a hospital. Consider:

1. What types of documents would you include in the knowledge base?

2. How would you handle sensitive patient information?

3. What happens if no relevant documents are found?

Pro Tips
  • Quality of retrieval directly impacts answer quality
  • Chunk documents appropriately for your use case
  • Always include source citations in responses
  • Monitor retrieval quality and improve over time

Related Terms