Retrieval & Augmentation Systems / Retrieval Methods

RAG

Advanced [4/5]

Retrieval-Augmented Generation Knowledge-augmented generation Context-injection generation

Definition

Retrieval-Augmented Generation (RAG) is a technique that enhances LLM responses by first retrieving relevant information from external sources, then including that information in the prompt for the model to use when generating its response.

RAG solves key LLM limitations: knowledge cutoff dates, inability to access private data, and hallucination of facts. It grounds responses in actual documents.

Key Concepts

Retrieval: Finding relevant documents from a knowledge base
Augmentation: Adding retrieved context to the prompt
Generation: LLM produces response using the context
Grounding: Responses are anchored in real documents

Examples

RAG Flow

How RAG Works Step by Step

1. USER QUERY
   "What's our company's refund policy?"

2. RETRIEVAL
   → Search knowledge base
   → Find: "refund-policy.pdf" (relevance: 0.94)
   → Find: "customer-faq.md" (relevance: 0.87)

3. AUGMENTATION
   Construct prompt:
   """
   Context:
   [Contents of refund-policy.pdf]
   [Relevant section from customer-faq.md]

   Question: What's our company's refund policy?

   Answer based on the context above:
   """

4. GENERATION
   LLM generates response using retrieved context

RAG ensures answers come from your actual documentation, not the model's training data.

RAG vs Fine-Tuning

When to Use Each Approach

Use RAG when:
✓ Knowledge changes frequently
✓ You need to cite sources
✓ You have lots of documents
✓ You want to avoid retraining

Use Fine-Tuning when:
✓ You need specific behavior/style
✓ Knowledge is stable
✓ You need faster inference
✓ RAG context would be too large

RAG is often preferred for knowledge-heavy applications because it's more flexible.

Interactive Exercise

📚

Design a RAG System

You're building a RAG-powered assistant for a hospital. Consider:

1. What types of documents would you include in the knowledge base?

2. How would you handle sensitive patient information?

3. What happens if no relevant documents are found?

Pro Tips

Quality of retrieval directly impacts answer quality
Chunk documents appropriately for your use case
Always include source citations in responses
Monitor retrieval quality and improve over time

Definition

Key Concepts

Examples

Interactive Exercise

Related Terms