Advanced Reasoning / Self-Improvement

Chain-of-Verification

Advanced [4/5]
CoVe Verification questioning Self-fact-checking

Definition

Chain-of-Verification (CoVe) is a technique where the model generates verification questions about its own response, answers those questions independently, and uses the results to revise the original response. This systematic self-checking reduces hallucinations and factual errors.

Developed by Meta AI, CoVe shows significant improvements in factual accuracy across various tasks.

Key Concepts

  • Baseline response: Initial answer to the question
  • Verification questions: Questions to fact-check the response
  • Independent verification: Answer checks without seeing original
  • Revised response: Updated answer incorporating verified facts

Examples

Process
Chain-of-Verification Steps
CHAIN-OF-VERIFICATION PROCESS: QUESTION: "Name some politicians born in Boston" STEP 1 - BASELINE RESPONSE: "Politicians born in Boston include John F. Kennedy, John Adams, Samuel Adams, and Paul Revere." STEP 2 - GENERATE VERIFICATION QUESTIONS: 1. "Was John F. Kennedy born in Boston?" 2. "Was John Adams born in Boston?" 3. "Was Samuel Adams born in Boston?" 4. "Was Paul Revere born in Boston?" 5. "Was Paul Revere a politician?" STEP 3 - INDEPENDENT VERIFICATION: (Answer each without seeing original response) 1. JFK born in Boston? → YES (Brookline, near Boston) ✓ 2. John Adams born in Boston? → NO (Braintree, MA) ✗ 3. Samuel Adams born in Boston? → YES ✓ 4. Paul Revere born in Boston? → YES ✓ 5. Paul Revere a politician? → NO (silversmith/patriot) ✗ STEP 4 - REVISED RESPONSE: "Politicians born in Boston include John F. Kennedy and Samuel Adams. (John Adams was born in Braintree, and Paul Revere, while born in Boston, was a silversmith and patriot rather than a politician.)" HALLUCINATION REDUCTION: Before CoVe: 2/4 claims had issues (50% error) After CoVe: 0/2 claims have issues (0% error)
Implementation
CoVe Prompt Pattern
CHAIN-OF-VERIFICATION IMPLEMENTATION: # Stage 1: Generate baseline baseline_prompt = """ Question: {question} Answer:""" # Stage 2: Generate verification questions verification_prompt = """ Your response: {baseline} Generate a list of specific, factual questions that would verify the accuracy of your response. Focus on: - Dates, numbers, names that could be wrong - Claims that need fact-checking - Relationships or categorizations Verification questions: 1.""" # Stage 3: Answer verifications independently verify_prompt = """ Answer this factual question with a brief response: {verification_question} Answer:""" # Stage 4: Revise based on verification revise_prompt = """ Original question: {question} Your initial response: {baseline} Verification results: {verification_results} Based on the verification, provide a revised response that corrects any errors found:""" # Implementation def chain_of_verification(question): # Step 1 baseline = llm(baseline_prompt.format(question=question)) # Step 2 verifications = llm(verification_prompt.format(baseline=baseline)) questions = parse_questions(verifications) # Step 3 - Independent verification (key: no context from baseline) results = [] for q in questions: answer = llm(verify_prompt.format(verification_question=q)) results.append(f"Q: {q}\nA: {answer}") # Step 4 revised = llm(revise_prompt.format( question=question, baseline=baseline, verification_results="\n".join(results) )) return revised PERFORMANCE (from Meta AI paper): ┌─────────────────────┬──────────┬─────────────┐ │ Task │ Baseline │ + CoVe │ ├─────────────────────┼──────────┼─────────────┤ │ Wiki bio facts │ 55% │ 77% (+22%) │ │ Question answering │ 62% │ 81% (+19%) │ │ List generation │ 48% │ 71% (+23%) │ └─────────────────────┴──────────┴─────────────┘

Interactive Exercise

Create Verification Questions

Response to verify: "The Mona Lisa was painted by Leonardo da Vinci in 1503 and is currently displayed at the British Museum in London."

Generate verification questions for each factual claim.

Pro Tips
  • Verification must be independent - don't show original response
  • Generate specific, answerable questions (not vague)
  • Focus on concrete facts: dates, locations, names, numbers
  • Works best for factual/knowledge-based responses

Related Terms