Advanced Reasoning / Self-Improvement

Self-Calibration

Advanced [4/5]
Confidence calibration Uncertainty estimation Know-what-you-know

Definition

Self-calibration is a technique where LLMs assess their own confidence in their answers, aiming to align stated confidence with actual accuracy. A well-calibrated model should be correct 80% of the time when it says it's "80% confident."

This is crucial for building reliable AI systems that know when to seek human input or additional verification.

Key Concepts

  • Calibration: Alignment between confidence and accuracy
  • Overconfidence: Stating higher confidence than warranted
  • Underconfidence: Stating lower confidence than warranted
  • Epistemic uncertainty: Uncertainty from lack of knowledge

Examples

Problem
The Calibration Challenge
CALIBRATION PROBLEM: UNCALIBRATED LLM: Q: "What year was the Eiffel Tower built?" A: "1889" (Confidence: 95%) ✓ Correct Q: "What year was the Burj Khalifa completed?" A: "2010" (Confidence: 95%) ✓ Correct Q: "What year was Building X completed?" A: "1987" (Confidence: 95%) ✗ Wrong (it was 1992) All answers have same confidence, but accuracy varies! IDEAL CALIBRATION: ┌─────────────────────┬──────────┬──────────┐ │ Stated Confidence │ Expected │ Actual │ │ │ Accuracy │ Accuracy │ ├─────────────────────┼──────────┼──────────┤ │ 50% │ 50% │ 48% │ ✓ Good │ 70% │ 70% │ 72% │ ✓ Good │ 90% │ 90% │ 91% │ ✓ Good │ 99% │ 99% │ 85% │ ✗ Overconfident! └─────────────────────┴──────────┴──────────┘ TYPICAL LLM PROBLEM: LLMs are often overconfident, especially on: - Rare or obscure facts - Recent events (after training cutoff) - Questions requiring precise numbers - Multi-step reasoning CALIBRATION CURVE: Accuracy 100% │ ╱ Perfect calibration │ ╱ 80% │ ╱ ← Actual LLM │ ╱ (overconfident) 60% │ ╱ │╱ 40% └───────────────────── 40% 60% 80% 100% Stated Confidence
Technique
Self-Calibration Prompting
SELF-CALIBRATION PROMPT: Question: {question} Please answer and assess your confidence: 1. Provide your answer 2. Rate your confidence (0-100%) 3. Explain what could make you wrong 4. Identify what would increase your confidence Format: Answer: [your answer] Confidence: [X]% Uncertainty factors: - [what could be wrong] - [knowledge gaps] Would need to verify: [what to check] --- EXAMPLE: Q: "What is the population of Tokyo?" Answer: Approximately 14 million (city proper) or 37 million (greater metro area) Confidence: 75% Uncertainty factors: - Population changes yearly (my data may be outdated) - Different definitions (city vs metro vs prefecture) - Last training data from 2024 Would need to verify: - Current census data - Which definition is being asked about --- VERBALIZED CONFIDENCE IMPROVES CALIBRATION: Without verbalization: "The answer is X." → Often overconfident With verbalization: "The answer is X because [reasoning]. However, I'm uncertain because [factors]. Confidence: 70%" → Better calibrated! CALIBRATION TECHNIQUES: 1. Ask for confidence explicitly 2. Request uncertainty factors 3. Use "what would change your mind?" 4. Sample multiple times, check agreement

Interactive Exercise

Calibrate Your Answer

Question: "How many moons does Saturn have?"

Provide an answer with calibrated confidence. What factors affect your certainty?

Pro Tips
  • Explicitly asking for confidence scores improves calibration
  • Having model explain uncertainty factors helps identify weak points
  • Use calibrated confidence to decide when to seek external verification
  • Aggregate confidence across multiple samples for better estimates

Related Terms