Self-Calibration | HyperKit.ai

Definition

Self-calibration is a technique where LLMs assess their own confidence in their answers, aiming to align stated confidence with actual accuracy. A well-calibrated model should be correct 80% of the time when it says it's "80% confident."

This is crucial for building reliable AI systems that know when to seek human input or additional verification.

Key Concepts

Calibration: Alignment between confidence and accuracy
Overconfidence: Stating higher confidence than warranted
Underconfidence: Stating lower confidence than warranted
Epistemic uncertainty: Uncertainty from lack of knowledge

Examples

Problem

The Calibration Challenge

CALIBRATION PROBLEM:

UNCALIBRATED LLM:
Q: "What year was the Eiffel Tower built?"
A: "1889" (Confidence: 95%)  ✓ Correct

Q: "What year was the Burj Khalifa completed?"
A: "2010" (Confidence: 95%)  ✓ Correct

Q: "What year was Building X completed?"
A: "1987" (Confidence: 95%)  ✗ Wrong (it was 1992)

All answers have same confidence, but accuracy varies!

IDEAL CALIBRATION:
┌─────────────────────┬──────────┬──────────┐
│ Stated Confidence   │ Expected │ Actual   │
│                     │ Accuracy │ Accuracy │
├─────────────────────┼──────────┼──────────┤
│ 50%                 │ 50%      │ 48%      │ ✓ Good
│ 70%                 │ 70%      │ 72%      │ ✓ Good
│ 90%                 │ 90%      │ 91%      │ ✓ Good
│ 99%                 │ 99%      │ 85%      │ ✗ Overconfident!
└─────────────────────┴──────────┴──────────┘

TYPICAL LLM PROBLEM:
LLMs are often overconfident, especially on:
- Rare or obscure facts
- Recent events (after training cutoff)
- Questions requiring precise numbers
- Multi-step reasoning

CALIBRATION CURVE:
Accuracy
100% │          ╱ Perfect calibration
     │        ╱
 80% │      ╱   ← Actual LLM
     │    ╱      (overconfident)
 60% │  ╱
     │╱
 40% └─────────────────────
     40%  60%  80%  100%
         Stated Confidence

Technique

Self-Calibration Prompting

SELF-CALIBRATION PROMPT:

Question: {question}

Please answer and assess your confidence:

1. Provide your answer
2. Rate your confidence (0-100%)
3. Explain what could make you wrong
4. Identify what would increase your confidence

Format:
Answer: [your answer]
Confidence: [X]%
Uncertainty factors:
- [what could be wrong]
- [knowledge gaps]
Would need to verify: [what to check]

---

EXAMPLE:

Q: "What is the population of Tokyo?"

Answer: Approximately 14 million (city proper) or
37 million (greater metro area)

Confidence: 75%

Uncertainty factors:
- Population changes yearly (my data may be outdated)
- Different definitions (city vs metro vs prefecture)
- Last training data from 2024

Would need to verify:
- Current census data
- Which definition is being asked about

---

VERBALIZED CONFIDENCE IMPROVES CALIBRATION:

Without verbalization:
"The answer is X." → Often overconfident

With verbalization:
"The answer is X because [reasoning].
However, I'm uncertain because [factors].
Confidence: 70%"
→ Better calibrated!

CALIBRATION TECHNIQUES:
1. Ask for confidence explicitly
2. Request uncertainty factors
3. Use "what would change your mind?"
4. Sample multiple times, check agreement

Interactive Exercise

✎

Calibrate Your Answer

Question: "How many moons does Saturn have?"

Provide an answer with calibrated confidence. What factors affect your certainty?

Pro Tips

Explicitly asking for confidence scores improves calibration
Having model explain uncertainty factors helps identify weak points
Use calibrated confidence to decide when to seek external verification
Aggregate confidence across multiple samples for better estimates

Definition

Key Concepts

Examples

Interactive Exercise

Related Terms