Self-Consistency | HyperKit.ai

Definition

Self-consistency samples multiple reasoning paths from the model and selects the most common answer through majority voting. Instead of relying on a single chain-of-thought, this technique leverages the diversity of reasoning approaches to improve accuracy.

This method significantly improves performance on complex reasoning tasks by reducing the impact of individual reasoning errors.

Key Concepts

Multiple sampling: Generate several different reasoning chains
Diversity through temperature: Higher temperature creates varied paths
Majority voting: Select the most frequent final answer
Error averaging: Random errors cancel out across samples

Examples

Concept

How Self-Consistency Works

Question: "If a store sells 3 items at $5 each with a 20% discount,
what's the total cost?"

Path 1: 3 × $5 = $15, 20% of $15 = $3, Total = $15 - $3 = $12 ✓
Path 2: $5 - 20% = $4 per item, 3 × $4 = $12 ✓
Path 3: 3 × $5 = $15, discount = $15 × 0.2 = $3, $15 - $3 = $12 ✓
Path 4: 20% off means 80% paid, 3 × $5 × 0.8 = $12 ✓
Path 5: (calculation error) 3 × $5 = $15, 20% = $2, Total = $13 ✗

Majority vote: $12 (4 out of 5 paths)
Final answer: $12

Implementation

Python Code

from collections import Counter

def self_consistency(prompt, model, n_samples=5, temp=0.7):
    answers = []

    for _ in range(n_samples):
        # Sample with higher temperature for diversity
        response = model.generate(
            prompt=prompt + "\nLet's think step by step.",
            temperature=temp
        )
        # Extract final answer from reasoning
        answer = extract_answer(response)
        answers.append(answer)

    # Majority vote
    vote_counts = Counter(answers)
    final_answer = vote_counts.most_common(1)[0][0]
    confidence = vote_counts[final_answer] / n_samples

    return final_answer, confidence

Interactive Exercise

✎

Apply Majority Voting

Given these 5 sampled answers for "What is 15% of 80?", which answer should be selected?

Paths: Path 1: 12, Path 2: 12, Path 3: 15, Path 4: 12, Path 5: 12

Pro Tips

Use temperature 0.5-0.7 for good diversity without chaos
5-10 samples usually sufficient; more helps with close votes
Works best when answer is discrete (number, category, yes/no)
Confidence = vote_count / total_samples gives uncertainty estimate

Definition

Key Concepts

Examples

Interactive Exercise

Related Terms