Repetition Penalty | HyperKit.ai

Definition

Repetition penalty reduces the probability of tokens that have already appeared in the generated text, discouraging the model from repeating words or phrases. This addresses a common failure mode where LLMs get stuck in repetitive loops.

Different APIs implement variations: frequency penalty (scales with count), presence penalty (binary), and n-gram blocking.

Key Concepts

Frequency penalty: Penalty increases with token occurrence count
Presence penalty: Fixed penalty if token appeared at all
N-gram blocking: Hard prevents repeating n-token sequences
Context window: How far back to check for repetition

Examples

Problem

Without Repetition Penalty

THE REPETITION PROBLEM:

Prompt: "Write about AI"

Without penalty (can get stuck):
"AI is transforming the world. AI is changing how we
work. AI is revolutionizing healthcare. AI is making
things better. AI is AI is AI is AI is AI is..."

WHY THIS HAPPENS:
1. "AI" has high probability given context
2. Each "AI" reinforces the pattern
3. Model enters degenerate loop
4. Especially common with:
   - Long generations
   - High temperature
   - Beam search
   - Certain topics/patterns

TYPES OF REPETITION:
- Word-level: "the the the"
- Phrase-level: "in order to... in order to..."
- Sentence-level: Repeating whole sentences
- Pattern-level: Alternating A-B-A-B-A-B

Implementation

Penalty Mechanisms

FREQUENCY PENALTY (OpenAI):
logit_new = logit - frequency_penalty × count(token)

Token "AI" appeared 5 times:
Original logit: 3.0
With freq_penalty=0.5: 3.0 - 0.5 × 5 = 0.5

PRESENCE PENALTY (OpenAI):
logit_new = logit - presence_penalty × (count > 0 ? 1 : 0)

Token "AI" appeared at all:
Original logit: 3.0
With pres_penalty=1.0: 3.0 - 1.0 = 2.0

REPETITION PENALTY (HuggingFace):
if token in previous_tokens:
    if logit > 0:
        logit = logit / repetition_penalty
    else:
        logit = logit × repetition_penalty

rep_penalty=1.2: logit 3.0 → 2.5

N-GRAM BLOCKING:
no_repeat_ngram_size=3
Prevents ANY 3-word sequence from repeating

"the big dog" appeared → "the big dog" blocked

API PARAMETERS:
# OpenAI
response = openai.chat.completions.create(
    model="gpt-4",
    messages=[...],
    frequency_penalty=0.5,  # 0-2
    presence_penalty=0.5    # 0-2
)

# HuggingFace
output = model.generate(
    input_ids,
    repetition_penalty=1.2,      # 1.0 = off
    no_repeat_ngram_size=3       # block 3-grams
)

Interactive Exercise

✎

Calculate Penalized Logit

Token "the" has logit 4.0 and has appeared 3 times. Calculate the new logit with frequency_penalty=0.8.

Pro Tips

Start with frequency_penalty=0.3-0.5 for natural text
Use presence_penalty for topic diversity, frequency for word diversity
Too high penalty causes unnatural word choices
Combine with temperature for best results

Definition

Key Concepts

Examples

Interactive Exercise

Related Terms