Model Parameters / Sampling Strategies

Greedy Decoding

Beginner [2/5]
Argmax decoding Deterministic decoding

Definition

Greedy decoding always selects the token with the highest probability at each step. It's the simplest and most deterministic generation method—given the same input, it always produces the exact same output.

While fast and consistent, greedy decoding can miss better overall sequences by making locally optimal choices.

Key Concepts

  • Deterministic: Same input always gives same output
  • Local optimization: Best choice at each step, not globally
  • No randomness: Temperature=0 equivalent
  • Fast: No sampling computation needed

Examples

Comparison
Greedy vs Sampling
Prompt: "The best programming language is" GREEDY DECODING (temperature=0): Step 1: [Python: 0.35, JavaScript: 0.30, ...] → "Python" Step 2: [because: 0.40, for: 0.25, ...] → "because" Step 3: [it: 0.45, of: 0.30, ...] → "it" ... Output: "Python because it is versatile and easy to learn." (Always this exact output for this prompt) SAMPLING (temperature=0.7): Run 1: "JavaScript for web development and..." Run 2: "Python because of its simplicity..." Run 3: "depends on the use case..." (Different each time!) ─────────────────────────────────────────── Greedy can get stuck in local optima: "The capital of France is" → "Paris" ✓ (good) "Write a creative story" → repetitive, boring (bad) Greedy works well for factual, single-answer questions but struggles with creative or open-ended tasks.
API Usage
Enabling Greedy Decoding
# OpenAI - set temperature to 0 response = client.chat.completions.create( model="gpt-4", messages=[{"role": "user", "content": "What is 2+2?"}], temperature=0 # Greedy decoding ) # Claude - temperature 0 response = anthropic.messages.create( model="claude-3-opus", max_tokens=100, temperature=0, # Greedy decoding messages=[...] ) # HuggingFace - do_sample=False from transformers import pipeline generator = pipeline("text-generation") output = generator( "The answer is", do_sample=False, # Greedy decoding max_length=50 ) # When to use greedy: # ✓ Math problems (one correct answer) # ✓ Factual questions # ✓ Code that must be syntactically correct # ✓ When reproducibility is required # ✗ Creative writing # ✗ Brainstorming # ✗ When diversity is needed

Interactive Exercise

Identify the Right Method

For each task, should you use greedy decoding or sampling?

1. Translating a legal document
2. Generating marketing tagline options
3. Extracting structured data from text
4. Writing a poem

Pro Tips
  • Use greedy for tasks with objectively correct answers
  • Greedy can cause repetition loops in long generations
  • For reproducible research, always document temperature=0
  • Consider beam search for better global optimization

Related Terms