Advanced Prompt Engineering / Output Control & Format

Prefilling

Beginner [2/5]
Response priming Output seeding Partial completion

Definition

Prefilling starts the assistant's response with specific text to guide the model toward the expected format or content. By providing the opening tokens, you prime the model to continue in that direction.

This technique is particularly useful for ensuring consistent output formats like JSON, XML, or specific structural patterns.

Key Concepts

  • Response priming: Setting the initial tokens of the output
  • Format enforcement: Starting with "{" ensures JSON output
  • Continuation behavior: Model completes what you started
  • Reduced preamble: Skips "Sure, I'll help..." type responses

Examples

JSON Prefill
Structured Output
User: "Extract the name and age from: 'John Smith is 25 years old'" Assistant (prefilled): { Model continues: "name": "John Smith", "age": 25}
Prefilling with "{" ensures the model produces JSON.
Format Prefill
Skip Preamble
Without prefill: "Sure! Here's a haiku about programming: Code flows like water..." With prefill starting "Code": "Code flows like water Bugs emerge from the shadows Debug, compile, run"
Prefilling skips conversational preamble and jumps to content.
API Usage
Claude API Example
messages = [ {"role": "user", "content": "List 3 colors as JSON"}, {"role": "assistant", "content": "["} # Prefill ] # Model will continue: '"red", "blue", "green"]'
Some APIs allow prefilling the assistant message.

Interactive Exercise

Choose the Prefill

What prefill would ensure this output format?

Desired output:

Summary: [text]
Score: [number]
Recommendation: [text]

Pro Tips
  • Prefill with "{" or "[" for guaranteed JSON/array output
  • Use prefills to skip "Sure!" or "Here's..." preambles
  • Combine with format instructions for best results
  • Check API documentation—not all APIs support prefilling

Related Terms