Stop Sequences | HyperKit.ai

Definition

Stop sequences are specific strings or tokens that signal the model to stop generating text when encountered. They provide explicit control over where generation should end, beyond the default end-of-sequence token.

Stop sequences are essential for structured output, preventing the model from continuing past the desired response boundary.

Key Concepts

Multiple stops: Can specify several stop sequences
String matching: Generation stops when any stop string is produced
Not included: Stop sequence typically excluded from output
EOS token: Built-in end token is implicit stop sequence

Examples

Use Cases

Common Stop Sequence Patterns

CHAT/CONVERSATION:
Stop when the AI would start a new turn

stop=["Human:", "User:", "\n\nHuman"]

Without stop:
"AI: Hello! How can I help?
Human: Thanks!
AI: You're welcome!
Human: ..."  ← model generates user turns!

With stop (stops at "Human:"):
"AI: Hello! How can I help?"  ← clean response

CODE GENERATION:
Stop at function boundaries

stop=["def ", "class ", "```"]

Prompt: "Write a function to add numbers"
Output: "def add(a, b):\n    return a + b"
(stops before generating another function)

JSON EXTRACTION:
Stop at closing brace

stop=["}"]

Output: {"name": "John", "age": 30}
(stops after valid JSON)

STRUCTURED OUTPUT:
Stop at section markers

stop=["---", "###", "END"]

Q&A FORMAT:
stop=["Question:", "Q:"]

Ensures only one answer generated

Implementation

Using Stop Sequences

OPENAI API:
response = openai.chat.completions.create(
    model="gpt-4",
    messages=[
        {"role": "user", "content": prompt}
    ],
    stop=["Human:", "\n\n---", "END"],
    max_tokens=500
)

ANTHROPIC API:
response = anthropic.messages.create(
    model="claude-3-opus-20240229",
    messages=[...],
    stop_sequences=["Human:", "\n\nUser:"],
    max_tokens=500
)

HUGGINGFACE:
# Using eos_token_id for single token
output = model.generate(
    input_ids,
    eos_token_id=tokenizer.encode("\n")[0]
)

# For string stops, check during generation
# or use stopping_criteria

PRACTICAL PATTERNS:

# ReAct agent - stop after action
stop=["Observation:"]

# JSON mode
stop=["}"]  # simple
stop=["}\n"]  # with newline

# Code blocks
stop=["```"]

# Numbered lists (stop after one item)
stop=["2.", "2)"]

# Conversation
stop=["\nUser:", "\nHuman:", "\n\n"]

GOTCHAS:
- Stop sequences are CASE SENSITIVE
- Whitespace matters! "\n\n" ≠ "\n \n"
- Test thoroughly with edge cases
- Some APIs limit number of stop sequences

Interactive Exercise

✎

Design Stop Sequences

You want to generate exactly ONE paragraph of text. What stop sequences would you use?

Pro Tips

Test stop sequences with actual model output, not assumptions
Include newline variations: "\n\n", "\n \n", "\r\n\r\n"
For agents, stop at "Observation:" to allow tool execution
Combine with max_tokens as a backup limit

Definition

Key Concepts

Examples

Interactive Exercise

Related Terms