Prefilling starts the assistant's response with specific text to guide the model toward the expected format or content. By providing the opening tokens, you prime the model to continue in that direction.
This technique is particularly useful for ensuring consistent output formats like JSON, XML, or specific structural patterns.