Sampling is the process of selecting the next token from a probability distribution during text generation. Instead of always picking the most likely token (greedy), sampling introduces randomness based on the probability weights.
Different sampling strategies (top-k, top-p, temperature) control how this randomness is applied to balance creativity and coherence.