The context window is the maximum amount of text (measured in tokens) that an LLM can process at once. It includes both your input prompt and the model's response. Think of it as the model's "working memory"—everything outside this window is invisible to the model.
Modern models range from 4K to 200K+ tokens, enabling processing of entire books or codebases in a single request.