Decoder-only is a transformer architecture where the model generates text one token at a time, only attending to previous tokens (causal attention). This is the architecture used by GPT, Claude, LLaMA, and most modern LLMs.
The "decoder" name comes from the original transformer paper, where this component generated output sequences.