Positional encoding adds information about token position to the input embeddings. Since self-attention is permutation-invariant (treats input as a set), positional encodings are essential for the model to understand word order.
Without positional information, "dog bites man" and "man bites dog" would be indistinguishable to the model.