Higher values produce more diverse outputs
The maximum numbers of new tokens
Higher values sample more low-probability tokens
Penalize repeated tokens