Question about Setting max_length and max_new_tokens in Generation Configuration for CNN/DM Dataset
#50
by
cooper521
- opened
In the generation_config.json file, I am quite puzzled about "max_length": 142, because the official documentation mentions:
- max_length (int, optional, defaults to 20) β The maximum length the generated tokens can have. Corresponds to the length of the input prompt + max_new_tokens. Its effect is overridden by max_new_tokens, if also set.
- max_new_tokens (int, optional) β The maximum numbers of tokens to generate, ignoring the number of tokens in the prompt.
If configured this way, it means the total length of the article + summary would only be 142 tokens. This is obviously inappropriate for cnn/dm as the articles usually have hundreds of tokens. So what would be a good setting for this parameter?