jucendrero's picture
Update README.md
71f14c3
metadata
language:
  - es
tags:
  - generated_from_trainer
  - recipe-generation
widget:
  - text: >-
      <RECIPE_START> <INPUT_START> salmón <NEXT_INPUT> zumo de naranja
      <NEXT_INPUT> aceite de oliva <NEXT_INPUT> sal <NEXT_INPUT> pimienta
      <INPUT_END> <INGR_START>
  - text: >-
      <RECIPE_START> <INPUT_START> harina <NEXT_INPUT> azúcar <NEXT_INPUT>
      huevos <NEXT_INPUT> chocolate <NEXT_INPUT> levadura Royal <INPUT_END>
      <INGR_START>
inference:
  parameters:
    top_k: 50
    top_p: 0.92
    do_sample: true
    num_return_sequences: 3
    max_new_tokens: 100

Model description

This model is a fine-tuned version of flax-community/gpt-2-spanish on a custom dataset (not publicly available). The dataset is made of crawled data from 3 Spanish cooking websites and it contains approximately ~50000 recipes. It achieves the following results on the evaluation set:

  • Loss: 0.5796

Contributors

How to use it

from transformers import AutoTokenizer, AutoModelForCausalLM

model_checkpoint = 'gastronomia-para-to2/gastronomia_para_to2'
tokenizer = AutoTokenizer.from_pretrained(model_checkpoint)
model = AutoModelForCausalLM.from_pretrained(model_checkpoint)

The tokenizer makes use of the following special tokens to indicate the structure of the recipe:

special_tokens = [
'<INPUT_START>',
'<NEXT_INPUT>',
'<INPUT_END>',
'<TITLE_START>',
'<TITLE_END>',
'<INGR_START>',
'<NEXT_INGR>',
'<INGR_END>',
'<INSTR_START>',
'<NEXT_INSTR>',
'<INSTR_END>',
'<RECIPE_START>',
'<RECIPE_END>']

The input should be of the form:

<RECIPE_START> <INPUT_START> ingredient_1 <NEXT_INPUT> ingredient_2 <NEXT_INPUT> ... <NEXT_INPUT> ingredient_n <INPUT_END> <INGR_START>

We are using the following configuration to generate recipes, but feel free to change parameters as needed:

tokenized_input = tokenizer(input, return_tensors='pt')
output = model.generate(**tokenized_input,
                          max_length=600,
                          do_sample=True,
                          top_p=0.92,
                          top_k=50,
                          num_return_sequences=3)
pre_output = tokenizer.decode(output[0], skip_special_tokens=False)

The recipe ends where the <RECIPE_END> special token appears for the first time.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 8
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 6
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
0.6213 1.0 5897 0.6214
0.5905 2.0 11794 0.5995
0.5777 3.0 17691 0.5893
0.574 4.0 23588 0.5837
0.5553 5.0 29485 0.5807
0.5647 6.0 35382 0.5796

Framework versions

  • Transformers 4.17.0
  • Pytorch 1.11.0+cu102
  • Datasets 2.0.0
  • Tokenizers 0.11.6

References

The list of special tokens used for generation recipe structure has been taken from: RecipeNLG: A Cooking Recipes Dataset for Semi-Structured Text Generation.