ALBERT XLarge Spanish

This is an ALBERT model trained on a big spanish corpora. The model was trained on a single TPU v3-8 with the following hyperparameters and steps/time:

  • LR: 0.0003125
  • Batch Size: 128
  • Warmup ratio: 0.00078125
  • Warmup steps: 6250
  • Goal steps: 8000000
  • Total steps: 2775000
  • Total training time (aprox): 64.2 days.

Training loss

https://drive.google.com/uc?export=view&id=1rw0vvqZY9LZAzRUACLjmP18Fc6D1fv7x

Downloads last month
9
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Dataset used to train dccuchile/albert-xlarge-spanish