LazarusNLP
/

IndoNanoT5-base

Text2Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Edit model card

IndoNanoT5 Base

IndoNanoT5 Base is an Indonesian sequence-to-sequence language model based on the T5 architecture. We conducted pre-training on an open-source Indonesian corpus of uonlp/CulturaX. On a held-out subset of the corpus, our model achieved an evaluation loss of 2.082 or a perplexity of about 8.02.

This model was trained using the nanoT5 PyTorch framework. All training was done on an NVIDIA H100 GPU. LazarusNLP/IndoNanoT5-base is released under Apache 2.0 license.

Model Detail

Developed by: LazarusNLP
Model type: Encoder-decoder T5 transformer language model
Language(s): Indonesian
License: Apache 2.0
Contact: Wilson Wongso

Use in 🤗Transformers

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

model_checkpoint = "LazarusNLP/IndoNanoT5-base"

tokenizer = AutoTokenizer.from_pretrained(model_checkpoint)
model = AutoModelForSeq2SeqLM.from_pretrained(model_checkpoint)

Training Datasets

Around 4B tokens from the following corpora were used during pre-training.

Cleaned, Enormous, and Public: The Multilingual Fuel to Democratize Large Language Models for 167 Languages

Training Hyperparameters

The following hyperparameters were used during training:

total_steps: 65536
input_length: 512
batch_size: 128
grad_acc: 1
base_lr: 5e-3
optimizer: AdamWScaled with betas=(0.9,0.999) and epsilon=1e-08
weight_decay: 0.0
lr_scheduler: cosine
warmup_steps: 10000
final_cosine: 1e-5
grad_clip: 1.0
precision: bf16

Acknowledgements

We would like to acknowledge nanoT5 for inspiring this project.

Credits

BhinnekaLM is developed with love by:

Downloads last month: 32

Safetensors

Model size

248M params

Tensor type

F32

·

Inference Examples

Text2Text Generation

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for LazarusNLP/IndoNanoT5-base

Finetunes

Dataset used to train LazarusNLP/IndoNanoT5-base

Spaces using LazarusNLP/IndoNanoT5-base 2

Collection including LazarusNLP/IndoNanoT5-base

Indonesian T5 Language Models

Indonesian T5 models pre-trained with nanoT5 and fine-tuned on IndoNLG tasks. GitHub: https://github.com/LazarusNLP/IndoT5/ • 5 items • Updated May 11