NLLB 600M TH-EN finetuned

This model is finetuned from facebook/nllb-200-distilled-600M using SCB-1M and OPUS dataset. The finetuning script is on GitHub.
View full finetuning logs on wandb.

Usage

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM, pipeline
import torch

MODEL_NAME = "wtarit/nllb-600M-th-en"

model = AutoModelForSeq2SeqLM.from_pretrained(MODEL_NAME)
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
device = 0 if torch.cuda.is_available() else "cpu"

translation_pipeline = pipeline(
    "translation", 
    model=model, 
    tokenizer=tokenizer, 
    src_lang="tha_Thai", 
    tgt_lang="eng_Latn", 
    max_length=400, 
    device=device
)

# Run translation pipeline
result = translation_pipeline("สวัสดี เราคือโมเดลแปลภาษา")
print(result[0]['translation_text'])

Score

BLEU Score (Using sacrebleu): 27.37 on IWSLT 2015

Downloads last month
37
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Space using wtarit/nllb-600M-th-en 1