LeonardPuettmann/PhiMaestra-3-Translation

PhiMaestra - A small model for Italian translation based of Phi 3

This model was finetuned with roughly 500.000 examples from the Tatoeba dataset of translations from English to Italian and Italian to English. The model was finetuned in a way to more directly provide a translation without any additional explanation. It is based on Microsofts Phi-3 model.

Finetuning took about 10 hours on an A10G Nvidia GPU.

Usage

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline

model_name = "LeonardPuettmann/PhiMaestra-3-Translation"
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    device_map="auto",
    trust_remote_code=True,
    torch_dtype=torch.bfloat16
)

tokenizer = AutoTokenizer.from_pretrained(model_name, add_bos_token=True, trust_remote_code=True)


pipe = pipeline( 
    "text-generation", # Don't use "translation" as this model is technically still decoder only meant for generating text
    model=model, 
    tokenizer=tokenizer, 
) 

generation_args = { 
    "max_new_tokens": 1024, 
    "return_full_text": False, 
    "temperature": 0.0, 
    "do_sample": False, 
} 

print("Type '/Exit' to exit.")
while True:
    user_input = input("You: ")
    if user_input.strip().lower() == "/exit":
        print("Exiting the chatbot. Goodbye!")
        break

    row_json = [
        {"role": "system", "content": "translate English to Italian: "}, # Use system promt "translate Italian to English: " for IT->EN 
        {"role": "user", "content": user_input},
    ]

    output = pipe(row_json, **generation_args)
    print(f"PhiMaestra: {output[0]['generated_text']}")

LeonardPuettmann
/

PhiMaestra-3-Translation

PhiMaestra - A small model for Italian translation based of Phi 3

Usage

Model tree for LeonardPuettmann/PhiMaestra-3-Translation

Collection including LeonardPuettmann/PhiMaestra-3-Translation

Maestra 🦙