metadata
license: mit
datasets:
- brwac
- carolina-c4ai/corpus-carolina
language:
- pt
DeBERTinha XSmall (aka "debertinha-ptbr-xsmall")
Introduction
DeBERTinha is a pretrained DeBERTa model for Brazilian Portuguese.
Available models
Model | Arch. | #Params |
---|---|---|
sagui-nlp/debertinha-ptbr-xsmall |
DeBERTa-V3-Xsmall | 40M |
Usage
from transformers import AutoTokenizer
from transformers import AutoModelForPreTraining
from transformers import AutoModel
model = AutoModelForPreTraining.from_pretrained('sagui-nlp/debertinha-ptbr-xsmall')
tokenizer = AutoTokenizer.from_pretrained('sagui-nlp/debertinha-ptbr-xsmall')
For embeddings
import torch
model = AutoModel.from_pretrained('sagui-nlp/debertinha-ptbr-xsmall')
input_ids = tokenizer.encode('Tinha uma pedra no meio do caminho.', return_tensors='pt')
with torch.no_grad():
outs = model(input_ids)
encoded = outs.last_hidden_state[0, 1:-1] # Ignore [CLS] and [SEP] special tokens
Citation
If you use our work, please cite:
Comming soon