--- license: mit datasets: - brwac - carolina-c4ai/corpus-carolina language: - pt --- # DeBERTinha XSmall (aka "debertinha-ptbr-xsmall") ## Introduction DeBERTinha is a pretrained DeBERTa model for Brazilian Portuguese. ## Available models | Model | Arch. | #Params | | ---------------------------------------- | ---------- | ------- | | `sagui-nlp/debertinha-ptbr-xsmall` | DeBERTa-V3-Xsmall | 40M | ## Usage ```python from transformers import AutoTokenizer from transformers import AutoModelForPreTraining from transformers import AutoModel model = AutoModelForPreTraining.from_pretrained('sagui-nlp/debertinha-ptbr-xsmall') tokenizer = AutoTokenizer.from_pretrained('sagui-nlp/debertinha-ptbr-xsmall') ``` ### For embeddings ```python import torch model = AutoModel.from_pretrained('sagui-nlp/debertinha-ptbr-xsmall') input_ids = tokenizer.encode('Tinha uma pedra no meio do caminho.', return_tensors='pt') with torch.no_grad(): outs = model(input_ids) encoded = outs.last_hidden_state[0, 1:-1] # Ignore [CLS] and [SEP] special tokens ``` ## Citation If you use our work, please cite: `Comming soon`