israelcamp commited on
Commit
81864d3
1 Parent(s): fe8c52b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +44 -1
README.md CHANGED
@@ -5,4 +5,47 @@ datasets:
5
  - carolina-c4ai/corpus-carolina
6
  language:
7
  - pt
8
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5
  - carolina-c4ai/corpus-carolina
6
  language:
7
  - pt
8
+ ---
9
+
10
+
11
+ # DeBERTinha XSmall (aka "debertinha-ptbr-xsmall")
12
+
13
+ ## Introduction
14
+
15
+ DeBERTinha is a pretrained DeBERTa model for Brazilian Portuguese.
16
+
17
+ ## Available models
18
+
19
+ | Model | Arch. | #Params |
20
+ | ---------------------------------------- | ---------- | ------- |
21
+ | `sagui-nlp/debertinha-ptbr-xsmall` | DeBERTa-V3-Xsmall | 40M |
22
+
23
+ ## Usage
24
+
25
+ ```python
26
+ from transformers import AutoTokenizer
27
+ from transformers import AutoModelForPreTraining
28
+ from transformers import AutoModel
29
+
30
+ model = AutoModelForPreTraining.from_pretrained('sagui-nlp/debertinha-ptbr-xsmall')
31
+ tokenizer = AutoTokenizer.from_pretrained('sagui-nlp/debertinha-ptbr-xsmall')
32
+ ```
33
+
34
+ ### For embeddings
35
+
36
+ ```python
37
+ import torch
38
+
39
+ model = AutoModel.from_pretrained('sagui-nlp/debertinha-ptbr-xsmall')
40
+ input_ids = tokenizer.encode('Tinha uma pedra no meio do caminho.', return_tensors='pt')
41
+
42
+ with torch.no_grad():
43
+ outs = model(input_ids)
44
+ encoded = outs.last_hidden_state[0, 1:-1] # Ignore [CLS] and [SEP] special tokens
45
+ ```
46
+
47
+ ## Citation
48
+
49
+ If you use our work, please cite:
50
+
51
+ `Comming soon`