Upload README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,46 @@
|
|
1 |
---
|
|
|
|
|
|
|
|
|
|
|
2 |
license: mit
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
+
language: en
|
3 |
+
tags:
|
4 |
+
- gtp2
|
5 |
+
- spanish
|
6 |
+
|
7 |
license: mit
|
8 |
---
|
9 |
+
|
10 |
+
|
11 |
+
# GPT-2 - reviewspanish
|
12 |
+
|
13 |
+
## Model description
|
14 |
+
|
15 |
+
|
16 |
+
GPT-2 is a transformers model pretrained on a very large corpus of text data in a self-supervised fashion. This
|
17 |
+
means it was pretrained on the raw texts only, with no humans labelling them in any way (which is why it can use lots
|
18 |
+
of publicly available data) with an automatic process to generate inputs and labels from those texts. More precisely,
|
19 |
+
it was trained to guess the next word in sentences.
|
20 |
+
|
21 |
+
In our case, we created a fined-tunned model of [Spanish GTP-2](https://huggingface.co/DeepESP/gpt2-spanish) combined with
|
22 |
+
the spanish reviews of Amazon from the HG dataset [Amazon-reviews-multi](https://huggingface.co/datasets/amazon_reviews_multi).
|
23 |
+
|
24 |
+
With this strategy, we obtain a model for text generation able to create realistic product reviews, useful for bot detection in
|
25 |
+
fake reviews.
|
26 |
+
|
27 |
+
### How to use
|
28 |
+
|
29 |
+
You can use this model directly with a pipeline for text generation. Since the generation relies on some randomness, we
|
30 |
+
set a seed for reproducibility:
|
31 |
+
|
32 |
+
```python
|
33 |
+
from transformers import pipeline, set_seed
|
34 |
+
generator = pipeline('text-generation',
|
35 |
+
model='Amloii/gpt2-reviewspanish',
|
36 |
+
tokenizer='Amloii/gpt2-reviewspanish')
|
37 |
+
set_seed(42)
|
38 |
+
generator("Me ha gustado su", max_length=30, num_return_sequences=5)
|
39 |
+
|
40 |
+
[{'generated_text': 'Me ha gustado su tamaño y la flexibilidad de las correas, al ser de plastico las hebillas que lleva para sujetar las cadenas me han quitado el'},
|
41 |
+
{'generated_text': 'Me ha gustado su color y calidad. Lo peor de todo, es que las gafas no se pegan nada. La parte de fuera es finita'},
|
42 |
+
{'generated_text': 'Me ha gustado su rapidez y los ajustes de la correa, lo único que para mí, es poco manejable. Además en el bolso tiene una goma'},
|
43 |
+
{'generated_text': 'Me ha gustado su diseño y las dimensiones, pero el material es demasiado duro. Se nota bastante el uso pero me parece un poco caro para lo que'},
|
44 |
+
{'generated_text': 'Me ha gustado su aspecto aunque para lo que yo lo quería no me ha impresionado mucho. Las hojas tienen un tacto muy agradable que hace que puedas'}]
|
45 |
+
|
46 |
+
```
|