|
--- |
|
license: apache-2.0 |
|
datasets: |
|
- IlyaGusev/rulm |
|
inference: |
|
parameters: |
|
min_length: 20 |
|
max_new_tokens: 250 |
|
top_k: 50 |
|
top_p: 0.9 |
|
early_stopping: true |
|
no_repeat_ngram_size: 2 |
|
use_cache: true |
|
repetition_penalty: 1.5 |
|
length_penalty: 0.8 |
|
num_beams: 2 |
|
language: |
|
- ru |
|
library_name: transformers |
|
pipeline_tag: text-generation |
|
tags: |
|
- finance |
|
- code |
|
--- |
|
|
|
<h1 style="font-size: 42px">WortegaLM 109m<h1/> |
|
|
|
|
|
|
|
# Model Summary |
|
|
|
> Это GPTneo like модель обученная с нуля на сете в 95gb кода, хабра, пикабу, новостей(порядка 12B токенов) Она умеет решать примитивные задачи, не пригодна для ZS FS, но идеальна как модель для студенческих проектов |
|
|
|
# Quick Start |
|
|
|
|
|
```python |
|
|
|
|
|
|
|
|
|
|
|
from transformers import AutoTokenizer, AutoModelForCausalLM, |
|
|
|
|
|
tokenizer = AutoTokenizer.from_pretrained('AlexWortega/wortegaLM',padding_side='left') |
|
device = 'cuda' |
|
model = AutoModelForCausalLM.from_pretrained('AlexWortega/wortegaLM') |
|
model.resize_token_embeddings(len(tokenizer)) |
|
model.to(device) |
|
|
|
|
|
|
|
def generate_seqs(q,model, k=2): |
|
gen_kwargs = { |
|
"min_length": 20, |
|
"max_new_tokens": 100, |
|
"top_k": 50, |
|
"top_p": 0.7, |
|
"do_sample": True, |
|
"early_stopping": True, |
|
"no_repeat_ngram_size": 2, |
|
"eos_token_id": tokenizer.eos_token_id, |
|
"pad_token_id": tokenizer.eos_token_id, |
|
"use_cache": True, |
|
"repetition_penalty": 1.5, |
|
"length_penalty": 1.2, |
|
"num_beams": 4, |
|
"num_return_sequences": k |
|
} |
|
|
|
t = tokenizer.encode(q, add_special_tokens=False, return_tensors='pt').to(device) |
|
g = model.generate(t, **gen_kwargs) |
|
generated_sequences = tokenizer.batch_decode(g, skip_special_tokens=False) |
|
|
|
return generated_sequences |
|
|
|
``` |
|
|
|
|