|
--- |
|
inference: false |
|
license: openrail |
|
language: |
|
- it |
|
datasets: |
|
- teelinsan/camoscio |
|
--- |
|
|
|
# ExtremITA Camoscio 7 bilion parameters |
|
This is the base model trained on Italian instructions, a sibling of Alpaca. |
|
It is based on [tellinsan/camoscio-7b-llama](https://huggingface.co/teelinsan/camoscio-7b-llama) adapters and the original LLaMA model, and it adds nothing new to [tellinsan/camoscio-7b-llama](https://huggingface.co/teelinsan/camoscio-7b-llama). Our version is the merged model with the adapters in order to obtain a more stable model that can be further fine-tuned, which we did for the [EVALITA 2023](https://www.evalita.it/campaigns/evalita-2023/) challenge. |
|
|
|
# Usage |
|
Checkout the github repository for more insights and codes: https://github.com/crux82/ExtremITA |
|
|
|
```python |
|
from transformers import LLaMATokenizer, LLaMAForCausalLM, GenerationConfig |
|
import torch |
|
|
|
tokenizer = LLaMATokenizer.from_pretrained("yahma/llama-7b-hf") |
|
model = LLaMAForCausalLM.from_pretrained( |
|
"sag-uniroma2/extremITA-Camoscio-7b", |
|
load_in_8bit=True, |
|
device_map="auto", |
|
) |
|
|
|
generation_config = GenerationConfig( |
|
temperature=0.2, |
|
top_p=0.75, |
|
top_k=40, |
|
num_beams=4, |
|
) |
|
|
|
prompts = [ |
|
"Riassumi la storia di Pinocchio", |
|
"Scrivi un programma che stampa i numeri da 1 a 100. Ma per i multipli \ |
|
di tre stampa 'Fizz' al posto del numero e per i multipli di cinque \ |
|
stampa 'Buzz'. Per i numeri che sono multipli sia di tre che di cinque \ |
|
stampa 'FizzBuzz'." |
|
] |
|
|
|
inputs = tokenizer(prompts, return_tensors="pt", padding=True, \ |
|
truncation=True).to(model.device) |
|
|
|
with torch.no_grad(): |
|
gen_outputs = model.generate( |
|
**inputs, |
|
generation_config=generation_config, |
|
return_dict_in_generate=True, |
|
output_scores=True, |
|
) |
|
|
|
for i in range(len(gen_outputs[0])): |
|
output = tokenizer.decode(gen_outputs[0][i], skip_special_tokens=True) |
|
print(output) |
|
``` |
|
|
|
# Citation |
|
``` |
|
@inproceedings{hromei2023extremita, |
|
author = {Claudiu Daniel Hromei and |
|
Danilo Croce and |
|
Valerio Basile and |
|
Roberto Basili}, |
|
title = {ExtremITA at EVALITA 2023: Multi-Task Sustainable Scaling to Large Language Models at its Extreme}, |
|
booktitle = {Proceedings of the Eighth Evaluation Campaign of Natural Language |
|
Processing and Speech Tools for Italian. Final Workshop (EVALITA 2023)}, |
|
publisher = {CEUR.org}, |
|
year = {2023}, |
|
month = {September}, |
|
address = {Parma, Italy} |
|
} |
|
``` |
|
|