File size: 2,969 Bytes
6e8f421 f20c925 31fcafb ba9c102 529b77d 31fcafb 529b77d 52f346c 529b77d 52f346c 529b77d 52f346c 529b77d 52f346c 5e0c393 b632085 5e0c393 52f346c f20c925 e09c0eb f20c925 52f346c be1fbe6 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 |
---
language:
- en
- fa
---
<p align="center">
<picture>
<img alt="Hugging Face Transformers Library" src="https://i.postimg.cc/VN4F7WRC/Untitled-design-modified.png" width="1000" height="450" style="max-width: 100%;">
</picture>
</p>
<h4 align="center">
<p>
<a href="https://huggingface.co/aidal/Persian-Mistral-7B#model-description">Model description</a> |
<a href="https://huggingface.co/aidal/Persian-Mistral-7B#example-output">Example output</a> |
<a href="https://huggingface.co/aidal/Persian-Mistral-7B#banchmark-results">Banchmark results</a> |
<a href="https://huggingface.co/aidal/Persian-Mistral-7B#how-to-use">How to use</a> |
<a href="https://huggingface.co/aidal/Persian-Mistral-7B#training-and-finetuning">Training and finetuning</a>
</p>
</h4>
----
# Model description
>Jamba is a state-of-the-art, hybrid SSM-Transformer LLM. It delivers throughput gains over traditional Transformer-based models, while outperforming or matching the leading models of its size class on most common benchmarks.
>Jamba is the first production-scale Mamba implementation, which opens up interesting research and application opportunities. While this initial experimentation shows encouraging gains, we expect these to be further enhanced with future optimizations and explorations.
>This model card is for the base version of Jamba. It’s a pretrained, mixture-of-experts (MoE) generative text model, with 12B active parameters and a total of 52B parameters across all experts. It supports a 256K context length, and can fit up to 140K tokens on a single 80GB GPU.
----
# Example output:
**Example 1:**
- Input: "سلام، خوبی؟"
- Output: "سلام، خوشحالم که با شما صحبت می کنم. چطور می توانم به شما کمک کنم؟"
**Example 2:**
- Input: "سلام، خوبی؟"
- Output: "سلام، خوشحالم که با شما صحبت می کنم. چطور می توانم به شما کمک کنم؟"
----
# Banchmark results
| model | dataset | max_token | prompt | score |
|---------------|-------------------|-----------|--------|---------|
| base-model-7b | ARC-easy-dev | 2 | en-1 | 0.41929 |
| base-model-7b | ARC-easy-dev | 80 | en-2 | 0.39122 |
| fa-model-7b | ARC-easy-dev | 80 | en-1 | 0.37894 |
| base-model-7b | ARC-challenge-dev | 80 | en-2 | 0.37123 |
| fa-model-7b | ARC-challenge-dev | 80 | en-1 | 0.39298 |
----
# How to use
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("aidal/Persian-Mistral-7B")
model = AutoModelForCausalLM.from_pretrained("aidal/Persian-Mistral-7B")
input_text = "پایتخت ایران کجاست؟"
input_ids = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**input_ids)
print(tokenizer.decode(outputs[0]))
```
----
# Training and finetuning |