File size: 2,807 Bytes
8b3f690
 
7841296
 
 
 
8b3f690
 
7841296
8b3f690
7841296
8b3f690
7841296
8b3f690
7841296
 
38705d5
2e1ddc1
8b3f690
f111a7f
 
 
 
 
 
57e0bba
38705d5
8b3f690
f111a7f
 
 
7841296
 
 
 
 
 
 
 
 
 
 
 
 
8b3f690
7841296
 
8b3f690
50d6f09
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
f65c699
50d6f09
 
 
 
 
 
 
 
 
 
 
 
 
8f254d7
50d6f09
 
8f254d7
50d6f09
8f254d7
50d6f09
7841296
 
8b3f690
7841296
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
---
library_name: transformers
license: mit
language:
- ja
- en
---

# stockmark/stockmark-100b-instruct-v0.1

Stockmark-100b-instruct-v0.1 is an instruction tuned version of [stockmark-100b](https://huggingface.co/stockmark/stockmark-100b), a 100 billion parameter LLM developed by [Stockmark Inc.](https://stockmark.co.jp/) 

## How to use

```python
import torch
from transformers import AutoTokenizer
from peft import AutoPeftModelForCausalLM

prompt_template = """### 指示:
{instruction}

### 応答:
"""

tokenizer = AutoTokenizer.from_pretrained("stockmark/stockmark-100b-instruct-v0.1")
model = AutoPeftModelForCausalLM.from_pretrained("stockmark/stockmark-100b-instruct-v0.1", device_map="auto", torch_dtype=torch.bfloat16)

instruction = "生成AIとは?"
prompt = prompt_template.format(instruction=instruction)
input_ids = tokenizer(prompt, return_tensors="pt").input_ids.to(model.device)
with torch.inference_mode():
    tokens = model.generate(
        input_ids,
        max_new_tokens = 256,
        do_sample = True,
        temperature = 0.7,
        top_p = 0.95,
        repetition_penalty = 1.08
    )
    
output = tokenizer.decode(tokens[0], skip_special_tokens=True)
print(output)
```

## Dataset (fine-tuning)
- Ichikara instruction [[Web Page](https://liat-aip.sakura.ne.jp/wp/llm%E3%81%AE%E3%81%9F%E3%82%81%E3%81%AE%E6%97%A5%E6%9C%AC%E8%AA%9E%E3%82%A4%E3%83%B3%E3%82%B9%E3%83%88%E3%83%A9%E3%82%AF%E3%82%B7%E3%83%A7%E3%83%B3%E3%83%87%E3%83%BC%E3%82%BF%E4%BD%9C%E6%88%90/llm%E3%81%AE%E3%81%9F%E3%82%81%E3%81%AE%E6%97%A5%E6%9C%AC%E8%AA%9E%E3%82%A4%E3%83%B3%E3%82%B9%E3%83%88%E3%83%A9%E3%82%AF%E3%82%B7%E3%83%A7%E3%83%B3%E3%83%87%E3%83%BC%E3%82%BF-%E5%85%AC%E9%96%8B/)], [[Ppaer](https://www.anlp.jp/proceedings/annual_meeting/2024/pdf_dir/A6-3.pdf)]

## Performance

**Stockmark Business Questions**

Dataset: https://huggingface.co/datasets/stockmark/business-questions

| model | accuracy |
|:---:|:---:|
|stockmark-100b-instruct| 0.90 |
|stockmark-13b-instruct| 0.80 |
|GPT-3.5-turbo[^1]| 0.42 |

[^1]: 0613

**Japanese Vicuna QA Benchmark**

We excluded categories that require calculation and coding, and use remaining 60 questions for evaluation.

GitHub: https://github.com/ku-nlp/ja-vicuna-qa-benchmark

| model | average score |
|:---:|:---:|
|stockmark-100b-instruct| 5.97 |
|tokyotech-llm/Swallow-70b-instruct-hf| 5.59 |
|GPT-3.5 (text-davinci-003)| 5.08 |

**Inference speed**

| model | time [s] for genrating 100 characters in Japanese |
|:---:|:---:|
|stockmark-100b-instruct| 1.86 |
| gpt-3.5-turbo | 2.15 |
| gpt-4-turbo | 5.48 |
|tokyotech-llm/Swallow-70b-instruct-hf| 2.22 |

For local LLMs, we measured the inference time using AWS Inferentia2.

## License
[MIT](https://opensource.org/licenses/MIT)

## Developed by
[Stockmark Inc.](https://stockmark.co.jp/)