stockmark
/

stockmark-100b-instruct-v0.1

Inference Endpoints

Model card Files Files and versions Community

stockmark-100b-instruct-v0.1 / README.md

omitakahiro's picture

Update README.md

2e1ddc1 verified 6 months ago

|

history blame contribute delete

2.81 kB

	---
	library_name: transformers
	license: mit
	language:
	- ja
	- en
	---

	# stockmark/stockmark-100b-instruct-v0.1

	Stockmark-100b-instruct-v0.1 is an instruction tuned version of [stockmark-100b](https://huggingface.co/stockmark/stockmark-100b), a 100 billion parameter LLM developed by [Stockmark Inc.](https://stockmark.co.jp/)

	## How to use

	```python
	import torch
	from transformers import AutoTokenizer
	from peft import AutoPeftModelForCausalLM

	prompt_template = """### 指示:
	{instruction}

	### 応答:
	"""

	tokenizer = AutoTokenizer.from_pretrained("stockmark/stockmark-100b-instruct-v0.1")
	model = AutoPeftModelForCausalLM.from_pretrained("stockmark/stockmark-100b-instruct-v0.1", device_map="auto", torch_dtype=torch.bfloat16)

	instruction = "生成AIとは？"
	prompt = prompt_template.format(instruction=instruction)
	input_ids = tokenizer(prompt, return_tensors="pt").input_ids.to(model.device)
	with torch.inference_mode():
	tokens = model.generate(
	input_ids,
	max_new_tokens = 256,
	do_sample = True,
	temperature = 0.7,
	top_p = 0.95,
	repetition_penalty = 1.08
	)

	output = tokenizer.decode(tokens[0], skip_special_tokens=True)
	print(output)
	```

	## Dataset (fine-tuning)
	- Ichikara instruction [[Web Page](https://liat-aip.sakura.ne.jp/wp/llm%E3%81%AE%E3%81%9F%E3%82%81%E3%81%AE%E6%97%A5%E6%9C%AC%E8%AA%9E%E3%82%A4%E3%83%B3%E3%82%B9%E3%83%88%E3%83%A9%E3%82%AF%E3%82%B7%E3%83%A7%E3%83%B3%E3%83%87%E3%83%BC%E3%82%BF%E4%BD%9C%E6%88%90/llm%E3%81%AE%E3%81%9F%E3%82%81%E3%81%AE%E6%97%A5%E6%9C%AC%E8%AA%9E%E3%82%A4%E3%83%B3%E3%82%B9%E3%83%88%E3%83%A9%E3%82%AF%E3%82%B7%E3%83%A7%E3%83%B3%E3%83%87%E3%83%BC%E3%82%BF-%E5%85%AC%E9%96%8B/)], [[Ppaer](https://www.anlp.jp/proceedings/annual_meeting/2024/pdf_dir/A6-3.pdf)]

	## Performance

	Stockmark Business Questions

	Dataset: https://huggingface.co/datasets/stockmark/business-questions

	\| model \| accuracy \|
	\|:---:\|:---:\|
	\|stockmark-100b-instruct\| 0.90 \|
	\|stockmark-13b-instruct\| 0.80 \|
	\|GPT-3.5-turbo[^1]\| 0.42 \|

	[^1]: 0613

	Japanese Vicuna QA Benchmark

	We excluded categories that require calculation and coding, and use remaining 60 questions for evaluation.

	GitHub: https://github.com/ku-nlp/ja-vicuna-qa-benchmark

	\| model \| average score \|
	\|:---:\|:---:\|
	\|stockmark-100b-instruct\| 5.97 \|
	\|tokyotech-llm/Swallow-70b-instruct-hf\| 5.59 \|
	\|GPT-3.5 (text-davinci-003)\| 5.08 \|

	Inference speed

	\| model \| time [s] for genrating 100 characters in Japanese \|
	\|:---:\|:---:\|
	\|stockmark-100b-instruct\| 1.86 \|
	\| gpt-3.5-turbo \| 2.15 \|
	\| gpt-4-turbo \| 5.48 \|
	\|tokyotech-llm/Swallow-70b-instruct-hf\| 2.22 \|

	For local LLMs, we measured the inference time using AWS Inferentia2.

	## License
	[MIT](https://opensource.org/licenses/MIT)

	## Developed by
	[Stockmark Inc.](https://stockmark.co.jp/)