cookinai
/

OrcaHermes-Mistral-70B-miqu

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

OrcaHermes-Mistral-70B-miqu / README.md

cookinai's picture

Update README.md

f63e24c verified 10 months ago

|

1.33 kB

	---
	license: cc-by-nc-4.0
	library_name: transformers
	tags:
	- mergekit
	- merge

	---
	![OrcaHermes](https://huggingface.co/cookinai/OrcaHermes-Mistral-70B-miqu/blob/main/converted_image.png)

	# OrcaHermes-Mistral-70B

	This model was created by SLERP Merging 2 Miqu Models trained on 2 high preforming datsets.


	Just an experiment, have not seen much miqu slerps yet.

	### Models Merged

	The following models were included in the merge:

	[alicecomfy/miqu-openhermes-full](https://huggingface.co/alicecomfy/miqu-openhermes-full)
	- Base Miqu Trained on [Openhermes](https://huggingface.co/datasets/teknium/OpenHermes-2.5)

	[ShinojiResearch/Senku-70B-Full](https://huggingface.co/ShinojiResearch/Senku-70B-Full)
	- Base Miqu Trained on [Slimorca](https://huggingface.co/datasets/Open-Orca/SlimOrca)


	### Configuration

	The following YAML configuration was used to produce this model:

	```yaml
	slices:
	- sources:
	- model: local//path//to//Senku-70B-Full
	layer_range: [0, 80]
	- model: local//path//to//miqu-openhermes-full
	layer_range: [0, 80]
	merge_method: slerp
	base_model: local//path//to//Senku-70B-Full
	parameters:
	t:
	- filter: self_attn
	value: [0, 0.5, 0.3, 0.7, 1]
	- filter: mlp
	value: [1, 0.5, 0.7, 0.3, 0]
	- value: 0.5 # fallback for rest of tensors
	dtype: float16

	```