ChenWeiLi
/

Med-ChimeraLlama-3-8B_SHERP

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Med-ChimeraLlama-3-8B_SHERP / README.md

ChenWeiLi's picture

Update README.md

c9cfa06 verified 6 months ago

|

history blame contribute delete

2.67 kB

	---
	base_model:
	- mlabonne/ChimeraLlama-3-8B-v3
	- johnsnowlabs/JSL-MedLlama-3-8B-v2.0
	library_name: transformers
	tags:
	- mergekit
	- merge
	license: llama3
	---
	# Chimera_MedLlama-3-8B

	This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).

	## Merge Details
	### Merge Method

	This model was merged using the SLERP merge method.

	### Models Merged

	The following models were included in the merge:
	* [mlabonne/ChimeraLlama-3-8B-v3](https://huggingface.co/mlabonne/ChimeraLlama-3-8B-v3)
	* [johnsnowlabs/JSL-MedLlama-3-8B-v2.0](https://huggingface.co/johnsnowlabs/JSL-MedLlama-3-8B-v2.0)

	### Evaluation

	- multimedqa (0 shot)</br>

	\| Tasks \|Version\|Filter\|n-shot\| Metric \|Value \| \|Stderr\|
	\|-------------------------------\|-------\|------\|-----:\|--------\|-----:\|---\|-----:\|
	\| - medmcqa \|Yaml \|none \| 0\|acc \|0.6087\|± \|0.0075\|
	\| \| \|none \| 0\|acc_norm\|0.6087\|± \|0.0075\|
	\| - medqa_4options \|Yaml \|none \| 0\|acc \|0.6269\|± \|0.0136\|
	\| \| \|none \| 0\|acc_norm\|0.6269\|± \|0.0136\|
	\| - anatomy (mmlu) \| 0\|none \| 0\|acc \|0.6963\|± \|0.0397\|
	\| - clinical_knowledge (mmlu) \| 0\|none \| 0\|acc \|0.7585\|± \|0.0263\|
	\| - college_biology (mmlu) \| 0\|none \| 0\|acc \|0.7847\|± \|0.0344\|
	\| - college_medicine (mmlu) \| 0\|none \| 0\|acc \|0.6936\|± \|0.0351\|
	\| - medical_genetics (mmlu) \| 0\|none \| 0\|acc \|0.8200\|± \|0.0386\|
	\| - professional_medicine (mmlu)\| 0\|none \| 0\|acc \|0.7684\|± \|0.0256\|
	\|stem \|N/A \|none \| 0\|acc_norm\|0.6129\|± \|0.0066\|
	\| \| \|none \| 0\|acc \|0.6440\|± \|0.0057\|
	\| - pubmedqa \| 1\|none \| 0\|acc \|0.7480\|± \|0.0194\|

	\|Groups\|Version\|Filter\|n-shot\| Metric \|Value \| \|Stderr\|
	\|------\|-------\|------\|-----:\|--------\|-----:\|---\|-----:\|
	\|stem \|N/A \|none \| 0\|acc_norm\|0.6129\|± \|0.0066\|
	\| \| \|none \| 0\|acc \|0.6440\|± \|0.0057\|

	### Configuration

	The following YAML configuration was used to produce this model:

	```yaml
	slices:
	- sources:
	- model: mlabonne/ChimeraLlama-3-8B-v3
	layer_range: [0, 32]
	- model: johnsnowlabs/JSL-MedLlama-3-8B-v2.0
	layer_range: [0, 32]
	merge_method: slerp
	base_model: mlabonne/ChimeraLlama-3-8B-v3
	parameters:
	t:
	- filter: self_attn
	value: [0, 0.5, 0.3, 0.7, 1]
	- filter: mlp
	value: [1, 0.5, 0.7, 0.3, 0]
	- value: 0.5
	dtype: bfloat16



	```