Adding Evaluation Results (#1)

0369cb8 verified 21 days ago

8.42 kB

	---
	license: apache-2.0
	library_name: transformers
	tags:
	- merge
	- mergekit
	- lazymergekit
	- bunnycore/QandoraExp-7B
	- trollek/Qwen2.5-7B-CySecButler-v0.1
	base_model:
	- bunnycore/QandoraExp-7B
	- trollek/Qwen2.5-7B-CySecButler-v0.1
	model-index:
	- name: Qwen2.5-7B-Qandora-CySec
	results:
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: IFEval (0-Shot)
	type: HuggingFaceH4/ifeval
	args:
	num_few_shot: 0
	metrics:
	- type: inst_level_strict_acc and prompt_level_strict_acc
	value: 67.73
	name: strict accuracy
	source:
	url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Qwen2.5-7B-Qandora-CySec
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: BBH (3-Shot)
	type: BBH
	args:
	num_few_shot: 3
	metrics:
	- type: acc_norm
	value: 36.26
	name: normalized accuracy
	source:
	url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Qwen2.5-7B-Qandora-CySec
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: MATH Lvl 5 (4-Shot)
	type: hendrycks/competition_math
	args:
	num_few_shot: 4
	metrics:
	- type: exact_match
	value: 22.89
	name: exact match
	source:
	url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Qwen2.5-7B-Qandora-CySec
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: GPQA (0-shot)
	type: Idavidrein/gpqa
	args:
	num_few_shot: 0
	metrics:
	- type: acc_norm
	value: 6.71
	name: acc_norm
	source:
	url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Qwen2.5-7B-Qandora-CySec
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: MuSR (0-shot)
	type: TAUR-Lab/MuSR
	args:
	num_few_shot: 0
	metrics:
	- type: acc_norm
	value: 13.41
	name: acc_norm
	source:
	url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Qwen2.5-7B-Qandora-CySec
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: MMLU-PRO (5-shot)
	type: TIGER-Lab/MMLU-Pro
	config: main
	split: test
	args:
	num_few_shot: 5
	metrics:
	- type: acc
	value: 38.72
	name: accuracy
	source:
	url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Qwen2.5-7B-Qandora-CySec
	name: Open LLM Leaderboard
	---
	# Qwen2.5-7B-Qandora-CySec

	ZeroXClem/Qwen2.5-7B-Qandora-CySec is an advanced model merge combining Q&A capabilities and cybersecurity expertise using the mergekit framework. This model excels in both general question-answering tasks and specialized cybersecurity domains.

	### 🔬 Quants
	ZeroXClem/Qwen2.5-7B-Qandora-CySec quantized in GGUF format can be [found here:](https://huggingface.co/models?other=base_model:quantized:ZeroXClem/Qwen2.5-7B-Qandora-CySec)

	## 🚀 Model Components

	- [bunnycore/QandoraExp-7B](https://huggingface.co/bunnycore/QandoraExp-7B): Powerful Q&A capabilities
	- [trollek/Qwen2.5-7B-CySecButler-v0.1](https://huggingface.co/trollek/Qwen2.5-7B-CySecButler-v0.1): Specialized cybersecurity knowledge

	## 🧩 Merge Configuration

	The models are merged using spherical linear interpolation (SLERP) for optimal blending:

	```yaml
	slices:
	- sources:
	- model: bunnycore/QandoraExp-7B
	layer_range: [0, 28]
	- model: trollek/Qwen2.5-7B-CySecButler-v0.1
	layer_range: [0, 28]
	merge_method: slerp
	base_model: bunnycore/QandoraExp-7B
	parameters:
	t:
	- filter: self_attn
	value: [0, 0.5, 0.3, 0.7, 1]
	- filter: mlp
	value: [1, 0.5, 0.7, 0.3, 0]
	- value: 0.5
	dtype: bfloat16
	```

	### Key Parameters

	- Self-Attention (self_attn): Controls blending across self-attention layers
	- MLP: Adjusts Multi-Layer Perceptron balance
	- Global Weight (t.value): 0.5 for equal contribution from both models
	- Data Type: bfloat16 for efficiency and precision

	## 🎯 Applications

	1. General Q&A Tasks
	2. Cybersecurity Analysis
	3. Hybrid Scenarios (general knowledge + cybersecurity)

	## Ollama Model Card

	The [GGUF quantized versions](https://huggingface.co/models?other=base_model:quantized:ZeroXClem/Qwen2.5-7B-Qandora-CySec) can be used directly in Ollama using the following model card. Simple save as Modelfile in the same directory.

	```Modelfile
	FROM ./qwen2.5-7b-qandora-cysec-q5_0.gguf # Change to your specific quant

	# set the temperature to 1 [higher is more creative, lower is more coherent]
	PARAMETER temperature 0.7
	PARAMETER top_p 0.8
	PARAMETER repeat_penalty 1.05
	PARAMETER top_k 20

	TEMPLATE """{{ if .Messages }}
	{{- if or .System .Tools }}<\|im_start\|>system
	{{ .System }}
	{{- if .Tools }}

	# Tools

	You are provided with function signatures within <tools></tools> XML tags:
	<tools>{{- range .Tools }}
	{"type": "function", "function": {{ .Function }}}{{- end }}
	</tools>

	For each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:
	<tool_call>
	{"name": <function-name>, "arguments": <args-json-object>}
	</tool_call>
	{{- end }}<\|im_end\|>
	{{ end }}
	{{- range $i, $_ := .Messages }}
	{{- $last := eq (len (slice $.Messages $i)) 1 -}}
	{{- if eq .Role "user" }}<\|im_start\|>user
	{{ .Content }}<\|im_end\|>
	{{ else if eq .Role "assistant" }}<\|im_start\|>assistant
	{{ if .Content }}{{ .Content }}
	{{- else if .ToolCalls }}<tool_call>
	{{ range .ToolCalls }}{"name": "{{ .Function.Name }}", "arguments": {{ .Function.Arguments }}}
	{{ end }}</tool_call>
	{{- end }}{{ if not $last }}<\|im_end\|>
	{{ end }}
	{{- else if eq .Role "tool" }}<\|im_start\|>user
	<tool_response>
	{{ .Content }}
	</tool_response><\|im_end\|>
	{{ end }}
	{{- if and (ne .Role "assistant") $last }}<\|im_start\|>assistant
	{{ end }}
	{{- end }}
	{{- else }}
	{{- if .System }}<\|im_start\|>system
	{{ .System }}<\|im_end\|>
	{{ end }}{{ if .Prompt }}<\|im_start\|>user
	{{ .Prompt }}<\|im_end\|>
	{{ end }}<\|im_start\|>assistant
	{{ end }}{{ .Response }}{{ if .Response }}<\|im_end\|>{{ end }}"""

	# set the system message
	SYSTEM """You are Qwen, merged by ZeroXClem. As such, you are a high quality assistant that excels in general question-answering tasks, code generation, and specialized cybersecurity domains."""
	```

	Then create the ollama model by running:

	``` bash
	ollama create qwen2.5-7B-qandora-cysec -f Modelfile
	```
	Once completed, you can run your ollama model by:

	``` bash
	ollama run qwen2.5-7B-qandora-cysec
	```

	## 🛠 Usage

	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM

	model_name = "ZeroXClem/Qwen2.5-7B-Qandora-CySec"
	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = AutoModelForCausalLM.from_pretrained(model_name)

	input_text = "What are the fundamentals of python programming?"
	input_ids = tokenizer.encode(input_text, return_tensors="pt")
	output = model.generate(input_ids, max_length=100)
	response = tokenizer.decode(output[0], skip_special_tokens=True)
	print(response)
	```

	## 📜 License

	This model inherits the licenses of its base models. Refer to bunnycore/QandoraExp-7B and trollek/Qwen2.5-7B-CySecButler-v0.1 for usage terms.

	## 🙏 Acknowledgements

	- bunnycore (QandoraExp-7B)
	- trollek (Qwen2.5-7B-CySecButler-v0.1)
	- mergekit project

	## 📚 Citation

	If you use this model, please cite this repository and the original base models.

	## 💡 Tags

	merge, mergekit, lazymergekit, bunnycore/QandoraExp-7B, trollek/Qwen2.5-7B-CySecButler-v0.1, cybersecurity, Q&A
	# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
	Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_ZeroXClem__Qwen2.5-7B-Qandora-CySec)

	\| Metric \|Value\|
	\|-------------------\|----:\|
	\|Avg. \|30.95\|
	\|IFEval (0-Shot) \|67.73\|
	\|BBH (3-Shot) \|36.26\|
	\|MATH Lvl 5 (4-Shot)\|22.89\|
	\|GPQA (0-shot) \| 6.71\|
	\|MuSR (0-shot) \|13.41\|
	\|MMLU-PRO (5-shot) \|38.72\|