|
--- |
|
license: apache-2.0 |
|
library_name: transformers |
|
tags: |
|
- merge |
|
- mergekit |
|
- lazymergekit |
|
- bunnycore/QandoraExp-7B |
|
- trollek/Qwen2.5-7B-CySecButler-v0.1 |
|
base_model: |
|
- bunnycore/QandoraExp-7B |
|
- trollek/Qwen2.5-7B-CySecButler-v0.1 |
|
model-index: |
|
- name: Qwen2.5-7B-Qandora-CySec |
|
results: |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: IFEval (0-Shot) |
|
type: HuggingFaceH4/ifeval |
|
args: |
|
num_few_shot: 0 |
|
metrics: |
|
- type: inst_level_strict_acc and prompt_level_strict_acc |
|
value: 67.73 |
|
name: strict accuracy |
|
source: |
|
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Qwen2.5-7B-Qandora-CySec |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: BBH (3-Shot) |
|
type: BBH |
|
args: |
|
num_few_shot: 3 |
|
metrics: |
|
- type: acc_norm |
|
value: 36.26 |
|
name: normalized accuracy |
|
source: |
|
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Qwen2.5-7B-Qandora-CySec |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: MATH Lvl 5 (4-Shot) |
|
type: hendrycks/competition_math |
|
args: |
|
num_few_shot: 4 |
|
metrics: |
|
- type: exact_match |
|
value: 22.89 |
|
name: exact match |
|
source: |
|
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Qwen2.5-7B-Qandora-CySec |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: GPQA (0-shot) |
|
type: Idavidrein/gpqa |
|
args: |
|
num_few_shot: 0 |
|
metrics: |
|
- type: acc_norm |
|
value: 6.71 |
|
name: acc_norm |
|
source: |
|
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Qwen2.5-7B-Qandora-CySec |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: MuSR (0-shot) |
|
type: TAUR-Lab/MuSR |
|
args: |
|
num_few_shot: 0 |
|
metrics: |
|
- type: acc_norm |
|
value: 13.41 |
|
name: acc_norm |
|
source: |
|
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Qwen2.5-7B-Qandora-CySec |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: MMLU-PRO (5-shot) |
|
type: TIGER-Lab/MMLU-Pro |
|
config: main |
|
split: test |
|
args: |
|
num_few_shot: 5 |
|
metrics: |
|
- type: acc |
|
value: 38.72 |
|
name: accuracy |
|
source: |
|
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Qwen2.5-7B-Qandora-CySec |
|
name: Open LLM Leaderboard |
|
--- |
|
# Qwen2.5-7B-Qandora-CySec |
|
|
|
ZeroXClem/Qwen2.5-7B-Qandora-CySec is an advanced model merge combining Q&A capabilities and cybersecurity expertise using the mergekit framework. This model excels in both general question-answering tasks and specialized cybersecurity domains. |
|
|
|
### π¬ Quants |
|
ZeroXClem/Qwen2.5-7B-Qandora-CySec quantized in GGUF format can be [found here:](https://huggingface.co/models?other=base_model:quantized:ZeroXClem/Qwen2.5-7B-Qandora-CySec) |
|
|
|
## π Model Components |
|
|
|
- **[bunnycore/QandoraExp-7B](https://huggingface.co/bunnycore/QandoraExp-7B)**: Powerful Q&A capabilities |
|
- **[trollek/Qwen2.5-7B-CySecButler-v0.1](https://huggingface.co/trollek/Qwen2.5-7B-CySecButler-v0.1)**: Specialized cybersecurity knowledge |
|
|
|
## 𧩠Merge Configuration |
|
|
|
The models are merged using spherical linear interpolation (SLERP) for optimal blending: |
|
|
|
```yaml |
|
slices: |
|
- sources: |
|
- model: bunnycore/QandoraExp-7B |
|
layer_range: [0, 28] |
|
- model: trollek/Qwen2.5-7B-CySecButler-v0.1 |
|
layer_range: [0, 28] |
|
merge_method: slerp |
|
base_model: bunnycore/QandoraExp-7B |
|
parameters: |
|
t: |
|
- filter: self_attn |
|
value: [0, 0.5, 0.3, 0.7, 1] |
|
- filter: mlp |
|
value: [1, 0.5, 0.7, 0.3, 0] |
|
- value: 0.5 |
|
dtype: bfloat16 |
|
``` |
|
|
|
### Key Parameters |
|
|
|
- **Self-Attention (self_attn)**: Controls blending across self-attention layers |
|
- **MLP**: Adjusts Multi-Layer Perceptron balance |
|
- **Global Weight (t.value)**: 0.5 for equal contribution from both models |
|
- **Data Type**: bfloat16 for efficiency and precision |
|
|
|
## π― Applications |
|
|
|
1. General Q&A Tasks |
|
2. Cybersecurity Analysis |
|
3. Hybrid Scenarios (general knowledge + cybersecurity) |
|
|
|
## Ollama Model Card |
|
|
|
The [GGUF quantized versions](https://huggingface.co/models?other=base_model:quantized:ZeroXClem/Qwen2.5-7B-Qandora-CySec) can be used directly in Ollama using the following model card. Simple save as Modelfile in the same directory. |
|
|
|
```Modelfile |
|
FROM ./qwen2.5-7b-qandora-cysec-q5_0.gguf # Change to your specific quant |
|
|
|
# set the temperature to 1 [higher is more creative, lower is more coherent] |
|
PARAMETER temperature 0.7 |
|
PARAMETER top_p 0.8 |
|
PARAMETER repeat_penalty 1.05 |
|
PARAMETER top_k 20 |
|
|
|
TEMPLATE """{{ if .Messages }} |
|
{{- if or .System .Tools }}<|im_start|>system |
|
{{ .System }} |
|
{{- if .Tools }} |
|
|
|
# Tools |
|
|
|
You are provided with function signatures within <tools></tools> XML tags: |
|
<tools>{{- range .Tools }} |
|
{"type": "function", "function": {{ .Function }}}{{- end }} |
|
</tools> |
|
|
|
For each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags: |
|
<tool_call> |
|
{"name": <function-name>, "arguments": <args-json-object>} |
|
</tool_call> |
|
{{- end }}<|im_end|> |
|
{{ end }} |
|
{{- range $i, $_ := .Messages }} |
|
{{- $last := eq (len (slice $.Messages $i)) 1 -}} |
|
{{- if eq .Role "user" }}<|im_start|>user |
|
{{ .Content }}<|im_end|> |
|
{{ else if eq .Role "assistant" }}<|im_start|>assistant |
|
{{ if .Content }}{{ .Content }} |
|
{{- else if .ToolCalls }}<tool_call> |
|
{{ range .ToolCalls }}{"name": "{{ .Function.Name }}", "arguments": {{ .Function.Arguments }}} |
|
{{ end }}</tool_call> |
|
{{- end }}{{ if not $last }}<|im_end|> |
|
{{ end }} |
|
{{- else if eq .Role "tool" }}<|im_start|>user |
|
<tool_response> |
|
{{ .Content }} |
|
</tool_response><|im_end|> |
|
{{ end }} |
|
{{- if and (ne .Role "assistant") $last }}<|im_start|>assistant |
|
{{ end }} |
|
{{- end }} |
|
{{- else }} |
|
{{- if .System }}<|im_start|>system |
|
{{ .System }}<|im_end|> |
|
{{ end }}{{ if .Prompt }}<|im_start|>user |
|
{{ .Prompt }}<|im_end|> |
|
{{ end }}<|im_start|>assistant |
|
{{ end }}{{ .Response }}{{ if .Response }}<|im_end|>{{ end }}""" |
|
|
|
# set the system message |
|
SYSTEM """You are Qwen, merged by ZeroXClem. As such, you are a high quality assistant that excels in general question-answering tasks, code generation, and specialized cybersecurity domains.""" |
|
``` |
|
|
|
Then create the ollama model by running: |
|
|
|
``` bash |
|
ollama create qwen2.5-7B-qandora-cysec -f Modelfile |
|
``` |
|
Once completed, you can run your ollama model by: |
|
|
|
``` bash |
|
ollama run qwen2.5-7B-qandora-cysec |
|
``` |
|
|
|
## π Usage |
|
|
|
```python |
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
|
|
model_name = "ZeroXClem/Qwen2.5-7B-Qandora-CySec" |
|
tokenizer = AutoTokenizer.from_pretrained(model_name) |
|
model = AutoModelForCausalLM.from_pretrained(model_name) |
|
|
|
input_text = "What are the fundamentals of python programming?" |
|
input_ids = tokenizer.encode(input_text, return_tensors="pt") |
|
output = model.generate(input_ids, max_length=100) |
|
response = tokenizer.decode(output[0], skip_special_tokens=True) |
|
print(response) |
|
``` |
|
|
|
## π License |
|
|
|
This model inherits the licenses of its base models. Refer to bunnycore/QandoraExp-7B and trollek/Qwen2.5-7B-CySecButler-v0.1 for usage terms. |
|
|
|
## π Acknowledgements |
|
|
|
- bunnycore (QandoraExp-7B) |
|
- trollek (Qwen2.5-7B-CySecButler-v0.1) |
|
- mergekit project |
|
|
|
## π Citation |
|
|
|
If you use this model, please cite this repository and the original base models. |
|
|
|
## π‘ Tags |
|
|
|
merge, mergekit, lazymergekit, bunnycore/QandoraExp-7B, trollek/Qwen2.5-7B-CySecButler-v0.1, cybersecurity, Q&A |
|
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard) |
|
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_ZeroXClem__Qwen2.5-7B-Qandora-CySec) |
|
|
|
| Metric |Value| |
|
|-------------------|----:| |
|
|Avg. |30.95| |
|
|IFEval (0-Shot) |67.73| |
|
|BBH (3-Shot) |36.26| |
|
|MATH Lvl 5 (4-Shot)|22.89| |
|
|GPQA (0-shot) | 6.71| |
|
|MuSR (0-shot) |13.41| |
|
|MMLU-PRO (5-shot) |38.72| |
|
|
|
|