jth01
/

Rombos-LLM-V2.5-Qwen-32b-4.0bpw-exl2

Text Generation

text-generation-inference

Inference Endpoints

4-bit precision

Model card Files Files and versions Community

Rombos-LLM-V2.5-Qwen-32b-4.0bpw-exl2 / README.md

jth01's picture

Update README.md

914b28d verified 24 days ago

|

history blame contribute delete

948 Bytes

	---
	library_name: transformers
	base_model:
	- Qwen/Qwen2.5-32B-Instruct
	license: apache-2.0
	---

	# Rombos-LLM-V2.5-Qwen-32b 4.0 BPW exl2

	4 bpw quant of https://huggingface.co/rombodawg/Rombos-LLM-V2.5-Qwen-32b

	Scores 63.2 on Aider benchmarks!

	---

	# Rombos-LLM-V2.5-Qwen-32b

	![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/642cc1c253e76b4c2286c58e/hXnQV6WtMKrmIQPdjECSX.jpeg)

	Rombos-LLM-V2.5-Qwen-32b is a continues finetuned version of Qwen2.5-32B. I noticed recently that the Qwen team did not learn from my methods of continuous finetuning, the great benefits, and no downsides of it. So I took it upon myself to merge the instruct model with the base model myself using the Ties merge method

	This version of the model shows higher performance than the original instruct and base models.

	Quants: (Coming soon)

	GGUF: https://huggingface.co/bartowski/Replete-LLM-V2.5-Qwen-32b-GGUF

	EXL2:

	Benchmarks: (Coming soon)