-r128-LoRA
This is a LoRA extracted from a language model. It was extracted using mergekit.
LoRA Details
This LoRA adapter was extracted from THUDM/LongReward-llama3.1-8b-DPO and uses unsloth/Meta-Llama-3.1-8B as a base.
Parameters
The following command was used to extract this LoRA adapter:
mergekit-extract-lora THUDM/LongReward-llama3.1-8b-DPO unsloth/Meta-Llama-3.1-8B OUTPUT_PATH --no-lazy-unpickle --skip-undecomposable --rank=128 --extend-vocab --model_name=-r128-LoRA --verbose
Model tree for kromcomp/L3.1-LongReward-r128-LoRA
Base model
THUDM/LongReward-llama3.1-8b-DPO