nsfw

Not-For-All-Audiences

llama-3

text-generation-inference

mergekit

Merge

Inference Endpoints

conversational

Model card Files Files and versions Community

L3-8B-Lunar-Stheno-GGUF

File size: 1,933 Bytes

7cf3e2d

---
license: llama3
library_name: transformers
tags:
- nsfw
- not-for-all-audiences
- llama-3
- text-generation-inference
- mergekit
- merge
---

Original model: https://huggingface.co/HiroseKoichi/L3-8B-Lunar-Stheno

# L3-8B-Lunar-Stheno
L3-8B-Lunaris-v1 is definitely a significant improvement over L3-8B-Stheno-v3.2 in terms of situational awareness and prose, but it's not without issues: the response length can sometimes be very long, causing it to go on a rant; it tends to not take direct action, saying that it will do something but never actually doing it; and its performance outside of roleplay took a hit.

This merge fixes all of those issues, and I'm genuinely impressed with the results. While I did use a SLERP merge to create this model, there was no blending of the models; all I did was replace L3-8B-Stheno-v3.2's weights with L3-8B-Lunaris-v1's.

# Experimental Quants Included
There's a full set of quants available, but half of them include an experimental quantization method, which is indicated by the prefix `f16` before the quantization level. These quants keep the embeddings and output tensors at f16, but is otherwise the same as the rest.

The experimental variants should be higher quality than their standard equivalent, but any feedback is welcome.

# Details
- **License**: [llama3](https://llama.meta.com/llama3/license/)
- **Instruct Format**: [llama-3](https://llama.meta.com/docs/model-cards-and-prompt-formats/meta-llama-3/)
- **Context Size**: 8K

## Models Used
- [L3-8B-Stheno-v3.2](https://huggingface.co/Sao10K/L3-8B-Stheno-v3.2)
- [L3-8B-Lunaris-v1](https://huggingface.co/Sao10K/L3-8B-Lunaris-v1)

## Merge Config
```yaml
models:
    - model: Sao10K/L3-8B-Stheno-v3.2
    - model: Sao10K/L3-8B-Lunaris-v1
merge_method: slerp
base_model: Sao10K/L3-8B-Stheno-v3.2
parameters:
  t:
    - filter: self_attn
      value: 0
    - filter: mlp
      value: 1
    - value: 0
dtype: bfloat16
```