mpasila
/

Viking-SlimSonnet-v1-7B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

mpasila commited on Sep 1

Commit

f6590e6

•

1 Parent(s): 45ff5fb

Update README.md

Files changed (1) hide show

README.md +18 -0

README.md CHANGED Viewed

@@ -2,6 +2,12 @@
 base_model: LumiOpen/Viking-7B
 language:
 - en
 license: apache-2.0
 tags:
 - text-generation-inference
@@ -10,7 +16,19 @@ tags:
 - llama
 - trl
 - sft
 ---
 # Uploaded  model

 base_model: LumiOpen/Viking-7B
 language:
 - en
+- fi
+- sv
+- 'no'
+- da
+- is
+- nn
 license: apache-2.0
 tags:
 - text-generation-inference
 - llama
 - trl
 - sft
+datasets:
+- Gryphe/Sonnet3.5-SlimOrcaDedupCleaned
+- mpasila/Sonnet3.5-SlimOrcaDedupCleaned-4k-context
 ---
+This is the fully trained version (with fixed formatting!!).
+Dataset used: [Gryphe/Sonnet3.5-SlimOrcaDedupCleaned](https://huggingface.co/datasets/Gryphe/Sonnet3.5-SlimOrcaDedupCleaned) which was further [filtered](https://huggingface.co/datasets/mpasila/Sonnet3.5-SlimOrcaDedupCleaned-4k-context) to remove prompts/examples that are longer than 4076 tokens (removed about 385 examples).
+Prompt format is: ChatML
+LoRA: [mpasila/Viking-SlimSonnet-v1-LoRA-7B](https://huggingface.co/mpasila/Viking-SlimSonnet-v1-LoRA-7B)
+Trained with regular LoRA (not quantized/QLoRA) and LoRA rank was 128 and Alpha set to 32. Trained for 1 epoch using A40 for about 23 hours.
 # Uploaded  model