|
--- |
|
license: mit |
|
datasets: |
|
- Nebulous/gpt4all_pruned |
|
- sahil2801/CodeAlpaca-20k |
|
- yahma/alpaca-cleaned |
|
--- |
|
|
|
This repo contains a low-rank adapter for **LLaMA-7b** fit on |
|
- `Nebulous/gpt4all_pruned` |
|
- `sahil2801/CodeAlpaca-20k` |
|
- `yahma/alpaca-cleaned` |
|
- datasets part of the OpenAssistant project. |
|
|
|
|
|
This version of the weights was trained with the following hyperparameters: |
|
|
|
- Epochs: 2 |
|
- Batch size: 128 |
|
- Max Length: 2048 |
|
- Learning rate: 4e-6 |
|
- Lora _r_: 8 |
|
- Lora Alpha: 32 |
|
- Lora target modules: q_proj, k_proj, v_proj, o_proj |
|
|
|
The model was trained with flash attention and gradient checkpointing. |