Update README.md
Browse files
README.md
CHANGED
@@ -9,7 +9,18 @@ language:
|
|
9 |
|
10 |
A series of SAEs trained on embeddings from [nomic-embed-text-v1.5](https://huggingface.co/nomic-ai/nomic-embed-text-v1.5)
|
11 |
|
12 |
-
The SAEs were trained
|
13 |
|
14 |
Run the models or train your own with [Latent SAE](https://github.com/enjalot/latent-sae)
|
15 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
9 |
|
10 |
A series of SAEs trained on embeddings from [nomic-embed-text-v1.5](https://huggingface.co/nomic-ai/nomic-embed-text-v1.5)
|
11 |
|
12 |
+
The SAEs were trained on the 100BT sample of Fineweb-EDU, see an example of the [10BT sample of Fineweb-Edu](https://huggingface.co/datasets/enjalot/fineweb-edu-sample-10BT-chunked-500).
|
13 |
|
14 |
Run the models or train your own with [Latent SAE](https://github.com/enjalot/latent-sae)
|
15 |
|
16 |
+
# Training
|
17 |
+
|
18 |
+
The models were trained using Modal Labs infrastructure with the command:
|
19 |
+
```bash
|
20 |
+
modal run train_modal.py --batch-size 512 --grad-acc-steps 4 --k 64 --expansion-factor 32
|
21 |
+
```
|
22 |
+
|
23 |
+
Error and dead latents charts can be seen here:
|
24 |
+
![image/png](https://cdn-uploads.huggingface.co/production/uploads/631bce12bf1351ed2bd6bffe/GKPdI97ogF5tF709oYbbY.png)
|
25 |
+
|
26 |
+
The training code is heavily copied from https://github.com/EleutherAI/sae
|