Cat Embeddings
A set of embedding model trained for study embedding quality vs model architecture (width/depth) given a size constraint (12M params).
- cat-emb-2-128: 2 layers/hidden size 128/4.4m
- cat-emb-4-128: 4 layers/H 128/4.8m
- cat-emb-8-128: 8 layers/H 128/5.6m
- cat-emb-12-128: 12 layers/H 128/6.4m
- cat-emb-2-256: 2 layers/H 256/9.7m
- cat-emb-4-256: 4 layers/H 256/11.3m
Training
- stage 1: seq 192, batch size 2048, 50k steps, sentence pairs.
- stage 2: seq 512, batch size 64, 5k steps, sentence triplets.
Perf
MRL dim\Task | BIOSSES | SICK-R | STS12 | STS13 | STS14 | STS15 | STS16 | STSB | SummEval |
---|---|---|---|---|---|---|---|---|---|
128 | 0.7107 | 0.7126 | 0.6815 | 0.7343 | 0.7038 | 0.8163 | 0.7495 | 0.7652 | 0.2958 |
64 | 0.713 | 0.7123 | 0.6829 | 0.7348 | 0.7008 | 0.813 | 0.7475 | 0.7609 | 0.2861 |
32 | 0.6714 | 0.7094 | 0.6847 | 0.7345 | 0.6911 | 0.7989 | 0.7385 | 0.7545 | 0.3106 |
16 | 0.6637 | 0.697 | 0.669 | 0.7096 | 0.6665 | 0.7589 | 0.7183 | 0.7307 | 0.3164 |
- Downloads last month
- 2