jupyterjazz
commited on
Commit
•
30996fe
1
Parent(s):
9c9e6d9
fix: normalization after truncation (#83)
Browse files- readme: notify about the normalization change (a14d943f3d9adf8da38f9d8a5afe849550e51065)
- update readme (49df0fbe860095d87f6ecf9b313f46e7a0abc035)
README.md
CHANGED
@@ -25056,6 +25056,13 @@ While the foundation model supports 100 languages, we've focused our tuning effo
|
|
25056 |
Hindi, Indonesian, Italian, Japanese, Korean, Latvian, Norwegian, Polish, Portuguese, Romanian,
|
25057 |
Russian, Slovak, Spanish, Swedish, Thai, Turkish, Ukrainian, Urdu,** and **Vietnamese.**
|
25058 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
25059 |
## Usage
|
25060 |
|
25061 |
**<details><summary>Apply mean pooling when integrating the model.</summary>**
|
|
|
25056 |
Hindi, Indonesian, Italian, Japanese, Korean, Latvian, Norwegian, Polish, Portuguese, Romanian,
|
25057 |
Russian, Slovak, Spanish, Swedish, Thai, Turkish, Ukrainian, Urdu,** and **Vietnamese.**
|
25058 |
|
25059 |
+
|
25060 |
+
> **⚠️ Important Notice:**
|
25061 |
+
> We fixed a bug in the `encode` function [#60](https://huggingface.co/jinaai/jina-embeddings-v3/discussions/60) where **Matryoshka embedding truncation** occurred *after normalization*, leading to non-normalized truncated embeddings. This issue has been resolved in the latest code revision.
|
25062 |
+
>
|
25063 |
+
> If you have encoded data using the previous version and wish to maintain consistency, please use the specific code revision when loading the model: `AutoModel.from_pretrained('jinaai/jina-embeddings-v3', code_revision='da863dd04a4e5dce6814c6625adfba87b83838aa', ...)`
|
25064 |
+
|
25065 |
+
|
25066 |
## Usage
|
25067 |
|
25068 |
**<details><summary>Apply mean pooling when integrating the model.</summary>**
|