Luke Merrick commited on
Commit
d3c84a2
1 Parent(s): 95c5e16

Organize readme

Browse files
Files changed (1) hide show
  1. README.md +14 -14
README.md CHANGED
@@ -7600,21 +7600,31 @@ model-index:
7600
  <a href=#news>News</a> |
7601
  <a href=#this-model>This Model</a> |
7602
  <a href=#usage>Usage</a> |
 
7603
  <a href="#contact">Contact</a> |
7604
- <a href="#faq">FAQ</a>
7605
  <a href="#license">License</a> |
7606
  <a href="#acknowledgement">Acknowledgement</a>
7607
  <p>
7608
  </h4>
7609
 
 
 
 
 
 
 
 
 
 
 
7610
  ## This Model
7611
 
7612
  This model is an incremental improvement over the original [snowflake-arctic-embed-m](https://huggingface.co/Snowflake/snowflake-arctic-embed-m/) designed to improve embedding vector compressibility. This model achieves a slightly higher performance overall without compression, and it is additionally capable of retaining most of its retrieval quality even down to 128 byte embedding vectors through a combination of [Matryoshka Representation Learning (MRL)](https://arxiv.org/abs/2205.13147) and uniform scalar quanitization.
7613
 
7614
- | Model Name | MTEB Retrieval Score (NDCG @ 10) |
7615
- | ------------------------------------------------------------------ | -------------------------------- |
7616
  | [snowflake-arctic-embed-m-v1.5](https://huggingface.co/Snowflake/snowflake-arctic-embed-m-v1.5) | 55.14 |
7617
- | [snowflake-arctic-embed-m](https://huggingface.co/Snowflake/snowflake-arctic-embed-m/) | 54.91 |
7618
 
7619
  Compared to several other models trained with MRL to produce 256-dimensional embedding vectors, `snowflake-arctic-embed-m-v1.5` retains a higher degree of original model quality and delivers better retrieval quality on the MTEB Retrieval benchmark.
7620
 
@@ -7638,16 +7648,6 @@ Additionally, this model was designed to pair well with a corpus-independent sca
7638
 
7639
  NOTE: A good uniform scalar quantization range to use with this model (and which was used in the eval above), is -0.18 to 0.18. For a detailed walkthrough of int4 quantization with `snowflake-arctic-embed-m-v1.5`, check out our [example notebook](compressed_embeddings_examples/score_arctic_embed_m_v1dot5_with_quantization.ipynb).
7640
 
7641
-
7642
- ## News
7643
-
7644
- 07/18/2024: Released of `snowflake-arctic-embed-m-v1.5`, capable of producing highly compressible embedding vectors that preserve quality even when squished as small as 128 bytes per vector.
7645
-
7646
- 05/10/2024: Release of the [technical report on Arctic Embed](https://arxiv.org/abs/2405.05374)
7647
-
7648
- 04/16/2024: Original release the `snowflake-arctic-embed` family of text embedding models.
7649
-
7650
-
7651
  ## Usage
7652
 
7653
  ### Using Sentence Transformers
 
7600
  <a href=#news>News</a> |
7601
  <a href=#this-model>This Model</a> |
7602
  <a href=#usage>Usage</a> |
7603
+ <a href="#faq">FAQ</a> |
7604
  <a href="#contact">Contact</a> |
 
7605
  <a href="#license">License</a> |
7606
  <a href="#acknowledgement">Acknowledgement</a>
7607
  <p>
7608
  </h4>
7609
 
7610
+
7611
+ ## News
7612
+
7613
+ 07/18/2024: Released of `snowflake-arctic-embed-m-v1.5`, capable of producing highly compressible embedding vectors that preserve quality even when squished as small as 128 bytes per vector.
7614
+
7615
+ 05/10/2024: Release of the [technical report on Arctic Embed](https://arxiv.org/abs/2405.05374)
7616
+
7617
+ 04/16/2024: Original release the `snowflake-arctic-embed` family of text embedding models.
7618
+
7619
+
7620
  ## This Model
7621
 
7622
  This model is an incremental improvement over the original [snowflake-arctic-embed-m](https://huggingface.co/Snowflake/snowflake-arctic-embed-m/) designed to improve embedding vector compressibility. This model achieves a slightly higher performance overall without compression, and it is additionally capable of retaining most of its retrieval quality even down to 128 byte embedding vectors through a combination of [Matryoshka Representation Learning (MRL)](https://arxiv.org/abs/2205.13147) and uniform scalar quanitization.
7623
 
7624
+ | Model Name | MTEB Retrieval Score (NDCG @ 10) |
7625
+ |:------------------------------------------------------------------------------------------------|:---------------------------------|
7626
  | [snowflake-arctic-embed-m-v1.5](https://huggingface.co/Snowflake/snowflake-arctic-embed-m-v1.5) | 55.14 |
7627
+ | [snowflake-arctic-embed-m](https://huggingface.co/Snowflake/snowflake-arctic-embed-m/) | 54.91 |
7628
 
7629
  Compared to several other models trained with MRL to produce 256-dimensional embedding vectors, `snowflake-arctic-embed-m-v1.5` retains a higher degree of original model quality and delivers better retrieval quality on the MTEB Retrieval benchmark.
7630
 
 
7648
 
7649
  NOTE: A good uniform scalar quantization range to use with this model (and which was used in the eval above), is -0.18 to 0.18. For a detailed walkthrough of int4 quantization with `snowflake-arctic-embed-m-v1.5`, check out our [example notebook](compressed_embeddings_examples/score_arctic_embed_m_v1dot5_with_quantization.ipynb).
7650
 
 
 
 
 
 
 
 
 
 
 
7651
  ## Usage
7652
 
7653
  ### Using Sentence Transformers