dustyatx's picture
Add new SentenceTransformer model.
f112565 verified
|
raw
history blame
124 kB
---
base_model: BAAI/bge-large-en-v1.5
datasets: []
language:
- en
library_name: sentence-transformers
license: other
metrics:
- cosine_accuracy@1
- cosine_accuracy@3
- cosine_accuracy@5
- cosine_accuracy@10
- cosine_precision@1
- cosine_precision@3
- cosine_precision@5
- cosine_precision@10
- cosine_recall@1
- cosine_recall@3
- cosine_recall@5
- cosine_recall@10
- cosine_ndcg@10
- cosine_mrr@10
- cosine_map@100
pipeline_tag: sentence-similarity
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:104022
- loss:MatryoshkaLoss
- loss:MultipleNegativesRankingLoss
widget:
- source_sentence: IZEA's market capitalization is $36 million, indicating potential
for raising additional funds if needed.
sentences:
- IZEA's market capitalization is $35.65 million, with a P/E ratio of -5.19, indicating
unprofitability in the last twelve months as of Q3 2023.
- NetApp sells its products and services through a direct sales force and an ecosystem
of partners.
- SAIL's expansion plans have raised concerns among investors, leading to underperformance
in its stock compared to the Nifty 500 index.
- source_sentence: Infinity Mining conducted an eight-hole reverse-circulation (RC)
drilling campaign at its Tambourah South project in Western Australia, targeting
lithium-caesium-tantalum (LCT) pegmatites.
sentences:
- The disclosure must be made to a Regulatory Information Service, as required by
Rule 8 of the Takeover Code.
- Infinity Mining plans to expand its exploration efforts at Tambourah South, including
the use of new technologies and techniques to identify and evaluate concealed
pegmatite targets.
- Russia aims to export over 65 million tons of grain during the season, a record
volume.
- source_sentence: Ukraine expects to receive about $1.5 billion from other international
financial institutions, including the World Bank, in 2024.
sentences:
- Ukraine has an ongoing cooperation with the International Monetary Fund (IMF),
with a 48-month lending program worth $15.6 billion, receiving $3.6 billion this
year and expecting $900 million in December, and $5.4 billion in 2024 subject
to reform targets and economic indicators.
- Vodacom Group could be considered a reasonable income stock despite the dividend
cut, with a solid payout ratio but a less impressive dividend track record.
- CoStar Group employees, members of the Black Excellence Network and Women's Network,
worked alongside Feed More volunteers to facilitate the giveaway.
- source_sentence: WaFd paid out 27% of its profit in dividends last year, indicating
a comfortable payout ratio.
sentences:
- USP35 knockdown in Hep3B cells inhibits tumor growth and reduces the expression
of ABHD17C, p-PI3K, and p-AKT in xenograft HCC models.
- Nasdaq will suspend trading of CohBar, Inc.'s common stock at the opening of business
on November 29, 2023, unless the company requests a hearing before a Nasdaq Hearings
Panel to appeal the determination.
- WaFd's earnings per share have grown at a rate of 9.4% per annum over the past
five years, demonstrating consistent growth.
- source_sentence: Scope Control provides a digital ledger of inspected lines, creating
a credible line history that underscores Custom Truck One Source's commitment
to operational safety.
sentences:
- China has implemented measures to address hidden debt, including extending debt
maturities, selling assets to repay debts, and replacing short-term local government
financial vehicle debts with longer-term, lower-cost refinancing bonds.
- Scope Control utilizes advanced Computer Vision and Deep Learning technologies
to accurately assess line health and categorize it as new, used, or bad based
on safety standards and residual break strength.
- The current management regulations for the national social security fund were
approved in December 2001 and have been implemented for over 20 years. The MOF
stated that parts of the content no longer address the current needs of the Chinese
financial market and the investment trend for the national social security fund,
necessitating a systematic and thorough revision.
model-index:
- name: VANTIGE_NEWS_v3_EDGE_DETECTION
results:
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: dim 1024
type: dim_1024
metrics:
- type: cosine_accuracy@1
value: 0.828
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.978
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.986
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.992
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.828
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.32599999999999996
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.19720000000000001
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.0992
name: Cosine Precision@10
- type: cosine_recall@1
value: 0.828
name: Cosine Recall@1
- type: cosine_recall@3
value: 0.978
name: Cosine Recall@3
- type: cosine_recall@5
value: 0.986
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.992
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.9261911001883877
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.9034555555555557
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.9038902618135377
name: Cosine Map@100
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: dim 768
type: dim_768
metrics:
- type: cosine_accuracy@1
value: 0.83
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.978
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.986
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.99
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.83
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.32599999999999996
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.1972
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.099
name: Cosine Precision@10
- type: cosine_recall@1
value: 0.83
name: Cosine Recall@1
- type: cosine_recall@3
value: 0.978
name: Cosine Recall@3
- type: cosine_recall@5
value: 0.986
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.99
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.9264556449878328
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.9044190476190478
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.9049635033323674
name: Cosine Map@100
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: dim 512
type: dim_512
metrics:
- type: cosine_accuracy@1
value: 0.83
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.978
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.988
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.99
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.83
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.32599999999999996
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.19760000000000003
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.099
name: Cosine Precision@10
- type: cosine_recall@1
value: 0.83
name: Cosine Recall@1
- type: cosine_recall@3
value: 0.978
name: Cosine Recall@3
- type: cosine_recall@5
value: 0.988
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.99
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.9262131769268145
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.9041
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.9046338347982871
name: Cosine Map@100
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: dim 256
type: dim_256
metrics:
- type: cosine_accuracy@1
value: 0.828
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.978
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.984
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.99
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.828
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.32599999999999996
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.1968
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.099
name: Cosine Precision@10
- type: cosine_recall@1
value: 0.828
name: Cosine Recall@1
- type: cosine_recall@3
value: 0.978
name: Cosine Recall@3
- type: cosine_recall@5
value: 0.984
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.99
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.9250967573273415
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.90265
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.9031974089635855
name: Cosine Map@100
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: dim 128
type: dim_128
metrics:
- type: cosine_accuracy@1
value: 0.832
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.978
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.986
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.992
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.832
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.32599999999999996
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.19720000000000001
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.0992
name: Cosine Precision@10
- type: cosine_recall@1
value: 0.832
name: Cosine Recall@1
- type: cosine_recall@3
value: 0.978
name: Cosine Recall@3
- type: cosine_recall@5
value: 0.986
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.992
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.9276434508354098
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.9054333333333333
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.9058527890466532
name: Cosine Map@100
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: dim 64
type: dim_64
metrics:
- type: cosine_accuracy@1
value: 0.822
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.978
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.986
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.99
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.822
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.32599999999999996
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.19720000000000001
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.099
name: Cosine Precision@10
- type: cosine_recall@1
value: 0.822
name: Cosine Recall@1
- type: cosine_recall@3
value: 0.978
name: Cosine Recall@3
- type: cosine_recall@5
value: 0.986
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.99
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.9224148281915946
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.8989999999999999
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.8995256769374417
name: Cosine Map@100
---
# VANTIGE_NEWS_v3_EDGE_DETECTION
This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [BAAI/bge-large-en-v1.5](https://huggingface.co/BAAI/bge-large-en-v1.5). It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
## Model Details
### Model Description
- **Model Type:** Sentence Transformer
- **Base model:** [BAAI/bge-large-en-v1.5](https://huggingface.co/BAAI/bge-large-en-v1.5) <!-- at revision d4aa6901d3a41ba39fb536a557fa166f842b0e09 -->
- **Maximum Sequence Length:** 512 tokens
- **Output Dimensionality:** 1024 tokens
- **Similarity Function:** Cosine Similarity
<!-- - **Training Dataset:** Unknown -->
- **Language:** en
- **License:** other
### Model Sources
- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
### Full Model Architecture
```
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
```
## Usage
### Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
```bash
pip install -U sentence-transformers
```
Then you can load this model and run inference.
```python
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("dustyatx/news_v3_graph_edges_embeddings_setence_paragraph")
# Run inference
sentences = [
"Scope Control provides a digital ledger of inspected lines, creating a credible line history that underscores Custom Truck One Source's commitment to operational safety.",
'Scope Control utilizes advanced Computer Vision and Deep Learning technologies to accurately assess line health and categorize it as new, used, or bad based on safety standards and residual break strength.',
'China has implemented measures to address hidden debt, including extending debt maturities, selling assets to repay debts, and replacing short-term local government financial vehicle debts with longer-term, lower-cost refinancing bonds.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
```
<!--
### Direct Usage (Transformers)
<details><summary>Click to see the direct usage in Transformers</summary>
</details>
-->
<!--
### Downstream Usage (Sentence Transformers)
You can finetune this model on your own dataset.
<details><summary>Click to expand</summary>
</details>
-->
<!--
### Out-of-Scope Use
*List how the model may foreseeably be misused and address what users ought not to do with the model.*
-->
## Evaluation
### Metrics
#### Information Retrieval
* Dataset: `dim_1024`
* Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
| Metric | Value |
|:--------------------|:-----------|
| cosine_accuracy@1 | 0.828 |
| cosine_accuracy@3 | 0.978 |
| cosine_accuracy@5 | 0.986 |
| cosine_accuracy@10 | 0.992 |
| cosine_precision@1 | 0.828 |
| cosine_precision@3 | 0.326 |
| cosine_precision@5 | 0.1972 |
| cosine_precision@10 | 0.0992 |
| cosine_recall@1 | 0.828 |
| cosine_recall@3 | 0.978 |
| cosine_recall@5 | 0.986 |
| cosine_recall@10 | 0.992 |
| cosine_ndcg@10 | 0.9262 |
| cosine_mrr@10 | 0.9035 |
| **cosine_map@100** | **0.9039** |
#### Information Retrieval
* Dataset: `dim_768`
* Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
| Metric | Value |
|:--------------------|:----------|
| cosine_accuracy@1 | 0.83 |
| cosine_accuracy@3 | 0.978 |
| cosine_accuracy@5 | 0.986 |
| cosine_accuracy@10 | 0.99 |
| cosine_precision@1 | 0.83 |
| cosine_precision@3 | 0.326 |
| cosine_precision@5 | 0.1972 |
| cosine_precision@10 | 0.099 |
| cosine_recall@1 | 0.83 |
| cosine_recall@3 | 0.978 |
| cosine_recall@5 | 0.986 |
| cosine_recall@10 | 0.99 |
| cosine_ndcg@10 | 0.9265 |
| cosine_mrr@10 | 0.9044 |
| **cosine_map@100** | **0.905** |
#### Information Retrieval
* Dataset: `dim_512`
* Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
| Metric | Value |
|:--------------------|:-----------|
| cosine_accuracy@1 | 0.83 |
| cosine_accuracy@3 | 0.978 |
| cosine_accuracy@5 | 0.988 |
| cosine_accuracy@10 | 0.99 |
| cosine_precision@1 | 0.83 |
| cosine_precision@3 | 0.326 |
| cosine_precision@5 | 0.1976 |
| cosine_precision@10 | 0.099 |
| cosine_recall@1 | 0.83 |
| cosine_recall@3 | 0.978 |
| cosine_recall@5 | 0.988 |
| cosine_recall@10 | 0.99 |
| cosine_ndcg@10 | 0.9262 |
| cosine_mrr@10 | 0.9041 |
| **cosine_map@100** | **0.9046** |
#### Information Retrieval
* Dataset: `dim_256`
* Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
| Metric | Value |
|:--------------------|:-----------|
| cosine_accuracy@1 | 0.828 |
| cosine_accuracy@3 | 0.978 |
| cosine_accuracy@5 | 0.984 |
| cosine_accuracy@10 | 0.99 |
| cosine_precision@1 | 0.828 |
| cosine_precision@3 | 0.326 |
| cosine_precision@5 | 0.1968 |
| cosine_precision@10 | 0.099 |
| cosine_recall@1 | 0.828 |
| cosine_recall@3 | 0.978 |
| cosine_recall@5 | 0.984 |
| cosine_recall@10 | 0.99 |
| cosine_ndcg@10 | 0.9251 |
| cosine_mrr@10 | 0.9026 |
| **cosine_map@100** | **0.9032** |
#### Information Retrieval
* Dataset: `dim_128`
* Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
| Metric | Value |
|:--------------------|:-----------|
| cosine_accuracy@1 | 0.832 |
| cosine_accuracy@3 | 0.978 |
| cosine_accuracy@5 | 0.986 |
| cosine_accuracy@10 | 0.992 |
| cosine_precision@1 | 0.832 |
| cosine_precision@3 | 0.326 |
| cosine_precision@5 | 0.1972 |
| cosine_precision@10 | 0.0992 |
| cosine_recall@1 | 0.832 |
| cosine_recall@3 | 0.978 |
| cosine_recall@5 | 0.986 |
| cosine_recall@10 | 0.992 |
| cosine_ndcg@10 | 0.9276 |
| cosine_mrr@10 | 0.9054 |
| **cosine_map@100** | **0.9059** |
#### Information Retrieval
* Dataset: `dim_64`
* Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
| Metric | Value |
|:--------------------|:-----------|
| cosine_accuracy@1 | 0.822 |
| cosine_accuracy@3 | 0.978 |
| cosine_accuracy@5 | 0.986 |
| cosine_accuracy@10 | 0.99 |
| cosine_precision@1 | 0.822 |
| cosine_precision@3 | 0.326 |
| cosine_precision@5 | 0.1972 |
| cosine_precision@10 | 0.099 |
| cosine_recall@1 | 0.822 |
| cosine_recall@3 | 0.978 |
| cosine_recall@5 | 0.986 |
| cosine_recall@10 | 0.99 |
| cosine_ndcg@10 | 0.9224 |
| cosine_mrr@10 | 0.899 |
| **cosine_map@100** | **0.8995** |
<!--
## Bias, Risks and Limitations
*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
-->
<!--
### Recommendations
*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
-->
## Training Details
### Training Dataset
#### Unnamed Dataset
* Size: 104,022 training samples
* Columns: <code>anchor</code> and <code>positive</code>
* Approximate statistics based on the first 1000 samples:
| | anchor | positive |
|:--------|:------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|
| type | string | string |
| details | <ul><li>min: 16 tokens</li><li>mean: 36.53 tokens</li><li>max: 102 tokens</li></ul> | <ul><li>min: 13 tokens</li><li>mean: 35.17 tokens</li><li>max: 117 tokens</li></ul> |
* Samples:
| anchor | positive |
|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| <code>The general public, including retail investors, collectively own 11% of FINEOS Corporation Holdings' shares, representing a minority stake in the company.</code> | <code>Private companies, with their 50% ownership stake, have substantial influence over FINEOS Corporation Holdings' management and governance decisions.</code> |
| <code>A study by the Insurance Institute for Highway Safety (IIHS) found that SUVs and vans with hood heights exceeding 40 inches are approximately 45% more likely to cause pedestrian fatalities compared to vehicles with hood heights of 30 inches or less and a sloping profile.</code> | <code>Vehicles with front ends exceeding 35 inches in height, particularly those lacking a sloping profile, are more likely to cause severe head, torso, and hip injuries to pedestrians.</code> |
| <code>SpringWorks Therapeutics has a portfolio of small molecule targeted oncology product candidates and is conducting clinical trials for rare tumor types and genetically defined cancers.</code> | <code>SpringWorks Therapeutics operates in the biopharmaceutical industry, specializing in precision medicine for underserved patient populations.</code> |
* Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
```json
{
"loss": "MultipleNegativesRankingLoss",
"matryoshka_dims": [
1024,
768,
512,
256,
128,
64
],
"matryoshka_weights": [
1,
1,
1,
1,
1,
1
],
"n_dims_per_step": -1
}
```
### Training Hyperparameters
#### Non-Default Hyperparameters
- `eval_strategy`: steps
- `per_device_train_batch_size`: 30
- `per_device_eval_batch_size`: 20
- `gradient_accumulation_steps`: 8
- `learning_rate`: 3e-05
- `num_train_epochs`: 2
- `lr_scheduler_type`: cosine
- `warmup_ratio`: 0.2
- `bf16`: True
- `tf32`: True
- `dataloader_num_workers`: 30
- `load_best_model_at_end`: True
- `optim`: adamw_torch_fused
- `batch_sampler`: no_duplicates
#### All Hyperparameters
<details><summary>Click to expand</summary>
- `overwrite_output_dir`: False
- `do_predict`: False
- `eval_strategy`: steps
- `prediction_loss_only`: True
- `per_device_train_batch_size`: 30
- `per_device_eval_batch_size`: 20
- `per_gpu_train_batch_size`: None
- `per_gpu_eval_batch_size`: None
- `gradient_accumulation_steps`: 8
- `eval_accumulation_steps`: None
- `torch_empty_cache_steps`: None
- `learning_rate`: 3e-05
- `weight_decay`: 0.0
- `adam_beta1`: 0.9
- `adam_beta2`: 0.999
- `adam_epsilon`: 1e-08
- `max_grad_norm`: 1.0
- `num_train_epochs`: 2
- `max_steps`: -1
- `lr_scheduler_type`: cosine
- `lr_scheduler_kwargs`: {}
- `warmup_ratio`: 0.2
- `warmup_steps`: 0
- `log_level`: passive
- `log_level_replica`: warning
- `log_on_each_node`: True
- `logging_nan_inf_filter`: True
- `save_safetensors`: True
- `save_on_each_node`: False
- `save_only_model`: False
- `restore_callback_states_from_checkpoint`: False
- `no_cuda`: False
- `use_cpu`: False
- `use_mps_device`: False
- `seed`: 42
- `data_seed`: None
- `jit_mode_eval`: False
- `use_ipex`: False
- `bf16`: True
- `fp16`: False
- `fp16_opt_level`: O1
- `half_precision_backend`: auto
- `bf16_full_eval`: False
- `fp16_full_eval`: False
- `tf32`: True
- `local_rank`: 0
- `ddp_backend`: None
- `tpu_num_cores`: None
- `tpu_metrics_debug`: False
- `debug`: []
- `dataloader_drop_last`: False
- `dataloader_num_workers`: 30
- `dataloader_prefetch_factor`: None
- `past_index`: -1
- `disable_tqdm`: False
- `remove_unused_columns`: True
- `label_names`: None
- `load_best_model_at_end`: True
- `ignore_data_skip`: False
- `fsdp`: []
- `fsdp_min_num_params`: 0
- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
- `fsdp_transformer_layer_cls_to_wrap`: None
- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
- `deepspeed`: None
- `label_smoothing_factor`: 0.0
- `optim`: adamw_torch_fused
- `optim_args`: None
- `adafactor`: False
- `group_by_length`: False
- `length_column_name`: length
- `ddp_find_unused_parameters`: None
- `ddp_bucket_cap_mb`: None
- `ddp_broadcast_buffers`: False
- `dataloader_pin_memory`: True
- `dataloader_persistent_workers`: False
- `skip_memory_metrics`: True
- `use_legacy_prediction_loop`: False
- `push_to_hub`: False
- `resume_from_checkpoint`: None
- `hub_model_id`: None
- `hub_strategy`: every_save
- `hub_private_repo`: False
- `hub_always_push`: False
- `gradient_checkpointing`: False
- `gradient_checkpointing_kwargs`: None
- `include_inputs_for_metrics`: False
- `eval_do_concat_batches`: True
- `fp16_backend`: auto
- `push_to_hub_model_id`: None
- `push_to_hub_organization`: None
- `mp_parameters`:
- `auto_find_batch_size`: False
- `full_determinism`: False
- `torchdynamo`: None
- `ray_scope`: last
- `ddp_timeout`: 1800
- `torch_compile`: False
- `torch_compile_backend`: None
- `torch_compile_mode`: None
- `dispatch_batches`: None
- `split_batches`: None
- `include_tokens_per_second`: False
- `include_num_input_tokens_seen`: False
- `neftune_noise_alpha`: None
- `optim_target_modules`: None
- `batch_eval_metrics`: False
- `eval_on_start`: False
- `eval_use_gather_object`: False
- `batch_sampler`: no_duplicates
- `multi_dataset_batch_sampler`: proportional
</details>
### Training Logs
<details><summary>Click to expand</summary>
| Epoch | Step | Training Loss | dim_1024_cosine_map@100 | dim_128_cosine_map@100 | dim_256_cosine_map@100 | dim_512_cosine_map@100 | dim_64_cosine_map@100 | dim_768_cosine_map@100 |
|:------:|:----:|:-------------:|:-----------------------:|:----------------------:|:----------------------:|:----------------------:|:---------------------:|:----------------------:|
| 0.0023 | 1 | 1.8313 | - | - | - | - | - | - |
| 0.0046 | 2 | 1.9678 | - | - | - | - | - | - |
| 0.0069 | 3 | 0.8038 | - | - | - | - | - | - |
| 0.0092 | 4 | 0.7993 | - | - | - | - | - | - |
| 0.0115 | 5 | 0.7926 | - | - | - | - | - | - |
| 0.0138 | 6 | 0.9348 | - | - | - | - | - | - |
| 0.0161 | 7 | 0.8707 | - | - | - | - | - | - |
| 0.0185 | 8 | 0.7293 | - | - | - | - | - | - |
| 0.0208 | 9 | 0.6618 | - | - | - | - | - | - |
| 0.0231 | 10 | 0.846 | - | - | - | - | - | - |
| 0.0254 | 11 | 0.6836 | - | - | - | - | - | - |
| 0.0277 | 12 | 0.7034 | - | - | - | - | - | - |
| 0.0300 | 13 | 0.7987 | - | - | - | - | - | - |
| 0.0323 | 14 | 0.6443 | - | - | - | - | - | - |
| 0.0346 | 15 | 0.5975 | - | - | - | - | - | - |
| 0.0369 | 16 | 0.4471 | - | - | - | - | - | - |
| 0.0392 | 17 | 0.4739 | - | - | - | - | - | - |
| 0.0415 | 18 | 0.4136 | - | - | - | - | - | - |
| 0.0438 | 19 | 0.3865 | - | - | - | - | - | - |
| 0.0461 | 20 | 0.3421 | - | - | - | - | - | - |
| 0.0484 | 21 | 0.5076 | - | - | - | - | - | - |
| 0.0507 | 22 | 0.1878 | - | - | - | - | - | - |
| 0.0531 | 23 | 0.3597 | - | - | - | - | - | - |
| 0.0554 | 24 | 0.23 | - | - | - | - | - | - |
| 0.0577 | 25 | 0.1331 | - | - | - | - | - | - |
| 0.0600 | 26 | 0.1793 | - | - | - | - | - | - |
| 0.0623 | 27 | 0.1309 | - | - | - | - | - | - |
| 0.0646 | 28 | 0.1077 | - | - | - | - | - | - |
| 0.0669 | 29 | 0.1681 | - | - | - | - | - | - |
| 0.0692 | 30 | 0.055 | - | - | - | - | - | - |
| 0.0715 | 31 | 0.1062 | - | - | - | - | - | - |
| 0.0738 | 32 | 0.0672 | - | - | - | - | - | - |
| 0.0761 | 33 | 0.067 | - | - | - | - | - | - |
| 0.0784 | 34 | 0.0953 | - | - | - | - | - | - |
| 0.0807 | 35 | 0.0602 | - | - | - | - | - | - |
| 0.0830 | 36 | 0.1312 | - | - | - | - | - | - |
| 0.0854 | 37 | 0.0356 | - | - | - | - | - | - |
| 0.0877 | 38 | 0.0707 | - | - | - | - | - | - |
| 0.0900 | 39 | 0.1525 | - | - | - | - | - | - |
| 0.0923 | 40 | 0.0362 | - | - | - | - | - | - |
| 0.0946 | 41 | 0.253 | - | - | - | - | - | - |
| 0.0969 | 42 | 0.0572 | - | - | - | - | - | - |
| 0.0992 | 43 | 0.1031 | - | - | - | - | - | - |
| 0.1015 | 44 | 0.1023 | - | - | - | - | - | - |
| 0.1038 | 45 | 0.052 | - | - | - | - | - | - |
| 0.1061 | 46 | 0.0614 | - | - | - | - | - | - |
| 0.1084 | 47 | 0.1256 | - | - | - | - | - | - |
| 0.1107 | 48 | 0.1624 | - | - | - | - | - | - |
| 0.1130 | 49 | 0.0363 | - | - | - | - | - | - |
| 0.1153 | 50 | 0.2001 | 0.8949 | 0.8940 | 0.8947 | 0.8950 | 0.8864 | 0.8972 |
| 0.1176 | 51 | 0.0846 | - | - | - | - | - | - |
| 0.1200 | 52 | 0.0338 | - | - | - | - | - | - |
| 0.1223 | 53 | 0.0648 | - | - | - | - | - | - |
| 0.1246 | 54 | 0.1232 | - | - | - | - | - | - |
| 0.1269 | 55 | 0.0318 | - | - | - | - | - | - |
| 0.1292 | 56 | 0.1148 | - | - | - | - | - | - |
| 0.1315 | 57 | 0.0826 | - | - | - | - | - | - |
| 0.1338 | 58 | 0.034 | - | - | - | - | - | - |
| 0.1361 | 59 | 0.0492 | - | - | - | - | - | - |
| 0.1384 | 60 | 0.0427 | - | - | - | - | - | - |
| 0.1407 | 61 | 0.0709 | - | - | - | - | - | - |
| 0.1430 | 62 | 0.0494 | - | - | - | - | - | - |
| 0.1453 | 63 | 0.0554 | - | - | - | - | - | - |
| 0.1476 | 64 | 0.061 | - | - | - | - | - | - |
| 0.1499 | 65 | 0.1155 | - | - | - | - | - | - |
| 0.1522 | 66 | 0.0419 | - | - | - | - | - | - |
| 0.1546 | 67 | 0.0185 | - | - | - | - | - | - |
| 0.1569 | 68 | 0.0559 | - | - | - | - | - | - |
| 0.1592 | 69 | 0.0219 | - | - | - | - | - | - |
| 0.1615 | 70 | 0.0302 | - | - | - | - | - | - |
| 0.1638 | 71 | 0.0322 | - | - | - | - | - | - |
| 0.1661 | 72 | 0.0604 | - | - | - | - | - | - |
| 0.1684 | 73 | 0.038 | - | - | - | - | - | - |
| 0.1707 | 74 | 0.0971 | - | - | - | - | - | - |
| 0.1730 | 75 | 0.0384 | - | - | - | - | - | - |
| 0.1753 | 76 | 0.0887 | - | - | - | - | - | - |
| 0.1776 | 77 | 0.0495 | - | - | - | - | - | - |
| 0.1799 | 78 | 0.0203 | - | - | - | - | - | - |
| 0.1822 | 79 | 0.0669 | - | - | - | - | - | - |
| 0.1845 | 80 | 0.0319 | - | - | - | - | - | - |
| 0.1869 | 81 | 0.0177 | - | - | - | - | - | - |
| 0.1892 | 82 | 0.0303 | - | - | - | - | - | - |
| 0.1915 | 83 | 0.037 | - | - | - | - | - | - |
| 0.1938 | 84 | 0.0122 | - | - | - | - | - | - |
| 0.1961 | 85 | 0.0377 | - | - | - | - | - | - |
| 0.1984 | 86 | 0.0578 | - | - | - | - | - | - |
| 0.2007 | 87 | 0.0347 | - | - | - | - | - | - |
| 0.2030 | 88 | 0.1288 | - | - | - | - | - | - |
| 0.2053 | 89 | 0.0964 | - | - | - | - | - | - |
| 0.2076 | 90 | 0.0172 | - | - | - | - | - | - |
| 0.2099 | 91 | 0.0726 | - | - | - | - | - | - |
| 0.2122 | 92 | 0.0225 | - | - | - | - | - | - |
| 0.2145 | 93 | 0.1011 | - | - | - | - | - | - |
| 0.2168 | 94 | 0.0248 | - | - | - | - | - | - |
| 0.2191 | 95 | 0.0431 | - | - | - | - | - | - |
| 0.2215 | 96 | 0.0243 | - | - | - | - | - | - |
| 0.2238 | 97 | 0.0221 | - | - | - | - | - | - |
| 0.2261 | 98 | 0.0529 | - | - | - | - | - | - |
| 0.2284 | 99 | 0.0459 | - | - | - | - | - | - |
| 0.2307 | 100 | 0.0869 | 0.9026 | 0.8967 | 0.8950 | 0.9003 | 0.8915 | 0.9009 |
| 0.2330 | 101 | 0.0685 | - | - | - | - | - | - |
| 0.2353 | 102 | 0.0801 | - | - | - | - | - | - |
| 0.2376 | 103 | 0.025 | - | - | - | - | - | - |
| 0.2399 | 104 | 0.0556 | - | - | - | - | - | - |
| 0.2422 | 105 | 0.0146 | - | - | - | - | - | - |
| 0.2445 | 106 | 0.0335 | - | - | - | - | - | - |
| 0.2468 | 107 | 0.0441 | - | - | - | - | - | - |
| 0.2491 | 108 | 0.0187 | - | - | - | - | - | - |
| 0.2514 | 109 | 0.1027 | - | - | - | - | - | - |
| 0.2537 | 110 | 0.0189 | - | - | - | - | - | - |
| 0.2561 | 111 | 0.1262 | - | - | - | - | - | - |
| 0.2584 | 112 | 0.1193 | - | - | - | - | - | - |
| 0.2607 | 113 | 0.0285 | - | - | - | - | - | - |
| 0.2630 | 114 | 0.0226 | - | - | - | - | - | - |
| 0.2653 | 115 | 0.1209 | - | - | - | - | - | - |
| 0.2676 | 116 | 0.0765 | - | - | - | - | - | - |
| 0.2699 | 117 | 0.1405 | - | - | - | - | - | - |
| 0.2722 | 118 | 0.0629 | - | - | - | - | - | - |
| 0.2745 | 119 | 0.0413 | - | - | - | - | - | - |
| 0.2768 | 120 | 0.0572 | - | - | - | - | - | - |
| 0.2791 | 121 | 0.0192 | - | - | - | - | - | - |
| 0.2814 | 122 | 0.0949 | - | - | - | - | - | - |
| 0.2837 | 123 | 0.0398 | - | - | - | - | - | - |
| 0.2860 | 124 | 0.0596 | - | - | - | - | - | - |
| 0.2884 | 125 | 0.0243 | - | - | - | - | - | - |
| 0.2907 | 126 | 0.0636 | - | - | - | - | - | - |
| 0.2930 | 127 | 0.0367 | - | - | - | - | - | - |
| 0.2953 | 128 | 0.0542 | - | - | - | - | - | - |
| 0.2976 | 129 | 0.0149 | - | - | - | - | - | - |
| 0.2999 | 130 | 0.097 | - | - | - | - | - | - |
| 0.3022 | 131 | 0.0213 | - | - | - | - | - | - |
| 0.3045 | 132 | 0.027 | - | - | - | - | - | - |
| 0.3068 | 133 | 0.0577 | - | - | - | - | - | - |
| 0.3091 | 134 | 0.0143 | - | - | - | - | - | - |
| 0.3114 | 135 | 0.0285 | - | - | - | - | - | - |
| 0.3137 | 136 | 0.033 | - | - | - | - | - | - |
| 0.3160 | 137 | 0.0412 | - | - | - | - | - | - |
| 0.3183 | 138 | 0.0125 | - | - | - | - | - | - |
| 0.3206 | 139 | 0.0512 | - | - | - | - | - | - |
| 0.3230 | 140 | 0.0189 | - | - | - | - | - | - |
| 0.3253 | 141 | 0.124 | - | - | - | - | - | - |
| 0.3276 | 142 | 0.0118 | - | - | - | - | - | - |
| 0.3299 | 143 | 0.017 | - | - | - | - | - | - |
| 0.3322 | 144 | 0.025 | - | - | - | - | - | - |
| 0.3345 | 145 | 0.0187 | - | - | - | - | - | - |
| 0.3368 | 146 | 0.0141 | - | - | - | - | - | - |
| 0.3391 | 147 | 0.0325 | - | - | - | - | - | - |
| 0.3414 | 148 | 0.0582 | - | - | - | - | - | - |
| 0.3437 | 149 | 0.0611 | - | - | - | - | - | - |
| 0.3460 | 150 | 0.0261 | 0.9047 | 0.8995 | 0.9003 | 0.9022 | 0.8998 | 0.9032 |
| 0.3483 | 151 | 0.014 | - | - | - | - | - | - |
| 0.3506 | 152 | 0.0077 | - | - | - | - | - | - |
| 0.3529 | 153 | 0.022 | - | - | - | - | - | - |
| 0.3552 | 154 | 0.0328 | - | - | - | - | - | - |
| 0.3576 | 155 | 0.0124 | - | - | - | - | - | - |
| 0.3599 | 156 | 0.0103 | - | - | - | - | - | - |
| 0.3622 | 157 | 0.0607 | - | - | - | - | - | - |
| 0.3645 | 158 | 0.0121 | - | - | - | - | - | - |
| 0.3668 | 159 | 0.0761 | - | - | - | - | - | - |
| 0.3691 | 160 | 0.0981 | - | - | - | - | - | - |
| 0.3714 | 161 | 0.1071 | - | - | - | - | - | - |
| 0.3737 | 162 | 0.1307 | - | - | - | - | - | - |
| 0.3760 | 163 | 0.0524 | - | - | - | - | - | - |
| 0.3783 | 164 | 0.0726 | - | - | - | - | - | - |
| 0.3806 | 165 | 0.0636 | - | - | - | - | - | - |
| 0.3829 | 166 | 0.0428 | - | - | - | - | - | - |
| 0.3852 | 167 | 0.0111 | - | - | - | - | - | - |
| 0.3875 | 168 | 0.0542 | - | - | - | - | - | - |
| 0.3899 | 169 | 0.0193 | - | - | - | - | - | - |
| 0.3922 | 170 | 0.0095 | - | - | - | - | - | - |
| 0.3945 | 171 | 0.0464 | - | - | - | - | - | - |
| 0.3968 | 172 | 0.0167 | - | - | - | - | - | - |
| 0.3991 | 173 | 0.0209 | - | - | - | - | - | - |
| 0.4014 | 174 | 0.0359 | - | - | - | - | - | - |
| 0.4037 | 175 | 0.071 | - | - | - | - | - | - |
| 0.4060 | 176 | 0.0189 | - | - | - | - | - | - |
| 0.4083 | 177 | 0.0448 | - | - | - | - | - | - |
| 0.4106 | 178 | 0.0161 | - | - | - | - | - | - |
| 0.4129 | 179 | 0.0427 | - | - | - | - | - | - |
| 0.4152 | 180 | 0.0229 | - | - | - | - | - | - |
| 0.4175 | 181 | 0.0274 | - | - | - | - | - | - |
| 0.4198 | 182 | 0.0173 | - | - | - | - | - | - |
| 0.4221 | 183 | 0.0123 | - | - | - | - | - | - |
| 0.4245 | 184 | 0.0395 | - | - | - | - | - | - |
| 0.4268 | 185 | 0.015 | - | - | - | - | - | - |
| 0.4291 | 186 | 0.0168 | - | - | - | - | - | - |
| 0.4314 | 187 | 0.0165 | - | - | - | - | - | - |
| 0.4337 | 188 | 0.0412 | - | - | - | - | - | - |
| 0.4360 | 189 | 0.0961 | - | - | - | - | - | - |
| 0.4383 | 190 | 0.0551 | - | - | - | - | - | - |
| 0.4406 | 191 | 0.0685 | - | - | - | - | - | - |
| 0.4429 | 192 | 0.1561 | - | - | - | - | - | - |
| 0.4452 | 193 | 0.0333 | - | - | - | - | - | - |
| 0.4475 | 194 | 0.0567 | - | - | - | - | - | - |
| 0.4498 | 195 | 0.0081 | - | - | - | - | - | - |
| 0.4521 | 196 | 0.0297 | - | - | - | - | - | - |
| 0.4544 | 197 | 0.0131 | - | - | - | - | - | - |
| 0.4567 | 198 | 0.0322 | - | - | - | - | - | - |
| 0.4591 | 199 | 0.0224 | - | - | - | - | - | - |
| 0.4614 | 200 | 0.0068 | 0.8989 | 0.8941 | 0.8983 | 0.8985 | 0.8975 | 0.9002 |
| 0.4637 | 201 | 0.0115 | - | - | - | - | - | - |
| 0.4660 | 202 | 0.0098 | - | - | - | - | - | - |
| 0.4683 | 203 | 0.101 | - | - | - | - | - | - |
| 0.4706 | 204 | 0.0282 | - | - | - | - | - | - |
| 0.4729 | 205 | 0.0721 | - | - | - | - | - | - |
| 0.4752 | 206 | 0.0123 | - | - | - | - | - | - |
| 0.4775 | 207 | 0.1014 | - | - | - | - | - | - |
| 0.4798 | 208 | 0.0257 | - | - | - | - | - | - |
| 0.4821 | 209 | 0.1126 | - | - | - | - | - | - |
| 0.4844 | 210 | 0.0586 | - | - | - | - | - | - |
| 0.4867 | 211 | 0.0307 | - | - | - | - | - | - |
| 0.4890 | 212 | 0.0226 | - | - | - | - | - | - |
| 0.4913 | 213 | 0.0471 | - | - | - | - | - | - |
| 0.4937 | 214 | 0.025 | - | - | - | - | - | - |
| 0.4960 | 215 | 0.0799 | - | - | - | - | - | - |
| 0.4983 | 216 | 0.0173 | - | - | - | - | - | - |
| 0.5006 | 217 | 0.0208 | - | - | - | - | - | - |
| 0.5029 | 218 | 0.0461 | - | - | - | - | - | - |
| 0.5052 | 219 | 0.0592 | - | - | - | - | - | - |
| 0.5075 | 220 | 0.0076 | - | - | - | - | - | - |
| 0.5098 | 221 | 0.0156 | - | - | - | - | - | - |
| 0.5121 | 222 | 0.0149 | - | - | - | - | - | - |
| 0.5144 | 223 | 0.0138 | - | - | - | - | - | - |
| 0.5167 | 224 | 0.0526 | - | - | - | - | - | - |
| 0.5190 | 225 | 0.0689 | - | - | - | - | - | - |
| 0.5213 | 226 | 0.0191 | - | - | - | - | - | - |
| 0.5236 | 227 | 0.0094 | - | - | - | - | - | - |
| 0.5260 | 228 | 0.0125 | - | - | - | - | - | - |
| 0.5283 | 229 | 0.0632 | - | - | - | - | - | - |
| 0.5306 | 230 | 0.0773 | - | - | - | - | - | - |
| 0.5329 | 231 | 0.0147 | - | - | - | - | - | - |
| 0.5352 | 232 | 0.0145 | - | - | - | - | - | - |
| 0.5375 | 233 | 0.0068 | - | - | - | - | - | - |
| 0.5398 | 234 | 0.0673 | - | - | - | - | - | - |
| 0.5421 | 235 | 0.0131 | - | - | - | - | - | - |
| 0.5444 | 236 | 0.0217 | - | - | - | - | - | - |
| 0.5467 | 237 | 0.0126 | - | - | - | - | - | - |
| 0.5490 | 238 | 0.0172 | - | - | - | - | - | - |
| 0.5513 | 239 | 0.0122 | - | - | - | - | - | - |
| 0.5536 | 240 | 0.0175 | - | - | - | - | - | - |
| 0.5559 | 241 | 0.0184 | - | - | - | - | - | - |
| 0.5582 | 242 | 0.0422 | - | - | - | - | - | - |
| 0.5606 | 243 | 0.0106 | - | - | - | - | - | - |
| 0.5629 | 244 | 0.071 | - | - | - | - | - | - |
| 0.5652 | 245 | 0.0089 | - | - | - | - | - | - |
| 0.5675 | 246 | 0.0099 | - | - | - | - | - | - |
| 0.5698 | 247 | 0.0133 | - | - | - | - | - | - |
| 0.5721 | 248 | 0.0627 | - | - | - | - | - | - |
| 0.5744 | 249 | 0.0248 | - | - | - | - | - | - |
| 0.5767 | 250 | 0.0349 | 0.8970 | 0.8968 | 0.8961 | 0.8961 | 0.8952 | 0.8963 |
| 0.5790 | 251 | 0.0145 | - | - | - | - | - | - |
| 0.5813 | 252 | 0.0052 | - | - | - | - | - | - |
| 0.5836 | 253 | 0.0198 | - | - | - | - | - | - |
| 0.5859 | 254 | 0.0065 | - | - | - | - | - | - |
| 0.5882 | 255 | 0.007 | - | - | - | - | - | - |
| 0.5905 | 256 | 0.0072 | - | - | - | - | - | - |
| 0.5928 | 257 | 0.1878 | - | - | - | - | - | - |
| 0.5952 | 258 | 0.0091 | - | - | - | - | - | - |
| 0.5975 | 259 | 0.0421 | - | - | - | - | - | - |
| 0.5998 | 260 | 0.0166 | - | - | - | - | - | - |
| 0.6021 | 261 | 0.0909 | - | - | - | - | - | - |
| 0.6044 | 262 | 0.0107 | - | - | - | - | - | - |
| 0.6067 | 263 | 0.0191 | - | - | - | - | - | - |
| 0.6090 | 264 | 0.0168 | - | - | - | - | - | - |
| 0.6113 | 265 | 0.0814 | - | - | - | - | - | - |
| 0.6136 | 266 | 0.0736 | - | - | - | - | - | - |
| 0.6159 | 267 | 0.0297 | - | - | - | - | - | - |
| 0.6182 | 268 | 0.016 | - | - | - | - | - | - |
| 0.6205 | 269 | 0.0201 | - | - | - | - | - | - |
| 0.6228 | 270 | 0.0111 | - | - | - | - | - | - |
| 0.6251 | 271 | 0.0164 | - | - | - | - | - | - |
| 0.6275 | 272 | 0.0106 | - | - | - | - | - | - |
| 0.6298 | 273 | 0.0287 | - | - | - | - | - | - |
| 0.6321 | 274 | 0.0595 | - | - | - | - | - | - |
| 0.6344 | 275 | 0.0446 | - | - | - | - | - | - |
| 0.6367 | 276 | 0.0203 | - | - | - | - | - | - |
| 0.6390 | 277 | 0.0079 | - | - | - | - | - | - |
| 0.6413 | 278 | 0.0345 | - | - | - | - | - | - |
| 0.6436 | 279 | 0.0461 | - | - | - | - | - | - |
| 0.6459 | 280 | 0.0803 | - | - | - | - | - | - |
| 0.6482 | 281 | 0.0218 | - | - | - | - | - | - |
| 0.6505 | 282 | 0.0288 | - | - | - | - | - | - |
| 0.6528 | 283 | 0.0745 | - | - | - | - | - | - |
| 0.6551 | 284 | 0.0102 | - | - | - | - | - | - |
| 0.6574 | 285 | 0.0626 | - | - | - | - | - | - |
| 0.6597 | 286 | 0.0606 | - | - | - | - | - | - |
| 0.6621 | 287 | 0.0319 | - | - | - | - | - | - |
| 0.6644 | 288 | 0.0303 | - | - | - | - | - | - |
| 0.6667 | 289 | 0.0216 | - | - | - | - | - | - |
| 0.6690 | 290 | 0.0417 | - | - | - | - | - | - |
| 0.6713 | 291 | 0.0061 | - | - | - | - | - | - |
| 0.6736 | 292 | 0.0386 | - | - | - | - | - | - |
| 0.6759 | 293 | 0.0117 | - | - | - | - | - | - |
| 0.6782 | 294 | 0.0283 | - | - | - | - | - | - |
| 0.6805 | 295 | 0.013 | - | - | - | - | - | - |
| 0.6828 | 296 | 0.1237 | - | - | - | - | - | - |
| 0.6851 | 297 | 0.0878 | - | - | - | - | - | - |
| 0.6874 | 298 | 0.0158 | - | - | - | - | - | - |
| 0.6897 | 299 | 0.0562 | - | - | - | - | - | - |
| 0.6920 | 300 | 0.0871 | 0.9022 | 0.9027 | 0.9074 | 0.9055 | 0.8990 | 0.9027 |
| 0.6943 | 301 | 0.0657 | - | - | - | - | - | - |
| 0.6967 | 302 | 0.0239 | - | - | - | - | - | - |
| 0.6990 | 303 | 0.0053 | - | - | - | - | - | - |
| 0.7013 | 304 | 0.0237 | - | - | - | - | - | - |
| 0.7036 | 305 | 0.0182 | - | - | - | - | - | - |
| 0.7059 | 306 | 0.0135 | - | - | - | - | - | - |
| 0.7082 | 307 | 0.0059 | - | - | - | - | - | - |
| 0.7105 | 308 | 0.0061 | - | - | - | - | - | - |
| 0.7128 | 309 | 0.0072 | - | - | - | - | - | - |
| 0.7151 | 310 | 0.0319 | - | - | - | - | - | - |
| 0.7174 | 311 | 0.1183 | - | - | - | - | - | - |
| 0.7197 | 312 | 0.0447 | - | - | - | - | - | - |
| 0.7220 | 313 | 0.0369 | - | - | - | - | - | - |
| 0.7243 | 314 | 0.0462 | - | - | - | - | - | - |
| 0.7266 | 315 | 0.0233 | - | - | - | - | - | - |
| 0.7290 | 316 | 0.0114 | - | - | - | - | - | - |
| 0.7313 | 317 | 0.0179 | - | - | - | - | - | - |
| 0.7336 | 318 | 0.0203 | - | - | - | - | - | - |
| 0.7359 | 319 | 0.0071 | - | - | - | - | - | - |
| 0.7382 | 320 | 0.1297 | - | - | - | - | - | - |
| 0.7405 | 321 | 0.0249 | - | - | - | - | - | - |
| 0.7428 | 322 | 0.063 | - | - | - | - | - | - |
| 0.7451 | 323 | 0.0479 | - | - | - | - | - | - |
| 0.7474 | 324 | 0.1483 | - | - | - | - | - | - |
| 0.7497 | 325 | 0.0058 | - | - | - | - | - | - |
| 0.7520 | 326 | 0.0191 | - | - | - | - | - | - |
| 0.7543 | 327 | 0.0855 | - | - | - | - | - | - |
| 0.7566 | 328 | 0.0156 | - | - | - | - | - | - |
| 0.7589 | 329 | 0.0147 | - | - | - | - | - | - |
| 0.7612 | 330 | 0.0124 | - | - | - | - | - | - |
| 0.7636 | 331 | 0.0242 | - | - | - | - | - | - |
| 0.7659 | 332 | 0.0433 | - | - | - | - | - | - |
| 0.7682 | 333 | 0.0103 | - | - | - | - | - | - |
| 0.7705 | 334 | 0.0833 | - | - | - | - | - | - |
| 0.7728 | 335 | 0.0082 | - | - | - | - | - | - |
| 0.7751 | 336 | 0.0122 | - | - | - | - | - | - |
| 0.7774 | 337 | 0.031 | - | - | - | - | - | - |
| 0.7797 | 338 | 0.0116 | - | - | - | - | - | - |
| 0.7820 | 339 | 0.0947 | - | - | - | - | - | - |
| 0.7843 | 340 | 0.0323 | - | - | - | - | - | - |
| 0.7866 | 341 | 0.0177 | - | - | - | - | - | - |
| 0.7889 | 342 | 0.0487 | - | - | - | - | - | - |
| 0.7912 | 343 | 0.0123 | - | - | - | - | - | - |
| 0.7935 | 344 | 0.0075 | - | - | - | - | - | - |
| 0.7958 | 345 | 0.0061 | - | - | - | - | - | - |
| 0.7982 | 346 | 0.0057 | - | - | - | - | - | - |
| 0.8005 | 347 | 0.1108 | - | - | - | - | - | - |
| 0.8028 | 348 | 0.0104 | - | - | - | - | - | - |
| 0.8051 | 349 | 0.0131 | - | - | - | - | - | - |
| 0.8074 | 350 | 0.0229 | 0.9053 | 0.9041 | 0.9033 | 0.9066 | 0.8965 | 0.9052 |
| 0.8097 | 351 | 0.0478 | - | - | - | - | - | - |
| 0.8120 | 352 | 0.0127 | - | - | - | - | - | - |
| 0.8143 | 353 | 0.1143 | - | - | - | - | - | - |
| 0.8166 | 354 | 0.0365 | - | - | - | - | - | - |
| 0.8189 | 355 | 0.0418 | - | - | - | - | - | - |
| 0.8212 | 356 | 0.0494 | - | - | - | - | - | - |
| 0.8235 | 357 | 0.0082 | - | - | - | - | - | - |
| 0.8258 | 358 | 0.0212 | - | - | - | - | - | - |
| 0.8281 | 359 | 0.0106 | - | - | - | - | - | - |
| 0.8304 | 360 | 0.1009 | - | - | - | - | - | - |
| 0.8328 | 361 | 0.0316 | - | - | - | - | - | - |
| 0.8351 | 362 | 0.0313 | - | - | - | - | - | - |
| 0.8374 | 363 | 0.0108 | - | - | - | - | - | - |
| 0.8397 | 364 | 0.0198 | - | - | - | - | - | - |
| 0.8420 | 365 | 0.0112 | - | - | - | - | - | - |
| 0.8443 | 366 | 0.0197 | - | - | - | - | - | - |
| 0.8466 | 367 | 0.058 | - | - | - | - | - | - |
| 0.8489 | 368 | 0.0187 | - | - | - | - | - | - |
| 0.8512 | 369 | 0.0196 | - | - | - | - | - | - |
| 0.8535 | 370 | 0.0586 | - | - | - | - | - | - |
| 0.8558 | 371 | 0.0099 | - | - | - | - | - | - |
| 0.8581 | 372 | 0.0248 | - | - | - | - | - | - |
| 0.8604 | 373 | 0.0183 | - | - | - | - | - | - |
| 0.8627 | 374 | 0.0268 | - | - | - | - | - | - |
| 0.8651 | 375 | 0.0154 | - | - | - | - | - | - |
| 0.8674 | 376 | 0.0868 | - | - | - | - | - | - |
| 0.8697 | 377 | 0.0264 | - | - | - | - | - | - |
| 0.8720 | 378 | 0.0639 | - | - | - | - | - | - |
| 0.8743 | 379 | 0.1036 | - | - | - | - | - | - |
| 0.8766 | 380 | 0.0334 | - | - | - | - | - | - |
| 0.8789 | 381 | 0.04 | - | - | - | - | - | - |
| 0.8812 | 382 | 0.0095 | - | - | - | - | - | - |
| 0.8835 | 383 | 0.0371 | - | - | - | - | - | - |
| 0.8858 | 384 | 0.0585 | - | - | - | - | - | - |
| 0.8881 | 385 | 0.0353 | - | - | - | - | - | - |
| 0.8904 | 386 | 0.0095 | - | - | - | - | - | - |
| 0.8927 | 387 | 0.0126 | - | - | - | - | - | - |
| 0.8950 | 388 | 0.0384 | - | - | - | - | - | - |
| 0.8973 | 389 | 0.018 | - | - | - | - | - | - |
| 0.8997 | 390 | 0.057 | - | - | - | - | - | - |
| 0.9020 | 391 | 0.0371 | - | - | - | - | - | - |
| 0.9043 | 392 | 0.0475 | - | - | - | - | - | - |
| 0.9066 | 393 | 0.0972 | - | - | - | - | - | - |
| 0.9089 | 394 | 0.0189 | - | - | - | - | - | - |
| 0.9112 | 395 | 0.0993 | - | - | - | - | - | - |
| 0.9135 | 396 | 0.0527 | - | - | - | - | - | - |
| 0.9158 | 397 | 0.0466 | - | - | - | - | - | - |
| 0.9181 | 398 | 0.0383 | - | - | - | - | - | - |
| 0.9204 | 399 | 0.0322 | - | - | - | - | - | - |
| 0.9227 | 400 | 0.0651 | 0.9077 | 0.9074 | 0.9073 | 0.9077 | 0.9023 | 0.9078 |
| 0.9250 | 401 | 0.0055 | - | - | - | - | - | - |
| 0.9273 | 402 | 0.0083 | - | - | - | - | - | - |
| 0.9296 | 403 | 0.0062 | - | - | - | - | - | - |
| 0.9319 | 404 | 0.0085 | - | - | - | - | - | - |
| 0.9343 | 405 | 0.0179 | - | - | - | - | - | - |
| 0.9366 | 406 | 0.0041 | - | - | - | - | - | - |
| 0.9389 | 407 | 0.0978 | - | - | - | - | - | - |
| 0.9412 | 408 | 0.0068 | - | - | - | - | - | - |
| 0.9435 | 409 | 0.0145 | - | - | - | - | - | - |
| 0.9458 | 410 | 0.0098 | - | - | - | - | - | - |
| 0.9481 | 411 | 0.032 | - | - | - | - | - | - |
| 0.9504 | 412 | 0.0232 | - | - | - | - | - | - |
| 0.9527 | 413 | 0.0149 | - | - | - | - | - | - |
| 0.9550 | 414 | 0.0175 | - | - | - | - | - | - |
| 0.9573 | 415 | 0.0099 | - | - | - | - | - | - |
| 0.9596 | 416 | 0.0121 | - | - | - | - | - | - |
| 0.9619 | 417 | 0.108 | - | - | - | - | - | - |
| 0.9642 | 418 | 0.012 | - | - | - | - | - | - |
| 0.9666 | 419 | 0.0102 | - | - | - | - | - | - |
| 0.9689 | 420 | 0.0108 | - | - | - | - | - | - |
| 0.9712 | 421 | 0.2258 | - | - | - | - | - | - |
| 0.9735 | 422 | 0.0037 | - | - | - | - | - | - |
| 0.9758 | 423 | 0.0186 | - | - | - | - | - | - |
| 0.9781 | 424 | 0.0446 | - | - | - | - | - | - |
| 0.9804 | 425 | 0.1558 | - | - | - | - | - | - |
| 0.9827 | 426 | 0.023 | - | - | - | - | - | - |
| 0.9850 | 427 | 0.0075 | - | - | - | - | - | - |
| 0.9873 | 428 | 0.0095 | - | - | - | - | - | - |
| 0.9896 | 429 | 0.0141 | - | - | - | - | - | - |
| 0.9919 | 430 | 0.0617 | - | - | - | - | - | - |
| 0.9942 | 431 | 0.0961 | - | - | - | - | - | - |
| 0.9965 | 432 | 0.0058 | - | - | - | - | - | - |
| 0.9988 | 433 | 0.0399 | - | - | - | - | - | - |
| 1.0012 | 434 | 0.0063 | - | - | - | - | - | - |
| 1.0035 | 435 | 0.0288 | - | - | - | - | - | - |
| 1.0058 | 436 | 0.0041 | - | - | - | - | - | - |
| 1.0081 | 437 | 0.0071 | - | - | - | - | - | - |
| 1.0104 | 438 | 0.0233 | - | - | - | - | - | - |
| 1.0127 | 439 | 0.0135 | - | - | - | - | - | - |
| 1.0150 | 440 | 0.1015 | - | - | - | - | - | - |
| 1.0173 | 441 | 0.0045 | - | - | - | - | - | - |
| 1.0196 | 442 | 0.0088 | - | - | - | - | - | - |
| 1.0219 | 443 | 0.0086 | - | - | - | - | - | - |
| 1.0242 | 444 | 0.0072 | - | - | - | - | - | - |
| 1.0265 | 445 | 0.0147 | - | - | - | - | - | - |
| 1.0288 | 446 | 0.025 | - | - | - | - | - | - |
| 1.0311 | 447 | 0.0067 | - | - | - | - | - | - |
| 1.0334 | 448 | 0.0066 | - | - | - | - | - | - |
| 1.0358 | 449 | 0.0062 | - | - | - | - | - | - |
| 1.0381 | 450 | 0.0068 | 0.9091 | 0.9083 | 0.9045 | 0.9038 | 0.8983 | 0.9072 |
| 1.0404 | 451 | 0.0126 | - | - | - | - | - | - |
| 1.0427 | 452 | 0.0082 | - | - | - | - | - | - |
| 1.0450 | 453 | 0.0034 | - | - | - | - | - | - |
| 1.0473 | 454 | 0.04 | - | - | - | - | - | - |
| 1.0496 | 455 | 0.0235 | - | - | - | - | - | - |
| 1.0519 | 456 | 0.24 | - | - | - | - | - | - |
| 1.0542 | 457 | 0.0514 | - | - | - | - | - | - |
| 1.0565 | 458 | 0.0152 | - | - | - | - | - | - |
| 1.0588 | 459 | 0.0476 | - | - | - | - | - | - |
| 1.0611 | 460 | 0.0037 | - | - | - | - | - | - |
| 1.0634 | 461 | 0.0066 | - | - | - | - | - | - |
| 1.0657 | 462 | 0.0065 | - | - | - | - | - | - |
| 1.0681 | 463 | 0.0097 | - | - | - | - | - | - |
| 1.0704 | 464 | 0.0053 | - | - | - | - | - | - |
| 1.0727 | 465 | 0.0397 | - | - | - | - | - | - |
| 1.0750 | 466 | 0.0089 | - | - | - | - | - | - |
| 1.0773 | 467 | 0.0238 | - | - | - | - | - | - |
| 1.0796 | 468 | 0.0078 | - | - | - | - | - | - |
| 1.0819 | 469 | 0.0108 | - | - | - | - | - | - |
| 1.0842 | 470 | 0.0094 | - | - | - | - | - | - |
| 1.0865 | 471 | 0.0034 | - | - | - | - | - | - |
| 1.0888 | 472 | 0.0165 | - | - | - | - | - | - |
| 1.0911 | 473 | 0.0407 | - | - | - | - | - | - |
| 1.0934 | 474 | 0.0339 | - | - | - | - | - | - |
| 1.0957 | 475 | 0.0645 | - | - | - | - | - | - |
| 1.0980 | 476 | 0.0052 | - | - | - | - | - | - |
| 1.1003 | 477 | 0.0643 | - | - | - | - | - | - |
| 1.1027 | 478 | 0.0113 | - | - | - | - | - | - |
| 1.1050 | 479 | 0.007 | - | - | - | - | - | - |
| 1.1073 | 480 | 0.0062 | - | - | - | - | - | - |
| 1.1096 | 481 | 0.0232 | - | - | - | - | - | - |
| 1.1119 | 482 | 0.0374 | - | - | - | - | - | - |
| 1.1142 | 483 | 0.0582 | - | - | - | - | - | - |
| 1.1165 | 484 | 0.0396 | - | - | - | - | - | - |
| 1.1188 | 485 | 0.0041 | - | - | - | - | - | - |
| 1.1211 | 486 | 0.0064 | - | - | - | - | - | - |
| 1.1234 | 487 | 0.0248 | - | - | - | - | - | - |
| 1.1257 | 488 | 0.0052 | - | - | - | - | - | - |
| 1.1280 | 489 | 0.0095 | - | - | - | - | - | - |
| 1.1303 | 490 | 0.0681 | - | - | - | - | - | - |
| 1.1326 | 491 | 0.0082 | - | - | - | - | - | - |
| 1.1349 | 492 | 0.0279 | - | - | - | - | - | - |
| 1.1373 | 493 | 0.008 | - | - | - | - | - | - |
| 1.1396 | 494 | 0.0032 | - | - | - | - | - | - |
| 1.1419 | 495 | 0.041 | - | - | - | - | - | - |
| 1.1442 | 496 | 0.0089 | - | - | - | - | - | - |
| 1.1465 | 497 | 0.0289 | - | - | - | - | - | - |
| 1.1488 | 498 | 0.0232 | - | - | - | - | - | - |
| 1.1511 | 499 | 0.059 | - | - | - | - | - | - |
| 1.1534 | 500 | 0.0053 | 0.9039 | 0.9059 | 0.9032 | 0.9046 | 0.8995 | 0.9050 |
</details>
### Framework Versions
- Python: 3.11.9
- Sentence Transformers: 3.0.1
- Transformers: 4.44.2
- PyTorch: 2.4.0+cu121
- Accelerate: 0.33.0
- Datasets: 2.19.2
- Tokenizers: 0.19.1
## Citation
### BibTeX
#### Sentence Transformers
```bibtex
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
```
#### MatryoshkaLoss
```bibtex
@misc{kusupati2024matryoshka,
title={Matryoshka Representation Learning},
author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
year={2024},
eprint={2205.13147},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
```
#### MultipleNegativesRankingLoss
```bibtex
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
```
<!--
## Glossary
*Clearly define terms in order to be accessible across audiences.*
-->
<!--
## Model Card Authors
*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
-->
<!--
## Model Card Contact
*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
-->