diff --git "a/README.md" "b/README.md" new file mode 100644--- /dev/null +++ "b/README.md" @@ -0,0 +1,1396 @@ +--- +base_model: BAAI/bge-large-en-v1.5 +datasets: [] +language: +- en +library_name: sentence-transformers +license: other +metrics: +- cosine_accuracy@1 +- cosine_accuracy@3 +- cosine_accuracy@5 +- cosine_accuracy@10 +- cosine_precision@1 +- cosine_precision@3 +- cosine_precision@5 +- cosine_precision@10 +- cosine_recall@1 +- cosine_recall@3 +- cosine_recall@5 +- cosine_recall@10 +- cosine_ndcg@10 +- cosine_mrr@10 +- cosine_map@100 +pipeline_tag: sentence-similarity +tags: +- sentence-transformers +- sentence-similarity +- feature-extraction +- generated_from_trainer +- dataset_size:104022 +- loss:MatryoshkaLoss +- loss:MultipleNegativesRankingLoss +widget: +- source_sentence: IZEA's market capitalization is $36 million, indicating potential + for raising additional funds if needed. + sentences: + - IZEA's market capitalization is $35.65 million, with a P/E ratio of -5.19, indicating + unprofitability in the last twelve months as of Q3 2023. + - NetApp sells its products and services through a direct sales force and an ecosystem + of partners. + - SAIL's expansion plans have raised concerns among investors, leading to underperformance + in its stock compared to the Nifty 500 index. +- source_sentence: Infinity Mining conducted an eight-hole reverse-circulation (RC) + drilling campaign at its Tambourah South project in Western Australia, targeting + lithium-caesium-tantalum (LCT) pegmatites. + sentences: + - The disclosure must be made to a Regulatory Information Service, as required by + Rule 8 of the Takeover Code. + - Infinity Mining plans to expand its exploration efforts at Tambourah South, including + the use of new technologies and techniques to identify and evaluate concealed + pegmatite targets. + - Russia aims to export over 65 million tons of grain during the season, a record + volume. +- source_sentence: Ukraine expects to receive about $1.5 billion from other international + financial institutions, including the World Bank, in 2024. + sentences: + - Ukraine has an ongoing cooperation with the International Monetary Fund (IMF), + with a 48-month lending program worth $15.6 billion, receiving $3.6 billion this + year and expecting $900 million in December, and $5.4 billion in 2024 subject + to reform targets and economic indicators. + - Vodacom Group could be considered a reasonable income stock despite the dividend + cut, with a solid payout ratio but a less impressive dividend track record. + - CoStar Group employees, members of the Black Excellence Network and Women's Network, + worked alongside Feed More volunteers to facilitate the giveaway. +- source_sentence: WaFd paid out 27% of its profit in dividends last year, indicating + a comfortable payout ratio. + sentences: + - USP35 knockdown in Hep3B cells inhibits tumor growth and reduces the expression + of ABHD17C, p-PI3K, and p-AKT in xenograft HCC models. + - Nasdaq will suspend trading of CohBar, Inc.'s common stock at the opening of business + on November 29, 2023, unless the company requests a hearing before a Nasdaq Hearings + Panel to appeal the determination. + - WaFd's earnings per share have grown at a rate of 9.4% per annum over the past + five years, demonstrating consistent growth. +- source_sentence: Scope Control provides a digital ledger of inspected lines, creating + a credible line history that underscores Custom Truck One Source's commitment + to operational safety. + sentences: + - China has implemented measures to address hidden debt, including extending debt + maturities, selling assets to repay debts, and replacing short-term local government + financial vehicle debts with longer-term, lower-cost refinancing bonds. + - Scope Control utilizes advanced Computer Vision and Deep Learning technologies + to accurately assess line health and categorize it as new, used, or bad based + on safety standards and residual break strength. + - The current management regulations for the national social security fund were + approved in December 2001 and have been implemented for over 20 years. The MOF + stated that parts of the content no longer address the current needs of the Chinese + financial market and the investment trend for the national social security fund, + necessitating a systematic and thorough revision. +model-index: +- name: VANTIGE_NEWS_v3_EDGE_DETECTION + results: + - task: + type: information-retrieval + name: Information Retrieval + dataset: + name: dim 1024 + type: dim_1024 + metrics: + - type: cosine_accuracy@1 + value: 0.828 + name: Cosine Accuracy@1 + - type: cosine_accuracy@3 + value: 0.978 + name: Cosine Accuracy@3 + - type: cosine_accuracy@5 + value: 0.986 + name: Cosine Accuracy@5 + - type: cosine_accuracy@10 + value: 0.992 + name: Cosine Accuracy@10 + - type: cosine_precision@1 + value: 0.828 + name: Cosine Precision@1 + - type: cosine_precision@3 + value: 0.32599999999999996 + name: Cosine Precision@3 + - type: cosine_precision@5 + value: 0.19720000000000001 + name: Cosine Precision@5 + - type: cosine_precision@10 + value: 0.0992 + name: Cosine Precision@10 + - type: cosine_recall@1 + value: 0.828 + name: Cosine Recall@1 + - type: cosine_recall@3 + value: 0.978 + name: Cosine Recall@3 + - type: cosine_recall@5 + value: 0.986 + name: Cosine Recall@5 + - type: cosine_recall@10 + value: 0.992 + name: Cosine Recall@10 + - type: cosine_ndcg@10 + value: 0.9261911001883877 + name: Cosine Ndcg@10 + - type: cosine_mrr@10 + value: 0.9034555555555557 + name: Cosine Mrr@10 + - type: cosine_map@100 + value: 0.9038902618135377 + name: Cosine Map@100 + - task: + type: information-retrieval + name: Information Retrieval + dataset: + name: dim 768 + type: dim_768 + metrics: + - type: cosine_accuracy@1 + value: 0.83 + name: Cosine Accuracy@1 + - type: cosine_accuracy@3 + value: 0.978 + name: Cosine Accuracy@3 + - type: cosine_accuracy@5 + value: 0.986 + name: Cosine Accuracy@5 + - type: cosine_accuracy@10 + value: 0.99 + name: Cosine Accuracy@10 + - type: cosine_precision@1 + value: 0.83 + name: Cosine Precision@1 + - type: cosine_precision@3 + value: 0.32599999999999996 + name: Cosine Precision@3 + - type: cosine_precision@5 + value: 0.1972 + name: Cosine Precision@5 + - type: cosine_precision@10 + value: 0.099 + name: Cosine Precision@10 + - type: cosine_recall@1 + value: 0.83 + name: Cosine Recall@1 + - type: cosine_recall@3 + value: 0.978 + name: Cosine Recall@3 + - type: cosine_recall@5 + value: 0.986 + name: Cosine Recall@5 + - type: cosine_recall@10 + value: 0.99 + name: Cosine Recall@10 + - type: cosine_ndcg@10 + value: 0.9264556449878328 + name: Cosine Ndcg@10 + - type: cosine_mrr@10 + value: 0.9044190476190478 + name: Cosine Mrr@10 + - type: cosine_map@100 + value: 0.9049635033323674 + name: Cosine Map@100 + - task: + type: information-retrieval + name: Information Retrieval + dataset: + name: dim 512 + type: dim_512 + metrics: + - type: cosine_accuracy@1 + value: 0.83 + name: Cosine Accuracy@1 + - type: cosine_accuracy@3 + value: 0.978 + name: Cosine Accuracy@3 + - type: cosine_accuracy@5 + value: 0.988 + name: Cosine Accuracy@5 + - type: cosine_accuracy@10 + value: 0.99 + name: Cosine Accuracy@10 + - type: cosine_precision@1 + value: 0.83 + name: Cosine Precision@1 + - type: cosine_precision@3 + value: 0.32599999999999996 + name: Cosine Precision@3 + - type: cosine_precision@5 + value: 0.19760000000000003 + name: Cosine Precision@5 + - type: cosine_precision@10 + value: 0.099 + name: Cosine Precision@10 + - type: cosine_recall@1 + value: 0.83 + name: Cosine Recall@1 + - type: cosine_recall@3 + value: 0.978 + name: Cosine Recall@3 + - type: cosine_recall@5 + value: 0.988 + name: Cosine Recall@5 + - type: cosine_recall@10 + value: 0.99 + name: Cosine Recall@10 + - type: cosine_ndcg@10 + value: 0.9262131769268145 + name: Cosine Ndcg@10 + - type: cosine_mrr@10 + value: 0.9041 + name: Cosine Mrr@10 + - type: cosine_map@100 + value: 0.9046338347982871 + name: Cosine Map@100 + - task: + type: information-retrieval + name: Information Retrieval + dataset: + name: dim 256 + type: dim_256 + metrics: + - type: cosine_accuracy@1 + value: 0.828 + name: Cosine Accuracy@1 + - type: cosine_accuracy@3 + value: 0.978 + name: Cosine Accuracy@3 + - type: cosine_accuracy@5 + value: 0.984 + name: Cosine Accuracy@5 + - type: cosine_accuracy@10 + value: 0.99 + name: Cosine Accuracy@10 + - type: cosine_precision@1 + value: 0.828 + name: Cosine Precision@1 + - type: cosine_precision@3 + value: 0.32599999999999996 + name: Cosine Precision@3 + - type: cosine_precision@5 + value: 0.1968 + name: Cosine Precision@5 + - type: cosine_precision@10 + value: 0.099 + name: Cosine Precision@10 + - type: cosine_recall@1 + value: 0.828 + name: Cosine Recall@1 + - type: cosine_recall@3 + value: 0.978 + name: Cosine Recall@3 + - type: cosine_recall@5 + value: 0.984 + name: Cosine Recall@5 + - type: cosine_recall@10 + value: 0.99 + name: Cosine Recall@10 + - type: cosine_ndcg@10 + value: 0.9250967573273415 + name: Cosine Ndcg@10 + - type: cosine_mrr@10 + value: 0.90265 + name: Cosine Mrr@10 + - type: cosine_map@100 + value: 0.9031974089635855 + name: Cosine Map@100 + - task: + type: information-retrieval + name: Information Retrieval + dataset: + name: dim 128 + type: dim_128 + metrics: + - type: cosine_accuracy@1 + value: 0.832 + name: Cosine Accuracy@1 + - type: cosine_accuracy@3 + value: 0.978 + name: Cosine Accuracy@3 + - type: cosine_accuracy@5 + value: 0.986 + name: Cosine Accuracy@5 + - type: cosine_accuracy@10 + value: 0.992 + name: Cosine Accuracy@10 + - type: cosine_precision@1 + value: 0.832 + name: Cosine Precision@1 + - type: cosine_precision@3 + value: 0.32599999999999996 + name: Cosine Precision@3 + - type: cosine_precision@5 + value: 0.19720000000000001 + name: Cosine Precision@5 + - type: cosine_precision@10 + value: 0.0992 + name: Cosine Precision@10 + - type: cosine_recall@1 + value: 0.832 + name: Cosine Recall@1 + - type: cosine_recall@3 + value: 0.978 + name: Cosine Recall@3 + - type: cosine_recall@5 + value: 0.986 + name: Cosine Recall@5 + - type: cosine_recall@10 + value: 0.992 + name: Cosine Recall@10 + - type: cosine_ndcg@10 + value: 0.9276434508354098 + name: Cosine Ndcg@10 + - type: cosine_mrr@10 + value: 0.9054333333333333 + name: Cosine Mrr@10 + - type: cosine_map@100 + value: 0.9058527890466532 + name: Cosine Map@100 + - task: + type: information-retrieval + name: Information Retrieval + dataset: + name: dim 64 + type: dim_64 + metrics: + - type: cosine_accuracy@1 + value: 0.822 + name: Cosine Accuracy@1 + - type: cosine_accuracy@3 + value: 0.978 + name: Cosine Accuracy@3 + - type: cosine_accuracy@5 + value: 0.986 + name: Cosine Accuracy@5 + - type: cosine_accuracy@10 + value: 0.99 + name: Cosine Accuracy@10 + - type: cosine_precision@1 + value: 0.822 + name: Cosine Precision@1 + - type: cosine_precision@3 + value: 0.32599999999999996 + name: Cosine Precision@3 + - type: cosine_precision@5 + value: 0.19720000000000001 + name: Cosine Precision@5 + - type: cosine_precision@10 + value: 0.099 + name: Cosine Precision@10 + - type: cosine_recall@1 + value: 0.822 + name: Cosine Recall@1 + - type: cosine_recall@3 + value: 0.978 + name: Cosine Recall@3 + - type: cosine_recall@5 + value: 0.986 + name: Cosine Recall@5 + - type: cosine_recall@10 + value: 0.99 + name: Cosine Recall@10 + - type: cosine_ndcg@10 + value: 0.9224148281915946 + name: Cosine Ndcg@10 + - type: cosine_mrr@10 + value: 0.8989999999999999 + name: Cosine Mrr@10 + - type: cosine_map@100 + value: 0.8995256769374417 + name: Cosine Map@100 +--- + +# VANTIGE_NEWS_v3_EDGE_DETECTION + +This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [BAAI/bge-large-en-v1.5](https://huggingface.co/BAAI/bge-large-en-v1.5). It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more. + +## Model Details + +### Model Description +- **Model Type:** Sentence Transformer +- **Base model:** [BAAI/bge-large-en-v1.5](https://huggingface.co/BAAI/bge-large-en-v1.5) +- **Maximum Sequence Length:** 512 tokens +- **Output Dimensionality:** 1024 tokens +- **Similarity Function:** Cosine Similarity + +- **Language:** en +- **License:** other + +### Model Sources + +- **Documentation:** [Sentence Transformers Documentation](https://sbert.net) +- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers) +- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers) + +### Full Model Architecture + +``` +SentenceTransformer( + (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel + (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True}) + (2): Normalize() +) +``` + +## Usage + +### Direct Usage (Sentence Transformers) + +First install the Sentence Transformers library: + +```bash +pip install -U sentence-transformers +``` + +Then you can load this model and run inference. +```python +from sentence_transformers import SentenceTransformer + +# Download from the 🤗 Hub +model = SentenceTransformer("dustyatx/news_v3_graph_edges_embeddings_setence_paragraph") +# Run inference +sentences = [ + "Scope Control provides a digital ledger of inspected lines, creating a credible line history that underscores Custom Truck One Source's commitment to operational safety.", + 'Scope Control utilizes advanced Computer Vision and Deep Learning technologies to accurately assess line health and categorize it as new, used, or bad based on safety standards and residual break strength.', + 'China has implemented measures to address hidden debt, including extending debt maturities, selling assets to repay debts, and replacing short-term local government financial vehicle debts with longer-term, lower-cost refinancing bonds.', +] +embeddings = model.encode(sentences) +print(embeddings.shape) +# [3, 1024] + +# Get the similarity scores for the embeddings +similarities = model.similarity(embeddings, embeddings) +print(similarities.shape) +# [3, 3] +``` + + + + + + + +## Evaluation + +### Metrics + +#### Information Retrieval +* Dataset: `dim_1024` +* Evaluated with [InformationRetrievalEvaluator](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator) + +| Metric | Value | +|:--------------------|:-----------| +| cosine_accuracy@1 | 0.828 | +| cosine_accuracy@3 | 0.978 | +| cosine_accuracy@5 | 0.986 | +| cosine_accuracy@10 | 0.992 | +| cosine_precision@1 | 0.828 | +| cosine_precision@3 | 0.326 | +| cosine_precision@5 | 0.1972 | +| cosine_precision@10 | 0.0992 | +| cosine_recall@1 | 0.828 | +| cosine_recall@3 | 0.978 | +| cosine_recall@5 | 0.986 | +| cosine_recall@10 | 0.992 | +| cosine_ndcg@10 | 0.9262 | +| cosine_mrr@10 | 0.9035 | +| **cosine_map@100** | **0.9039** | + +#### Information Retrieval +* Dataset: `dim_768` +* Evaluated with [InformationRetrievalEvaluator](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator) + +| Metric | Value | +|:--------------------|:----------| +| cosine_accuracy@1 | 0.83 | +| cosine_accuracy@3 | 0.978 | +| cosine_accuracy@5 | 0.986 | +| cosine_accuracy@10 | 0.99 | +| cosine_precision@1 | 0.83 | +| cosine_precision@3 | 0.326 | +| cosine_precision@5 | 0.1972 | +| cosine_precision@10 | 0.099 | +| cosine_recall@1 | 0.83 | +| cosine_recall@3 | 0.978 | +| cosine_recall@5 | 0.986 | +| cosine_recall@10 | 0.99 | +| cosine_ndcg@10 | 0.9265 | +| cosine_mrr@10 | 0.9044 | +| **cosine_map@100** | **0.905** | + +#### Information Retrieval +* Dataset: `dim_512` +* Evaluated with [InformationRetrievalEvaluator](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator) + +| Metric | Value | +|:--------------------|:-----------| +| cosine_accuracy@1 | 0.83 | +| cosine_accuracy@3 | 0.978 | +| cosine_accuracy@5 | 0.988 | +| cosine_accuracy@10 | 0.99 | +| cosine_precision@1 | 0.83 | +| cosine_precision@3 | 0.326 | +| cosine_precision@5 | 0.1976 | +| cosine_precision@10 | 0.099 | +| cosine_recall@1 | 0.83 | +| cosine_recall@3 | 0.978 | +| cosine_recall@5 | 0.988 | +| cosine_recall@10 | 0.99 | +| cosine_ndcg@10 | 0.9262 | +| cosine_mrr@10 | 0.9041 | +| **cosine_map@100** | **0.9046** | + +#### Information Retrieval +* Dataset: `dim_256` +* Evaluated with [InformationRetrievalEvaluator](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator) + +| Metric | Value | +|:--------------------|:-----------| +| cosine_accuracy@1 | 0.828 | +| cosine_accuracy@3 | 0.978 | +| cosine_accuracy@5 | 0.984 | +| cosine_accuracy@10 | 0.99 | +| cosine_precision@1 | 0.828 | +| cosine_precision@3 | 0.326 | +| cosine_precision@5 | 0.1968 | +| cosine_precision@10 | 0.099 | +| cosine_recall@1 | 0.828 | +| cosine_recall@3 | 0.978 | +| cosine_recall@5 | 0.984 | +| cosine_recall@10 | 0.99 | +| cosine_ndcg@10 | 0.9251 | +| cosine_mrr@10 | 0.9026 | +| **cosine_map@100** | **0.9032** | + +#### Information Retrieval +* Dataset: `dim_128` +* Evaluated with [InformationRetrievalEvaluator](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator) + +| Metric | Value | +|:--------------------|:-----------| +| cosine_accuracy@1 | 0.832 | +| cosine_accuracy@3 | 0.978 | +| cosine_accuracy@5 | 0.986 | +| cosine_accuracy@10 | 0.992 | +| cosine_precision@1 | 0.832 | +| cosine_precision@3 | 0.326 | +| cosine_precision@5 | 0.1972 | +| cosine_precision@10 | 0.0992 | +| cosine_recall@1 | 0.832 | +| cosine_recall@3 | 0.978 | +| cosine_recall@5 | 0.986 | +| cosine_recall@10 | 0.992 | +| cosine_ndcg@10 | 0.9276 | +| cosine_mrr@10 | 0.9054 | +| **cosine_map@100** | **0.9059** | + +#### Information Retrieval +* Dataset: `dim_64` +* Evaluated with [InformationRetrievalEvaluator](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator) + +| Metric | Value | +|:--------------------|:-----------| +| cosine_accuracy@1 | 0.822 | +| cosine_accuracy@3 | 0.978 | +| cosine_accuracy@5 | 0.986 | +| cosine_accuracy@10 | 0.99 | +| cosine_precision@1 | 0.822 | +| cosine_precision@3 | 0.326 | +| cosine_precision@5 | 0.1972 | +| cosine_precision@10 | 0.099 | +| cosine_recall@1 | 0.822 | +| cosine_recall@3 | 0.978 | +| cosine_recall@5 | 0.986 | +| cosine_recall@10 | 0.99 | +| cosine_ndcg@10 | 0.9224 | +| cosine_mrr@10 | 0.899 | +| **cosine_map@100** | **0.8995** | + + + + + +## Training Details + +### Training Dataset + +#### Unnamed Dataset + + +* Size: 104,022 training samples +* Columns: anchor and positive +* Approximate statistics based on the first 1000 samples: + | | anchor | positive | + |:--------|:------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------| + | type | string | string | + | details | | | +* Samples: + | anchor | positive | + |:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| + | The general public, including retail investors, collectively own 11% of FINEOS Corporation Holdings' shares, representing a minority stake in the company. | Private companies, with their 50% ownership stake, have substantial influence over FINEOS Corporation Holdings' management and governance decisions. | + | A study by the Insurance Institute for Highway Safety (IIHS) found that SUVs and vans with hood heights exceeding 40 inches are approximately 45% more likely to cause pedestrian fatalities compared to vehicles with hood heights of 30 inches or less and a sloping profile. | Vehicles with front ends exceeding 35 inches in height, particularly those lacking a sloping profile, are more likely to cause severe head, torso, and hip injuries to pedestrians. | + | SpringWorks Therapeutics has a portfolio of small molecule targeted oncology product candidates and is conducting clinical trials for rare tumor types and genetically defined cancers. | SpringWorks Therapeutics operates in the biopharmaceutical industry, specializing in precision medicine for underserved patient populations. | +* Loss: [MatryoshkaLoss](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters: + ```json + { + "loss": "MultipleNegativesRankingLoss", + "matryoshka_dims": [ + 1024, + 768, + 512, + 256, + 128, + 64 + ], + "matryoshka_weights": [ + 1, + 1, + 1, + 1, + 1, + 1 + ], + "n_dims_per_step": -1 + } + ``` + +### Training Hyperparameters +#### Non-Default Hyperparameters + +- `eval_strategy`: steps +- `per_device_train_batch_size`: 30 +- `per_device_eval_batch_size`: 20 +- `gradient_accumulation_steps`: 8 +- `learning_rate`: 3e-05 +- `num_train_epochs`: 2 +- `lr_scheduler_type`: cosine +- `warmup_ratio`: 0.2 +- `bf16`: True +- `tf32`: True +- `dataloader_num_workers`: 30 +- `load_best_model_at_end`: True +- `optim`: adamw_torch_fused +- `batch_sampler`: no_duplicates + +#### All Hyperparameters +
Click to expand + +- `overwrite_output_dir`: False +- `do_predict`: False +- `eval_strategy`: steps +- `prediction_loss_only`: True +- `per_device_train_batch_size`: 30 +- `per_device_eval_batch_size`: 20 +- `per_gpu_train_batch_size`: None +- `per_gpu_eval_batch_size`: None +- `gradient_accumulation_steps`: 8 +- `eval_accumulation_steps`: None +- `torch_empty_cache_steps`: None +- `learning_rate`: 3e-05 +- `weight_decay`: 0.0 +- `adam_beta1`: 0.9 +- `adam_beta2`: 0.999 +- `adam_epsilon`: 1e-08 +- `max_grad_norm`: 1.0 +- `num_train_epochs`: 2 +- `max_steps`: -1 +- `lr_scheduler_type`: cosine +- `lr_scheduler_kwargs`: {} +- `warmup_ratio`: 0.2 +- `warmup_steps`: 0 +- `log_level`: passive +- `log_level_replica`: warning +- `log_on_each_node`: True +- `logging_nan_inf_filter`: True +- `save_safetensors`: True +- `save_on_each_node`: False +- `save_only_model`: False +- `restore_callback_states_from_checkpoint`: False +- `no_cuda`: False +- `use_cpu`: False +- `use_mps_device`: False +- `seed`: 42 +- `data_seed`: None +- `jit_mode_eval`: False +- `use_ipex`: False +- `bf16`: True +- `fp16`: False +- `fp16_opt_level`: O1 +- `half_precision_backend`: auto +- `bf16_full_eval`: False +- `fp16_full_eval`: False +- `tf32`: True +- `local_rank`: 0 +- `ddp_backend`: None +- `tpu_num_cores`: None +- `tpu_metrics_debug`: False +- `debug`: [] +- `dataloader_drop_last`: False +- `dataloader_num_workers`: 30 +- `dataloader_prefetch_factor`: None +- `past_index`: -1 +- `disable_tqdm`: False +- `remove_unused_columns`: True +- `label_names`: None +- `load_best_model_at_end`: True +- `ignore_data_skip`: False +- `fsdp`: [] +- `fsdp_min_num_params`: 0 +- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False} +- `fsdp_transformer_layer_cls_to_wrap`: None +- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None} +- `deepspeed`: None +- `label_smoothing_factor`: 0.0 +- `optim`: adamw_torch_fused +- `optim_args`: None +- `adafactor`: False +- `group_by_length`: False +- `length_column_name`: length +- `ddp_find_unused_parameters`: None +- `ddp_bucket_cap_mb`: None +- `ddp_broadcast_buffers`: False +- `dataloader_pin_memory`: True +- `dataloader_persistent_workers`: False +- `skip_memory_metrics`: True +- `use_legacy_prediction_loop`: False +- `push_to_hub`: False +- `resume_from_checkpoint`: None +- `hub_model_id`: None +- `hub_strategy`: every_save +- `hub_private_repo`: False +- `hub_always_push`: False +- `gradient_checkpointing`: False +- `gradient_checkpointing_kwargs`: None +- `include_inputs_for_metrics`: False +- `eval_do_concat_batches`: True +- `fp16_backend`: auto +- `push_to_hub_model_id`: None +- `push_to_hub_organization`: None +- `mp_parameters`: +- `auto_find_batch_size`: False +- `full_determinism`: False +- `torchdynamo`: None +- `ray_scope`: last +- `ddp_timeout`: 1800 +- `torch_compile`: False +- `torch_compile_backend`: None +- `torch_compile_mode`: None +- `dispatch_batches`: None +- `split_batches`: None +- `include_tokens_per_second`: False +- `include_num_input_tokens_seen`: False +- `neftune_noise_alpha`: None +- `optim_target_modules`: None +- `batch_eval_metrics`: False +- `eval_on_start`: False +- `eval_use_gather_object`: False +- `batch_sampler`: no_duplicates +- `multi_dataset_batch_sampler`: proportional + +
+ +### Training Logs +
Click to expand + +| Epoch | Step | Training Loss | dim_1024_cosine_map@100 | dim_128_cosine_map@100 | dim_256_cosine_map@100 | dim_512_cosine_map@100 | dim_64_cosine_map@100 | dim_768_cosine_map@100 | +|:------:|:----:|:-------------:|:-----------------------:|:----------------------:|:----------------------:|:----------------------:|:---------------------:|:----------------------:| +| 0.0023 | 1 | 1.8313 | - | - | - | - | - | - | +| 0.0046 | 2 | 1.9678 | - | - | - | - | - | - | +| 0.0069 | 3 | 0.8038 | - | - | - | - | - | - | +| 0.0092 | 4 | 0.7993 | - | - | - | - | - | - | +| 0.0115 | 5 | 0.7926 | - | - | - | - | - | - | +| 0.0138 | 6 | 0.9348 | - | - | - | - | - | - | +| 0.0161 | 7 | 0.8707 | - | - | - | - | - | - | +| 0.0185 | 8 | 0.7293 | - | - | - | - | - | - | +| 0.0208 | 9 | 0.6618 | - | - | - | - | - | - | +| 0.0231 | 10 | 0.846 | - | - | - | - | - | - | +| 0.0254 | 11 | 0.6836 | - | - | - | - | - | - | +| 0.0277 | 12 | 0.7034 | - | - | - | - | - | - | +| 0.0300 | 13 | 0.7987 | - | - | - | - | - | - | +| 0.0323 | 14 | 0.6443 | - | - | - | - | - | - | +| 0.0346 | 15 | 0.5975 | - | - | - | - | - | - | +| 0.0369 | 16 | 0.4471 | - | - | - | - | - | - | +| 0.0392 | 17 | 0.4739 | - | - | - | - | - | - | +| 0.0415 | 18 | 0.4136 | - | - | - | - | - | - | +| 0.0438 | 19 | 0.3865 | - | - | - | - | - | - | +| 0.0461 | 20 | 0.3421 | - | - | - | - | - | - | +| 0.0484 | 21 | 0.5076 | - | - | - | - | - | - | +| 0.0507 | 22 | 0.1878 | - | - | - | - | - | - | +| 0.0531 | 23 | 0.3597 | - | - | - | - | - | - | +| 0.0554 | 24 | 0.23 | - | - | - | - | - | - | +| 0.0577 | 25 | 0.1331 | - | - | - | - | - | - | +| 0.0600 | 26 | 0.1793 | - | - | - | - | - | - | +| 0.0623 | 27 | 0.1309 | - | - | - | - | - | - | +| 0.0646 | 28 | 0.1077 | - | - | - | - | - | - | +| 0.0669 | 29 | 0.1681 | - | - | - | - | - | - | +| 0.0692 | 30 | 0.055 | - | - | - | - | - | - | +| 0.0715 | 31 | 0.1062 | - | - | - | - | - | - | +| 0.0738 | 32 | 0.0672 | - | - | - | - | - | - | +| 0.0761 | 33 | 0.067 | - | - | - | - | - | - | +| 0.0784 | 34 | 0.0953 | - | - | - | - | - | - | +| 0.0807 | 35 | 0.0602 | - | - | - | - | - | - | +| 0.0830 | 36 | 0.1312 | - | - | - | - | - | - | +| 0.0854 | 37 | 0.0356 | - | - | - | - | - | - | +| 0.0877 | 38 | 0.0707 | - | - | - | - | - | - | +| 0.0900 | 39 | 0.1525 | - | - | - | - | - | - | +| 0.0923 | 40 | 0.0362 | - | - | - | - | - | - | +| 0.0946 | 41 | 0.253 | - | - | - | - | - | - | +| 0.0969 | 42 | 0.0572 | - | - | - | - | - | - | +| 0.0992 | 43 | 0.1031 | - | - | - | - | - | - | +| 0.1015 | 44 | 0.1023 | - | - | - | - | - | - | +| 0.1038 | 45 | 0.052 | - | - | - | - | - | - | +| 0.1061 | 46 | 0.0614 | - | - | - | - | - | - | +| 0.1084 | 47 | 0.1256 | - | - | - | - | - | - | +| 0.1107 | 48 | 0.1624 | - | - | - | - | - | - | +| 0.1130 | 49 | 0.0363 | - | - | - | - | - | - | +| 0.1153 | 50 | 0.2001 | 0.8949 | 0.8940 | 0.8947 | 0.8950 | 0.8864 | 0.8972 | +| 0.1176 | 51 | 0.0846 | - | - | - | - | - | - | +| 0.1200 | 52 | 0.0338 | - | - | - | - | - | - | +| 0.1223 | 53 | 0.0648 | - | - | - | - | - | - | +| 0.1246 | 54 | 0.1232 | - | - | - | - | - | - | +| 0.1269 | 55 | 0.0318 | - | - | - | - | - | - | +| 0.1292 | 56 | 0.1148 | - | - | - | - | - | - | +| 0.1315 | 57 | 0.0826 | - | - | - | - | - | - | +| 0.1338 | 58 | 0.034 | - | - | - | - | - | - | +| 0.1361 | 59 | 0.0492 | - | - | - | - | - | - | +| 0.1384 | 60 | 0.0427 | - | - | - | - | - | - | +| 0.1407 | 61 | 0.0709 | - | - | - | - | - | - | +| 0.1430 | 62 | 0.0494 | - | - | - | - | - | - | +| 0.1453 | 63 | 0.0554 | - | - | - | - | - | - | +| 0.1476 | 64 | 0.061 | - | - | - | - | - | - | +| 0.1499 | 65 | 0.1155 | - | - | - | - | - | - | +| 0.1522 | 66 | 0.0419 | - | - | - | - | - | - | +| 0.1546 | 67 | 0.0185 | - | - | - | - | - | - | +| 0.1569 | 68 | 0.0559 | - | - | - | - | - | - | +| 0.1592 | 69 | 0.0219 | - | - | - | - | - | - | +| 0.1615 | 70 | 0.0302 | - | - | - | - | - | - | +| 0.1638 | 71 | 0.0322 | - | - | - | - | - | - | +| 0.1661 | 72 | 0.0604 | - | - | - | - | - | - | +| 0.1684 | 73 | 0.038 | - | - | - | - | - | - | +| 0.1707 | 74 | 0.0971 | - | - | - | - | - | - | +| 0.1730 | 75 | 0.0384 | - | - | - | - | - | - | +| 0.1753 | 76 | 0.0887 | - | - | - | - | - | - | +| 0.1776 | 77 | 0.0495 | - | - | - | - | - | - | +| 0.1799 | 78 | 0.0203 | - | - | - | - | - | - | +| 0.1822 | 79 | 0.0669 | - | - | - | - | - | - | +| 0.1845 | 80 | 0.0319 | - | - | - | - | - | - | +| 0.1869 | 81 | 0.0177 | - | - | - | - | - | - | +| 0.1892 | 82 | 0.0303 | - | - | - | - | - | - | +| 0.1915 | 83 | 0.037 | - | - | - | - | - | - | +| 0.1938 | 84 | 0.0122 | - | - | - | - | - | - | +| 0.1961 | 85 | 0.0377 | - | - | - | - | - | - | +| 0.1984 | 86 | 0.0578 | - | - | - | - | - | - | +| 0.2007 | 87 | 0.0347 | - | - | - | - | - | - | +| 0.2030 | 88 | 0.1288 | - | - | - | - | - | - | +| 0.2053 | 89 | 0.0964 | - | - | - | - | - | - | +| 0.2076 | 90 | 0.0172 | - | - | - | - | - | - | +| 0.2099 | 91 | 0.0726 | - | - | - | - | - | - | +| 0.2122 | 92 | 0.0225 | - | - | - | - | - | - | +| 0.2145 | 93 | 0.1011 | - | - | - | - | - | - | +| 0.2168 | 94 | 0.0248 | - | - | - | - | - | - | +| 0.2191 | 95 | 0.0431 | - | - | - | - | - | - | +| 0.2215 | 96 | 0.0243 | - | - | - | - | - | - | +| 0.2238 | 97 | 0.0221 | - | - | - | - | - | - | +| 0.2261 | 98 | 0.0529 | - | - | - | - | - | - | +| 0.2284 | 99 | 0.0459 | - | - | - | - | - | - | +| 0.2307 | 100 | 0.0869 | 0.9026 | 0.8967 | 0.8950 | 0.9003 | 0.8915 | 0.9009 | +| 0.2330 | 101 | 0.0685 | - | - | - | - | - | - | +| 0.2353 | 102 | 0.0801 | - | - | - | - | - | - | +| 0.2376 | 103 | 0.025 | - | - | - | - | - | - | +| 0.2399 | 104 | 0.0556 | - | - | - | - | - | - | +| 0.2422 | 105 | 0.0146 | - | - | - | - | - | - | +| 0.2445 | 106 | 0.0335 | - | - | - | - | - | - | +| 0.2468 | 107 | 0.0441 | - | - | - | - | - | - | +| 0.2491 | 108 | 0.0187 | - | - | - | - | - | - | +| 0.2514 | 109 | 0.1027 | - | - | - | - | - | - | +| 0.2537 | 110 | 0.0189 | - | - | - | - | - | - | +| 0.2561 | 111 | 0.1262 | - | - | - | - | - | - | +| 0.2584 | 112 | 0.1193 | - | - | - | - | - | - | +| 0.2607 | 113 | 0.0285 | - | - | - | - | - | - | +| 0.2630 | 114 | 0.0226 | - | - | - | - | - | - | +| 0.2653 | 115 | 0.1209 | - | - | - | - | - | - | +| 0.2676 | 116 | 0.0765 | - | - | - | - | - | - | +| 0.2699 | 117 | 0.1405 | - | - | - | - | - | - | +| 0.2722 | 118 | 0.0629 | - | - | - | - | - | - | +| 0.2745 | 119 | 0.0413 | - | - | - | - | - | - | +| 0.2768 | 120 | 0.0572 | - | - | - | - | - | - | +| 0.2791 | 121 | 0.0192 | - | - | - | - | - | - | +| 0.2814 | 122 | 0.0949 | - | - | - | - | - | - | +| 0.2837 | 123 | 0.0398 | - | - | - | - | - | - | +| 0.2860 | 124 | 0.0596 | - | - | - | - | - | - | +| 0.2884 | 125 | 0.0243 | - | - | - | - | - | - | +| 0.2907 | 126 | 0.0636 | - | - | - | - | - | - | +| 0.2930 | 127 | 0.0367 | - | - | - | - | - | - | +| 0.2953 | 128 | 0.0542 | - | - | - | - | - | - | +| 0.2976 | 129 | 0.0149 | - | - | - | - | - | - | +| 0.2999 | 130 | 0.097 | - | - | - | - | - | - | +| 0.3022 | 131 | 0.0213 | - | - | - | - | - | - | +| 0.3045 | 132 | 0.027 | - | - | - | - | - | - | +| 0.3068 | 133 | 0.0577 | - | - | - | - | - | - | +| 0.3091 | 134 | 0.0143 | - | - | - | - | - | - | +| 0.3114 | 135 | 0.0285 | - | - | - | - | - | - | +| 0.3137 | 136 | 0.033 | - | - | - | - | - | - | +| 0.3160 | 137 | 0.0412 | - | - | - | - | - | - | +| 0.3183 | 138 | 0.0125 | - | - | - | - | - | - | +| 0.3206 | 139 | 0.0512 | - | - | - | - | - | - | +| 0.3230 | 140 | 0.0189 | - | - | - | - | - | - | +| 0.3253 | 141 | 0.124 | - | - | - | - | - | - | +| 0.3276 | 142 | 0.0118 | - | - | - | - | - | - | +| 0.3299 | 143 | 0.017 | - | - | - | - | - | - | +| 0.3322 | 144 | 0.025 | - | - | - | - | - | - | +| 0.3345 | 145 | 0.0187 | - | - | - | - | - | - | +| 0.3368 | 146 | 0.0141 | - | - | - | - | - | - | +| 0.3391 | 147 | 0.0325 | - | - | - | - | - | - | +| 0.3414 | 148 | 0.0582 | - | - | - | - | - | - | +| 0.3437 | 149 | 0.0611 | - | - | - | - | - | - | +| 0.3460 | 150 | 0.0261 | 0.9047 | 0.8995 | 0.9003 | 0.9022 | 0.8998 | 0.9032 | +| 0.3483 | 151 | 0.014 | - | - | - | - | - | - | +| 0.3506 | 152 | 0.0077 | - | - | - | - | - | - | +| 0.3529 | 153 | 0.022 | - | - | - | - | - | - | +| 0.3552 | 154 | 0.0328 | - | - | - | - | - | - | +| 0.3576 | 155 | 0.0124 | - | - | - | - | - | - | +| 0.3599 | 156 | 0.0103 | - | - | - | - | - | - | +| 0.3622 | 157 | 0.0607 | - | - | - | - | - | - | +| 0.3645 | 158 | 0.0121 | - | - | - | - | - | - | +| 0.3668 | 159 | 0.0761 | - | - | - | - | - | - | +| 0.3691 | 160 | 0.0981 | - | - | - | - | - | - | +| 0.3714 | 161 | 0.1071 | - | - | - | - | - | - | +| 0.3737 | 162 | 0.1307 | - | - | - | - | - | - | +| 0.3760 | 163 | 0.0524 | - | - | - | - | - | - | +| 0.3783 | 164 | 0.0726 | - | - | - | - | - | - | +| 0.3806 | 165 | 0.0636 | - | - | - | - | - | - | +| 0.3829 | 166 | 0.0428 | - | - | - | - | - | - | +| 0.3852 | 167 | 0.0111 | - | - | - | - | - | - | +| 0.3875 | 168 | 0.0542 | - | - | - | - | - | - | +| 0.3899 | 169 | 0.0193 | - | - | - | - | - | - | +| 0.3922 | 170 | 0.0095 | - | - | - | - | - | - | +| 0.3945 | 171 | 0.0464 | - | - | - | - | - | - | +| 0.3968 | 172 | 0.0167 | - | - | - | - | - | - | +| 0.3991 | 173 | 0.0209 | - | - | - | - | - | - | +| 0.4014 | 174 | 0.0359 | - | - | - | - | - | - | +| 0.4037 | 175 | 0.071 | - | - | - | - | - | - | +| 0.4060 | 176 | 0.0189 | - | - | - | - | - | - | +| 0.4083 | 177 | 0.0448 | - | - | - | - | - | - | +| 0.4106 | 178 | 0.0161 | - | - | - | - | - | - | +| 0.4129 | 179 | 0.0427 | - | - | - | - | - | - | +| 0.4152 | 180 | 0.0229 | - | - | - | - | - | - | +| 0.4175 | 181 | 0.0274 | - | - | - | - | - | - | +| 0.4198 | 182 | 0.0173 | - | - | - | - | - | - | +| 0.4221 | 183 | 0.0123 | - | - | - | - | - | - | +| 0.4245 | 184 | 0.0395 | - | - | - | - | - | - | +| 0.4268 | 185 | 0.015 | - | - | - | - | - | - | +| 0.4291 | 186 | 0.0168 | - | - | - | - | - | - | +| 0.4314 | 187 | 0.0165 | - | - | - | - | - | - | +| 0.4337 | 188 | 0.0412 | - | - | - | - | - | - | +| 0.4360 | 189 | 0.0961 | - | - | - | - | - | - | +| 0.4383 | 190 | 0.0551 | - | - | - | - | - | - | +| 0.4406 | 191 | 0.0685 | - | - | - | - | - | - | +| 0.4429 | 192 | 0.1561 | - | - | - | - | - | - | +| 0.4452 | 193 | 0.0333 | - | - | - | - | - | - | +| 0.4475 | 194 | 0.0567 | - | - | - | - | - | - | +| 0.4498 | 195 | 0.0081 | - | - | - | - | - | - | +| 0.4521 | 196 | 0.0297 | - | - | - | - | - | - | +| 0.4544 | 197 | 0.0131 | - | - | - | - | - | - | +| 0.4567 | 198 | 0.0322 | - | - | - | - | - | - | +| 0.4591 | 199 | 0.0224 | - | - | - | - | - | - | +| 0.4614 | 200 | 0.0068 | 0.8989 | 0.8941 | 0.8983 | 0.8985 | 0.8975 | 0.9002 | +| 0.4637 | 201 | 0.0115 | - | - | - | - | - | - | +| 0.4660 | 202 | 0.0098 | - | - | - | - | - | - | +| 0.4683 | 203 | 0.101 | - | - | - | - | - | - | +| 0.4706 | 204 | 0.0282 | - | - | - | - | - | - | +| 0.4729 | 205 | 0.0721 | - | - | - | - | - | - | +| 0.4752 | 206 | 0.0123 | - | - | - | - | - | - | +| 0.4775 | 207 | 0.1014 | - | - | - | - | - | - | +| 0.4798 | 208 | 0.0257 | - | - | - | - | - | - | +| 0.4821 | 209 | 0.1126 | - | - | - | - | - | - | +| 0.4844 | 210 | 0.0586 | - | - | - | - | - | - | +| 0.4867 | 211 | 0.0307 | - | - | - | - | - | - | +| 0.4890 | 212 | 0.0226 | - | - | - | - | - | - | +| 0.4913 | 213 | 0.0471 | - | - | - | - | - | - | +| 0.4937 | 214 | 0.025 | - | - | - | - | - | - | +| 0.4960 | 215 | 0.0799 | - | - | - | - | - | - | +| 0.4983 | 216 | 0.0173 | - | - | - | - | - | - | +| 0.5006 | 217 | 0.0208 | - | - | - | - | - | - | +| 0.5029 | 218 | 0.0461 | - | - | - | - | - | - | +| 0.5052 | 219 | 0.0592 | - | - | - | - | - | - | +| 0.5075 | 220 | 0.0076 | - | - | - | - | - | - | +| 0.5098 | 221 | 0.0156 | - | - | - | - | - | - | +| 0.5121 | 222 | 0.0149 | - | - | - | - | - | - | +| 0.5144 | 223 | 0.0138 | - | - | - | - | - | - | +| 0.5167 | 224 | 0.0526 | - | - | - | - | - | - | +| 0.5190 | 225 | 0.0689 | - | - | - | - | - | - | +| 0.5213 | 226 | 0.0191 | - | - | - | - | - | - | +| 0.5236 | 227 | 0.0094 | - | - | - | - | - | - | +| 0.5260 | 228 | 0.0125 | - | - | - | - | - | - | +| 0.5283 | 229 | 0.0632 | - | - | - | - | - | - | +| 0.5306 | 230 | 0.0773 | - | - | - | - | - | - | +| 0.5329 | 231 | 0.0147 | - | - | - | - | - | - | +| 0.5352 | 232 | 0.0145 | - | - | - | - | - | - | +| 0.5375 | 233 | 0.0068 | - | - | - | - | - | - | +| 0.5398 | 234 | 0.0673 | - | - | - | - | - | - | +| 0.5421 | 235 | 0.0131 | - | - | - | - | - | - | +| 0.5444 | 236 | 0.0217 | - | - | - | - | - | - | +| 0.5467 | 237 | 0.0126 | - | - | - | - | - | - | +| 0.5490 | 238 | 0.0172 | - | - | - | - | - | - | +| 0.5513 | 239 | 0.0122 | - | - | - | - | - | - | +| 0.5536 | 240 | 0.0175 | - | - | - | - | - | - | +| 0.5559 | 241 | 0.0184 | - | - | - | - | - | - | +| 0.5582 | 242 | 0.0422 | - | - | - | - | - | - | +| 0.5606 | 243 | 0.0106 | - | - | - | - | - | - | +| 0.5629 | 244 | 0.071 | - | - | - | - | - | - | +| 0.5652 | 245 | 0.0089 | - | - | - | - | - | - | +| 0.5675 | 246 | 0.0099 | - | - | - | - | - | - | +| 0.5698 | 247 | 0.0133 | - | - | - | - | - | - | +| 0.5721 | 248 | 0.0627 | - | - | - | - | - | - | +| 0.5744 | 249 | 0.0248 | - | - | - | - | - | - | +| 0.5767 | 250 | 0.0349 | 0.8970 | 0.8968 | 0.8961 | 0.8961 | 0.8952 | 0.8963 | +| 0.5790 | 251 | 0.0145 | - | - | - | - | - | - | +| 0.5813 | 252 | 0.0052 | - | - | - | - | - | - | +| 0.5836 | 253 | 0.0198 | - | - | - | - | - | - | +| 0.5859 | 254 | 0.0065 | - | - | - | - | - | - | +| 0.5882 | 255 | 0.007 | - | - | - | - | - | - | +| 0.5905 | 256 | 0.0072 | - | - | - | - | - | - | +| 0.5928 | 257 | 0.1878 | - | - | - | - | - | - | +| 0.5952 | 258 | 0.0091 | - | - | - | - | - | - | +| 0.5975 | 259 | 0.0421 | - | - | - | - | - | - | +| 0.5998 | 260 | 0.0166 | - | - | - | - | - | - | +| 0.6021 | 261 | 0.0909 | - | - | - | - | - | - | +| 0.6044 | 262 | 0.0107 | - | - | - | - | - | - | +| 0.6067 | 263 | 0.0191 | - | - | - | - | - | - | +| 0.6090 | 264 | 0.0168 | - | - | - | - | - | - | +| 0.6113 | 265 | 0.0814 | - | - | - | - | - | - | +| 0.6136 | 266 | 0.0736 | - | - | - | - | - | - | +| 0.6159 | 267 | 0.0297 | - | - | - | - | - | - | +| 0.6182 | 268 | 0.016 | - | - | - | - | - | - | +| 0.6205 | 269 | 0.0201 | - | - | - | - | - | - | +| 0.6228 | 270 | 0.0111 | - | - | - | - | - | - | +| 0.6251 | 271 | 0.0164 | - | - | - | - | - | - | +| 0.6275 | 272 | 0.0106 | - | - | - | - | - | - | +| 0.6298 | 273 | 0.0287 | - | - | - | - | - | - | +| 0.6321 | 274 | 0.0595 | - | - | - | - | - | - | +| 0.6344 | 275 | 0.0446 | - | - | - | - | - | - | +| 0.6367 | 276 | 0.0203 | - | - | - | - | - | - | +| 0.6390 | 277 | 0.0079 | - | - | - | - | - | - | +| 0.6413 | 278 | 0.0345 | - | - | - | - | - | - | +| 0.6436 | 279 | 0.0461 | - | - | - | - | - | - | +| 0.6459 | 280 | 0.0803 | - | - | - | - | - | - | +| 0.6482 | 281 | 0.0218 | - | - | - | - | - | - | +| 0.6505 | 282 | 0.0288 | - | - | - | - | - | - | +| 0.6528 | 283 | 0.0745 | - | - | - | - | - | - | +| 0.6551 | 284 | 0.0102 | - | - | - | - | - | - | +| 0.6574 | 285 | 0.0626 | - | - | - | - | - | - | +| 0.6597 | 286 | 0.0606 | - | - | - | - | - | - | +| 0.6621 | 287 | 0.0319 | - | - | - | - | - | - | +| 0.6644 | 288 | 0.0303 | - | - | - | - | - | - | +| 0.6667 | 289 | 0.0216 | - | - | - | - | - | - | +| 0.6690 | 290 | 0.0417 | - | - | - | - | - | - | +| 0.6713 | 291 | 0.0061 | - | - | - | - | - | - | +| 0.6736 | 292 | 0.0386 | - | - | - | - | - | - | +| 0.6759 | 293 | 0.0117 | - | - | - | - | - | - | +| 0.6782 | 294 | 0.0283 | - | - | - | - | - | - | +| 0.6805 | 295 | 0.013 | - | - | - | - | - | - | +| 0.6828 | 296 | 0.1237 | - | - | - | - | - | - | +| 0.6851 | 297 | 0.0878 | - | - | - | - | - | - | +| 0.6874 | 298 | 0.0158 | - | - | - | - | - | - | +| 0.6897 | 299 | 0.0562 | - | - | - | - | - | - | +| 0.6920 | 300 | 0.0871 | 0.9022 | 0.9027 | 0.9074 | 0.9055 | 0.8990 | 0.9027 | +| 0.6943 | 301 | 0.0657 | - | - | - | - | - | - | +| 0.6967 | 302 | 0.0239 | - | - | - | - | - | - | +| 0.6990 | 303 | 0.0053 | - | - | - | - | - | - | +| 0.7013 | 304 | 0.0237 | - | - | - | - | - | - | +| 0.7036 | 305 | 0.0182 | - | - | - | - | - | - | +| 0.7059 | 306 | 0.0135 | - | - | - | - | - | - | +| 0.7082 | 307 | 0.0059 | - | - | - | - | - | - | +| 0.7105 | 308 | 0.0061 | - | - | - | - | - | - | +| 0.7128 | 309 | 0.0072 | - | - | - | - | - | - | +| 0.7151 | 310 | 0.0319 | - | - | - | - | - | - | +| 0.7174 | 311 | 0.1183 | - | - | - | - | - | - | +| 0.7197 | 312 | 0.0447 | - | - | - | - | - | - | +| 0.7220 | 313 | 0.0369 | - | - | - | - | - | - | +| 0.7243 | 314 | 0.0462 | - | - | - | - | - | - | +| 0.7266 | 315 | 0.0233 | - | - | - | - | - | - | +| 0.7290 | 316 | 0.0114 | - | - | - | - | - | - | +| 0.7313 | 317 | 0.0179 | - | - | - | - | - | - | +| 0.7336 | 318 | 0.0203 | - | - | - | - | - | - | +| 0.7359 | 319 | 0.0071 | - | - | - | - | - | - | +| 0.7382 | 320 | 0.1297 | - | - | - | - | - | - | +| 0.7405 | 321 | 0.0249 | - | - | - | - | - | - | +| 0.7428 | 322 | 0.063 | - | - | - | - | - | - | +| 0.7451 | 323 | 0.0479 | - | - | - | - | - | - | +| 0.7474 | 324 | 0.1483 | - | - | - | - | - | - | +| 0.7497 | 325 | 0.0058 | - | - | - | - | - | - | +| 0.7520 | 326 | 0.0191 | - | - | - | - | - | - | +| 0.7543 | 327 | 0.0855 | - | - | - | - | - | - | +| 0.7566 | 328 | 0.0156 | - | - | - | - | - | - | +| 0.7589 | 329 | 0.0147 | - | - | - | - | - | - | +| 0.7612 | 330 | 0.0124 | - | - | - | - | - | - | +| 0.7636 | 331 | 0.0242 | - | - | - | - | - | - | +| 0.7659 | 332 | 0.0433 | - | - | - | - | - | - | +| 0.7682 | 333 | 0.0103 | - | - | - | - | - | - | +| 0.7705 | 334 | 0.0833 | - | - | - | - | - | - | +| 0.7728 | 335 | 0.0082 | - | - | - | - | - | - | +| 0.7751 | 336 | 0.0122 | - | - | - | - | - | - | +| 0.7774 | 337 | 0.031 | - | - | - | - | - | - | +| 0.7797 | 338 | 0.0116 | - | - | - | - | - | - | +| 0.7820 | 339 | 0.0947 | - | - | - | - | - | - | +| 0.7843 | 340 | 0.0323 | - | - | - | - | - | - | +| 0.7866 | 341 | 0.0177 | - | - | - | - | - | - | +| 0.7889 | 342 | 0.0487 | - | - | - | - | - | - | +| 0.7912 | 343 | 0.0123 | - | - | - | - | - | - | +| 0.7935 | 344 | 0.0075 | - | - | - | - | - | - | +| 0.7958 | 345 | 0.0061 | - | - | - | - | - | - | +| 0.7982 | 346 | 0.0057 | - | - | - | - | - | - | +| 0.8005 | 347 | 0.1108 | - | - | - | - | - | - | +| 0.8028 | 348 | 0.0104 | - | - | - | - | - | - | +| 0.8051 | 349 | 0.0131 | - | - | - | - | - | - | +| 0.8074 | 350 | 0.0229 | 0.9053 | 0.9041 | 0.9033 | 0.9066 | 0.8965 | 0.9052 | +| 0.8097 | 351 | 0.0478 | - | - | - | - | - | - | +| 0.8120 | 352 | 0.0127 | - | - | - | - | - | - | +| 0.8143 | 353 | 0.1143 | - | - | - | - | - | - | +| 0.8166 | 354 | 0.0365 | - | - | - | - | - | - | +| 0.8189 | 355 | 0.0418 | - | - | - | - | - | - | +| 0.8212 | 356 | 0.0494 | - | - | - | - | - | - | +| 0.8235 | 357 | 0.0082 | - | - | - | - | - | - | +| 0.8258 | 358 | 0.0212 | - | - | - | - | - | - | +| 0.8281 | 359 | 0.0106 | - | - | - | - | - | - | +| 0.8304 | 360 | 0.1009 | - | - | - | - | - | - | +| 0.8328 | 361 | 0.0316 | - | - | - | - | - | - | +| 0.8351 | 362 | 0.0313 | - | - | - | - | - | - | +| 0.8374 | 363 | 0.0108 | - | - | - | - | - | - | +| 0.8397 | 364 | 0.0198 | - | - | - | - | - | - | +| 0.8420 | 365 | 0.0112 | - | - | - | - | - | - | +| 0.8443 | 366 | 0.0197 | - | - | - | - | - | - | +| 0.8466 | 367 | 0.058 | - | - | - | - | - | - | +| 0.8489 | 368 | 0.0187 | - | - | - | - | - | - | +| 0.8512 | 369 | 0.0196 | - | - | - | - | - | - | +| 0.8535 | 370 | 0.0586 | - | - | - | - | - | - | +| 0.8558 | 371 | 0.0099 | - | - | - | - | - | - | +| 0.8581 | 372 | 0.0248 | - | - | - | - | - | - | +| 0.8604 | 373 | 0.0183 | - | - | - | - | - | - | +| 0.8627 | 374 | 0.0268 | - | - | - | - | - | - | +| 0.8651 | 375 | 0.0154 | - | - | - | - | - | - | +| 0.8674 | 376 | 0.0868 | - | - | - | - | - | - | +| 0.8697 | 377 | 0.0264 | - | - | - | - | - | - | +| 0.8720 | 378 | 0.0639 | - | - | - | - | - | - | +| 0.8743 | 379 | 0.1036 | - | - | - | - | - | - | +| 0.8766 | 380 | 0.0334 | - | - | - | - | - | - | +| 0.8789 | 381 | 0.04 | - | - | - | - | - | - | +| 0.8812 | 382 | 0.0095 | - | - | - | - | - | - | +| 0.8835 | 383 | 0.0371 | - | - | - | - | - | - | +| 0.8858 | 384 | 0.0585 | - | - | - | - | - | - | +| 0.8881 | 385 | 0.0353 | - | - | - | - | - | - | +| 0.8904 | 386 | 0.0095 | - | - | - | - | - | - | +| 0.8927 | 387 | 0.0126 | - | - | - | - | - | - | +| 0.8950 | 388 | 0.0384 | - | - | - | - | - | - | +| 0.8973 | 389 | 0.018 | - | - | - | - | - | - | +| 0.8997 | 390 | 0.057 | - | - | - | - | - | - | +| 0.9020 | 391 | 0.0371 | - | - | - | - | - | - | +| 0.9043 | 392 | 0.0475 | - | - | - | - | - | - | +| 0.9066 | 393 | 0.0972 | - | - | - | - | - | - | +| 0.9089 | 394 | 0.0189 | - | - | - | - | - | - | +| 0.9112 | 395 | 0.0993 | - | - | - | - | - | - | +| 0.9135 | 396 | 0.0527 | - | - | - | - | - | - | +| 0.9158 | 397 | 0.0466 | - | - | - | - | - | - | +| 0.9181 | 398 | 0.0383 | - | - | - | - | - | - | +| 0.9204 | 399 | 0.0322 | - | - | - | - | - | - | +| 0.9227 | 400 | 0.0651 | 0.9077 | 0.9074 | 0.9073 | 0.9077 | 0.9023 | 0.9078 | +| 0.9250 | 401 | 0.0055 | - | - | - | - | - | - | +| 0.9273 | 402 | 0.0083 | - | - | - | - | - | - | +| 0.9296 | 403 | 0.0062 | - | - | - | - | - | - | +| 0.9319 | 404 | 0.0085 | - | - | - | - | - | - | +| 0.9343 | 405 | 0.0179 | - | - | - | - | - | - | +| 0.9366 | 406 | 0.0041 | - | - | - | - | - | - | +| 0.9389 | 407 | 0.0978 | - | - | - | - | - | - | +| 0.9412 | 408 | 0.0068 | - | - | - | - | - | - | +| 0.9435 | 409 | 0.0145 | - | - | - | - | - | - | +| 0.9458 | 410 | 0.0098 | - | - | - | - | - | - | +| 0.9481 | 411 | 0.032 | - | - | - | - | - | - | +| 0.9504 | 412 | 0.0232 | - | - | - | - | - | - | +| 0.9527 | 413 | 0.0149 | - | - | - | - | - | - | +| 0.9550 | 414 | 0.0175 | - | - | - | - | - | - | +| 0.9573 | 415 | 0.0099 | - | - | - | - | - | - | +| 0.9596 | 416 | 0.0121 | - | - | - | - | - | - | +| 0.9619 | 417 | 0.108 | - | - | - | - | - | - | +| 0.9642 | 418 | 0.012 | - | - | - | - | - | - | +| 0.9666 | 419 | 0.0102 | - | - | - | - | - | - | +| 0.9689 | 420 | 0.0108 | - | - | - | - | - | - | +| 0.9712 | 421 | 0.2258 | - | - | - | - | - | - | +| 0.9735 | 422 | 0.0037 | - | - | - | - | - | - | +| 0.9758 | 423 | 0.0186 | - | - | - | - | - | - | +| 0.9781 | 424 | 0.0446 | - | - | - | - | - | - | +| 0.9804 | 425 | 0.1558 | - | - | - | - | - | - | +| 0.9827 | 426 | 0.023 | - | - | - | - | - | - | +| 0.9850 | 427 | 0.0075 | - | - | - | - | - | - | +| 0.9873 | 428 | 0.0095 | - | - | - | - | - | - | +| 0.9896 | 429 | 0.0141 | - | - | - | - | - | - | +| 0.9919 | 430 | 0.0617 | - | - | - | - | - | - | +| 0.9942 | 431 | 0.0961 | - | - | - | - | - | - | +| 0.9965 | 432 | 0.0058 | - | - | - | - | - | - | +| 0.9988 | 433 | 0.0399 | - | - | - | - | - | - | +| 1.0012 | 434 | 0.0063 | - | - | - | - | - | - | +| 1.0035 | 435 | 0.0288 | - | - | - | - | - | - | +| 1.0058 | 436 | 0.0041 | - | - | - | - | - | - | +| 1.0081 | 437 | 0.0071 | - | - | - | - | - | - | +| 1.0104 | 438 | 0.0233 | - | - | - | - | - | - | +| 1.0127 | 439 | 0.0135 | - | - | - | - | - | - | +| 1.0150 | 440 | 0.1015 | - | - | - | - | - | - | +| 1.0173 | 441 | 0.0045 | - | - | - | - | - | - | +| 1.0196 | 442 | 0.0088 | - | - | - | - | - | - | +| 1.0219 | 443 | 0.0086 | - | - | - | - | - | - | +| 1.0242 | 444 | 0.0072 | - | - | - | - | - | - | +| 1.0265 | 445 | 0.0147 | - | - | - | - | - | - | +| 1.0288 | 446 | 0.025 | - | - | - | - | - | - | +| 1.0311 | 447 | 0.0067 | - | - | - | - | - | - | +| 1.0334 | 448 | 0.0066 | - | - | - | - | - | - | +| 1.0358 | 449 | 0.0062 | - | - | - | - | - | - | +| 1.0381 | 450 | 0.0068 | 0.9091 | 0.9083 | 0.9045 | 0.9038 | 0.8983 | 0.9072 | +| 1.0404 | 451 | 0.0126 | - | - | - | - | - | - | +| 1.0427 | 452 | 0.0082 | - | - | - | - | - | - | +| 1.0450 | 453 | 0.0034 | - | - | - | - | - | - | +| 1.0473 | 454 | 0.04 | - | - | - | - | - | - | +| 1.0496 | 455 | 0.0235 | - | - | - | - | - | - | +| 1.0519 | 456 | 0.24 | - | - | - | - | - | - | +| 1.0542 | 457 | 0.0514 | - | - | - | - | - | - | +| 1.0565 | 458 | 0.0152 | - | - | - | - | - | - | +| 1.0588 | 459 | 0.0476 | - | - | - | - | - | - | +| 1.0611 | 460 | 0.0037 | - | - | - | - | - | - | +| 1.0634 | 461 | 0.0066 | - | - | - | - | - | - | +| 1.0657 | 462 | 0.0065 | - | - | - | - | - | - | +| 1.0681 | 463 | 0.0097 | - | - | - | - | - | - | +| 1.0704 | 464 | 0.0053 | - | - | - | - | - | - | +| 1.0727 | 465 | 0.0397 | - | - | - | - | - | - | +| 1.0750 | 466 | 0.0089 | - | - | - | - | - | - | +| 1.0773 | 467 | 0.0238 | - | - | - | - | - | - | +| 1.0796 | 468 | 0.0078 | - | - | - | - | - | - | +| 1.0819 | 469 | 0.0108 | - | - | - | - | - | - | +| 1.0842 | 470 | 0.0094 | - | - | - | - | - | - | +| 1.0865 | 471 | 0.0034 | - | - | - | - | - | - | +| 1.0888 | 472 | 0.0165 | - | - | - | - | - | - | +| 1.0911 | 473 | 0.0407 | - | - | - | - | - | - | +| 1.0934 | 474 | 0.0339 | - | - | - | - | - | - | +| 1.0957 | 475 | 0.0645 | - | - | - | - | - | - | +| 1.0980 | 476 | 0.0052 | - | - | - | - | - | - | +| 1.1003 | 477 | 0.0643 | - | - | - | - | - | - | +| 1.1027 | 478 | 0.0113 | - | - | - | - | - | - | +| 1.1050 | 479 | 0.007 | - | - | - | - | - | - | +| 1.1073 | 480 | 0.0062 | - | - | - | - | - | - | +| 1.1096 | 481 | 0.0232 | - | - | - | - | - | - | +| 1.1119 | 482 | 0.0374 | - | - | - | - | - | - | +| 1.1142 | 483 | 0.0582 | - | - | - | - | - | - | +| 1.1165 | 484 | 0.0396 | - | - | - | - | - | - | +| 1.1188 | 485 | 0.0041 | - | - | - | - | - | - | +| 1.1211 | 486 | 0.0064 | - | - | - | - | - | - | +| 1.1234 | 487 | 0.0248 | - | - | - | - | - | - | +| 1.1257 | 488 | 0.0052 | - | - | - | - | - | - | +| 1.1280 | 489 | 0.0095 | - | - | - | - | - | - | +| 1.1303 | 490 | 0.0681 | - | - | - | - | - | - | +| 1.1326 | 491 | 0.0082 | - | - | - | - | - | - | +| 1.1349 | 492 | 0.0279 | - | - | - | - | - | - | +| 1.1373 | 493 | 0.008 | - | - | - | - | - | - | +| 1.1396 | 494 | 0.0032 | - | - | - | - | - | - | +| 1.1419 | 495 | 0.041 | - | - | - | - | - | - | +| 1.1442 | 496 | 0.0089 | - | - | - | - | - | - | +| 1.1465 | 497 | 0.0289 | - | - | - | - | - | - | +| 1.1488 | 498 | 0.0232 | - | - | - | - | - | - | +| 1.1511 | 499 | 0.059 | - | - | - | - | - | - | +| 1.1534 | 500 | 0.0053 | 0.9039 | 0.9059 | 0.9032 | 0.9046 | 0.8995 | 0.9050 | + +
+ +### Framework Versions +- Python: 3.11.9 +- Sentence Transformers: 3.0.1 +- Transformers: 4.44.2 +- PyTorch: 2.4.0+cu121 +- Accelerate: 0.33.0 +- Datasets: 2.19.2 +- Tokenizers: 0.19.1 + +## Citation + +### BibTeX + +#### Sentence Transformers +```bibtex +@inproceedings{reimers-2019-sentence-bert, + title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks", + author = "Reimers, Nils and Gurevych, Iryna", + booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing", + month = "11", + year = "2019", + publisher = "Association for Computational Linguistics", + url = "https://arxiv.org/abs/1908.10084", +} +``` + +#### MatryoshkaLoss +```bibtex +@misc{kusupati2024matryoshka, + title={Matryoshka Representation Learning}, + author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi}, + year={2024}, + eprint={2205.13147}, + archivePrefix={arXiv}, + primaryClass={cs.LG} +} +``` + +#### MultipleNegativesRankingLoss +```bibtex +@misc{henderson2017efficient, + title={Efficient Natural Language Response Suggestion for Smart Reply}, + author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil}, + year={2017}, + eprint={1705.00652}, + archivePrefix={arXiv}, + primaryClass={cs.CL} +} +``` + + + + + + \ No newline at end of file