--- base_model: BAAI/bge-large-en-v1.5 datasets: [] language: - en library_name: sentence-transformers license: other metrics: - cosine_accuracy@1 - cosine_accuracy@3 - cosine_accuracy@5 - cosine_accuracy@10 - cosine_precision@1 - cosine_precision@3 - cosine_precision@5 - cosine_precision@10 - cosine_recall@1 - cosine_recall@3 - cosine_recall@5 - cosine_recall@10 - cosine_ndcg@10 - cosine_mrr@10 - cosine_map@100 pipeline_tag: sentence-similarity tags: - sentence-transformers - sentence-similarity - feature-extraction - generated_from_trainer - dataset_size:104022 - loss:MatryoshkaLoss - loss:MultipleNegativesRankingLoss widget: - source_sentence: IZEA's market capitalization is $36 million, indicating potential for raising additional funds if needed. sentences: - IZEA's market capitalization is $35.65 million, with a P/E ratio of -5.19, indicating unprofitability in the last twelve months as of Q3 2023. - NetApp sells its products and services through a direct sales force and an ecosystem of partners. - SAIL's expansion plans have raised concerns among investors, leading to underperformance in its stock compared to the Nifty 500 index. - source_sentence: Infinity Mining conducted an eight-hole reverse-circulation (RC) drilling campaign at its Tambourah South project in Western Australia, targeting lithium-caesium-tantalum (LCT) pegmatites. sentences: - The disclosure must be made to a Regulatory Information Service, as required by Rule 8 of the Takeover Code. - Infinity Mining plans to expand its exploration efforts at Tambourah South, including the use of new technologies and techniques to identify and evaluate concealed pegmatite targets. - Russia aims to export over 65 million tons of grain during the season, a record volume. - source_sentence: Ukraine expects to receive about $1.5 billion from other international financial institutions, including the World Bank, in 2024. sentences: - Ukraine has an ongoing cooperation with the International Monetary Fund (IMF), with a 48-month lending program worth $15.6 billion, receiving $3.6 billion this year and expecting $900 million in December, and $5.4 billion in 2024 subject to reform targets and economic indicators. - Vodacom Group could be considered a reasonable income stock despite the dividend cut, with a solid payout ratio but a less impressive dividend track record. - CoStar Group employees, members of the Black Excellence Network and Women's Network, worked alongside Feed More volunteers to facilitate the giveaway. - source_sentence: WaFd paid out 27% of its profit in dividends last year, indicating a comfortable payout ratio. sentences: - USP35 knockdown in Hep3B cells inhibits tumor growth and reduces the expression of ABHD17C, p-PI3K, and p-AKT in xenograft HCC models. - Nasdaq will suspend trading of CohBar, Inc.'s common stock at the opening of business on November 29, 2023, unless the company requests a hearing before a Nasdaq Hearings Panel to appeal the determination. - WaFd's earnings per share have grown at a rate of 9.4% per annum over the past five years, demonstrating consistent growth. - source_sentence: Scope Control provides a digital ledger of inspected lines, creating a credible line history that underscores Custom Truck One Source's commitment to operational safety. sentences: - China has implemented measures to address hidden debt, including extending debt maturities, selling assets to repay debts, and replacing short-term local government financial vehicle debts with longer-term, lower-cost refinancing bonds. - Scope Control utilizes advanced Computer Vision and Deep Learning technologies to accurately assess line health and categorize it as new, used, or bad based on safety standards and residual break strength. - The current management regulations for the national social security fund were approved in December 2001 and have been implemented for over 20 years. The MOF stated that parts of the content no longer address the current needs of the Chinese financial market and the investment trend for the national social security fund, necessitating a systematic and thorough revision. model-index: - name: VANTIGE_NEWS_v3_EDGE_DETECTION results: - task: type: information-retrieval name: Information Retrieval dataset: name: dim 1024 type: dim_1024 metrics: - type: cosine_accuracy@1 value: 0.828 name: Cosine Accuracy@1 - type: cosine_accuracy@3 value: 0.978 name: Cosine Accuracy@3 - type: cosine_accuracy@5 value: 0.986 name: Cosine Accuracy@5 - type: cosine_accuracy@10 value: 0.992 name: Cosine Accuracy@10 - type: cosine_precision@1 value: 0.828 name: Cosine Precision@1 - type: cosine_precision@3 value: 0.32599999999999996 name: Cosine Precision@3 - type: cosine_precision@5 value: 0.19720000000000001 name: Cosine Precision@5 - type: cosine_precision@10 value: 0.0992 name: Cosine Precision@10 - type: cosine_recall@1 value: 0.828 name: Cosine Recall@1 - type: cosine_recall@3 value: 0.978 name: Cosine Recall@3 - type: cosine_recall@5 value: 0.986 name: Cosine Recall@5 - type: cosine_recall@10 value: 0.992 name: Cosine Recall@10 - type: cosine_ndcg@10 value: 0.9261911001883877 name: Cosine Ndcg@10 - type: cosine_mrr@10 value: 0.9034555555555557 name: Cosine Mrr@10 - type: cosine_map@100 value: 0.9038902618135377 name: Cosine Map@100 - task: type: information-retrieval name: Information Retrieval dataset: name: dim 768 type: dim_768 metrics: - type: cosine_accuracy@1 value: 0.83 name: Cosine Accuracy@1 - type: cosine_accuracy@3 value: 0.978 name: Cosine Accuracy@3 - type: cosine_accuracy@5 value: 0.986 name: Cosine Accuracy@5 - type: cosine_accuracy@10 value: 0.99 name: Cosine Accuracy@10 - type: cosine_precision@1 value: 0.83 name: Cosine Precision@1 - type: cosine_precision@3 value: 0.32599999999999996 name: Cosine Precision@3 - type: cosine_precision@5 value: 0.1972 name: Cosine Precision@5 - type: cosine_precision@10 value: 0.099 name: Cosine Precision@10 - type: cosine_recall@1 value: 0.83 name: Cosine Recall@1 - type: cosine_recall@3 value: 0.978 name: Cosine Recall@3 - type: cosine_recall@5 value: 0.986 name: Cosine Recall@5 - type: cosine_recall@10 value: 0.99 name: Cosine Recall@10 - type: cosine_ndcg@10 value: 0.9264556449878328 name: Cosine Ndcg@10 - type: cosine_mrr@10 value: 0.9044190476190478 name: Cosine Mrr@10 - type: cosine_map@100 value: 0.9049635033323674 name: Cosine Map@100 - task: type: information-retrieval name: Information Retrieval dataset: name: dim 512 type: dim_512 metrics: - type: cosine_accuracy@1 value: 0.83 name: Cosine Accuracy@1 - type: cosine_accuracy@3 value: 0.978 name: Cosine Accuracy@3 - type: cosine_accuracy@5 value: 0.988 name: Cosine Accuracy@5 - type: cosine_accuracy@10 value: 0.99 name: Cosine Accuracy@10 - type: cosine_precision@1 value: 0.83 name: Cosine Precision@1 - type: cosine_precision@3 value: 0.32599999999999996 name: Cosine Precision@3 - type: cosine_precision@5 value: 0.19760000000000003 name: Cosine Precision@5 - type: cosine_precision@10 value: 0.099 name: Cosine Precision@10 - type: cosine_recall@1 value: 0.83 name: Cosine Recall@1 - type: cosine_recall@3 value: 0.978 name: Cosine Recall@3 - type: cosine_recall@5 value: 0.988 name: Cosine Recall@5 - type: cosine_recall@10 value: 0.99 name: Cosine Recall@10 - type: cosine_ndcg@10 value: 0.9262131769268145 name: Cosine Ndcg@10 - type: cosine_mrr@10 value: 0.9041 name: Cosine Mrr@10 - type: cosine_map@100 value: 0.9046338347982871 name: Cosine Map@100 - task: type: information-retrieval name: Information Retrieval dataset: name: dim 256 type: dim_256 metrics: - type: cosine_accuracy@1 value: 0.828 name: Cosine Accuracy@1 - type: cosine_accuracy@3 value: 0.978 name: Cosine Accuracy@3 - type: cosine_accuracy@5 value: 0.984 name: Cosine Accuracy@5 - type: cosine_accuracy@10 value: 0.99 name: Cosine Accuracy@10 - type: cosine_precision@1 value: 0.828 name: Cosine Precision@1 - type: cosine_precision@3 value: 0.32599999999999996 name: Cosine Precision@3 - type: cosine_precision@5 value: 0.1968 name: Cosine Precision@5 - type: cosine_precision@10 value: 0.099 name: Cosine Precision@10 - type: cosine_recall@1 value: 0.828 name: Cosine Recall@1 - type: cosine_recall@3 value: 0.978 name: Cosine Recall@3 - type: cosine_recall@5 value: 0.984 name: Cosine Recall@5 - type: cosine_recall@10 value: 0.99 name: Cosine Recall@10 - type: cosine_ndcg@10 value: 0.9250967573273415 name: Cosine Ndcg@10 - type: cosine_mrr@10 value: 0.90265 name: Cosine Mrr@10 - type: cosine_map@100 value: 0.9031974089635855 name: Cosine Map@100 - task: type: information-retrieval name: Information Retrieval dataset: name: dim 128 type: dim_128 metrics: - type: cosine_accuracy@1 value: 0.832 name: Cosine Accuracy@1 - type: cosine_accuracy@3 value: 0.978 name: Cosine Accuracy@3 - type: cosine_accuracy@5 value: 0.986 name: Cosine Accuracy@5 - type: cosine_accuracy@10 value: 0.992 name: Cosine Accuracy@10 - type: cosine_precision@1 value: 0.832 name: Cosine Precision@1 - type: cosine_precision@3 value: 0.32599999999999996 name: Cosine Precision@3 - type: cosine_precision@5 value: 0.19720000000000001 name: Cosine Precision@5 - type: cosine_precision@10 value: 0.0992 name: Cosine Precision@10 - type: cosine_recall@1 value: 0.832 name: Cosine Recall@1 - type: cosine_recall@3 value: 0.978 name: Cosine Recall@3 - type: cosine_recall@5 value: 0.986 name: Cosine Recall@5 - type: cosine_recall@10 value: 0.992 name: Cosine Recall@10 - type: cosine_ndcg@10 value: 0.9276434508354098 name: Cosine Ndcg@10 - type: cosine_mrr@10 value: 0.9054333333333333 name: Cosine Mrr@10 - type: cosine_map@100 value: 0.9058527890466532 name: Cosine Map@100 - task: type: information-retrieval name: Information Retrieval dataset: name: dim 64 type: dim_64 metrics: - type: cosine_accuracy@1 value: 0.822 name: Cosine Accuracy@1 - type: cosine_accuracy@3 value: 0.978 name: Cosine Accuracy@3 - type: cosine_accuracy@5 value: 0.986 name: Cosine Accuracy@5 - type: cosine_accuracy@10 value: 0.99 name: Cosine Accuracy@10 - type: cosine_precision@1 value: 0.822 name: Cosine Precision@1 - type: cosine_precision@3 value: 0.32599999999999996 name: Cosine Precision@3 - type: cosine_precision@5 value: 0.19720000000000001 name: Cosine Precision@5 - type: cosine_precision@10 value: 0.099 name: Cosine Precision@10 - type: cosine_recall@1 value: 0.822 name: Cosine Recall@1 - type: cosine_recall@3 value: 0.978 name: Cosine Recall@3 - type: cosine_recall@5 value: 0.986 name: Cosine Recall@5 - type: cosine_recall@10 value: 0.99 name: Cosine Recall@10 - type: cosine_ndcg@10 value: 0.9224148281915946 name: Cosine Ndcg@10 - type: cosine_mrr@10 value: 0.8989999999999999 name: Cosine Mrr@10 - type: cosine_map@100 value: 0.8995256769374417 name: Cosine Map@100 --- # VANTIGE_NEWS_v3_EDGE_DETECTION This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [BAAI/bge-large-en-v1.5](https://huggingface.co/BAAI/bge-large-en-v1.5). It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more. ## Model Details ### Model Description - **Model Type:** Sentence Transformer - **Base model:** [BAAI/bge-large-en-v1.5](https://huggingface.co/BAAI/bge-large-en-v1.5) - **Maximum Sequence Length:** 512 tokens - **Output Dimensionality:** 1024 tokens - **Similarity Function:** Cosine Similarity - **Language:** en - **License:** other ### Model Sources - **Documentation:** [Sentence Transformers Documentation](https://sbert.net) - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers) - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers) ### Full Model Architecture ``` SentenceTransformer( (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True}) (2): Normalize() ) ``` ## Usage ### Direct Usage (Sentence Transformers) First install the Sentence Transformers library: ```bash pip install -U sentence-transformers ``` Then you can load this model and run inference. ```python from sentence_transformers import SentenceTransformer # Download from the 🤗 Hub model = SentenceTransformer("dustyatx/news_v3_graph_edges_embeddings_setence_paragraph") # Run inference sentences = [ "Scope Control provides a digital ledger of inspected lines, creating a credible line history that underscores Custom Truck One Source's commitment to operational safety.", 'Scope Control utilizes advanced Computer Vision and Deep Learning technologies to accurately assess line health and categorize it as new, used, or bad based on safety standards and residual break strength.', 'China has implemented measures to address hidden debt, including extending debt maturities, selling assets to repay debts, and replacing short-term local government financial vehicle debts with longer-term, lower-cost refinancing bonds.', ] embeddings = model.encode(sentences) print(embeddings.shape) # [3, 1024] # Get the similarity scores for the embeddings similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [3, 3] ``` ## Evaluation ### Metrics #### Information Retrieval * Dataset: `dim_1024` * Evaluated with [InformationRetrievalEvaluator](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator) | Metric | Value | |:--------------------|:-----------| | cosine_accuracy@1 | 0.828 | | cosine_accuracy@3 | 0.978 | | cosine_accuracy@5 | 0.986 | | cosine_accuracy@10 | 0.992 | | cosine_precision@1 | 0.828 | | cosine_precision@3 | 0.326 | | cosine_precision@5 | 0.1972 | | cosine_precision@10 | 0.0992 | | cosine_recall@1 | 0.828 | | cosine_recall@3 | 0.978 | | cosine_recall@5 | 0.986 | | cosine_recall@10 | 0.992 | | cosine_ndcg@10 | 0.9262 | | cosine_mrr@10 | 0.9035 | | **cosine_map@100** | **0.9039** | #### Information Retrieval * Dataset: `dim_768` * Evaluated with [InformationRetrievalEvaluator](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator) | Metric | Value | |:--------------------|:----------| | cosine_accuracy@1 | 0.83 | | cosine_accuracy@3 | 0.978 | | cosine_accuracy@5 | 0.986 | | cosine_accuracy@10 | 0.99 | | cosine_precision@1 | 0.83 | | cosine_precision@3 | 0.326 | | cosine_precision@5 | 0.1972 | | cosine_precision@10 | 0.099 | | cosine_recall@1 | 0.83 | | cosine_recall@3 | 0.978 | | cosine_recall@5 | 0.986 | | cosine_recall@10 | 0.99 | | cosine_ndcg@10 | 0.9265 | | cosine_mrr@10 | 0.9044 | | **cosine_map@100** | **0.905** | #### Information Retrieval * Dataset: `dim_512` * Evaluated with [InformationRetrievalEvaluator](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator) | Metric | Value | |:--------------------|:-----------| | cosine_accuracy@1 | 0.83 | | cosine_accuracy@3 | 0.978 | | cosine_accuracy@5 | 0.988 | | cosine_accuracy@10 | 0.99 | | cosine_precision@1 | 0.83 | | cosine_precision@3 | 0.326 | | cosine_precision@5 | 0.1976 | | cosine_precision@10 | 0.099 | | cosine_recall@1 | 0.83 | | cosine_recall@3 | 0.978 | | cosine_recall@5 | 0.988 | | cosine_recall@10 | 0.99 | | cosine_ndcg@10 | 0.9262 | | cosine_mrr@10 | 0.9041 | | **cosine_map@100** | **0.9046** | #### Information Retrieval * Dataset: `dim_256` * Evaluated with [InformationRetrievalEvaluator](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator) | Metric | Value | |:--------------------|:-----------| | cosine_accuracy@1 | 0.828 | | cosine_accuracy@3 | 0.978 | | cosine_accuracy@5 | 0.984 | | cosine_accuracy@10 | 0.99 | | cosine_precision@1 | 0.828 | | cosine_precision@3 | 0.326 | | cosine_precision@5 | 0.1968 | | cosine_precision@10 | 0.099 | | cosine_recall@1 | 0.828 | | cosine_recall@3 | 0.978 | | cosine_recall@5 | 0.984 | | cosine_recall@10 | 0.99 | | cosine_ndcg@10 | 0.9251 | | cosine_mrr@10 | 0.9026 | | **cosine_map@100** | **0.9032** | #### Information Retrieval * Dataset: `dim_128` * Evaluated with [InformationRetrievalEvaluator](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator) | Metric | Value | |:--------------------|:-----------| | cosine_accuracy@1 | 0.832 | | cosine_accuracy@3 | 0.978 | | cosine_accuracy@5 | 0.986 | | cosine_accuracy@10 | 0.992 | | cosine_precision@1 | 0.832 | | cosine_precision@3 | 0.326 | | cosine_precision@5 | 0.1972 | | cosine_precision@10 | 0.0992 | | cosine_recall@1 | 0.832 | | cosine_recall@3 | 0.978 | | cosine_recall@5 | 0.986 | | cosine_recall@10 | 0.992 | | cosine_ndcg@10 | 0.9276 | | cosine_mrr@10 | 0.9054 | | **cosine_map@100** | **0.9059** | #### Information Retrieval * Dataset: `dim_64` * Evaluated with [InformationRetrievalEvaluator](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator) | Metric | Value | |:--------------------|:-----------| | cosine_accuracy@1 | 0.822 | | cosine_accuracy@3 | 0.978 | | cosine_accuracy@5 | 0.986 | | cosine_accuracy@10 | 0.99 | | cosine_precision@1 | 0.822 | | cosine_precision@3 | 0.326 | | cosine_precision@5 | 0.1972 | | cosine_precision@10 | 0.099 | | cosine_recall@1 | 0.822 | | cosine_recall@3 | 0.978 | | cosine_recall@5 | 0.986 | | cosine_recall@10 | 0.99 | | cosine_ndcg@10 | 0.9224 | | cosine_mrr@10 | 0.899 | | **cosine_map@100** | **0.8995** | ## Training Details ### Training Dataset #### Unnamed Dataset * Size: 104,022 training samples * Columns: anchor and positive * Approximate statistics based on the first 1000 samples: | | anchor | positive | |:--------|:------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------| | type | string | string | | details | | | * Samples: | anchor | positive | |:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | The general public, including retail investors, collectively own 11% of FINEOS Corporation Holdings' shares, representing a minority stake in the company. | Private companies, with their 50% ownership stake, have substantial influence over FINEOS Corporation Holdings' management and governance decisions. | | A study by the Insurance Institute for Highway Safety (IIHS) found that SUVs and vans with hood heights exceeding 40 inches are approximately 45% more likely to cause pedestrian fatalities compared to vehicles with hood heights of 30 inches or less and a sloping profile. | Vehicles with front ends exceeding 35 inches in height, particularly those lacking a sloping profile, are more likely to cause severe head, torso, and hip injuries to pedestrians. | | SpringWorks Therapeutics has a portfolio of small molecule targeted oncology product candidates and is conducting clinical trials for rare tumor types and genetically defined cancers. | SpringWorks Therapeutics operates in the biopharmaceutical industry, specializing in precision medicine for underserved patient populations. | * Loss: [MatryoshkaLoss](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters: ```json { "loss": "MultipleNegativesRankingLoss", "matryoshka_dims": [ 1024, 768, 512, 256, 128, 64 ], "matryoshka_weights": [ 1, 1, 1, 1, 1, 1 ], "n_dims_per_step": -1 } ``` ### Training Hyperparameters #### Non-Default Hyperparameters - `eval_strategy`: steps - `per_device_train_batch_size`: 30 - `per_device_eval_batch_size`: 20 - `gradient_accumulation_steps`: 8 - `learning_rate`: 3e-05 - `num_train_epochs`: 2 - `lr_scheduler_type`: cosine - `warmup_ratio`: 0.2 - `bf16`: True - `tf32`: True - `dataloader_num_workers`: 30 - `load_best_model_at_end`: True - `optim`: adamw_torch_fused - `batch_sampler`: no_duplicates #### All Hyperparameters
Click to expand - `overwrite_output_dir`: False - `do_predict`: False - `eval_strategy`: steps - `prediction_loss_only`: True - `per_device_train_batch_size`: 30 - `per_device_eval_batch_size`: 20 - `per_gpu_train_batch_size`: None - `per_gpu_eval_batch_size`: None - `gradient_accumulation_steps`: 8 - `eval_accumulation_steps`: None - `torch_empty_cache_steps`: None - `learning_rate`: 3e-05 - `weight_decay`: 0.0 - `adam_beta1`: 0.9 - `adam_beta2`: 0.999 - `adam_epsilon`: 1e-08 - `max_grad_norm`: 1.0 - `num_train_epochs`: 2 - `max_steps`: -1 - `lr_scheduler_type`: cosine - `lr_scheduler_kwargs`: {} - `warmup_ratio`: 0.2 - `warmup_steps`: 0 - `log_level`: passive - `log_level_replica`: warning - `log_on_each_node`: True - `logging_nan_inf_filter`: True - `save_safetensors`: True - `save_on_each_node`: False - `save_only_model`: False - `restore_callback_states_from_checkpoint`: False - `no_cuda`: False - `use_cpu`: False - `use_mps_device`: False - `seed`: 42 - `data_seed`: None - `jit_mode_eval`: False - `use_ipex`: False - `bf16`: True - `fp16`: False - `fp16_opt_level`: O1 - `half_precision_backend`: auto - `bf16_full_eval`: False - `fp16_full_eval`: False - `tf32`: True - `local_rank`: 0 - `ddp_backend`: None - `tpu_num_cores`: None - `tpu_metrics_debug`: False - `debug`: [] - `dataloader_drop_last`: False - `dataloader_num_workers`: 30 - `dataloader_prefetch_factor`: None - `past_index`: -1 - `disable_tqdm`: False - `remove_unused_columns`: True - `label_names`: None - `load_best_model_at_end`: True - `ignore_data_skip`: False - `fsdp`: [] - `fsdp_min_num_params`: 0 - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False} - `fsdp_transformer_layer_cls_to_wrap`: None - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None} - `deepspeed`: None - `label_smoothing_factor`: 0.0 - `optim`: adamw_torch_fused - `optim_args`: None - `adafactor`: False - `group_by_length`: False - `length_column_name`: length - `ddp_find_unused_parameters`: None - `ddp_bucket_cap_mb`: None - `ddp_broadcast_buffers`: False - `dataloader_pin_memory`: True - `dataloader_persistent_workers`: False - `skip_memory_metrics`: True - `use_legacy_prediction_loop`: False - `push_to_hub`: False - `resume_from_checkpoint`: None - `hub_model_id`: None - `hub_strategy`: every_save - `hub_private_repo`: False - `hub_always_push`: False - `gradient_checkpointing`: False - `gradient_checkpointing_kwargs`: None - `include_inputs_for_metrics`: False - `eval_do_concat_batches`: True - `fp16_backend`: auto - `push_to_hub_model_id`: None - `push_to_hub_organization`: None - `mp_parameters`: - `auto_find_batch_size`: False - `full_determinism`: False - `torchdynamo`: None - `ray_scope`: last - `ddp_timeout`: 1800 - `torch_compile`: False - `torch_compile_backend`: None - `torch_compile_mode`: None - `dispatch_batches`: None - `split_batches`: None - `include_tokens_per_second`: False - `include_num_input_tokens_seen`: False - `neftune_noise_alpha`: None - `optim_target_modules`: None - `batch_eval_metrics`: False - `eval_on_start`: False - `eval_use_gather_object`: False - `batch_sampler`: no_duplicates - `multi_dataset_batch_sampler`: proportional
### Training Logs
Click to expand | Epoch | Step | Training Loss | dim_1024_cosine_map@100 | dim_128_cosine_map@100 | dim_256_cosine_map@100 | dim_512_cosine_map@100 | dim_64_cosine_map@100 | dim_768_cosine_map@100 | |:------:|:----:|:-------------:|:-----------------------:|:----------------------:|:----------------------:|:----------------------:|:---------------------:|:----------------------:| | 0.0023 | 1 | 1.8313 | - | - | - | - | - | - | | 0.0046 | 2 | 1.9678 | - | - | - | - | - | - | | 0.0069 | 3 | 0.8038 | - | - | - | - | - | - | | 0.0092 | 4 | 0.7993 | - | - | - | - | - | - | | 0.0115 | 5 | 0.7926 | - | - | - | - | - | - | | 0.0138 | 6 | 0.9348 | - | - | - | - | - | - | | 0.0161 | 7 | 0.8707 | - | - | - | - | - | - | | 0.0185 | 8 | 0.7293 | - | - | - | - | - | - | | 0.0208 | 9 | 0.6618 | - | - | - | - | - | - | | 0.0231 | 10 | 0.846 | - | - | - | - | - | - | | 0.0254 | 11 | 0.6836 | - | - | - | - | - | - | | 0.0277 | 12 | 0.7034 | - | - | - | - | - | - | | 0.0300 | 13 | 0.7987 | - | - | - | - | - | - | | 0.0323 | 14 | 0.6443 | - | - | - | - | - | - | | 0.0346 | 15 | 0.5975 | - | - | - | - | - | - | | 0.0369 | 16 | 0.4471 | - | - | - | - | - | - | | 0.0392 | 17 | 0.4739 | - | - | - | - | - | - | | 0.0415 | 18 | 0.4136 | - | - | - | - | - | - | | 0.0438 | 19 | 0.3865 | - | - | - | - | - | - | | 0.0461 | 20 | 0.3421 | - | - | - | - | - | - | | 0.0484 | 21 | 0.5076 | - | - | - | - | - | - | | 0.0507 | 22 | 0.1878 | - | - | - | - | - | - | | 0.0531 | 23 | 0.3597 | - | - | - | - | - | - | | 0.0554 | 24 | 0.23 | - | - | - | - | - | - | | 0.0577 | 25 | 0.1331 | - | - | - | - | - | - | | 0.0600 | 26 | 0.1793 | - | - | - | - | - | - | | 0.0623 | 27 | 0.1309 | - | - | - | - | - | - | | 0.0646 | 28 | 0.1077 | - | - | - | - | - | - | | 0.0669 | 29 | 0.1681 | - | - | - | - | - | - | | 0.0692 | 30 | 0.055 | - | - | - | - | - | - | | 0.0715 | 31 | 0.1062 | - | - | - | - | - | - | | 0.0738 | 32 | 0.0672 | - | - | - | - | - | - | | 0.0761 | 33 | 0.067 | - | - | - | - | - | - | | 0.0784 | 34 | 0.0953 | - | - | - | - | - | - | | 0.0807 | 35 | 0.0602 | - | - | - | - | - | - | | 0.0830 | 36 | 0.1312 | - | - | - | - | - | - | | 0.0854 | 37 | 0.0356 | - | - | - | - | - | - | | 0.0877 | 38 | 0.0707 | - | - | - | - | - | - | | 0.0900 | 39 | 0.1525 | - | - | - | - | - | - | | 0.0923 | 40 | 0.0362 | - | - | - | - | - | - | | 0.0946 | 41 | 0.253 | - | - | - | - | - | - | | 0.0969 | 42 | 0.0572 | - | - | - | - | - | - | | 0.0992 | 43 | 0.1031 | - | - | - | - | - | - | | 0.1015 | 44 | 0.1023 | - | - | - | - | - | - | | 0.1038 | 45 | 0.052 | - | - | - | - | - | - | | 0.1061 | 46 | 0.0614 | - | - | - | - | - | - | | 0.1084 | 47 | 0.1256 | - | - | - | - | - | - | | 0.1107 | 48 | 0.1624 | - | - | - | - | - | - | | 0.1130 | 49 | 0.0363 | - | - | - | - | - | - | | 0.1153 | 50 | 0.2001 | 0.8949 | 0.8940 | 0.8947 | 0.8950 | 0.8864 | 0.8972 | | 0.1176 | 51 | 0.0846 | - | - | - | - | - | - | | 0.1200 | 52 | 0.0338 | - | - | - | - | - | - | | 0.1223 | 53 | 0.0648 | - | - | - | - | - | - | | 0.1246 | 54 | 0.1232 | - | - | - | - | - | - | | 0.1269 | 55 | 0.0318 | - | - | - | - | - | - | | 0.1292 | 56 | 0.1148 | - | - | - | - | - | - | | 0.1315 | 57 | 0.0826 | - | - | - | - | - | - | | 0.1338 | 58 | 0.034 | - | - | - | - | - | - | | 0.1361 | 59 | 0.0492 | - | - | - | - | - | - | | 0.1384 | 60 | 0.0427 | - | - | - | - | - | - | | 0.1407 | 61 | 0.0709 | - | - | - | - | - | - | | 0.1430 | 62 | 0.0494 | - | - | - | - | - | - | | 0.1453 | 63 | 0.0554 | - | - | - | - | - | - | | 0.1476 | 64 | 0.061 | - | - | - | - | - | - | | 0.1499 | 65 | 0.1155 | - | - | - | - | - | - | | 0.1522 | 66 | 0.0419 | - | - | - | - | - | - | | 0.1546 | 67 | 0.0185 | - | - | - | - | - | - | | 0.1569 | 68 | 0.0559 | - | - | - | - | - | - | | 0.1592 | 69 | 0.0219 | - | - | - | - | - | - | | 0.1615 | 70 | 0.0302 | - | - | - | - | - | - | | 0.1638 | 71 | 0.0322 | - | - | - | - | - | - | | 0.1661 | 72 | 0.0604 | - | - | - | - | - | - | | 0.1684 | 73 | 0.038 | - | - | - | - | - | - | | 0.1707 | 74 | 0.0971 | - | - | - | - | - | - | | 0.1730 | 75 | 0.0384 | - | - | - | - | - | - | | 0.1753 | 76 | 0.0887 | - | - | - | - | - | - | | 0.1776 | 77 | 0.0495 | - | - | - | - | - | - | | 0.1799 | 78 | 0.0203 | - | - | - | - | - | - | | 0.1822 | 79 | 0.0669 | - | - | - | - | - | - | | 0.1845 | 80 | 0.0319 | - | - | - | - | - | - | | 0.1869 | 81 | 0.0177 | - | - | - | - | - | - | | 0.1892 | 82 | 0.0303 | - | - | - | - | - | - | | 0.1915 | 83 | 0.037 | - | - | - | - | - | - | | 0.1938 | 84 | 0.0122 | - | - | - | - | - | - | | 0.1961 | 85 | 0.0377 | - | - | - | - | - | - | | 0.1984 | 86 | 0.0578 | - | - | - | - | - | - | | 0.2007 | 87 | 0.0347 | - | - | - | - | - | - | | 0.2030 | 88 | 0.1288 | - | - | - | - | - | - | | 0.2053 | 89 | 0.0964 | - | - | - | - | - | - | | 0.2076 | 90 | 0.0172 | - | - | - | - | - | - | | 0.2099 | 91 | 0.0726 | - | - | - | - | - | - | | 0.2122 | 92 | 0.0225 | - | - | - | - | - | - | | 0.2145 | 93 | 0.1011 | - | - | - | - | - | - | | 0.2168 | 94 | 0.0248 | - | - | - | - | - | - | | 0.2191 | 95 | 0.0431 | - | - | - | - | - | - | | 0.2215 | 96 | 0.0243 | - | - | - | - | - | - | | 0.2238 | 97 | 0.0221 | - | - | - | - | - | - | | 0.2261 | 98 | 0.0529 | - | - | - | - | - | - | | 0.2284 | 99 | 0.0459 | - | - | - | - | - | - | | 0.2307 | 100 | 0.0869 | 0.9026 | 0.8967 | 0.8950 | 0.9003 | 0.8915 | 0.9009 | | 0.2330 | 101 | 0.0685 | - | - | - | - | - | - | | 0.2353 | 102 | 0.0801 | - | - | - | - | - | - | | 0.2376 | 103 | 0.025 | - | - | - | - | - | - | | 0.2399 | 104 | 0.0556 | - | - | - | - | - | - | | 0.2422 | 105 | 0.0146 | - | - | - | - | - | - | | 0.2445 | 106 | 0.0335 | - | - | - | - | - | - | | 0.2468 | 107 | 0.0441 | - | - | - | - | - | - | | 0.2491 | 108 | 0.0187 | - | - | - | - | - | - | | 0.2514 | 109 | 0.1027 | - | - | - | - | - | - | | 0.2537 | 110 | 0.0189 | - | - | - | - | - | - | | 0.2561 | 111 | 0.1262 | - | - | - | - | - | - | | 0.2584 | 112 | 0.1193 | - | - | - | - | - | - | | 0.2607 | 113 | 0.0285 | - | - | - | - | - | - | | 0.2630 | 114 | 0.0226 | - | - | - | - | - | - | | 0.2653 | 115 | 0.1209 | - | - | - | - | - | - | | 0.2676 | 116 | 0.0765 | - | - | - | - | - | - | | 0.2699 | 117 | 0.1405 | - | - | - | - | - | - | | 0.2722 | 118 | 0.0629 | - | - | - | - | - | - | | 0.2745 | 119 | 0.0413 | - | - | - | - | - | - | | 0.2768 | 120 | 0.0572 | - | - | - | - | - | - | | 0.2791 | 121 | 0.0192 | - | - | - | - | - | - | | 0.2814 | 122 | 0.0949 | - | - | - | - | - | - | | 0.2837 | 123 | 0.0398 | - | - | - | - | - | - | | 0.2860 | 124 | 0.0596 | - | - | - | - | - | - | | 0.2884 | 125 | 0.0243 | - | - | - | - | - | - | | 0.2907 | 126 | 0.0636 | - | - | - | - | - | - | | 0.2930 | 127 | 0.0367 | - | - | - | - | - | - | | 0.2953 | 128 | 0.0542 | - | - | - | - | - | - | | 0.2976 | 129 | 0.0149 | - | - | - | - | - | - | | 0.2999 | 130 | 0.097 | - | - | - | - | - | - | | 0.3022 | 131 | 0.0213 | - | - | - | - | - | - | | 0.3045 | 132 | 0.027 | - | - | - | - | - | - | | 0.3068 | 133 | 0.0577 | - | - | - | - | - | - | | 0.3091 | 134 | 0.0143 | - | - | - | - | - | - | | 0.3114 | 135 | 0.0285 | - | - | - | - | - | - | | 0.3137 | 136 | 0.033 | - | - | - | - | - | - | | 0.3160 | 137 | 0.0412 | - | - | - | - | - | - | | 0.3183 | 138 | 0.0125 | - | - | - | - | - | - | | 0.3206 | 139 | 0.0512 | - | - | - | - | - | - | | 0.3230 | 140 | 0.0189 | - | - | - | - | - | - | | 0.3253 | 141 | 0.124 | - | - | - | - | - | - | | 0.3276 | 142 | 0.0118 | - | - | - | - | - | - | | 0.3299 | 143 | 0.017 | - | - | - | - | - | - | | 0.3322 | 144 | 0.025 | - | - | - | - | - | - | | 0.3345 | 145 | 0.0187 | - | - | - | - | - | - | | 0.3368 | 146 | 0.0141 | - | - | - | - | - | - | | 0.3391 | 147 | 0.0325 | - | - | - | - | - | - | | 0.3414 | 148 | 0.0582 | - | - | - | - | - | - | | 0.3437 | 149 | 0.0611 | - | - | - | - | - | - | | 0.3460 | 150 | 0.0261 | 0.9047 | 0.8995 | 0.9003 | 0.9022 | 0.8998 | 0.9032 | | 0.3483 | 151 | 0.014 | - | - | - | - | - | - | | 0.3506 | 152 | 0.0077 | - | - | - | - | - | - | | 0.3529 | 153 | 0.022 | - | - | - | - | - | - | | 0.3552 | 154 | 0.0328 | - | - | - | - | - | - | | 0.3576 | 155 | 0.0124 | - | - | - | - | - | - | | 0.3599 | 156 | 0.0103 | - | - | - | - | - | - | | 0.3622 | 157 | 0.0607 | - | - | - | - | - | - | | 0.3645 | 158 | 0.0121 | - | - | - | - | - | - | | 0.3668 | 159 | 0.0761 | - | - | - | - | - | - | | 0.3691 | 160 | 0.0981 | - | - | - | - | - | - | | 0.3714 | 161 | 0.1071 | - | - | - | - | - | - | | 0.3737 | 162 | 0.1307 | - | - | - | - | - | - | | 0.3760 | 163 | 0.0524 | - | - | - | - | - | - | | 0.3783 | 164 | 0.0726 | - | - | - | - | - | - | | 0.3806 | 165 | 0.0636 | - | - | - | - | - | - | | 0.3829 | 166 | 0.0428 | - | - | - | - | - | - | | 0.3852 | 167 | 0.0111 | - | - | - | - | - | - | | 0.3875 | 168 | 0.0542 | - | - | - | - | - | - | | 0.3899 | 169 | 0.0193 | - | - | - | - | - | - | | 0.3922 | 170 | 0.0095 | - | - | - | - | - | - | | 0.3945 | 171 | 0.0464 | - | - | - | - | - | - | | 0.3968 | 172 | 0.0167 | - | - | - | - | - | - | | 0.3991 | 173 | 0.0209 | - | - | - | - | - | - | | 0.4014 | 174 | 0.0359 | - | - | - | - | - | - | | 0.4037 | 175 | 0.071 | - | - | - | - | - | - | | 0.4060 | 176 | 0.0189 | - | - | - | - | - | - | | 0.4083 | 177 | 0.0448 | - | - | - | - | - | - | | 0.4106 | 178 | 0.0161 | - | - | - | - | - | - | | 0.4129 | 179 | 0.0427 | - | - | - | - | - | - | | 0.4152 | 180 | 0.0229 | - | - | - | - | - | - | | 0.4175 | 181 | 0.0274 | - | - | - | - | - | - | | 0.4198 | 182 | 0.0173 | - | - | - | - | - | - | | 0.4221 | 183 | 0.0123 | - | - | - | - | - | - | | 0.4245 | 184 | 0.0395 | - | - | - | - | - | - | | 0.4268 | 185 | 0.015 | - | - | - | - | - | - | | 0.4291 | 186 | 0.0168 | - | - | - | - | - | - | | 0.4314 | 187 | 0.0165 | - | - | - | - | - | - | | 0.4337 | 188 | 0.0412 | - | - | - | - | - | - | | 0.4360 | 189 | 0.0961 | - | - | - | - | - | - | | 0.4383 | 190 | 0.0551 | - | - | - | - | - | - | | 0.4406 | 191 | 0.0685 | - | - | - | - | - | - | | 0.4429 | 192 | 0.1561 | - | - | - | - | - | - | | 0.4452 | 193 | 0.0333 | - | - | - | - | - | - | | 0.4475 | 194 | 0.0567 | - | - | - | - | - | - | | 0.4498 | 195 | 0.0081 | - | - | - | - | - | - | | 0.4521 | 196 | 0.0297 | - | - | - | - | - | - | | 0.4544 | 197 | 0.0131 | - | - | - | - | - | - | | 0.4567 | 198 | 0.0322 | - | - | - | - | - | - | | 0.4591 | 199 | 0.0224 | - | - | - | - | - | - | | 0.4614 | 200 | 0.0068 | 0.8989 | 0.8941 | 0.8983 | 0.8985 | 0.8975 | 0.9002 | | 0.4637 | 201 | 0.0115 | - | - | - | - | - | - | | 0.4660 | 202 | 0.0098 | - | - | - | - | - | - | | 0.4683 | 203 | 0.101 | - | - | - | - | - | - | | 0.4706 | 204 | 0.0282 | - | - | - | - | - | - | | 0.4729 | 205 | 0.0721 | - | - | - | - | - | - | | 0.4752 | 206 | 0.0123 | - | - | - | - | - | - | | 0.4775 | 207 | 0.1014 | - | - | - | - | - | - | | 0.4798 | 208 | 0.0257 | - | - | - | - | - | - | | 0.4821 | 209 | 0.1126 | - | - | - | - | - | - | | 0.4844 | 210 | 0.0586 | - | - | - | - | - | - | | 0.4867 | 211 | 0.0307 | - | - | - | - | - | - | | 0.4890 | 212 | 0.0226 | - | - | - | - | - | - | | 0.4913 | 213 | 0.0471 | - | - | - | - | - | - | | 0.4937 | 214 | 0.025 | - | - | - | - | - | - | | 0.4960 | 215 | 0.0799 | - | - | - | - | - | - | | 0.4983 | 216 | 0.0173 | - | - | - | - | - | - | | 0.5006 | 217 | 0.0208 | - | - | - | - | - | - | | 0.5029 | 218 | 0.0461 | - | - | - | - | - | - | | 0.5052 | 219 | 0.0592 | - | - | - | - | - | - | | 0.5075 | 220 | 0.0076 | - | - | - | - | - | - | | 0.5098 | 221 | 0.0156 | - | - | - | - | - | - | | 0.5121 | 222 | 0.0149 | - | - | - | - | - | - | | 0.5144 | 223 | 0.0138 | - | - | - | - | - | - | | 0.5167 | 224 | 0.0526 | - | - | - | - | - | - | | 0.5190 | 225 | 0.0689 | - | - | - | - | - | - | | 0.5213 | 226 | 0.0191 | - | - | - | - | - | - | | 0.5236 | 227 | 0.0094 | - | - | - | - | - | - | | 0.5260 | 228 | 0.0125 | - | - | - | - | - | - | | 0.5283 | 229 | 0.0632 | - | - | - | - | - | - | | 0.5306 | 230 | 0.0773 | - | - | - | - | - | - | | 0.5329 | 231 | 0.0147 | - | - | - | - | - | - | | 0.5352 | 232 | 0.0145 | - | - | - | - | - | - | | 0.5375 | 233 | 0.0068 | - | - | - | - | - | - | | 0.5398 | 234 | 0.0673 | - | - | - | - | - | - | | 0.5421 | 235 | 0.0131 | - | - | - | - | - | - | | 0.5444 | 236 | 0.0217 | - | - | - | - | - | - | | 0.5467 | 237 | 0.0126 | - | - | - | - | - | - | | 0.5490 | 238 | 0.0172 | - | - | - | - | - | - | | 0.5513 | 239 | 0.0122 | - | - | - | - | - | - | | 0.5536 | 240 | 0.0175 | - | - | - | - | - | - | | 0.5559 | 241 | 0.0184 | - | - | - | - | - | - | | 0.5582 | 242 | 0.0422 | - | - | - | - | - | - | | 0.5606 | 243 | 0.0106 | - | - | - | - | - | - | | 0.5629 | 244 | 0.071 | - | - | - | - | - | - | | 0.5652 | 245 | 0.0089 | - | - | - | - | - | - | | 0.5675 | 246 | 0.0099 | - | - | - | - | - | - | | 0.5698 | 247 | 0.0133 | - | - | - | - | - | - | | 0.5721 | 248 | 0.0627 | - | - | - | - | - | - | | 0.5744 | 249 | 0.0248 | - | - | - | - | - | - | | 0.5767 | 250 | 0.0349 | 0.8970 | 0.8968 | 0.8961 | 0.8961 | 0.8952 | 0.8963 | | 0.5790 | 251 | 0.0145 | - | - | - | - | - | - | | 0.5813 | 252 | 0.0052 | - | - | - | - | - | - | | 0.5836 | 253 | 0.0198 | - | - | - | - | - | - | | 0.5859 | 254 | 0.0065 | - | - | - | - | - | - | | 0.5882 | 255 | 0.007 | - | - | - | - | - | - | | 0.5905 | 256 | 0.0072 | - | - | - | - | - | - | | 0.5928 | 257 | 0.1878 | - | - | - | - | - | - | | 0.5952 | 258 | 0.0091 | - | - | - | - | - | - | | 0.5975 | 259 | 0.0421 | - | - | - | - | - | - | | 0.5998 | 260 | 0.0166 | - | - | - | - | - | - | | 0.6021 | 261 | 0.0909 | - | - | - | - | - | - | | 0.6044 | 262 | 0.0107 | - | - | - | - | - | - | | 0.6067 | 263 | 0.0191 | - | - | - | - | - | - | | 0.6090 | 264 | 0.0168 | - | - | - | - | - | - | | 0.6113 | 265 | 0.0814 | - | - | - | - | - | - | | 0.6136 | 266 | 0.0736 | - | - | - | - | - | - | | 0.6159 | 267 | 0.0297 | - | - | - | - | - | - | | 0.6182 | 268 | 0.016 | - | - | - | - | - | - | | 0.6205 | 269 | 0.0201 | - | - | - | - | - | - | | 0.6228 | 270 | 0.0111 | - | - | - | - | - | - | | 0.6251 | 271 | 0.0164 | - | - | - | - | - | - | | 0.6275 | 272 | 0.0106 | - | - | - | - | - | - | | 0.6298 | 273 | 0.0287 | - | - | - | - | - | - | | 0.6321 | 274 | 0.0595 | - | - | - | - | - | - | | 0.6344 | 275 | 0.0446 | - | - | - | - | - | - | | 0.6367 | 276 | 0.0203 | - | - | - | - | - | - | | 0.6390 | 277 | 0.0079 | - | - | - | - | - | - | | 0.6413 | 278 | 0.0345 | - | - | - | - | - | - | | 0.6436 | 279 | 0.0461 | - | - | - | - | - | - | | 0.6459 | 280 | 0.0803 | - | - | - | - | - | - | | 0.6482 | 281 | 0.0218 | - | - | - | - | - | - | | 0.6505 | 282 | 0.0288 | - | - | - | - | - | - | | 0.6528 | 283 | 0.0745 | - | - | - | - | - | - | | 0.6551 | 284 | 0.0102 | - | - | - | - | - | - | | 0.6574 | 285 | 0.0626 | - | - | - | - | - | - | | 0.6597 | 286 | 0.0606 | - | - | - | - | - | - | | 0.6621 | 287 | 0.0319 | - | - | - | - | - | - | | 0.6644 | 288 | 0.0303 | - | - | - | - | - | - | | 0.6667 | 289 | 0.0216 | - | - | - | - | - | - | | 0.6690 | 290 | 0.0417 | - | - | - | - | - | - | | 0.6713 | 291 | 0.0061 | - | - | - | - | - | - | | 0.6736 | 292 | 0.0386 | - | - | - | - | - | - | | 0.6759 | 293 | 0.0117 | - | - | - | - | - | - | | 0.6782 | 294 | 0.0283 | - | - | - | - | - | - | | 0.6805 | 295 | 0.013 | - | - | - | - | - | - | | 0.6828 | 296 | 0.1237 | - | - | - | - | - | - | | 0.6851 | 297 | 0.0878 | - | - | - | - | - | - | | 0.6874 | 298 | 0.0158 | - | - | - | - | - | - | | 0.6897 | 299 | 0.0562 | - | - | - | - | - | - | | 0.6920 | 300 | 0.0871 | 0.9022 | 0.9027 | 0.9074 | 0.9055 | 0.8990 | 0.9027 | | 0.6943 | 301 | 0.0657 | - | - | - | - | - | - | | 0.6967 | 302 | 0.0239 | - | - | - | - | - | - | | 0.6990 | 303 | 0.0053 | - | - | - | - | - | - | | 0.7013 | 304 | 0.0237 | - | - | - | - | - | - | | 0.7036 | 305 | 0.0182 | - | - | - | - | - | - | | 0.7059 | 306 | 0.0135 | - | - | - | - | - | - | | 0.7082 | 307 | 0.0059 | - | - | - | - | - | - | | 0.7105 | 308 | 0.0061 | - | - | - | - | - | - | | 0.7128 | 309 | 0.0072 | - | - | - | - | - | - | | 0.7151 | 310 | 0.0319 | - | - | - | - | - | - | | 0.7174 | 311 | 0.1183 | - | - | - | - | - | - | | 0.7197 | 312 | 0.0447 | - | - | - | - | - | - | | 0.7220 | 313 | 0.0369 | - | - | - | - | - | - | | 0.7243 | 314 | 0.0462 | - | - | - | - | - | - | | 0.7266 | 315 | 0.0233 | - | - | - | - | - | - | | 0.7290 | 316 | 0.0114 | - | - | - | - | - | - | | 0.7313 | 317 | 0.0179 | - | - | - | - | - | - | | 0.7336 | 318 | 0.0203 | - | - | - | - | - | - | | 0.7359 | 319 | 0.0071 | - | - | - | - | - | - | | 0.7382 | 320 | 0.1297 | - | - | - | - | - | - | | 0.7405 | 321 | 0.0249 | - | - | - | - | - | - | | 0.7428 | 322 | 0.063 | - | - | - | - | - | - | | 0.7451 | 323 | 0.0479 | - | - | - | - | - | - | | 0.7474 | 324 | 0.1483 | - | - | - | - | - | - | | 0.7497 | 325 | 0.0058 | - | - | - | - | - | - | | 0.7520 | 326 | 0.0191 | - | - | - | - | - | - | | 0.7543 | 327 | 0.0855 | - | - | - | - | - | - | | 0.7566 | 328 | 0.0156 | - | - | - | - | - | - | | 0.7589 | 329 | 0.0147 | - | - | - | - | - | - | | 0.7612 | 330 | 0.0124 | - | - | - | - | - | - | | 0.7636 | 331 | 0.0242 | - | - | - | - | - | - | | 0.7659 | 332 | 0.0433 | - | - | - | - | - | - | | 0.7682 | 333 | 0.0103 | - | - | - | - | - | - | | 0.7705 | 334 | 0.0833 | - | - | - | - | - | - | | 0.7728 | 335 | 0.0082 | - | - | - | - | - | - | | 0.7751 | 336 | 0.0122 | - | - | - | - | - | - | | 0.7774 | 337 | 0.031 | - | - | - | - | - | - | | 0.7797 | 338 | 0.0116 | - | - | - | - | - | - | | 0.7820 | 339 | 0.0947 | - | - | - | - | - | - | | 0.7843 | 340 | 0.0323 | - | - | - | - | - | - | | 0.7866 | 341 | 0.0177 | - | - | - | - | - | - | | 0.7889 | 342 | 0.0487 | - | - | - | - | - | - | | 0.7912 | 343 | 0.0123 | - | - | - | - | - | - | | 0.7935 | 344 | 0.0075 | - | - | - | - | - | - | | 0.7958 | 345 | 0.0061 | - | - | - | - | - | - | | 0.7982 | 346 | 0.0057 | - | - | - | - | - | - | | 0.8005 | 347 | 0.1108 | - | - | - | - | - | - | | 0.8028 | 348 | 0.0104 | - | - | - | - | - | - | | 0.8051 | 349 | 0.0131 | - | - | - | - | - | - | | 0.8074 | 350 | 0.0229 | 0.9053 | 0.9041 | 0.9033 | 0.9066 | 0.8965 | 0.9052 | | 0.8097 | 351 | 0.0478 | - | - | - | - | - | - | | 0.8120 | 352 | 0.0127 | - | - | - | - | - | - | | 0.8143 | 353 | 0.1143 | - | - | - | - | - | - | | 0.8166 | 354 | 0.0365 | - | - | - | - | - | - | | 0.8189 | 355 | 0.0418 | - | - | - | - | - | - | | 0.8212 | 356 | 0.0494 | - | - | - | - | - | - | | 0.8235 | 357 | 0.0082 | - | - | - | - | - | - | | 0.8258 | 358 | 0.0212 | - | - | - | - | - | - | | 0.8281 | 359 | 0.0106 | - | - | - | - | - | - | | 0.8304 | 360 | 0.1009 | - | - | - | - | - | - | | 0.8328 | 361 | 0.0316 | - | - | - | - | - | - | | 0.8351 | 362 | 0.0313 | - | - | - | - | - | - | | 0.8374 | 363 | 0.0108 | - | - | - | - | - | - | | 0.8397 | 364 | 0.0198 | - | - | - | - | - | - | | 0.8420 | 365 | 0.0112 | - | - | - | - | - | - | | 0.8443 | 366 | 0.0197 | - | - | - | - | - | - | | 0.8466 | 367 | 0.058 | - | - | - | - | - | - | | 0.8489 | 368 | 0.0187 | - | - | - | - | - | - | | 0.8512 | 369 | 0.0196 | - | - | - | - | - | - | | 0.8535 | 370 | 0.0586 | - | - | - | - | - | - | | 0.8558 | 371 | 0.0099 | - | - | - | - | - | - | | 0.8581 | 372 | 0.0248 | - | - | - | - | - | - | | 0.8604 | 373 | 0.0183 | - | - | - | - | - | - | | 0.8627 | 374 | 0.0268 | - | - | - | - | - | - | | 0.8651 | 375 | 0.0154 | - | - | - | - | - | - | | 0.8674 | 376 | 0.0868 | - | - | - | - | - | - | | 0.8697 | 377 | 0.0264 | - | - | - | - | - | - | | 0.8720 | 378 | 0.0639 | - | - | - | - | - | - | | 0.8743 | 379 | 0.1036 | - | - | - | - | - | - | | 0.8766 | 380 | 0.0334 | - | - | - | - | - | - | | 0.8789 | 381 | 0.04 | - | - | - | - | - | - | | 0.8812 | 382 | 0.0095 | - | - | - | - | - | - | | 0.8835 | 383 | 0.0371 | - | - | - | - | - | - | | 0.8858 | 384 | 0.0585 | - | - | - | - | - | - | | 0.8881 | 385 | 0.0353 | - | - | - | - | - | - | | 0.8904 | 386 | 0.0095 | - | - | - | - | - | - | | 0.8927 | 387 | 0.0126 | - | - | - | - | - | - | | 0.8950 | 388 | 0.0384 | - | - | - | - | - | - | | 0.8973 | 389 | 0.018 | - | - | - | - | - | - | | 0.8997 | 390 | 0.057 | - | - | - | - | - | - | | 0.9020 | 391 | 0.0371 | - | - | - | - | - | - | | 0.9043 | 392 | 0.0475 | - | - | - | - | - | - | | 0.9066 | 393 | 0.0972 | - | - | - | - | - | - | | 0.9089 | 394 | 0.0189 | - | - | - | - | - | - | | 0.9112 | 395 | 0.0993 | - | - | - | - | - | - | | 0.9135 | 396 | 0.0527 | - | - | - | - | - | - | | 0.9158 | 397 | 0.0466 | - | - | - | - | - | - | | 0.9181 | 398 | 0.0383 | - | - | - | - | - | - | | 0.9204 | 399 | 0.0322 | - | - | - | - | - | - | | 0.9227 | 400 | 0.0651 | 0.9077 | 0.9074 | 0.9073 | 0.9077 | 0.9023 | 0.9078 | | 0.9250 | 401 | 0.0055 | - | - | - | - | - | - | | 0.9273 | 402 | 0.0083 | - | - | - | - | - | - | | 0.9296 | 403 | 0.0062 | - | - | - | - | - | - | | 0.9319 | 404 | 0.0085 | - | - | - | - | - | - | | 0.9343 | 405 | 0.0179 | - | - | - | - | - | - | | 0.9366 | 406 | 0.0041 | - | - | - | - | - | - | | 0.9389 | 407 | 0.0978 | - | - | - | - | - | - | | 0.9412 | 408 | 0.0068 | - | - | - | - | - | - | | 0.9435 | 409 | 0.0145 | - | - | - | - | - | - | | 0.9458 | 410 | 0.0098 | - | - | - | - | - | - | | 0.9481 | 411 | 0.032 | - | - | - | - | - | - | | 0.9504 | 412 | 0.0232 | - | - | - | - | - | - | | 0.9527 | 413 | 0.0149 | - | - | - | - | - | - | | 0.9550 | 414 | 0.0175 | - | - | - | - | - | - | | 0.9573 | 415 | 0.0099 | - | - | - | - | - | - | | 0.9596 | 416 | 0.0121 | - | - | - | - | - | - | | 0.9619 | 417 | 0.108 | - | - | - | - | - | - | | 0.9642 | 418 | 0.012 | - | - | - | - | - | - | | 0.9666 | 419 | 0.0102 | - | - | - | - | - | - | | 0.9689 | 420 | 0.0108 | - | - | - | - | - | - | | 0.9712 | 421 | 0.2258 | - | - | - | - | - | - | | 0.9735 | 422 | 0.0037 | - | - | - | - | - | - | | 0.9758 | 423 | 0.0186 | - | - | - | - | - | - | | 0.9781 | 424 | 0.0446 | - | - | - | - | - | - | | 0.9804 | 425 | 0.1558 | - | - | - | - | - | - | | 0.9827 | 426 | 0.023 | - | - | - | - | - | - | | 0.9850 | 427 | 0.0075 | - | - | - | - | - | - | | 0.9873 | 428 | 0.0095 | - | - | - | - | - | - | | 0.9896 | 429 | 0.0141 | - | - | - | - | - | - | | 0.9919 | 430 | 0.0617 | - | - | - | - | - | - | | 0.9942 | 431 | 0.0961 | - | - | - | - | - | - | | 0.9965 | 432 | 0.0058 | - | - | - | - | - | - | | 0.9988 | 433 | 0.0399 | - | - | - | - | - | - | | 1.0012 | 434 | 0.0063 | - | - | - | - | - | - | | 1.0035 | 435 | 0.0288 | - | - | - | - | - | - | | 1.0058 | 436 | 0.0041 | - | - | - | - | - | - | | 1.0081 | 437 | 0.0071 | - | - | - | - | - | - | | 1.0104 | 438 | 0.0233 | - | - | - | - | - | - | | 1.0127 | 439 | 0.0135 | - | - | - | - | - | - | | 1.0150 | 440 | 0.1015 | - | - | - | - | - | - | | 1.0173 | 441 | 0.0045 | - | - | - | - | - | - | | 1.0196 | 442 | 0.0088 | - | - | - | - | - | - | | 1.0219 | 443 | 0.0086 | - | - | - | - | - | - | | 1.0242 | 444 | 0.0072 | - | - | - | - | - | - | | 1.0265 | 445 | 0.0147 | - | - | - | - | - | - | | 1.0288 | 446 | 0.025 | - | - | - | - | - | - | | 1.0311 | 447 | 0.0067 | - | - | - | - | - | - | | 1.0334 | 448 | 0.0066 | - | - | - | - | - | - | | 1.0358 | 449 | 0.0062 | - | - | - | - | - | - | | 1.0381 | 450 | 0.0068 | 0.9091 | 0.9083 | 0.9045 | 0.9038 | 0.8983 | 0.9072 | | 1.0404 | 451 | 0.0126 | - | - | - | - | - | - | | 1.0427 | 452 | 0.0082 | - | - | - | - | - | - | | 1.0450 | 453 | 0.0034 | - | - | - | - | - | - | | 1.0473 | 454 | 0.04 | - | - | - | - | - | - | | 1.0496 | 455 | 0.0235 | - | - | - | - | - | - | | 1.0519 | 456 | 0.24 | - | - | - | - | - | - | | 1.0542 | 457 | 0.0514 | - | - | - | - | - | - | | 1.0565 | 458 | 0.0152 | - | - | - | - | - | - | | 1.0588 | 459 | 0.0476 | - | - | - | - | - | - | | 1.0611 | 460 | 0.0037 | - | - | - | - | - | - | | 1.0634 | 461 | 0.0066 | - | - | - | - | - | - | | 1.0657 | 462 | 0.0065 | - | - | - | - | - | - | | 1.0681 | 463 | 0.0097 | - | - | - | - | - | - | | 1.0704 | 464 | 0.0053 | - | - | - | - | - | - | | 1.0727 | 465 | 0.0397 | - | - | - | - | - | - | | 1.0750 | 466 | 0.0089 | - | - | - | - | - | - | | 1.0773 | 467 | 0.0238 | - | - | - | - | - | - | | 1.0796 | 468 | 0.0078 | - | - | - | - | - | - | | 1.0819 | 469 | 0.0108 | - | - | - | - | - | - | | 1.0842 | 470 | 0.0094 | - | - | - | - | - | - | | 1.0865 | 471 | 0.0034 | - | - | - | - | - | - | | 1.0888 | 472 | 0.0165 | - | - | - | - | - | - | | 1.0911 | 473 | 0.0407 | - | - | - | - | - | - | | 1.0934 | 474 | 0.0339 | - | - | - | - | - | - | | 1.0957 | 475 | 0.0645 | - | - | - | - | - | - | | 1.0980 | 476 | 0.0052 | - | - | - | - | - | - | | 1.1003 | 477 | 0.0643 | - | - | - | - | - | - | | 1.1027 | 478 | 0.0113 | - | - | - | - | - | - | | 1.1050 | 479 | 0.007 | - | - | - | - | - | - | | 1.1073 | 480 | 0.0062 | - | - | - | - | - | - | | 1.1096 | 481 | 0.0232 | - | - | - | - | - | - | | 1.1119 | 482 | 0.0374 | - | - | - | - | - | - | | 1.1142 | 483 | 0.0582 | - | - | - | - | - | - | | 1.1165 | 484 | 0.0396 | - | - | - | - | - | - | | 1.1188 | 485 | 0.0041 | - | - | - | - | - | - | | 1.1211 | 486 | 0.0064 | - | - | - | - | - | - | | 1.1234 | 487 | 0.0248 | - | - | - | - | - | - | | 1.1257 | 488 | 0.0052 | - | - | - | - | - | - | | 1.1280 | 489 | 0.0095 | - | - | - | - | - | - | | 1.1303 | 490 | 0.0681 | - | - | - | - | - | - | | 1.1326 | 491 | 0.0082 | - | - | - | - | - | - | | 1.1349 | 492 | 0.0279 | - | - | - | - | - | - | | 1.1373 | 493 | 0.008 | - | - | - | - | - | - | | 1.1396 | 494 | 0.0032 | - | - | - | - | - | - | | 1.1419 | 495 | 0.041 | - | - | - | - | - | - | | 1.1442 | 496 | 0.0089 | - | - | - | - | - | - | | 1.1465 | 497 | 0.0289 | - | - | - | - | - | - | | 1.1488 | 498 | 0.0232 | - | - | - | - | - | - | | 1.1511 | 499 | 0.059 | - | - | - | - | - | - | | 1.1534 | 500 | 0.0053 | 0.9039 | 0.9059 | 0.9032 | 0.9046 | 0.8995 | 0.9050 |
### Framework Versions - Python: 3.11.9 - Sentence Transformers: 3.0.1 - Transformers: 4.44.2 - PyTorch: 2.4.0+cu121 - Accelerate: 0.33.0 - Datasets: 2.19.2 - Tokenizers: 0.19.1 ## Citation ### BibTeX #### Sentence Transformers ```bibtex @inproceedings{reimers-2019-sentence-bert, title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks", author = "Reimers, Nils and Gurevych, Iryna", booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing", month = "11", year = "2019", publisher = "Association for Computational Linguistics", url = "https://arxiv.org/abs/1908.10084", } ``` #### MatryoshkaLoss ```bibtex @misc{kusupati2024matryoshka, title={Matryoshka Representation Learning}, author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi}, year={2024}, eprint={2205.13147}, archivePrefix={arXiv}, primaryClass={cs.LG} } ``` #### MultipleNegativesRankingLoss ```bibtex @misc{henderson2017efficient, title={Efficient Natural Language Response Suggestion for Smart Reply}, author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil}, year={2017}, eprint={1705.00652}, archivePrefix={arXiv}, primaryClass={cs.CL} } ```