XicoC's picture
Add new SentenceTransformer model.
9d4fa34 verified
metadata
base_model: Snowflake/snowflake-arctic-embed-m
library_name: sentence-transformers
metrics:
  - cosine_accuracy@1
  - cosine_accuracy@3
  - cosine_accuracy@5
  - cosine_accuracy@10
  - cosine_precision@1
  - cosine_precision@3
  - cosine_precision@5
  - cosine_precision@10
  - cosine_recall@1
  - cosine_recall@3
  - cosine_recall@5
  - cosine_recall@10
  - cosine_ndcg@10
  - cosine_mrr@10
  - cosine_map@100
  - dot_accuracy@1
  - dot_accuracy@3
  - dot_accuracy@5
  - dot_accuracy@10
  - dot_precision@1
  - dot_precision@3
  - dot_precision@5
  - dot_precision@10
  - dot_recall@1
  - dot_recall@3
  - dot_recall@5
  - dot_recall@10
  - dot_ndcg@10
  - dot_mrr@10
  - dot_map@100
pipeline_tag: sentence-similarity
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:600
  - loss:MatryoshkaLoss
  - loss:MultipleNegativesRankingLoss
widget:
  - source_sentence: >-
      How can high compute resource utilization in training GAI models affect
      ecosystems?
    sentences:
      - >-
        should not be used in education, work, housing, or in other contexts
        where the use of such surveillance 

        technologies is likely to limit rights, opportunities, or access.
        Whenever possible, you should have access to 

        reporting that confirms your data decisions have been respected and
        provides an assessment of the 

        potential impact of surveillance technologies on your rights,
        opportunities, or access. 

        NOTICE AND EXPLANATION
      - >-
        Legal Disclaimer 

        The Blueprint for an AI Bill of Rights: Making Automated Systems Work
        for the American People is a white paper 

        published by the White House Office of Science and Technology Policy. It
        is intended to support the 

        development of policies and practices that protect civil rights and
        promote democratic values in the building, 

        deployment, and governance of automated systems. 

        The Blueprint for an AI Bill of Rights is non-binding and does not
        constitute U.S. government policy. It 

        does not supersede, modify, or direct an interpretation of any existing
        statute, regulation, policy, or 

        international instrument. It does not constitute binding guidance for
        the public or Federal agencies and
      - >-
        or stereotyping content . 

        4. Data Privacy: Impacts due to l eakage and unauthorized use,
        disclosure , or de -anonymization of 

        biometric, health, location , or other  personally identifiable
        information  or sensitive data .7 

        5. Environmental Impacts: Impacts due to high compute resource
        utilization in training or 

        operating GAI models, and related outcomes that may adversely impact
        ecosystems.  

        6. Harmful Bias or Homogenization: Amplification and exacerbation of 
        historical,  societal, and 

        systemic  biases ; performance disparities8 between  sub- groups or
        languages , possibly due to 

        non- representative training data , that result in discrimination,
        amplification of biases, or
  - source_sentence: >-
      What are the potential risks associated with human-AI configuration in GAI
      systems?
    sentences:
      - >-
        establish approved GAI technology and service provider lists.  Value
        Chain and Component 

        Integration  

        GV-6.1-0 08 Maintain records of changes to content made by third parties
        to promote content 

        provenance, including sources, timestamps, metadata . Information
        Integrity ; Value Chain 

        and Component Integration; Intellectual Property  

        GV-6.1-0 09 Update and integrate due diligence processes for GAI
        acquisition and 

        procurement vendor assessments to include intellectual property, data
        privacy, security, and other risks. For example, update p rocesses 

        to: Address solutions that 

        may rely on embedded GAI technologies; Address ongoing monitoring , 

        assessments, and alerting, dynamic risk assessments, and real -time
        reporting
      - >-
        could lead to homogenized outputs, including by amplifying any
        homogenization from the model used to 

        generate the synthetic training data . 

        Trustworthy AI Characteristics:  Fair with Harmful Bias Managed, Valid
        and Reliable  

        2.7. Human -AI Configuration  

        GAI system  use can involve varying risks of misconfigurations  and poor
        interactions  between a system 

        and a human who is interacti ng with it. Humans bring their  unique
        perspectives , experiences , or domain -

        specific expertise to interactions with AI systems  but may not have
        detailed knowledge of AI systems and 

        how they work.  As a result, h uman experts may be unnecessarily “averse
         to GAI systems , and thus 

        deprive themselves  or others  of GAI’s beneficial uses .
      - >-
        requests image features that are inconsistent with the  stereotypes. 
        Harmful b ias in GAI models , which 

        may stem from their training data , can also  cause representational 
        harm s or perpetuate  or exacerbate  

        bias based on race, gender, disability, or other protected classes .  

        Harmful b ias in GAI systems can also lead to harms via disparities
        between how a model performs for 

        different subgroups or languages  (e.g., an LLM may perform  less well
        for non- English languages  or 

        certain dialects ). Such disparities can contribute to discriminatory
        decision -making or amplification of 

        existing societal biases.  In addition,  GAI systems may be
        inappropriately trusted to perform similarly
  - source_sentence: >-
      What types of content are considered harmful biases in the context of
      information security?
    sentences:
      - >-
        MS-2.5-0 05 Verify GAI system training data and TEVV data provenance,
        and that fine -tuning  

        or retrieval- augmented generation data is grounded.  Information
        Integrity  

        MS-2.5-0 06 Regularly review security and safety guardrails, especially
        if the GAI system is 

        being operated in novel circumstances. This includes reviewing reasons
        why the 

        GAI system was initially assessed as being safe to deploy.  Information
        Security ; Dangerous , 

        Violent, or Hateful  Content  

        AI Actor Tasks:  Domain Experts, TEVV
      - >-
        to diminished transparency or accountability for downstream users. 
        While  this is a risk for traditional AI 

        systems  and some other digital technologies , the risk is exacerbated
        for GAI due to the scale of the 

        training data, which may be too large for humans to vet; the  difficulty
        of training foundation models, 

        which leads to extensive reuse of limited numbers of models; an d the
        extent to which GAI may be 

        integrat ed into other  devices and services.  As GAI systems often
        involve many distinct  third -party 

        components and data sources , it may be difficult to attribute issues in a
        system’s behavior to any one of 

        these sources.  

        Errors in t hird-party GAI components can also have downstream impacts 
        on accuracy and robustness .
      - >-
        biases in the generated content.  Information Security ; Harmful Bias 

        and Homogenization  

        MG-2.2-005 Engage in due diligence to analyze  GAI output for harmful
        content, potential 

        misinformation , and CBRN -related or NCII content . CBRN Information or
        Capabilities ; 

        Obscene, Degrading, and/or 

        Abusive Content ; Harmful Bias and 

        Homogenization ; Dangerous , 

        Violent, or Hateful Content
  - source_sentence: >-
      What is the focus of the paper by Padmakumar et al (2024) regarding
      language models and content diversity?
    sentences:
      - >-
        Content  

        MS-2.12- 002 Document anticipated environmental impacts of model
        development, 

        maintenance, and deployment in product design decisions. 
        Environmental  

        MS-2.12- 003 Measure or estimate environmental impacts (e.g., energy and
        water 

        consumption) for training, fine tuning, and deploying models: Verify
        tradeoffs 

        between resources used at inference time versus additional resources
        required at training time.  Environmental  

        MS-2.12- 004 Verify effectiveness of carbon capture or offset programs 
        for GAI training and 

        applications , and address green -washing concerns . Environmental  

        AI Actor Tasks:  AI Deployment, AI Impact Assessment, Domain Experts,
        Operation and Monitoring, TEVV
      - >-
        opportunities, undermine their privac y, or pervasively track their
        activity—often without their knowledge or 

        consent. 

        These outcomes are deeply harmful—but they are not inevitable. Automated
        systems have brought about extraor-

        dinary benefits, from technology that helps farmers grow food more
        efficiently and computers that predict storm 

        paths, to algorithms that can identify diseases in patients. These tools
        now drive important decisions across 

        sectors, while data is helping to revolutionize global industries.
        Fueled by the power of American innovation, 

        these tools hold the potential to redefine every part of our society and
        make life better for everyone.
      - >-
        Publishing, Paris . https://doi.org/10.1787/d1a8d965- en 

        OpenAI  (2023) GPT-4 System Card . https://cdn.openai.com/papers/gpt
        -4-system -card.pdf  

        OpenAI  (2024) GPT-4 Technical Report.
        https://arxiv.org/pdf/2303.08774  

        Padmakumar, V. et al. (2024) Does writing with language models reduce
        content diversity?  ICLR . 

        https://arxiv.org/pdf/2309.05196  

        Park,  P.  et. al. (2024)  AI deception: A survey of  examples, risks,
        and potential solutions. Patterns, 5(5).  

        arXiv . https://arxiv.org/pdf/2308.14752  

        Partnership on AI  (2023) Building a Glossary for Synthetic Media
        Transparency Methods, Part 1: Indirect 

        Disclosure . https://partnershiponai.org/glossary -for-synthetic -media-
        transparency -methods -part-1-

        indirect -disclosure/
  - source_sentence: >-
      What are the key components involved in ensuring data quality and ethical
      considerations in AI systems?
    sentences:
      - >-
        (such as where significant negative impacts are imminent, severe harms
        are actually occurring, or large -scale risks could occur); and broad
        GAI negative risks, 

        including: Immature safety or risk cultures related to AI and GAI
        design, development and deployment, public information integrity risks,
        including impacts on democratic processes, unknown long -term
        performance characteristics of GAI.  Information Integrity ; Dangerous
        , 

        Violent, or Hateful Content ; CBRN 

        Information or Capabilities  

        GV-1.3-007 Devise a plan to halt development or deployment of a GAI
        system that poses unacceptable negative risk.  CBRN Information and
        Capability ; 

        Information Security ; Information 

        Integrity  

        AI Actor Tasks: Governance and Oversight
      - >-
        30 MEASURE 2.2:  Evaluations involving human subjects meet applicable
        requirements (including human subject protection) and are 

        representative of the relevant population.  

        Action ID  Suggested Action  GAI Risks  

        MS-2.2-001 Assess and manage statistical biases related to GAI content
        provenance through 

        techniques such as re -sampling, re -weighting, or adversarial
        training.  Information Integrity ; Information 

        Security ; Harmful Bias and 

        Homogenization  

        MS-2.2-002 Document how content provenance data  is tracked  and how
        that data interact s 

        with  privacy and security . Consider : Anonymiz ing data to protect the
        privacy of 

        human subjects; Leverag ing privacy output filters; Remov ing any
        personally
      - >-
        Data quality; Model architecture (e.g., convolutional neural network,
        transformers, etc.); Optimizatio n objectives; Training algorithms;
        RLHF 

        approaches; Fine -tuning or retrieval- augmented generation approaches; 

        Evaluation data; Ethical considerations; Legal and regulatory
        requirements.  Information Integrity ; Harmful Bias 

        and Homogenization  

        AI Actor Tasks:  AI Deployment, AI Impact Assessment, Domain Experts,
        End -Users, Operation and Monitoring, TEVV  
         
        MEASURE 2.10:  Privacy risk of the AI system  as identified in the MAP
        function  is examined and documented.  

        Action ID  Suggested Action  GAI Risks  

        MS-2.10- 001 Conduct AI red -teaming to assess issues  such as:
        Outputting of training data
model-index:
  - name: SentenceTransformer based on Snowflake/snowflake-arctic-embed-m
    results:
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: Unknown
          type: unknown
        metrics:
          - type: cosine_accuracy@1
            value: 0.8
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.99
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.99
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 1
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.8
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.33000000000000007
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.19799999999999998
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.09999999999999998
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.8
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.99
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.99
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 1
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.9195108324425135
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.8916666666666667
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.8916666666666666
            name: Cosine Map@100
          - type: dot_accuracy@1
            value: 0.8
            name: Dot Accuracy@1
          - type: dot_accuracy@3
            value: 0.99
            name: Dot Accuracy@3
          - type: dot_accuracy@5
            value: 0.99
            name: Dot Accuracy@5
          - type: dot_accuracy@10
            value: 1
            name: Dot Accuracy@10
          - type: dot_precision@1
            value: 0.8
            name: Dot Precision@1
          - type: dot_precision@3
            value: 0.33000000000000007
            name: Dot Precision@3
          - type: dot_precision@5
            value: 0.19799999999999998
            name: Dot Precision@5
          - type: dot_precision@10
            value: 0.09999999999999998
            name: Dot Precision@10
          - type: dot_recall@1
            value: 0.8
            name: Dot Recall@1
          - type: dot_recall@3
            value: 0.99
            name: Dot Recall@3
          - type: dot_recall@5
            value: 0.99
            name: Dot Recall@5
          - type: dot_recall@10
            value: 1
            name: Dot Recall@10
          - type: dot_ndcg@10
            value: 0.9195108324425135
            name: Dot Ndcg@10
          - type: dot_mrr@10
            value: 0.8916666666666667
            name: Dot Mrr@10
          - type: dot_map@100
            value: 0.8916666666666666
            name: Dot Map@100

SentenceTransformer based on Snowflake/snowflake-arctic-embed-m

This is a sentence-transformers model finetuned from Snowflake/snowflake-arctic-embed-m. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: Snowflake/snowflake-arctic-embed-m
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 768 tokens
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("XicoC/midterm-finetuned-arctic")
# Run inference
sentences = [
    'What are the key components involved in ensuring data quality and ethical considerations in AI systems?',
    'Data quality; Model architecture (e.g., convolutional neural network, transformers, etc.); Optimizatio n objectives; Training algorithms; RLHF \napproaches; Fine -tuning or retrieval- augmented generation approaches; \nEvaluation data; Ethical considerations; Legal and regulatory requirements.  Information Integrity ; Harmful Bias \nand Homogenization  \nAI Actor Tasks:  AI Deployment, AI Impact Assessment, Domain Experts, End -Users, Operation and Monitoring, TEVV  \n \nMEASURE 2.10:  Privacy risk of the AI system – as identified in the MAP function – is examined and documented.  \nAction ID  Suggested Action  GAI Risks  \nMS-2.10- 001 Conduct AI red -teaming to assess issues  such as: Outputting of training data',
    '30 MEASURE 2.2:  Evaluations involving human subjects meet applicable requirements (including human subject protection) and are \nrepresentative of the relevant population.  \nAction ID  Suggested Action  GAI Risks  \nMS-2.2-001 Assess and manage statistical biases related to GAI content provenance through \ntechniques such as re -sampling, re -weighting, or adversarial training.  Information Integrity ; Information \nSecurity ; Harmful Bias and \nHomogenization  \nMS-2.2-002 Document how content provenance data  is tracked  and how that data interact s \nwith  privacy and security . Consider : Anonymiz ing data to protect the privacy of \nhuman subjects; Leverag ing privacy output filters; Remov ing any personally',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Metric Value
cosine_accuracy@1 0.8
cosine_accuracy@3 0.99
cosine_accuracy@5 0.99
cosine_accuracy@10 1.0
cosine_precision@1 0.8
cosine_precision@3 0.33
cosine_precision@5 0.198
cosine_precision@10 0.1
cosine_recall@1 0.8
cosine_recall@3 0.99
cosine_recall@5 0.99
cosine_recall@10 1.0
cosine_ndcg@10 0.9195
cosine_mrr@10 0.8917
cosine_map@100 0.8917
dot_accuracy@1 0.8
dot_accuracy@3 0.99
dot_accuracy@5 0.99
dot_accuracy@10 1.0
dot_precision@1 0.8
dot_precision@3 0.33
dot_precision@5 0.198
dot_precision@10 0.1
dot_recall@1 0.8
dot_recall@3 0.99
dot_recall@5 0.99
dot_recall@10 1.0
dot_ndcg@10 0.9195
dot_mrr@10 0.8917
dot_map@100 0.8917

Training Details

Training Dataset

Unnamed Dataset

  • Size: 600 training samples
  • Columns: sentence_0 and sentence_1
  • Approximate statistics based on the first 600 samples:
    sentence_0 sentence_1
    type string string
    details
    • min: 13 tokens
    • mean: 21.67 tokens
    • max: 34 tokens
    • min: 3 tokens
    • mean: 132.86 tokens
    • max: 512 tokens
  • Samples:
    sentence_0 sentence_1
    What is the title of the NIST publication related to Artificial Intelligence Risk Management? NIST Trustworthy and Responsible AI
    NIST AI 600 -1
    Artificial Intelligence Risk Management
    Framework: Generative Artificial
    Intelligence Profile


    This publication is available free of charge from:
    https://doi.org/10.6028/NIST.AI.600 -1
    Where can the NIST AI 600 -1 publication be accessed for free? NIST Trustworthy and Responsible AI
    NIST AI 600 -1
    Artificial Intelligence Risk Management
    Framework: Generative Artificial
    Intelligence Profile


    This publication is available free of charge from:
    https://doi.org/10.6028/NIST.AI.600 -1
    What is the title of the publication released by NIST in July 2024 regarding artificial intelligence? NIST Trustworthy and Responsible AI
    NIST AI 600 -1
    Artificial Intelligence Risk Management
    Framework: Generative Artificial
    Intelligence Profile


    This publication is available free of charge from:
    https://doi.org/10.6028/NIST.AI.600 -1

    July 2024




    U.S. Department of Commerce
    Gina M. Raimondo, Secretary
    National Institute of Standards and Technology
    Laurie E. Locascio, NIST Director and Under Secretary of Commerce for Standards and Technology
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "MultipleNegativesRankingLoss",
        "matryoshka_dims": [
            768,
            512,
            256,
            128,
            64
        ],
        "matryoshka_weights": [
            1,
            1,
            1,
            1,
            1
        ],
        "n_dims_per_step": -1
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 20
  • per_device_eval_batch_size: 20
  • num_train_epochs: 5
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 20
  • per_device_eval_batch_size: 20
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 5
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • eval_use_gather_object: False
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin

Training Logs

Epoch Step cosine_map@100
1.0 30 0.8722
1.6667 50 0.8817
2.0 60 0.8867
3.0 90 0.8867
3.3333 100 0.8917

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.1.0
  • Transformers: 4.44.2
  • PyTorch: 2.4.0+cu121
  • Accelerate: 0.34.2
  • Datasets: 2.19.2
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MatryoshkaLoss

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning},
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}