metadata

base_model: sentence-transformers/all-MiniLM-L6-v2
library_name: sentence-transformers
pipeline_tag: sentence-similarity
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:555
  - loss:MultipleNegativesRankingLoss
widget:
  - source_sentence: What does this text say about unclassified?
    sentences:
      - >-
        these sources. 

        Errors in third-party GAI components can also have downstream impacts on
        accuracy and robustness. 

        For example, test datasets commonly used to benchmark or validate models
        can contain label errors. 

        Inaccuracies in these labels can impact the “stability” or robustness of
        these benchmarks, which many 

        GAI practitioners consider during the model selection process.  

        Trustworthy AI Characteristics: Accountable and Transparent, Explainable
        and Interpretable, Fair with 

        Harmful Bias Managed, Privacy Enhanced, Safe, Secure and Resilient,
        Valid and Reliable 

        3. 

        Suggested Actions to Manage GAI Risks 

        The following suggested actions target risks unique to or exacerbated by
        GAI. 

        In addition to the suggested actions below, AI risk management
        activities and actions set forth in the AI 

        RMF 1.0 and Playbook are already applicable for managing GAI risks.
        Organizations are encouraged to
      - >-
        and hardware vulnerabilities; labor practices; data privacy and
        localization 

        compliance; geopolitical alignment). 

        Data Privacy; Information Security; 

        Value Chain and Component 

        Integration; Harmful Bias and 

        Homogenization 

        MG-3.1-003 

        Re-assess model risks after ﬁne-tuning or retrieval-augmented
        generation 

        implementation and for any third-party GAI models deployed for
        applications 

        and/or use cases that were not evaluated in initial testing. 

        Value Chain and Component 

        Integration 

        MG-3.1-004 

        Take reasonable measures to review training data for CBRN information,
        and 

        intellectual property, and where appropriate, remove it. Implement
        reasonable 

        measures to prevent, ﬂag, or take other action in response to outputs
        that 

        reproduce particular training data (e.g., plagiarized, trademarked,
        patented, 

        licensed content or trade secret material). 

        Intellectual Property; CBRN 

        Information or Capabilities 
         
        43
      - >-
        • 

        Stage of the AI lifecycle: Risks can arise during design, development,
        deployment, operation, 

        and/or decommissioning. 

        • 

        Scope: Risks may exist at individual model or system levels, at the
        application or implementation 

        levels (i.e., for a speciﬁc use case), or at the ecosystem level – that
        is, beyond a single system or 

        organizational context. Examples of the latter include the expansion of
        “algorithmic 

        monocultures,3” resulting from repeated use of the same model, or
        impacts on access to 

        opportunity, labor markets, and the creative economies.4 

        • 

        Source of risk: Risks may emerge from factors related to the design,
        training, or operation of the 

        GAI model itself, stemming in some cases from GAI model or system
        inputs, and in other cases, 

        from GAI system outputs. Many GAI risks, however, originate from human
        behavior, including 
         
         
        3 “Algorithmic monocultures” refers to the phenomenon in which repeated
        use of the same model or algorithm in
  - source_sentence: What does this text say about unclassified?
    sentences:
      - >-
        Security; Dangerous, Violent, or 

        Hateful Content 
         
        34 

        MS-2.7-009 Regularly assess and verify that security measures remain
        eﬀective and have not 

        been compromised. 

        Information Security 

        AI Actor Tasks: AI Deployment, AI Impact Assessment, Domain Experts,
        Operation and Monitoring, TEVV 
         
        MEASURE 2.8: Risks associated with transparency and accountability – as
        identiﬁed in the MAP function – are examined and 

        documented. 

        Action ID 

        Suggested Action 

        GAI Risks 

        MS-2.8-001 

        Compile statistics on actual policy violations, take-down requests, and
        intellectual 

        property infringement for organizational GAI systems: Analyze
        transparency 

        reports across demographic groups, languages groups. 

        Intellectual Property; Harmful Bias 

        and Homogenization 

        MS-2.8-002 Document the instructions given to data annotators or AI
        red-teamers. 

        Human-AI Conﬁguration 

        MS-2.8-003 

        Use digital content transparency solutions to enable the documentation
        of each
      - >-
        information during GAI training and maintenance. 

        Human-AI Conﬁguration; Obscene, 

        Degrading, and/or Abusive 

        Content; Value Chain and 

        Component Integration; 

        Dangerous, Violent, or Hateful 

        Content 

        MS-2.6-002 

        Assess existence or levels of harmful bias, intellectual property
        infringement, 

        data privacy violations, obscenity, extremism, violence, or CBRN
        information in 

        system training data. 

        Data Privacy; Intellectual Property; 

        Obscene, Degrading, and/or 

        Abusive Content; Harmful Bias and 

        Homogenization; Dangerous, 

        Violent, or Hateful Content; CBRN 

        Information or Capabilities 

        MS-2.6-003 Re-evaluate safety features of ﬁne-tuned models when the
        negative risk exceeds 

        organizational risk tolerance. 

        Dangerous, Violent, or Hateful 

        Content 

        MS-2.6-004 Review GAI system outputs for validity and safety: Review
        generated code to 

        assess risks that may arise from unreliable downstream decision-making. 

        Value Chain and Component 

        Integration; Dangerous, Violent, or 

        Hateful Content
      - >-
        Information Integrity; Harmful Bias 

        and Homogenization 

        AI Actor Tasks: AI Deployment, AI Impact Assessment, Domain Experts,
        End-Users, Operation and Monitoring, TEVV 
         
        MEASURE 2.10: Privacy risk of the AI system – as identiﬁed in the MAP
        function – is examined and documented. 

        Action ID 

        Suggested Action 

        GAI Risks 

        MS-2.10-001 

        Conduct AI red-teaming to assess issues such as: Outputting of training
        data 

        samples, and subsequent reverse engineering, model extraction, and 

        membership inference risks; Revealing biometric, conﬁdential,
        copyrighted, 

        licensed, patented, personal, proprietary, sensitive, or trade-marked
        information; 

        Tracking or revealing location information of users or members of
        training 

        datasets. 

        Human-AI Conﬁguration; 

        Information Integrity; Intellectual 

        Property 

        MS-2.10-002 

        Engage directly with end-users and other stakeholders to understand
        their 

        expectations and concerns regarding content provenance. Use this
        feedback to
  - source_sentence: What does this text say about risk management?
    sentences:
      - >-
        robust watermarking techniques and corresponding detectors to identify
        the source of content or 

        metadata recording techniques and metadata management tools and
        repositories to trace content 

        origins and modiﬁcations. Further narrowing of GAI task deﬁnitions to
        include provenance data can 

        enable organizations to maximize the utility of provenance data and risk
        management eﬀorts. 

        A.1.7. Enhancing Content Provenance through Structured Public Feedback 

        While indirect feedback methods such as automated error collection
        systems are useful, they often lack 

        the context and depth that direct input from end users can provide.
        Organizations can leverage feedback 

        approaches described in the Pre-Deployment Testing section to capture
        input from external sources such 

        as through AI red-teaming.  

        Integrating pre- and post-deployment external feedback into the
        monitoring process for GAI models and
      - >-
        tools for monitoring third-party GAI risks; Consider policy adjustments
        across GAI 

        modeling libraries, tools and APIs, ﬁne-tuned models, and embedded
        tools; 

        Assess GAI vendors, open-source or proprietary GAI tools, or GAI
        service 

        providers against incident or vulnerability databases. 

        Data Privacy; Human-AI 

        Conﬁguration; Information 

        Security; Intellectual Property; 

        Value Chain and Component 

        Integration; Harmful Bias and 

        Homogenization 

        GV-6.1-010 

        Update GAI acceptable use policies to address proprietary and
        open-source GAI 

        technologies and data, and contractors, consultants, and other
        third-party 

        personnel. 

        Intellectual Property; Value Chain 

        and Component Integration 

        AI Actor Tasks: Operation and Monitoring, Procurement, Third-party
        entities 
         
        GOVERN 6.2: Contingency processes are in place to handle failures or
        incidents in third-party data or AI systems deemed to be 

        high-risk. 

        Action ID 

        Suggested Action 

        GAI Risks 

        GV-6.2-001
      - >-
        MEASURE 2.3: AI system performance or assurance criteria are measured
        qualitatively or quantitatively and demonstrated for 

        conditions similar to deployment setting(s). Measures are documented. 

        Action ID 

        Suggested Action 

        GAI Risks 

        MS-2.3-001 Consider baseline model performance on suites of benchmarks
        when selecting a 

        model for ﬁne tuning or enhancement with retrieval-augmented
        generation. 

        Information Security; 

        Confabulation 

        MS-2.3-002 Evaluate claims of model capabilities using empirically
        validated methods. 

        Confabulation; Information 

        Security 

        MS-2.3-003 Share results of pre-deployment testing with relevant GAI
        Actors, such as those 

        with system release approval authority. 

        Human-AI Conﬁguration 
         
        31 

        MS-2.3-004 

        Utilize a purpose-built testing environment such as NIST Dioptra to
        empirically 

        evaluate GAI trustworthy characteristics. 

        CBRN Information or Capabilities; 

        Data Privacy; Confabulation; 

        Information Integrity; Information 

        Security; Dangerous, Violent, or
  - source_sentence: What does this text say about unclassified?
    sentences:
      - >-
        techniques such as re-sampling, re-ranking, or adversarial training to
        mitigate 

        biases in the generated content. 

        Information Security; Harmful Bias 

        and Homogenization 

        MG-2.2-005 

        Engage in due diligence to analyze GAI output for harmful content,
        potential 

        misinformation, and CBRN-related or NCII content. 

        CBRN Information or Capabilities; 

        Obscene, Degrading, and/or 

        Abusive Content; Harmful Bias and 

        Homogenization; Dangerous, 

        Violent, or Hateful Content 
         
        41 

        MG-2.2-006 

        Use feedback from internal and external AI Actors, users, individuals,
        and 

        communities, to assess impact of AI-generated content. 

        Human-AI Conﬁguration 

        MG-2.2-007 

        Use real-time auditing tools where they can be demonstrated to aid in
        the 

        tracking and validation of the lineage and authenticity of AI-generated
        data. 

        Information Integrity 

        MG-2.2-008 

        Use structured feedback mechanisms to solicit and capture user input
        about AI-

        generated content to detect subtle shifts in quality or alignment with
      - >-
        Human-AI Conﬁguration; Value 

        Chain and Component Integration 

        MP-5.2-002 

        Plan regular engagements with AI Actors responsible for inputs to GAI
        systems, 

        including third-party data and algorithms, to review and evaluate
        unanticipated 

        impacts. 

        Human-AI Conﬁguration; Value 

        Chain and Component Integration 

        AI Actor Tasks: AI Deployment, AI Design, AI Impact Assessment, Aﬀected
        Individuals and Communities, Domain Experts, End-

        Users, Human Factors, Operation and Monitoring  
         
        MEASURE 1.1: Approaches and metrics for measurement of AI risks
        enumerated during the MAP function are selected for 

        implementation starting with the most signiﬁcant AI risks. The risks or
        trustworthiness characteristics that will not – or cannot – be 

        measured are properly documented. 

        Action ID 

        Suggested Action 

        GAI Risks 

        MS-1.1-001 Employ methods to trace the origin and modiﬁcations of
        digital content. 

        Information Integrity 

        MS-1.1-002
      - >-
        input them directly to a GAI system, with a variety of downstream
        negative consequences to 

        interconnected systems. Indirect prompt injection attacks occur when
        adversaries remotely (i.e., without 

        a direct interface) exploit LLM-integrated applications by injecting
        prompts into data likely to be 

        retrieved. Security researchers have already demonstrated how indirect
        prompt injections can exploit 

        vulnerabilities by stealing proprietary data or running malicious code
        remotely on a machine. Merely 

        querying a closed production model can elicit previously undisclosed
        information about that model. 

        Another cybersecurity risk to GAI is data poisoning, in which an
        adversary compromises a training 

        dataset used by a model to manipulate its outputs or operation.
        Malicious tampering with data or parts 

        of the model could exacerbate risks associated with GAI system outputs. 

        Trustworthy AI Characteristics: Privacy Enhanced, Safe, Secure and
        Resilient, Valid and Reliable 

        2.10.
  - source_sentence: What does this text say about data privacy?
    sentences:
      - >-
        Property. We also note that some risks are cross-cutting between these
        categories.  
         
        4 

        1. CBRN Information or Capabilities: Eased access to or synthesis of
        materially nefarious 

        information or design capabilities related to chemical, biological,
        radiological, or nuclear (CBRN) 

        weapons or other dangerous materials or agents. 

        2. Confabulation: The production of conﬁdently stated but erroneous or
        false content (known 

        colloquially as “hallucinations” or “fabrications”) by which users may
        be misled or deceived.6 

        3. Dangerous, Violent, or Hateful Content: Eased production of and
        access to violent, inciting, 

        radicalizing, or threatening content as well as recommendations to carry
        out self-harm or 

        conduct illegal activities. Includes diﬃculty controlling public
        exposure to hateful and disparaging 

        or stereotyping content. 

        4. Data Privacy: Impacts due to leakage and unauthorized use,
        disclosure, or de-anonymization of
      - >-
        information during GAI training and maintenance. 

        Human-AI Conﬁguration; Obscene, 

        Degrading, and/or Abusive 

        Content; Value Chain and 

        Component Integration; 

        Dangerous, Violent, or Hateful 

        Content 

        MS-2.6-002 

        Assess existence or levels of harmful bias, intellectual property
        infringement, 

        data privacy violations, obscenity, extremism, violence, or CBRN
        information in 

        system training data. 

        Data Privacy; Intellectual Property; 

        Obscene, Degrading, and/or 

        Abusive Content; Harmful Bias and 

        Homogenization; Dangerous, 

        Violent, or Hateful Content; CBRN 

        Information or Capabilities 

        MS-2.6-003 Re-evaluate safety features of ﬁne-tuned models when the
        negative risk exceeds 

        organizational risk tolerance. 

        Dangerous, Violent, or Hateful 

        Content 

        MS-2.6-004 Review GAI system outputs for validity and safety: Review
        generated code to 

        assess risks that may arise from unreliable downstream decision-making. 

        Value Chain and Component 

        Integration; Dangerous, Violent, or 

        Hateful Content
      - >-
        Scheurer, J. et al. (2023) Technical report: Large language models can
        strategically deceive their users 

        when put under pressure. arXiv. https://arxiv.org/abs/2311.07590 

        Shelby, R. et al. (2023) Sociotechnical Harms of Algorithmic Systems:
        Scoping a Taxonomy for Harm 

        Reduction. arXiv. https://arxiv.org/pdf/2210.05791 

        Shevlane, T. et al. (2023) Model evaluation for extreme risks. arXiv.
        https://arxiv.org/pdf/2305.15324 

        Shumailov, I. et al. (2023) The curse of recursion: training on
        generated data makes models forget. arXiv. 

        https://arxiv.org/pdf/2305.17493v2 

        Smith, A. et al. (2023) Hallucination or Confabulation? Neuroanatomy as
        metaphor in Large Language 

        Models. PLOS Digital Health. 

        https://journals.plos.org/digitalhealth/article?id=10.1371/journal.pdig.0000388 

        Soice, E. et al. (2023) Can large language models democratize access to
        dual-use biotechnology? arXiv. 

        https://arxiv.org/abs/2306.03809

SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2

This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Type: Sentence Transformer
Base model: sentence-transformers/all-MiniLM-L6-v2
Maximum Sequence Length: 256 tokens
Output Dimensionality: 384 tokens
Similarity Function: Cosine Similarity

Model Sources

Documentation: Sentence Transformers Documentation
Repository: Sentence Transformers on GitHub
Hugging Face: Sentence Transformers on Hugging Face

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'What does this text say about data privacy?',
    'information during GAI training and maintenance. \nHuman-AI Conﬁguration; Obscene, \nDegrading, and/or Abusive \nContent; Value Chain and \nComponent Integration; \nDangerous, Violent, or Hateful \nContent \nMS-2.6-002 \nAssess existence or levels of harmful bias, intellectual property infringement, \ndata privacy violations, obscenity, extremism, violence, or CBRN information in \nsystem training data. \nData Privacy; Intellectual Property; \nObscene, Degrading, and/or \nAbusive Content; Harmful Bias and \nHomogenization; Dangerous, \nViolent, or Hateful Content; CBRN \nInformation or Capabilities \nMS-2.6-003 Re-evaluate safety features of ﬁne-tuned models when the negative risk exceeds \norganizational risk tolerance. \nDangerous, Violent, or Hateful \nContent \nMS-2.6-004 Review GAI system outputs for validity and safety: Review generated code to \nassess risks that may arise from unreliable downstream decision-making. \nValue Chain and Component \nIntegration; Dangerous, Violent, or \nHateful Content',
    'Scheurer, J. et al. (2023) Technical report: Large language models can strategically deceive their users \nwhen put under pressure. arXiv. https://arxiv.org/abs/2311.07590 \nShelby, R. et al. (2023) Sociotechnical Harms of Algorithmic Systems: Scoping a Taxonomy for Harm \nReduction. arXiv. https://arxiv.org/pdf/2210.05791 \nShevlane, T. et al. (2023) Model evaluation for extreme risks. arXiv. https://arxiv.org/pdf/2305.15324 \nShumailov, I. et al. (2023) The curse of recursion: training on generated data makes models forget. arXiv. \nhttps://arxiv.org/pdf/2305.17493v2 \nSmith, A. et al. (2023) Hallucination or Confabulation? Neuroanatomy as metaphor in Large Language \nModels. PLOS Digital Health. \nhttps://journals.plos.org/digitalhealth/article?id=10.1371/journal.pdig.0000388 \nSoice, E. et al. (2023) Can large language models democratize access to dual-use biotechnology? arXiv. \nhttps://arxiv.org/abs/2306.03809',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

Unnamed Dataset

Size: 555 training samples
Columns: sentence_0 and sentence_1
Approximate statistics based on the first 555 samples:
sentence_0 sentence_1
type string string
details
min: 10 tokens
mean: 11.2 tokens
max: 12 tokens

min: 156 tokens
mean: 199.37 tokens
max: 256 tokens

	sentence_0	sentence_1
type	string	string
details	min: 10 tokens mean: 11.2 tokens max: 12 tokens	min: 156 tokens mean: 199.37 tokens max: 256 tokens

Samples:

sentence_0	sentence_1
`What does this text say about trustworthiness?`	other systems. Information Integrity; Value Chain and Component Integration MP-2.2-002 Observe and analyze how the GAI system interacts with external networks, and identify any potential for negative externalities, particularly where content provenance might be compromised. Information Integrity AI Actor Tasks: End Users MAP 2.3: Scientiﬁc integrity and TEVV considerations are identiﬁed and documented, including those related to experimental design, data collection and selection (e.g., availability, representativeness, suitability), system trustworthiness, and construct validation Action ID Suggested Action GAI Risks MP-2.3-001 Assess the accuracy, quality, reliability, and authenticity of GAI output by comparing it to a set of known ground truth data and by using a variety of evaluation methods (e.g., human oversight and automated evaluation, proven cryptographic techniques, review of content inputs). Information Integrity 25
`What does this text say about unclassified?`	training and TEVV data; Filtering of hate speech or content in GAI system training data; Prevalence of GAI-generated data in GAI system training data. Harmful Bias and Homogenization 15 Winogender Schemas is a sample set of paired sentences which diﬀer only by gender of the pronouns used, which can be used to evaluate gender bias in natural language processing coreference resolution systems. 37 MS-2.11-005 Assess the proportion of synthetic to non-synthetic training data and verify training data is not overly homogenous or GAI-produced to mitigate concerns of model collapse. Harmful Bias and Homogenization AI Actor Tasks: AI Deployment, AI Impact Assessment, Aﬀected Individuals and Communities, Domain Experts, End-Users, Operation and Monitoring, TEVV MEASURE 2.12: Environmental impact and sustainability of AI model training and management activities – as identiﬁed in the MAP function – are assessed and documented. Action ID Suggested Action GAI Risks
`What does this text say about unclassified?`	Padmakumar, V. et al. (2024) Does writing with language models reduce content diversity? ICLR. https://arxiv.org/pdf/2309.05196 Park, P. et. al. (2024) AI deception: A survey of examples, risks, and potential solutions. Patterns, 5(5). arXiv. https://arxiv.org/pdf/2308.14752 Partnership on AI (2023) Building a Glossary for Synthetic Media Transparency Methods, Part 1: Indirect Disclosure. https://partnershiponai.org/glossary-for-synthetic-media-transparency-methods-part-1- indirect-disclosure/ Qu, Y. et al. (2023) Unsafe Diﬀusion: On the Generation of Unsafe Images and Hateful Memes From Text- To-Image Models. arXiv. https://arxiv.org/pdf/2305.13873 Rafat, K. et al. (2023) Mitigating carbon footprint for knowledge distillation based deep learning model compression. PLOS One. https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0285668 Said, I. et al. (2022) Nonconsensual Distribution of Intimate Images: Exploring the Role of Legal Attitudes

Loss: MultipleNegativesRankingLoss with these parameters:

{
    "scale": 20.0,
    "similarity_fct": "cos_sim"
}

Training Hyperparameters

Non-Default Hyperparameters

per_device_train_batch_size: 16
per_device_eval_batch_size: 16
multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand

overwrite_output_dir: False
do_predict: False
eval_strategy: no
prediction_loss_only: True
per_device_train_batch_size: 16
per_device_eval_batch_size: 16
per_gpu_train_batch_size: None
per_gpu_eval_batch_size: None
gradient_accumulation_steps: 1
eval_accumulation_steps: None
torch_empty_cache_steps: None
learning_rate: 5e-05
weight_decay: 0.0
adam_beta1: 0.9
adam_beta2: 0.999
adam_epsilon: 1e-08
max_grad_norm: 1
num_train_epochs: 3
max_steps: -1
lr_scheduler_type: linear
lr_scheduler_kwargs: {}
warmup_ratio: 0.0
warmup_steps: 0
log_level: passive
log_level_replica: warning
log_on_each_node: True
logging_nan_inf_filter: True
save_safetensors: True
save_on_each_node: False
save_only_model: False
restore_callback_states_from_checkpoint: False
no_cuda: False
use_cpu: False
use_mps_device: False
seed: 42
data_seed: None
jit_mode_eval: False
use_ipex: False
bf16: False
fp16: False
fp16_opt_level: O1
half_precision_backend: auto
bf16_full_eval: False
fp16_full_eval: False
tf32: None
local_rank: 0
ddp_backend: None
tpu_num_cores: None
tpu_metrics_debug: False
debug: []
dataloader_drop_last: False
dataloader_num_workers: 0
dataloader_prefetch_factor: None
past_index: -1
disable_tqdm: False
remove_unused_columns: True
label_names: None
load_best_model_at_end: False
ignore_data_skip: False
fsdp: []
fsdp_min_num_params: 0
fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
fsdp_transformer_layer_cls_to_wrap: None
accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
deepspeed: None
label_smoothing_factor: 0.0
optim: adamw_torch
optim_args: None
adafactor: False
group_by_length: False
length_column_name: length
ddp_find_unused_parameters: None
ddp_bucket_cap_mb: None
ddp_broadcast_buffers: False
dataloader_pin_memory: True
dataloader_persistent_workers: False
skip_memory_metrics: True
use_legacy_prediction_loop: False
push_to_hub: False
resume_from_checkpoint: None
hub_model_id: None
hub_strategy: every_save
hub_private_repo: False
hub_always_push: False
gradient_checkpointing: False
gradient_checkpointing_kwargs: None
include_inputs_for_metrics: False
eval_do_concat_batches: True
fp16_backend: auto
push_to_hub_model_id: None
push_to_hub_organization: None
mp_parameters:
auto_find_batch_size: False
full_determinism: False
torchdynamo: None
ray_scope: last
ddp_timeout: 1800
torch_compile: False
torch_compile_backend: None
torch_compile_mode: None
dispatch_batches: None
split_batches: None
include_tokens_per_second: False
include_num_input_tokens_seen: False
neftune_noise_alpha: None
optim_target_modules: None
batch_eval_metrics: False
eval_on_start: False
eval_use_gather_object: False
batch_sampler: batch_sampler
multi_dataset_batch_sampler: round_robin

Framework Versions

Python: 3.11.5
Sentence Transformers: 3.1.1
Transformers: 4.44.2
PyTorch: 2.4.1+cpu
Accelerate: 0.34.2
Datasets: 3.0.0
Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}