svb01's picture
Upload README.md
54dfc77 verified
metadata
base_model: sentence-transformers/all-MiniLM-L6-v2
library_name: sentence-transformers
pipeline_tag: sentence-similarity
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:555
  - loss:MultipleNegativesRankingLoss
widget:
  - source_sentence: What does this text say about unclassified?
    sentences:
      - >-
        these sources. 

        Errors in third-party GAI components can also have downstream impacts on
        accuracy and robustness. 

        For example, test datasets commonly used to benchmark or validate models
        can contain label errors. 

        Inaccuracies in these labels can impact the “stability” or robustness of
        these benchmarks, which many 

        GAI practitioners consider during the model selection process.  

        Trustworthy AI Characteristics: Accountable and Transparent, Explainable
        and Interpretable, Fair with 

        Harmful Bias Managed, Privacy Enhanced, Safe, Secure and Resilient,
        Valid and Reliable 

        3. 

        Suggested Actions to Manage GAI Risks 

        The following suggested actions target risks unique to or exacerbated by
        GAI. 

        In addition to the suggested actions below, AI risk management
        activities and actions set forth in the AI 

        RMF 1.0 and Playbook are already applicable for managing GAI risks.
        Organizations are encouraged to
      - >-
        and hardware vulnerabilities; labor practices; data privacy and
        localization 

        compliance; geopolitical alignment). 

        Data Privacy; Information Security; 

        Value Chain and Component 

        Integration; Harmful Bias and 

        Homogenization 

        MG-3.1-003 

        Re-assess model risks after fine-tuning or retrieval-augmented
        generation 

        implementation and for any third-party GAI models deployed for
        applications 

        and/or use cases that were not evaluated in initial testing. 

        Value Chain and Component 

        Integration 

        MG-3.1-004 

        Take reasonable measures to review training data for CBRN information,
        and 

        intellectual property, and where appropriate, remove it. Implement
        reasonable 

        measures to prevent, flag, or take other action in response to outputs
        that 

        reproduce particular training data (e.g., plagiarized, trademarked,
        patented, 

        licensed content or trade secret material). 

        Intellectual Property; CBRN 

        Information or Capabilities 
         
        43
      - >-


        Stage of the AI lifecycle: Risks can arise during design, development,
        deployment, operation, 

        and/or decommissioning. 

         

        Scope: Risks may exist at individual model or system levels, at the
        application or implementation 

        levels (i.e., for a specific use case), or at the ecosystem level  that
        is, beyond a single system or 

        organizational context. Examples of the latter include the expansion of
        “algorithmic 

        monocultures,3” resulting from repeated use of the same model, or
        impacts on access to 

        opportunity, labor markets, and the creative economies.4 

         

        Source of risk: Risks may emerge from factors related to the design,
        training, or operation of the 

        GAI model itself, stemming in some cases from GAI model or system
        inputs, and in other cases, 

        from GAI system outputs. Many GAI risks, however, originate from human
        behavior, including 
         
         
        3 “Algorithmic monocultures” refers to the phenomenon in which repeated
        use of the same model or algorithm in
  - source_sentence: What does this text say about unclassified?
    sentences:
      - >-
        Security; Dangerous, Violent, or 

        Hateful Content 
         
        34 

        MS-2.7-009 Regularly assess and verify that security measures remain
        effective and have not 

        been compromised. 

        Information Security 

        AI Actor Tasks: AI Deployment, AI Impact Assessment, Domain Experts,
        Operation and Monitoring, TEVV 
         
        MEASURE 2.8: Risks associated with transparency and accountability  as
        identified in the MAP function  are examined and 

        documented. 

        Action ID 

        Suggested Action 

        GAI Risks 

        MS-2.8-001 

        Compile statistics on actual policy violations, take-down requests, and
        intellectual 

        property infringement for organizational GAI systems: Analyze
        transparency 

        reports across demographic groups, languages groups. 

        Intellectual Property; Harmful Bias 

        and Homogenization 

        MS-2.8-002 Document the instructions given to data annotators or AI
        red-teamers. 

        Human-AI Configuration 

        MS-2.8-003 

        Use digital content transparency solutions to enable the documentation
        of each
      - >-
        information during GAI training and maintenance. 

        Human-AI Configuration; Obscene, 

        Degrading, and/or Abusive 

        Content; Value Chain and 

        Component Integration; 

        Dangerous, Violent, or Hateful 

        Content 

        MS-2.6-002 

        Assess existence or levels of harmful bias, intellectual property
        infringement, 

        data privacy violations, obscenity, extremism, violence, or CBRN
        information in 

        system training data. 

        Data Privacy; Intellectual Property; 

        Obscene, Degrading, and/or 

        Abusive Content; Harmful Bias and 

        Homogenization; Dangerous, 

        Violent, or Hateful Content; CBRN 

        Information or Capabilities 

        MS-2.6-003 Re-evaluate safety features of fine-tuned models when the
        negative risk exceeds 

        organizational risk tolerance. 

        Dangerous, Violent, or Hateful 

        Content 

        MS-2.6-004 Review GAI system outputs for validity and safety: Review
        generated code to 

        assess risks that may arise from unreliable downstream decision-making. 

        Value Chain and Component 

        Integration; Dangerous, Violent, or 

        Hateful Content
      - >-
        Information Integrity; Harmful Bias 

        and Homogenization 

        AI Actor Tasks: AI Deployment, AI Impact Assessment, Domain Experts,
        End-Users, Operation and Monitoring, TEVV 
         
        MEASURE 2.10: Privacy risk of the AI system  as identified in the MAP
        function  is examined and documented. 

        Action ID 

        Suggested Action 

        GAI Risks 

        MS-2.10-001 

        Conduct AI red-teaming to assess issues such as: Outputting of training
        data 

        samples, and subsequent reverse engineering, model extraction, and 

        membership inference risks; Revealing biometric, confidential,
        copyrighted, 

        licensed, patented, personal, proprietary, sensitive, or trade-marked
        information; 

        Tracking or revealing location information of users or members of
        training 

        datasets. 

        Human-AI Configuration; 

        Information Integrity; Intellectual 

        Property 

        MS-2.10-002 

        Engage directly with end-users and other stakeholders to understand
        their 

        expectations and concerns regarding content provenance. Use this
        feedback to
  - source_sentence: What does this text say about risk management?
    sentences:
      - >-
        robust watermarking techniques and corresponding detectors to identify
        the source of content or 

        metadata recording techniques and metadata management tools and
        repositories to trace content 

        origins and modifications. Further narrowing of GAI task definitions to
        include provenance data can 

        enable organizations to maximize the utility of provenance data and risk
        management efforts. 

        A.1.7. Enhancing Content Provenance through Structured Public Feedback 

        While indirect feedback methods such as automated error collection
        systems are useful, they often lack 

        the context and depth that direct input from end users can provide.
        Organizations can leverage feedback 

        approaches described in the Pre-Deployment Testing section to capture
        input from external sources such 

        as through AI red-teaming.  

        Integrating pre- and post-deployment external feedback into the
        monitoring process for GAI models and
      - >-
        tools for monitoring third-party GAI risks; Consider policy adjustments
        across GAI 

        modeling libraries, tools and APIs, fine-tuned models, and embedded
        tools; 

        Assess GAI vendors, open-source or proprietary GAI tools, or GAI
        service 

        providers against incident or vulnerability databases. 

        Data Privacy; Human-AI 

        Configuration; Information 

        Security; Intellectual Property; 

        Value Chain and Component 

        Integration; Harmful Bias and 

        Homogenization 

        GV-6.1-010 

        Update GAI acceptable use policies to address proprietary and
        open-source GAI 

        technologies and data, and contractors, consultants, and other
        third-party 

        personnel. 

        Intellectual Property; Value Chain 

        and Component Integration 

        AI Actor Tasks: Operation and Monitoring, Procurement, Third-party
        entities 
         
        GOVERN 6.2: Contingency processes are in place to handle failures or
        incidents in third-party data or AI systems deemed to be 

        high-risk. 

        Action ID 

        Suggested Action 

        GAI Risks 

        GV-6.2-001
      - >-
        MEASURE 2.3: AI system performance or assurance criteria are measured
        qualitatively or quantitatively and demonstrated for 

        conditions similar to deployment setting(s). Measures are documented. 

        Action ID 

        Suggested Action 

        GAI Risks 

        MS-2.3-001 Consider baseline model performance on suites of benchmarks
        when selecting a 

        model for fine tuning or enhancement with retrieval-augmented
        generation. 

        Information Security; 

        Confabulation 

        MS-2.3-002 Evaluate claims of model capabilities using empirically
        validated methods. 

        Confabulation; Information 

        Security 

        MS-2.3-003 Share results of pre-deployment testing with relevant GAI
        Actors, such as those 

        with system release approval authority. 

        Human-AI Configuration 
         
        31 

        MS-2.3-004 

        Utilize a purpose-built testing environment such as NIST Dioptra to
        empirically 

        evaluate GAI trustworthy characteristics. 

        CBRN Information or Capabilities; 

        Data Privacy; Confabulation; 

        Information Integrity; Information 

        Security; Dangerous, Violent, or
  - source_sentence: What does this text say about unclassified?
    sentences:
      - >-
        techniques such as re-sampling, re-ranking, or adversarial training to
        mitigate 

        biases in the generated content. 

        Information Security; Harmful Bias 

        and Homogenization 

        MG-2.2-005 

        Engage in due diligence to analyze GAI output for harmful content,
        potential 

        misinformation, and CBRN-related or NCII content. 

        CBRN Information or Capabilities; 

        Obscene, Degrading, and/or 

        Abusive Content; Harmful Bias and 

        Homogenization; Dangerous, 

        Violent, or Hateful Content 
         
        41 

        MG-2.2-006 

        Use feedback from internal and external AI Actors, users, individuals,
        and 

        communities, to assess impact of AI-generated content. 

        Human-AI Configuration 

        MG-2.2-007 

        Use real-time auditing tools where they can be demonstrated to aid in
        the 

        tracking and validation of the lineage and authenticity of AI-generated
        data. 

        Information Integrity 

        MG-2.2-008 

        Use structured feedback mechanisms to solicit and capture user input
        about AI-

        generated content to detect subtle shifts in quality or alignment with
      - >-
        Human-AI Configuration; Value 

        Chain and Component Integration 

        MP-5.2-002 

        Plan regular engagements with AI Actors responsible for inputs to GAI
        systems, 

        including third-party data and algorithms, to review and evaluate
        unanticipated 

        impacts. 

        Human-AI Configuration; Value 

        Chain and Component Integration 

        AI Actor Tasks: AI Deployment, AI Design, AI Impact Assessment, Affected
        Individuals and Communities, Domain Experts, End-

        Users, Human Factors, Operation and Monitoring  
         
        MEASURE 1.1: Approaches and metrics for measurement of AI risks
        enumerated during the MAP function are selected for 

        implementation starting with the most significant AI risks. The risks or
        trustworthiness characteristics that will not  or cannot  be 

        measured are properly documented. 

        Action ID 

        Suggested Action 

        GAI Risks 

        MS-1.1-001 Employ methods to trace the origin and modifications of
        digital content. 

        Information Integrity 

        MS-1.1-002
      - >-
        input them directly to a GAI system, with a variety of downstream
        negative consequences to 

        interconnected systems. Indirect prompt injection attacks occur when
        adversaries remotely (i.e., without 

        a direct interface) exploit LLM-integrated applications by injecting
        prompts into data likely to be 

        retrieved. Security researchers have already demonstrated how indirect
        prompt injections can exploit 

        vulnerabilities by stealing proprietary data or running malicious code
        remotely on a machine. Merely 

        querying a closed production model can elicit previously undisclosed
        information about that model. 

        Another cybersecurity risk to GAI is data poisoning, in which an
        adversary compromises a training 

        dataset used by a model to manipulate its outputs or operation.
        Malicious tampering with data or parts 

        of the model could exacerbate risks associated with GAI system outputs. 

        Trustworthy AI Characteristics: Privacy Enhanced, Safe, Secure and
        Resilient, Valid and Reliable 

        2.10.
  - source_sentence: What does this text say about data privacy?
    sentences:
      - >-
        Property. We also note that some risks are cross-cutting between these
        categories.  
         
        4 

        1. CBRN Information or Capabilities: Eased access to or synthesis of
        materially nefarious 

        information or design capabilities related to chemical, biological,
        radiological, or nuclear (CBRN) 

        weapons or other dangerous materials or agents. 

        2. Confabulation: The production of confidently stated but erroneous or
        false content (known 

        colloquially as “hallucinations” or “fabrications”) by which users may
        be misled or deceived.6 

        3. Dangerous, Violent, or Hateful Content: Eased production of and
        access to violent, inciting, 

        radicalizing, or threatening content as well as recommendations to carry
        out self-harm or 

        conduct illegal activities. Includes difficulty controlling public
        exposure to hateful and disparaging 

        or stereotyping content. 

        4. Data Privacy: Impacts due to leakage and unauthorized use,
        disclosure, or de-anonymization of
      - >-
        information during GAI training and maintenance. 

        Human-AI Configuration; Obscene, 

        Degrading, and/or Abusive 

        Content; Value Chain and 

        Component Integration; 

        Dangerous, Violent, or Hateful 

        Content 

        MS-2.6-002 

        Assess existence or levels of harmful bias, intellectual property
        infringement, 

        data privacy violations, obscenity, extremism, violence, or CBRN
        information in 

        system training data. 

        Data Privacy; Intellectual Property; 

        Obscene, Degrading, and/or 

        Abusive Content; Harmful Bias and 

        Homogenization; Dangerous, 

        Violent, or Hateful Content; CBRN 

        Information or Capabilities 

        MS-2.6-003 Re-evaluate safety features of fine-tuned models when the
        negative risk exceeds 

        organizational risk tolerance. 

        Dangerous, Violent, or Hateful 

        Content 

        MS-2.6-004 Review GAI system outputs for validity and safety: Review
        generated code to 

        assess risks that may arise from unreliable downstream decision-making. 

        Value Chain and Component 

        Integration; Dangerous, Violent, or 

        Hateful Content
      - >-
        Scheurer, J. et al. (2023) Technical report: Large language models can
        strategically deceive their users 

        when put under pressure. arXiv. https://arxiv.org/abs/2311.07590 

        Shelby, R. et al. (2023) Sociotechnical Harms of Algorithmic Systems:
        Scoping a Taxonomy for Harm 

        Reduction. arXiv. https://arxiv.org/pdf/2210.05791 

        Shevlane, T. et al. (2023) Model evaluation for extreme risks. arXiv.
        https://arxiv.org/pdf/2305.15324 

        Shumailov, I. et al. (2023) The curse of recursion: training on
        generated data makes models forget. arXiv. 

        https://arxiv.org/pdf/2305.17493v2 

        Smith, A. et al. (2023) Hallucination or Confabulation? Neuroanatomy as
        metaphor in Large Language 

        Models. PLOS Digital Health. 

        https://journals.plos.org/digitalhealth/article?id=10.1371/journal.pdig.0000388 

        Soice, E. et al. (2023) Can large language models democratize access to
        dual-use biotechnology? arXiv. 

        https://arxiv.org/abs/2306.03809

SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2

This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: sentence-transformers/all-MiniLM-L6-v2
  • Maximum Sequence Length: 256 tokens
  • Output Dimensionality: 384 tokens
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'What does this text say about data privacy?',
    'information during GAI training and maintenance. \nHuman-AI Configuration; Obscene, \nDegrading, and/or Abusive \nContent; Value Chain and \nComponent Integration; \nDangerous, Violent, or Hateful \nContent \nMS-2.6-002 \nAssess existence or levels of harmful bias, intellectual property infringement, \ndata privacy violations, obscenity, extremism, violence, or CBRN information in \nsystem training data. \nData Privacy; Intellectual Property; \nObscene, Degrading, and/or \nAbusive Content; Harmful Bias and \nHomogenization; Dangerous, \nViolent, or Hateful Content; CBRN \nInformation or Capabilities \nMS-2.6-003 Re-evaluate safety features of fine-tuned models when the negative risk exceeds \norganizational risk tolerance. \nDangerous, Violent, or Hateful \nContent \nMS-2.6-004 Review GAI system outputs for validity and safety: Review generated code to \nassess risks that may arise from unreliable downstream decision-making. \nValue Chain and Component \nIntegration; Dangerous, Violent, or \nHateful Content',
    'Scheurer, J. et al. (2023) Technical report: Large language models can strategically deceive their users \nwhen put under pressure. arXiv. https://arxiv.org/abs/2311.07590 \nShelby, R. et al. (2023) Sociotechnical Harms of Algorithmic Systems: Scoping a Taxonomy for Harm \nReduction. arXiv. https://arxiv.org/pdf/2210.05791 \nShevlane, T. et al. (2023) Model evaluation for extreme risks. arXiv. https://arxiv.org/pdf/2305.15324 \nShumailov, I. et al. (2023) The curse of recursion: training on generated data makes models forget. arXiv. \nhttps://arxiv.org/pdf/2305.17493v2 \nSmith, A. et al. (2023) Hallucination or Confabulation? Neuroanatomy as metaphor in Large Language \nModels. PLOS Digital Health. \nhttps://journals.plos.org/digitalhealth/article?id=10.1371/journal.pdig.0000388 \nSoice, E. et al. (2023) Can large language models democratize access to dual-use biotechnology? arXiv. \nhttps://arxiv.org/abs/2306.03809',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

Unnamed Dataset

  • Size: 555 training samples
  • Columns: sentence_0 and sentence_1
  • Approximate statistics based on the first 555 samples:
    sentence_0 sentence_1
    type string string
    details
    • min: 10 tokens
    • mean: 11.2 tokens
    • max: 12 tokens
    • min: 156 tokens
    • mean: 199.37 tokens
    • max: 256 tokens
  • Samples:
    sentence_0 sentence_1
    What does this text say about trustworthiness? other systems.
    Information Integrity; Value Chain
    and Component Integration
    MP-2.2-002
    Observe and analyze how the GAI system interacts with external networks, and
    identify any potential for negative externalities, particularly where content
    provenance might be compromised.
    Information Integrity
    AI Actor Tasks: End Users

    MAP 2.3: Scientific integrity and TEVV considerations are identified and documented, including those related to experimental
    design, data collection and selection (e.g., availability, representativeness, suitability), system trustworthiness, and construct
    validation
    Action ID
    Suggested Action
    GAI Risks
    MP-2.3-001
    Assess the accuracy, quality, reliability, and authenticity of GAI output by
    comparing it to a set of known ground truth data and by using a variety of
    evaluation methods (e.g., human oversight and automated evaluation, proven
    cryptographic techniques, review of content inputs).
    Information Integrity

    25
    What does this text say about unclassified? training and TEVV data; Filtering of hate speech or content in GAI system
    training data; Prevalence of GAI-generated data in GAI system training data.
    Harmful Bias and Homogenization


    15 Winogender Schemas is a sample set of paired sentences which differ only by gender of the pronouns used,
    which can be used to evaluate gender bias in natural language processing coreference resolution systems.

    37
    MS-2.11-005
    Assess the proportion of synthetic to non-synthetic training data and verify
    training data is not overly homogenous or GAI-produced to mitigate concerns of
    model collapse.
    Harmful Bias and Homogenization
    AI Actor Tasks: AI Deployment, AI Impact Assessment, Affected Individuals and Communities, Domain Experts, End-Users,
    Operation and Monitoring, TEVV

    MEASURE 2.12: Environmental impact and sustainability of AI model training and management activities – as identified in the MAP
    function – are assessed and documented.
    Action ID
    Suggested Action
    GAI Risks
    What does this text say about unclassified? Padmakumar, V. et al. (2024) Does writing with language models reduce content diversity? ICLR.
    https://arxiv.org/pdf/2309.05196
    Park, P. et. al. (2024) AI deception: A survey of examples, risks, and potential solutions. Patterns, 5(5).
    arXiv. https://arxiv.org/pdf/2308.14752
    Partnership on AI (2023) Building a Glossary for Synthetic Media Transparency Methods, Part 1: Indirect
    Disclosure. https://partnershiponai.org/glossary-for-synthetic-media-transparency-methods-part-1-
    indirect-disclosure/
    Qu, Y. et al. (2023) Unsafe Diffusion: On the Generation of Unsafe Images and Hateful Memes From Text-
    To-Image Models. arXiv. https://arxiv.org/pdf/2305.13873
    Rafat, K. et al. (2023) Mitigating carbon footprint for knowledge distillation based deep learning model
    compression. PLOS One. https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0285668
    Said, I. et al. (2022) Nonconsensual Distribution of Intimate Images: Exploring the Role of Legal Attitudes
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 3
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • eval_use_gather_object: False
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin

Framework Versions

  • Python: 3.11.5
  • Sentence Transformers: 3.1.1
  • Transformers: 4.44.2
  • PyTorch: 2.4.1+cpu
  • Accelerate: 0.34.2
  • Datasets: 3.0.0
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}