test12 / README.md
ostoveland's picture
Add new SentenceTransformer model.
20f1c22 verified
|
raw
history blame
17.7 kB
metadata
base_model: sentence-transformers/distilbert-base-nli-mean-tokens
datasets: []
language: []
library_name: sentence-transformers
metrics:
  - cosine_accuracy
  - dot_accuracy
  - manhattan_accuracy
  - euclidean_accuracy
  - max_accuracy
pipeline_tag: sentence-similarity
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:2400
  - loss:TripletLoss
  - loss:MultipleNegativesRankingLoss
  - loss:CoSENTLoss
widget:
  - source_sentence: Flislegging av hall
    sentences:
      - 'query: tapetsering av rom med grunnflate 4x4.5 meter minus tre dører'
      - 'query: fliser i hall'
      - 'query: fornye markiseduk'
  - source_sentence: Betongskjæring av rømningsvindu
    sentences:
      - Installere ventilasjonssystem
      - Installere nytt vindu i trevegg
      - Skjære ut rømningsvindu i betongvegg
  - source_sentence: Ny garasje leddport
    sentences:
      - Installere garasjeport
      - Bygge ny garasje
      - Legge nytt tak
  - source_sentence: Legge varmefolie i gang og stue.
    sentences:
      - Strø grusveier med salt
      - Legge varmekabler
      - Installere gulvvarme
  - source_sentence: Oppgradere kjeller til boareale
    sentences:
      - Oppussing av kjeller for boligformål
      - elektriker  bolig  120kvm
      - Installere dusjkabinett
model-index:
  - name: >-
      SentenceTransformer based on
      sentence-transformers/distilbert-base-nli-mean-tokens
    results:
      - task:
          type: triplet
          name: Triplet
        dataset:
          name: test triplet evaluation
          type: test-triplet-evaluation
        metrics:
          - type: cosine_accuracy
            value: 0.8111346018322763
            name: Cosine Accuracy
          - type: dot_accuracy
            value: 0.19873150105708245
            name: Dot Accuracy
          - type: manhattan_accuracy
            value: 0.8146582100070472
            name: Manhattan Accuracy
          - type: euclidean_accuracy
            value: 0.8083157152924595
            name: Euclidean Accuracy
          - type: max_accuracy
            value: 0.8146582100070472
            name: Max Accuracy

SentenceTransformer based on sentence-transformers/distilbert-base-nli-mean-tokens

This is a sentence-transformers model finetuned from sentence-transformers/distilbert-base-nli-mean-tokens. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: DistilBertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("ostoveland/test12")
# Run inference
sentences = [
    'Oppgradere kjeller til boareale',
    'Oppussing av kjeller for boligformål',
    'Installere dusjkabinett',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Triplet

Metric Value
cosine_accuracy 0.8111
dot_accuracy 0.1987
manhattan_accuracy 0.8147
euclidean_accuracy 0.8083
max_accuracy 0.8147

Training Details

Training Datasets

Unnamed Dataset

  • Size: 800 training samples
  • Columns: sentence_0, sentence_1, and sentence_2
  • Approximate statistics based on the first 1000 samples:
    sentence_0 sentence_1 sentence_2
    type string string string
    details
    • min: 4 tokens
    • mean: 12.39 tokens
    • max: 49 tokens
    • min: 4 tokens
    • mean: 9.92 tokens
    • max: 21 tokens
    • min: 4 tokens
    • mean: 8.88 tokens
    • max: 34 tokens
  • Samples:
    sentence_0 sentence_1 sentence_2
    Oppussing av stue Renovere stue Male stue
    Sameie søker vaktmestertjenester Trenger vaktmester til sameie Renholdstjenester for sameie
    Sprenge og klargjøre til garasje Grave ut til garasje Bygge garasje
  • Loss: TripletLoss with these parameters:
    {
        "distance_metric": "TripletDistanceMetric.EUCLIDEAN",
        "triplet_margin": 5
    }
    

Unnamed Dataset

  • Size: 800 training samples
  • Columns: sentence_0 and sentence_1
  • Approximate statistics based on the first 1000 samples:
    sentence_0 sentence_1
    type string string
    details
    • min: 4 tokens
    • mean: 13.27 tokens
    • max: 41 tokens
    • min: 7 tokens
    • mean: 14.34 tokens
    • max: 29 tokens
  • Samples:
    sentence_0 sentence_1
    Helsparkle rom med totale veggflater på ca 20 m2 query: helsparkling av rom med 20 m2 veggflater
    Reparere skifer tak og tak vindu query: fikse takvindu og skifertak
    Pigge opp flisgulv, fjerne gips vegger og gipstak - 11 kvm query: fjerne flisgulv, gipsvegger og gipstak på 11 kvm
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Unnamed Dataset

  • Size: 800 training samples
  • Columns: sentence_0, sentence_1, and label
  • Approximate statistics based on the first 1000 samples:
    sentence_0 sentence_1 label
    type string string float
    details
    • min: 4 tokens
    • mean: 13.11 tokens
    • max: 36 tokens
    • min: 4 tokens
    • mean: 10.54 tokens
    • max: 28 tokens
    • min: 0.1
    • mean: 0.51
    • max: 0.95
  • Samples:
    sentence_0 sentence_1 label
    Legging av våtromsbelegg Renovering av bad 0.65
    overvåkingskamera 3stk installasjon av 3 overvåkingskameraer 0.95
    Bytte lamper i portrom Male portrom 0.15
  • Loss: CoSENTLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "pairwise_cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • num_train_epochs: 1
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin

Training Logs

Epoch Step test-triplet-evaluation_max_accuracy
1.0 75 0.8147

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.0.1
  • Transformers: 4.41.2
  • PyTorch: 2.3.0+cu121
  • Accelerate: 0.31.0
  • Datasets: 2.20.0
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

TripletLoss

@misc{hermans2017defense,
    title={In Defense of the Triplet Loss for Person Re-Identification}, 
    author={Alexander Hermans and Lucas Beyer and Bastian Leibe},
    year={2017},
    eprint={1703.07737},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply}, 
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}

CoSENTLoss

@online{kexuefm-8847,
    title={CoSENT: A more efficient sentence vector scheme than Sentence-BERT},
    author={Su Jianlin},
    year={2022},
    month={Jan},
    url={https://kexue.fm/archives/8847},
}