metadata

base_model: BAAI/bge-base-en-v1.5
datasets: []
language: []
library_name: sentence-transformers
metrics:
  - cosine_accuracy
  - dot_accuracy
  - manhattan_accuracy
  - euclidean_accuracy
  - max_accuracy
pipeline_tag: sentence-similarity
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:317521
  - loss:TripletLoss
widget:
  - source_sentence: >-
      Write a function to extract every specified element from a given two
      dimensional list.
    sentences:
      - "def nCr_mod_p(n, r, p): \r\n\tif (r > n- r): \r\n\t\tr = n - r \r\n\tC = [0 for i in range(r + 1)] \r\n\tC[0] = 1 \r\n\tfor i in range(1, n + 1): \r\n\t\tfor j in range(min(i, r), 0, -1): \r\n\t\t\tC[j] = (C[j] + C[j-1]) % p \r\n\treturn C[r] "
      - "import cmath\r\ndef len_complex(a,b):\r\n  cn=complex(a,b)\r\n  length=abs(cn)\r\n  return length"
      - "def specified_element(nums, N):\r\n    result = [i[N] for i in nums]\r\n    return result"
  - source_sentence: >-
      Write a python function to find the kth element in an array containing odd
      elements first and then even elements.
    sentences:
      - "def get_Number(n, k): \r\n    arr = [0] * n; \r\n    i = 0; \r\n    odd = 1; \r\n    while (odd <= n):   \r\n        arr[i] = odd; \r\n        i += 1; \r\n        odd += 2;\r\n    even = 2; \r\n    while (even <= n): \r\n        arr[i] = even; \r\n        i += 1;\r\n        even += 2; \r\n    return arr[k - 1]; "
      - "def sort_matrix(M):\r\n    result = sorted(M, key=sum)\r\n    return result"
      - "INT_BITS = 32\r\ndef left_Rotate(n,d):   \r\n    return (n << d)|(n >> (INT_BITS - d))  "
  - source_sentence: >-
      Write a function to remove all the words with k length in the given
      string.
    sentences:
      - "def remove_tuples(test_list, K):\r\n  res = [ele for ele in test_list if len(ele) != K]\r\n  return (res) "
      - "def is_Sub_Array(A,B,n,m): \r\n    i = 0; j = 0; \r\n    while (i < n and j < m):  \r\n        if (A[i] == B[j]): \r\n            i += 1; \r\n            j += 1; \r\n            if (j == m): \r\n                return True;  \r\n        else: \r\n            i = i - j + 1; \r\n            j = 0;       \r\n    return False; "
      - "def remove_length(test_str, K):\r\n  temp = test_str.split()\r\n  res = [ele for ele in temp if len(ele) != K]\r\n  res = ' '.join(res)\r\n  return (res) "
  - source_sentence: >-
      Write a function to find the occurence of characters 'std' in the given
      string 1. list item 1. list item 1. list item 2. list item 2. list item 2.
      list item
    sentences:
      - "def magic_square_test(my_matrix):\r\n    iSize = len(my_matrix[0])\r\n    sum_list = []\r\n    sum_list.extend([sum (lines) for lines in my_matrix])   \r\n    for col in range(iSize):\r\n        sum_list.append(sum(row[col] for row in my_matrix))\r\n    result1 = 0\r\n    for i in range(0,iSize):\r\n        result1 +=my_matrix[i][i]\r\n    sum_list.append(result1)      \r\n    result2 = 0\r\n    for i in range(iSize-1,-1,-1):\r\n        result2 +=my_matrix[i][i]\r\n    sum_list.append(result2)\r\n    if len(set(sum_list))>1:\r\n        return False\r\n    return True"
      - "def count_occurance(s):\r\n  count=0\r\n  for i in range(len(s)):\r\n    if (s[i]== 's' and s[i+1]=='t' and s[i+2]== 'd'):\r\n      count = count + 1\r\n  return count"
      - "def power(a,b):\r\n\tif b==0:\r\n\t\treturn 1\r\n\telif a==0:\r\n\t\treturn 0\r\n\telif b==1:\r\n\t\treturn a\r\n\telse:\r\n\t\treturn a*power(a,b-1)"
  - source_sentence: Write a function to find sum and average of first n natural numbers.
    sentences:
      - "def long_words(n, str):\r\n    word_len = []\r\n    txt = str.split(\" \")\r\n    for x in txt:\r\n        if len(x) > n:\r\n            word_len.append(x)\r\n    return word_len\t"
      - "def long_words(n, str):\r\n    word_len = []\r\n    txt = str.split(\" \")\r\n    for x in txt:\r\n        if len(x) > n:\r\n            word_len.append(x)\r\n    return word_len\t"
      - "def sum_average(number):\r\n total = 0\r\n for value in range(1, number + 1):\r\n    total = total + value\r\n average = total / number\r\n return (total,average)"
model-index:
  - name: SentenceTransformer based on BAAI/bge-base-en-v1.5
    results:
      - task:
          type: triplet
          name: Triplet
        dataset:
          name: sts dev
          type: sts-dev
        metrics:
          - type: cosine_accuracy
            value: 0.997141408425864
            name: Cosine Accuracy
          - type: dot_accuracy
            value: 0.0028145001873883936
            name: Dot Accuracy
          - type: manhattan_accuracy
            value: 0.99605382088609
            name: Manhattan Accuracy
          - type: euclidean_accuracy
            value: 0.997141408425864
            name: Euclidean Accuracy
          - type: max_accuracy
            value: 0.997141408425864
            name: Max Accuracy

SentenceTransformer based on BAAI/bge-base-en-v1.5

This is a sentence-transformers model finetuned from BAAI/bge-base-en-v1.5. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Type: Sentence Transformer
Base model: BAAI/bge-base-en-v1.5
Maximum Sequence Length: 512 tokens
Output Dimensionality: 768 tokens
Similarity Function: Cosine Similarity

Model Sources

Documentation: Sentence Transformers Documentation
Repository: Sentence Transformers on GitHub
Hugging Face: Sentence Transformers on Hugging Face

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("Nutanix/bge-base-mbpp")
# Run inference
sentences = [
    'Write a function to find sum and average of first n natural numbers.',
    'def sum_average(number):\r\n total = 0\r\n for value in range(1, number + 1):\r\n    total = total + value\r\n average = total / number\r\n return (total,average)',
    'def long_words(n, str):\r\n    word_len = []\r\n    txt = str.split(" ")\r\n    for x in txt:\r\n        if len(x) > n:\r\n            word_len.append(x)\r\n    return word_len\t',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Triplet

Dataset: sts-dev
Evaluated with TripletEvaluator

Metric	Value
cosine_accuracy	0.9971
dot_accuracy	0.0028
manhattan_accuracy	0.9961
euclidean_accuracy	0.9971
max_accuracy	0.9971

Training Details

Training Hyperparameters

Non-Default Hyperparameters

per_device_train_batch_size: 16
per_device_eval_batch_size: 16
num_train_epochs: 1
bf16: True
batch_sampler: no_duplicates

All Hyperparameters

Click to expand

overwrite_output_dir: False
do_predict: False
prediction_loss_only: True
per_device_train_batch_size: 16
per_device_eval_batch_size: 16
per_gpu_train_batch_size: None
per_gpu_eval_batch_size: None
gradient_accumulation_steps: 1
eval_accumulation_steps: None
learning_rate: 5e-05
weight_decay: 0.0
adam_beta1: 0.9
adam_beta2: 0.999
adam_epsilon: 1e-08
max_grad_norm: 1.0
num_train_epochs: 1
max_steps: -1
lr_scheduler_type: linear
lr_scheduler_kwargs: {}
warmup_ratio: 0
warmup_steps: 0
log_level: passive
log_level_replica: warning
log_on_each_node: True
logging_nan_inf_filter: True
save_safetensors: True
save_on_each_node: False
save_only_model: False
no_cuda: False
use_cpu: False
use_mps_device: False
seed: 42
data_seed: None
jit_mode_eval: False
use_ipex: False
bf16: True
fp16: False
fp16_opt_level: O1
half_precision_backend: auto
bf16_full_eval: False
fp16_full_eval: False
tf32: None
local_rank: 0
ddp_backend: None
tpu_num_cores: None
tpu_metrics_debug: False
debug: []
dataloader_drop_last: False
dataloader_num_workers: 0
dataloader_prefetch_factor: None
past_index: -1
disable_tqdm: False
remove_unused_columns: True
label_names: None
load_best_model_at_end: False
ignore_data_skip: False
fsdp: []
fsdp_min_num_params: 0
fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
fsdp_transformer_layer_cls_to_wrap: None
accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'gradient_accumulation_kwargs': None}
deepspeed: None
label_smoothing_factor: 0.0
optim: adamw_torch
optim_args: None
adafactor: False
group_by_length: False
length_column_name: length
ddp_find_unused_parameters: None
ddp_bucket_cap_mb: None
ddp_broadcast_buffers: False
dataloader_pin_memory: True
dataloader_persistent_workers: False
skip_memory_metrics: True
use_legacy_prediction_loop: False
push_to_hub: False
resume_from_checkpoint: None
hub_model_id: None
hub_strategy: every_save
hub_private_repo: False
hub_always_push: False
gradient_checkpointing: False
gradient_checkpointing_kwargs: None
include_inputs_for_metrics: False
eval_do_concat_batches: True
fp16_backend: auto
push_to_hub_model_id: None
push_to_hub_organization: None
mp_parameters:
auto_find_batch_size: False
full_determinism: False
torchdynamo: None
ray_scope: last
ddp_timeout: 1800
torch_compile: False
torch_compile_backend: None
torch_compile_mode: None
dispatch_batches: None
split_batches: None
include_tokens_per_second: False
include_num_input_tokens_seen: False
neftune_noise_alpha: None
optim_target_modules: None
batch_sampler: no_duplicates
multi_dataset_batch_sampler: proportional

Training Logs

Click to expand

Epoch	Step	Training Loss	sts-dev_max_accuracy
0.0050	100	4.3364	-
0.0101	200	4.122	-
0.0151	300	4.0825	-
0.0202	400	4.0381	-
0.0252	500	4.015	-
0.0302	600	3.9996	-
0.0353	700	3.9567	-
0.0403	800	3.9593	-
0.0453	900	3.9456	-
0.0504	1000	3.938	-
0.0554	1100	3.933	-
0.0605	1200	3.905	-
0.0655	1300	3.906	-
0.0705	1400	3.9073	-
0.0756	1500	3.9193	-
0.0806	1600	3.9016	-
0.0857	1700	3.8899	-
0.0907	1800	3.9	-
0.0957	1900	3.8983	-
0.1008	2000	3.876	-
0.1058	2100	3.9001	-
0.1109	2200	3.8818	-
0.1159	2300	3.8788	-
0.1209	2400	3.8815	-
0.1260	2500	3.8664	-
0.1310	2600	3.854	-
0.1360	2700	3.8674	-
0.1411	2800	3.8525	-
0.1461	2900	3.8733	-
0.1512	3000	3.8538	-
0.1562	3100	3.8348	-
0.1612	3200	3.8378	-
0.1663	3300	3.8504	-
0.1713	3400	3.8409	-
0.1764	3500	3.8436	-
0.1814	3600	3.8422	-
0.1864	3700	3.8629	-
0.1915	3800	3.8589	-
0.1965	3900	3.8572	-
0.2016	4000	3.8309	-
0.2066	4100	3.8465	-
0.2116	4200	3.8311	-
0.2167	4300	3.8124	-
0.2217	4400	3.8412	-
0.2267	4500	3.8228	-
0.2318	4600	3.8012	-
0.2368	4700	3.8185	-
0.2419	4800	3.8242	-
0.2469	4900	3.7917	-
0.2519	5000	3.8022	-
0.2570	5100	3.7991	-
0.2620	5200	3.7943	-
0.2671	5300	3.7874	-
0.2721	5400	3.7987	-
0.2771	5500	3.7982	-
0.2822	5600	3.7789	-
0.2872	5700	3.7837	-
0.2923	5800	3.7762	-
0.2973	5900	3.7854	-
0.3023	6000	3.7719	-
0.3074	6100	3.7925	-
0.3124	6200	3.7795	-
0.3174	6300	3.7725	-
0.3225	6400	3.7897	-
0.3275	6500	3.773	-
0.3326	6600	3.7803	-
0.3376	6700	3.7476	-
0.3426	6800	3.7585	-
0.3477	6900	3.7426	-
0.3527	7000	3.7529	-
0.3578	7100	3.7745	-
0.3628	7200	3.7771	-
0.3678	7300	3.7598	-
0.3729	7400	3.7428	-
0.3779	7500	3.7409	-
0.3829	7600	3.7569	-
0.3880	7700	3.7517	-
0.3930	7800	3.7484	-
0.3981	7900	3.7415	-
0.4031	8000	3.7228	-
0.4081	8100	3.7569	-
0.4132	8200	3.7421	-
0.4182	8300	3.7233	-
0.4233	8400	3.72	-
0.4283	8500	3.7431	-
0.4333	8600	3.7258	-
0.4384	8700	3.73	-
0.4434	8800	3.7286	-
0.4485	8900	3.7487	-
0.4535	9000	3.7359	-
0.4585	9100	3.7387	-
0.4636	9200	3.7135	-
0.4686	9300	3.7219	-
0.4736	9400	3.7189	-
0.4787	9500	3.7234	-
0.4837	9600	3.7333	-
0.4888	9700	3.7027	-
0.4938	9800	3.7358	-
0.4988	9900	3.6959	-
0.5039	10000	3.7051	-
0.5089	10100	3.7205	-
0.5140	10200	3.711	-
0.5190	10300	3.6898	-
0.5240	10400	3.7103	-
0.5291	10500	3.695	-
0.5341	10600	3.7108	-
0.5392	10700	3.7226	-
0.5442	10800	3.7004	-
0.5492	10900	3.736	-
0.5543	11000	3.7135	-
0.5593	11100	3.7148	-
0.5643	11200	3.7285	-
0.5694	11300	3.694	-
0.5744	11400	3.6913	-
0.5795	11500	3.69	-
0.5845	11600	3.7249	-
0.5895	11700	3.6907	-
0.5946	11800	3.7135	-
0.5996	11900	3.7172	-
0.6047	12000	3.7087	-
0.6097	12100	3.7045	-
0.6147	12200	3.7043	-
0.6198	12300	3.693	-
0.6248	12400	3.6982	-
0.6298	12500	3.6922	-
0.6349	12600	3.6857	-
0.6399	12700	3.6834	-
0.6450	12800	3.7052	-
0.6500	12900	3.6935	-
0.6550	13000	3.6736	-
0.6601	13100	3.7026	-
0.6651	13200	3.6846	-
0.6702	13300	3.704	-
0.6752	13400	3.6818	-
0.6802	13500	3.7075	-
0.6853	13600	3.6688	-
0.6903	13700	3.6933	-
0.6954	13800	3.6971	-
0.7004	13900	3.6785	-
0.7054	14000	3.7088	-
0.7105	14100	3.7127	-
0.7155	14200	3.6996	-
0.7205	14300	3.6901	-
0.7256	14400	3.6914	-
0.7306	14500	3.6659	-
0.7357	14600	3.6859	-
0.7407	14700	3.68	-
0.7457	14800	3.6874	-
0.7508	14900	3.6854	-
0.7558	15000	3.671	-
0.7609	15100	3.6909	-
0.7659	15200	3.7014	-
0.7709	15300	3.6828	-
0.7760	15400	3.6773	-
0.7810	15500	3.6863	-
0.7861	15600	3.6892	-
0.7911	15700	3.6864	-
0.7961	15800	3.6586	-
0.8012	15900	3.6639	-
0.8062	16000	3.6843	-
0.8112	16100	3.6865	-
0.8163	16200	3.678	-
0.8213	16300	3.6825	-
0.8264	16400	3.7068	-
0.8314	16500	3.6886	-
0.8364	16600	3.6905	-
0.8415	16700	3.6905	-
0.8465	16800	3.6677	-
0.8516	16900	3.684	-
0.8566	17000	3.6872	-
0.8616	17100	3.6849	-
0.8667	17200	3.662	-
0.8717	17300	3.6887	-
0.8768	17400	3.6999	-
0.8818	17500	3.6916	-
0.8868	17600	3.6853	-
0.8919	17700	3.6971	-
0.8969	17800	3.6846	-
0.9019	17900	3.6701	-
0.9070	18000	3.6911	-
0.9120	18100	3.7021	-
0.9171	18200	3.6851	-
0.9221	18300	3.6924	-
0.9271	18400	3.6644	-
0.9322	18500	3.6674	-
0.9372	18600	3.6962	-
0.9423	18700	3.6759	-
0.9473	18800	3.6839	-
0.9523	18900	3.6822	-
0.9574	19000	3.6947	-
0.9624	19100	3.6589	-
0.9674	19200	3.6817	-
0.9725	19300	3.6754	-
0.9775	19400	3.6947	-
0.9826	19500	3.6785	-
0.9876	19600	3.6776	-
0.9926	19700	3.6791	-
0.9977	19800	3.6795	-
1.0	19846	-	0.9971

Framework Versions

Python: 3.10.14
Sentence Transformers: 3.0.1
Transformers: 4.40.0
PyTorch: 2.3.0+cu121
Accelerate: 0.33.0
Datasets: 2.20.0
Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

TripletLoss

@misc{hermans2017defense,
    title={In Defense of the Triplet Loss for Person Re-Identification}, 
    author={Alexander Hermans and Lucas Beyer and Bastian Leibe},
    year={2017},
    eprint={1703.07737},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}