File size: 25,406 Bytes
0558cb8 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 |
---
base_model: BAAI/bge-base-en-v1.5
datasets: []
language: []
library_name: sentence-transformers
pipeline_tag: sentence-similarity
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:1342
- loss:MultipleNegativesRankingLoss
widget:
- source_sentence: What is significant about New Delhi's history?
sentences:
- As of the 2011 India census, Arackal had a population of 16,739 with 7,963 males
and 8,776 females.
- Edappadi K. Palaniswami is an Indian politician. He is the current and 8th Chief
Minister of Tamil Nadu. He is the chief minister since 16 February 2017. Palaniswami
is a senior leader of All India Anna Dravida Munnetra Kazhagam.
- New Delhi () is the capital of India and a union territory of the megacity of
Delhi. It has a very old history and is home to several monuments where the city
is expensive to live in. In traditional Indian geography it falls under the North
Indian zone. The city has an area of about 42.7 km. New Delhi has a population
of about 9.4 Million people.
- source_sentence: What was the significance of the Pandyan kingdom in ancient Tamil
history?
sentences:
- Ektara (literally "one-string", also called iktar, ', yaktaro gopichand) is a
one-string instrument. It is most often used in traditional music from Bangladesh,
India , Egypt, and Pakistan.
- Polygar War or Palayakarar Wars refers to the wars fought between the Polygars
("Palayakarrars") of former Madurai Kingdom in Tamil Nadu, India and the British
colonial forces between March 1799 to May 1802. The British finally won after
carrying out long and difficult protracted jungle campaigns against the Polygar
armies and finally defeated them. Many lives were lost on both sides and the victory
over Polygars made large part of territories of Tamil Nadu coming under British
control enabling them to get a strong hold in India.
- The Pandyan kingdom பாண்டியர் was an ancient Tamil state in South India of unknown
antiquity. Pandyas were one of the three ancient Tamil kingdoms (Chola and Chera
being the other two) who ruled the Tamil country from pre-historic times until
end of the 15th century. They ruled initially from Korkai, a sea port on the southern
most tip of the Indian peninsula, and in later times moved to Madurai.
- source_sentence: Can you tell me about Louis-Frédéric Nussbaum's contributions?
sentences:
- Shipkila is a mountain pass and border post on the Republic of India-People's
Republic of China border. It is through this pass which the river Sutlej enters
India (from the Tibet Autonomous Region).
- Suvra Mukherjee (September 17, 1940 – August 18, 2015) was the First Lady of India
from 2012 until her death in 2015. She was the wife of Indian President Pranab
Mukherjee from 1957 until her death in 2015.
- Louis-Frédéric Nussbaum (1923-1996 ), also known as Louis Frédéric or Louis-Frédéric,
was a French scholar, art historian, writer, translator and editor. He was a specialist
in the cultures of Asia, especially India and Japan.
- source_sentence: What were the original goals of Dravida Kazhagam?
sentences:
- Sir Patrick Geddes (2 October 1854 – 17 April 1932) was a Scottish biologist,
sociologist, geographer, philanthropist and pioneering town planner. He developed
a new urban theoris, including the second master plan of Jerusalem in 1919. He
also developed the first master plan of Tel Aviv in 1925 that included the Bauhaus
architecture in the White City of Tel Aviv. His other work was in India during
the period of British India. A small memorial board to Patrick Geddes is under
a bridge of the Heil HaShirion Street in Tel Aviv.
- Dravida Kazhagam (or Dravidar Kazhagam, "Dravidian Organization") was one of the
first Dravidian parties in India. The party was founded by E.V. Ramasamy, also
called Thanthai Periyar. Its original goals were to eradicate the ills of the
existing caste system including untouchability and to obtain a "Dravida Nadu"
(Dravidian nation) from the Madras Presidency i.e., a separate nation from India
for Dravidian people alone.
- Dheeran Chinnamalai ( born as Theerthagiri Sarkkarai Mandraadiyaar [Sarkkarai
Mandraadiyaar Refers Payiran Kulam] or Theerthagiri Gounder on April 17, 1756)
was a Kongu chieftain and Palayakkarar from Tamil Nadu who rose up in revolt against
the British East India Company in the Kongu Nadu, Southern India. He was born
in Melapalayam, near Erode in the South Indian state of Tamil Nadu.
- source_sentence: Can you tell me about the literary contributions of Chattopadhyay?
sentences:
- The Election Commission of India held indirect 2nd presidential elections of India
on May 6, 1957. Dr. Rajendra Prasad won his re-election with 459,698 votes over
his rivals Chowdhry Hari Ramwho got 2,672 votes and Nagendra Narayan Das who got
2,000 votes. Rajendra Prasad, has been the only person, to have won and served
two terms, as President of India.
- S. Rajendra Babu (born 1 June 1939) is an Indian judge. He was the 34th Chief
Justice of India from May to June 2004. He also served as the chairperson of National
Human Rights Commission of India.
- Rishi Bankim Chandra Chattopadhyay (27 June 1838 – 8 April 1894) was a Bengali
writer, poet and journalist. He was the composer of India's national song "Vande
Mataram". It was originally a Bengali and Sanskrit "stotra" (hymn) portraying
India as a mother goddess. The song inspired the activists during the Indian Independence
Movement. Chattopadhyay wrote 13 novels. He also wrote several 'serious, serio-comic,
satirical, scientific and critical articles in Bengali. His works were widely
translated into other regional languages of India.
---
# SentenceTransformer based on BAAI/bge-base-en-v1.5
This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
## Model Details
### Model Description
- **Model Type:** Sentence Transformer
- **Base model:** [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) <!-- at revision a5beb1e3e68b9ab74eb54cfd186867f64f240e1a -->
- **Maximum Sequence Length:** 512 tokens
- **Output Dimensionality:** 768 tokens
- **Similarity Function:** Cosine Similarity
<!-- - **Training Dataset:** Unknown -->
<!-- - **Language:** Unknown -->
<!-- - **License:** Unknown -->
### Model Sources
- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
### Full Model Architecture
```
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
```
## Usage
### Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
```bash
pip install -U sentence-transformers
```
Then you can load this model and run inference.
```python
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("GenAIGirl/bge-base-finetune-embedder")
# Run inference
sentences = [
'Can you tell me about the literary contributions of Chattopadhyay?',
'Rishi Bankim Chandra Chattopadhyay (27 June 1838 – 8 April 1894) was a Bengali writer, poet and journalist. He was the composer of India\'s national song "Vande Mataram". It was originally a Bengali and Sanskrit "stotra" (hymn) portraying India as a mother goddess. The song inspired the activists during the Indian Independence Movement. Chattopadhyay wrote 13 novels. He also wrote several \'serious, serio-comic, satirical, scientific and critical articles in Bengali. His works were widely translated into other regional languages of India.',
'S. Rajendra Babu (born 1 June 1939) is an Indian judge. He was the 34th Chief Justice of India from May to June 2004. He also served as the chairperson of National Human Rights Commission of India.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
```
<!--
### Direct Usage (Transformers)
<details><summary>Click to see the direct usage in Transformers</summary>
</details>
-->
<!--
### Downstream Usage (Sentence Transformers)
You can finetune this model on your own dataset.
<details><summary>Click to expand</summary>
</details>
-->
<!--
### Out-of-Scope Use
*List how the model may foreseeably be misused and address what users ought not to do with the model.*
-->
<!--
## Bias, Risks and Limitations
*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
-->
<!--
### Recommendations
*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
-->
## Training Details
### Training Dataset
#### Unnamed Dataset
* Size: 1,342 training samples
* Columns: <code>question</code> and <code>context</code>
* Approximate statistics based on the first 1000 samples:
| | question | context |
|:--------|:----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|
| type | string | string |
| details | <ul><li>min: 6 tokens</li><li>mean: 12.49 tokens</li><li>max: 27 tokens</li></ul> | <ul><li>min: 9 tokens</li><li>mean: 83.95 tokens</li><li>max: 510 tokens</li></ul> |
* Samples:
| question | context |
|:--------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| <code>What is the origin of Basil?</code> | <code>Basil ("Ocimum basilicum") ( or ) is a plant of the Family Lamiaceae. It is also known as Sweet Basil or Tulsi. It is a tender low-growing herb that is grown as a perennial in warm, tropical climates. Basil is originally native to India and other tropical regions of Asia. It has been cultivated there for more than 5,000 years. It is prominently featured in many cuisines throughout the world. Some of them are Italian, Thai, Vietnamese and Laotian cuisines. It grows to between 30–60 cm tall. It has light green, silky leaves 3–5 cm long and 1–3 cm broad. The leaves are opposite each other. The flowers are quite big. They are white in color and arranged as a spike.</code> |
| <code>In which cuisines is Basil prominently featured?</code> | <code>Basil ("Ocimum basilicum") ( or ) is a plant of the Family Lamiaceae. It is also known as Sweet Basil or Tulsi. It is a tender low-growing herb that is grown as a perennial in warm, tropical climates. Basil is originally native to India and other tropical regions of Asia. It has been cultivated there for more than 5,000 years. It is prominently featured in many cuisines throughout the world. Some of them are Italian, Thai, Vietnamese and Laotian cuisines. It grows to between 30–60 cm tall. It has light green, silky leaves 3–5 cm long and 1–3 cm broad. The leaves are opposite each other. The flowers are quite big. They are white in color and arranged as a spike.</code> |
| <code>What is the significance of the Roerich Pact?</code> | <code>The Roerich Pact is a treaty on Protection of Artistic and Scientific Institutions and Historic Monuments, signed by the representatives of 21 states in the Oval Office of the White House on 15 April 1935. As of January 1, 1990, the Roerich Pact had been ratified by ten nations: Brazil, Chile, Colombia, Cuba, the Dominican Republic, El Salvador, Guatemala, Mexico, the United States, and Venezuela. It went into effect on 26 August 1935. The Government of India approved the Treaty in 1948, but did not take any further formal action. The Roerich Pact is also known as "Pax Cultura" ("Cultural Peace" or "Peace through Culture"). The most important part of the Roerich Pact is the legal recognition that the protection of culture is always more important than any military necessity.</code> |
* Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
```json
{
"scale": 20.0,
"similarity_fct": "cos_sim"
}
```
### Evaluation Dataset
#### Unnamed Dataset
* Size: 100 evaluation samples
* Columns: <code>question</code> and <code>context</code>
* Approximate statistics based on the first 1000 samples:
| | question | context |
|:--------|:----------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|
| type | string | string |
| details | <ul><li>min: 7 tokens</li><li>mean: 12.37 tokens</li><li>max: 18 tokens</li></ul> | <ul><li>min: 18 tokens</li><li>mean: 72.93 tokens</li><li>max: 228 tokens</li></ul> |
* Samples:
| question | context |
|:------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| <code>What role did Suvra Mukherjee hold in India?</code> | <code>Suvra Mukherjee (September 17, 1940 – August 18, 2015) was the First Lady of India from 2012 until her death in 2015. She was the wife of Indian President Pranab Mukherjee from 1957 until her death in 2015.</code> |
| <code>What political party is Edappadi K. Palaniswami associated with?</code> | <code>Edappadi K. Palaniswami is an Indian politician. He is the current and 8th Chief Minister of Tamil Nadu. He is the chief minister since 16 February 2017. Palaniswami is a senior leader of All India Anna Dravida Munnetra Kazhagam.</code> |
| <code>Where are Tibetan antelopes primarily found?</code> | <code>Tibetan antelope, also known as Chiru is a medium sized antelope most closely related to wild goats and sheep of the subfamily Caprinae. Tibetan antelope are native to northwest India and Tibet. They live on the treeless Steppe above . They are an endangered species. They are a target for hunters for their fine underfur called chiru. It is used to make luxury shawls. It takes about four animals to make a single shawl. In order to collect the chiru, the animals must be killed. Because of this the Chiru are close to extinction.</code> |
* Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
```json
{
"scale": 20.0,
"similarity_fct": "cos_sim"
}
```
### Training Hyperparameters
#### Non-Default Hyperparameters
- `eval_strategy`: steps
- `per_device_train_batch_size`: 16
- `per_device_eval_batch_size`: 16
- `learning_rate`: 3e-06
- `max_steps`: 166
- `warmup_ratio`: 0.1
- `fp16`: True
- `batch_sampler`: no_duplicates
#### All Hyperparameters
<details><summary>Click to expand</summary>
- `overwrite_output_dir`: False
- `do_predict`: False
- `eval_strategy`: steps
- `prediction_loss_only`: True
- `per_device_train_batch_size`: 16
- `per_device_eval_batch_size`: 16
- `per_gpu_train_batch_size`: None
- `per_gpu_eval_batch_size`: None
- `gradient_accumulation_steps`: 1
- `eval_accumulation_steps`: None
- `torch_empty_cache_steps`: None
- `learning_rate`: 3e-06
- `weight_decay`: 0.0
- `adam_beta1`: 0.9
- `adam_beta2`: 0.999
- `adam_epsilon`: 1e-08
- `max_grad_norm`: 1.0
- `num_train_epochs`: 3.0
- `max_steps`: 166
- `lr_scheduler_type`: linear
- `lr_scheduler_kwargs`: {}
- `warmup_ratio`: 0.1
- `warmup_steps`: 0
- `log_level`: passive
- `log_level_replica`: warning
- `log_on_each_node`: True
- `logging_nan_inf_filter`: True
- `save_safetensors`: True
- `save_on_each_node`: False
- `save_only_model`: False
- `restore_callback_states_from_checkpoint`: False
- `no_cuda`: False
- `use_cpu`: False
- `use_mps_device`: False
- `seed`: 42
- `data_seed`: None
- `jit_mode_eval`: False
- `use_ipex`: False
- `bf16`: False
- `fp16`: True
- `fp16_opt_level`: O1
- `half_precision_backend`: auto
- `bf16_full_eval`: False
- `fp16_full_eval`: False
- `tf32`: None
- `local_rank`: 0
- `ddp_backend`: None
- `tpu_num_cores`: None
- `tpu_metrics_debug`: False
- `debug`: []
- `dataloader_drop_last`: False
- `dataloader_num_workers`: 0
- `dataloader_prefetch_factor`: None
- `past_index`: -1
- `disable_tqdm`: False
- `remove_unused_columns`: True
- `label_names`: None
- `load_best_model_at_end`: False
- `ignore_data_skip`: False
- `fsdp`: []
- `fsdp_min_num_params`: 0
- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
- `fsdp_transformer_layer_cls_to_wrap`: None
- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
- `deepspeed`: None
- `label_smoothing_factor`: 0.0
- `optim`: adamw_torch
- `optim_args`: None
- `adafactor`: False
- `group_by_length`: False
- `length_column_name`: length
- `ddp_find_unused_parameters`: None
- `ddp_bucket_cap_mb`: None
- `ddp_broadcast_buffers`: False
- `dataloader_pin_memory`: True
- `dataloader_persistent_workers`: False
- `skip_memory_metrics`: True
- `use_legacy_prediction_loop`: False
- `push_to_hub`: False
- `resume_from_checkpoint`: None
- `hub_model_id`: None
- `hub_strategy`: every_save
- `hub_private_repo`: False
- `hub_always_push`: False
- `gradient_checkpointing`: False
- `gradient_checkpointing_kwargs`: None
- `include_inputs_for_metrics`: False
- `eval_do_concat_batches`: True
- `fp16_backend`: auto
- `push_to_hub_model_id`: None
- `push_to_hub_organization`: None
- `mp_parameters`:
- `auto_find_batch_size`: False
- `full_determinism`: False
- `torchdynamo`: None
- `ray_scope`: last
- `ddp_timeout`: 1800
- `torch_compile`: False
- `torch_compile_backend`: None
- `torch_compile_mode`: None
- `dispatch_batches`: None
- `split_batches`: None
- `include_tokens_per_second`: False
- `include_num_input_tokens_seen`: False
- `neftune_noise_alpha`: None
- `optim_target_modules`: None
- `batch_eval_metrics`: False
- `eval_on_start`: False
- `use_liger_kernel`: False
- `eval_use_gather_object`: False
- `batch_sampler`: no_duplicates
- `multi_dataset_batch_sampler`: proportional
</details>
### Training Logs
| Epoch | Step | Training Loss | loss |
|:------:|:----:|:-------------:|:------:|
| 0.2381 | 20 | 0.1734 | 0.0589 |
| 0.4762 | 40 | 0.0827 | 0.0477 |
| 0.7143 | 60 | 0.0737 | 0.0474 |
| 0.9524 | 80 | 0.0451 | 0.0465 |
| 1.1905 | 100 | 0.0569 | 0.0416 |
| 1.4286 | 120 | 0.0431 | 0.0407 |
| 1.6667 | 140 | 0.03 | 0.0406 |
| 1.9048 | 160 | 0.0389 | 0.0405 |
### Framework Versions
- Python: 3.10.12
- Sentence Transformers: 3.0.1
- Transformers: 4.45.1
- PyTorch: 2.2.0+cu121
- Accelerate: 0.34.2
- Datasets: 2.20.0
- Tokenizers: 0.20.0
## Citation
### BibTeX
#### Sentence Transformers
```bibtex
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
```
#### MultipleNegativesRankingLoss
```bibtex
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
```
<!--
## Glossary
*Clearly define terms in order to be accessible across audiences.*
-->
<!--
## Model Card Authors
*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
-->
<!--
## Model Card Contact
*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
--> |