Edit model card

SentenceTransformer based on Qwen/Qwen2.5-0.5B-Instruct

This is a sentence-transformers model finetuned from Qwen/Qwen2.5-0.5B-Instruct. It maps sentences & paragraphs to a 896-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: Qwen/Qwen2.5-0.5B-Instruct
  • Maximum Sequence Length: 1024 tokens
  • Output Dimensionality: 896 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 1024, 'do_lower_case': False}) with Transformer model: Qwen2Model 
  (1): Pooling({'word_embedding_dimension': 896, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("AlexWortega/qwen7k")
# Run inference
sentences = [
    'When was ABC formed?',
    "American Broadcasting Company\nABC launched as a radio network on October 12, 1943, serving as the successor to the NBC Blue Network, which had been purchased by Edward J. Noble. It extended its operations to television in 1948, following in the footsteps of established broadcast networks CBS and NBC. In the mid-1950s, ABC merged with United Paramount Theatres, a chain of movie theaters that formerly operated as a subsidiary of Paramount Pictures. Leonard Goldenson, who had been the head of UPT, made the new television network profitable by helping develop and greenlight many successful series. In the 1980s, after purchasing an 80% interest in cable sports channel ESPN, the network's corporate parent, American Broadcasting Companies, Inc., merged with Capital Cities Communications, owner of several print publications, and television and radio stations. In 1996, most of Capital Cities/ABC's assets were purchased by The Walt Disney Company.",
    'Americans Battling Communism\nAmericans Battling Communism, Inc. (ABC) was an anti-communist organization created following an October 1947 speech by Pennsylvania Judge Blair Gunther that called for an "ABC movement" to educate America about communism. Chartered in November 1947 by Harry Alan Sherman, a local lawyer active in various anti-communist organizations, the group took part in such activities as blacklisting by disclosing the names of people suspected of being communists. Its members included local judges and lawyers active in the McCarthy-era prosecution of communists.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 896]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Semantic Similarity

Metric sts-dev-896 sts-dev-768
pearson_cosine 0.7619 0.7599
spearman_cosine 0.7686 0.7672

Training Details

Training Dataset

Unnamed Dataset

  • Size: 1,077,240 training samples
  • Columns: query, response, and negative
  • Approximate statistics based on the first 1000 samples:
    query response negative
    type string string string
    details
    • min: 4 tokens
    • mean: 8.76 tokens
    • max: 26 tokens
    • min: 23 tokens
    • mean: 141.88 tokens
    • max: 532 tokens
    • min: 4 tokens
    • mean: 134.02 tokens
    • max: 472 tokens
  • Samples:
    query response negative
    Was there a year 0? Year zero
    Year zero does not exist in the anno Domini system usually used to number years in the Gregorian calendar and in its predecessor, the Julian calendar. In this system, the year 1 BC is followed by AD 1. However, there is a year zero in astronomical year numbering (where it coincides with the Julian year 1 BC) and in ISO 8601:2004 (where it coincides with the Gregorian year 1 BC) as well as in all Buddhist and Hindu calendars.
    504
    Year 504 (DIV) was a leap year starting on Thursday (link will display the full calendar) of the Julian calendar. At the time, it was known as the Year of the Consulship of Nicomachus without colleague (or, less frequently, year 1257 "Ab urbe condita"). The denomination 504 for this year has been used since the early medieval period, when the Anno Domini calendar era became the prevalent method in Europe for naming years.
    When is the dialectical method used? Dialectic
    Dialectic or dialectics (Greek: διαλεκτική, dialektikḗ; related to dialogue), also known as the dialectical method, is at base a discourse between two or more people holding different points of view about a subject but wishing to establish the truth through reasoned arguments. Dialectic resembles debate, but the concept excludes subjective elements such as emotional appeal and the modern pejorative sense of rhetoric.[1][2] Dialectic may be contrasted with the didactic method, wherein one side of the conversation teaches the other. Dialectic is alternatively known as minor logic, as opposed to major logic or critique.
    Derek Bentley case
    Another factor in the posthumous defence was that a "confession" recorded by Bentley, which was claimed by the prosecution to be a "verbatim record of dictated monologue", was shown by forensic linguistics methods to have been largely edited by policemen. Linguist Malcolm Coulthard showed that certain patterns, such as the frequency of the word "then" and the grammatical use of "then" after the grammatical subject ("I then" rather than "then I"), were not consistent with Bentley's use of language (his idiolect), as evidenced in court testimony. These patterns fit better the recorded testimony of the policemen involved. This is one of the earliest uses of forensic linguistics on record.
    What do Grasshoppers eat? Grasshopper
    Grasshoppers are plant-eaters, with a few species at times becoming serious pests of cereals, vegetables and pasture, especially when they swarm in their millions as locusts and destroy crops over wide areas. They protect themselves from predators by camouflage; when detected, many species attempt to startle the predator with a brilliantly-coloured wing-flash while jumping and (if adult) launching themselves into the air, usually flying for only a short distance. Other species such as the rainbow grasshopper have warning coloration which deters predators. Grasshoppers are affected by parasites and various diseases, and many predatory creatures feed on both nymphs and adults. The eggs are the subject of attack by parasitoids and predators.
    Groundhog
    Very often the dens of groundhogs provide homes for other animals including skunks, red foxes, and cottontail rabbits. The fox and skunk feed upon field mice, grasshoppers, beetles and other creatures that destroy farm crops. In aiding these animals, the groundhog indirectly helps the farmer. In addition to providing homes for itself and other animals, the groundhog aids in soil improvement by bringing subsoil to the surface. The groundhog is also a valuable game animal and is considered a difficult sport when hunted in a fair manner. In some parts of Appalachia, they are eaten.
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 12
  • per_device_eval_batch_size: 12
  • gradient_accumulation_steps: 4
  • num_train_epochs: 1
  • warmup_ratio: 0.3
  • bf16: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 12
  • per_device_eval_batch_size: 12
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 4
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.3
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss sts-dev-896_spearman_cosine sts-dev-768_spearman_cosine
0.0004 10 2.2049 - -
0.0009 20 2.3168 - -
0.0013 30 2.3544 - -
0.0018 40 2.2519 - -
0.0022 50 2.1809 - -
0.0027 60 2.1572 - -
0.0031 70 2.1855 - -
0.0036 80 2.5887 - -
0.0040 90 2.883 - -
0.0045 100 2.8557 - -
0.0049 110 2.9356 - -
0.0053 120 2.8833 - -
0.0058 130 2.8394 - -
0.0062 140 2.923 - -
0.0067 150 2.8191 - -
0.0071 160 2.8658 - -
0.0076 170 2.8252 - -
0.0080 180 2.8312 - -
0.0085 190 2.7761 - -
0.0089 200 2.7193 - -
0.0094 210 2.724 - -
0.0098 220 2.7484 - -
0.0102 230 2.7262 - -
0.0107 240 2.6964 - -
0.0111 250 2.6676 - -
0.0116 260 2.6715 - -
0.0120 270 2.6145 - -
0.0125 280 2.6191 - -
0.0129 290 1.9812 - -
0.0134 300 1.6413 - -
0.0138 310 1.6126 - -
0.0143 320 1.3599 - -
0.0147 330 1.2996 - -
0.0151 340 1.2654 - -
0.0156 350 1.9409 - -
0.0160 360 2.1287 - -
0.0165 370 1.8442 - -
0.0169 380 1.6837 - -
0.0174 390 1.5489 - -
0.0178 400 1.4382 - -
0.0183 410 1.4848 - -
0.0187 420 1.3481 - -
0.0192 430 1.3467 - -
0.0196 440 1.3977 - -
0.0201 450 1.26 - -
0.0205 460 1.2412 - -
0.0209 470 1.316 - -
0.0214 480 1.3501 - -
0.0218 490 1.2246 - -
0.0223 500 1.2271 - -
0.0227 510 1.1871 - -
0.0232 520 1.1685 - -
0.0236 530 1.1624 - -
0.0241 540 1.1911 - -
0.0245 550 1.1978 - -
0.0250 560 1.1228 - -
0.0254 570 1.1091 - -
0.0258 580 1.1433 - -
0.0263 590 1.0638 - -
0.0267 600 1.0515 - -
0.0272 610 1.175 - -
0.0276 620 1.0943 - -
0.0281 630 1.1226 - -
0.0285 640 0.9871 - -
0.0290 650 1.0171 - -
0.0294 660 1.0169 - -
0.0299 670 0.9643 - -
0.0303 680 0.9563 - -
0.0307 690 0.9841 - -
0.0312 700 1.0349 - -
0.0316 710 0.8958 - -
0.0321 720 0.9225 - -
0.0325 730 0.842 - -
0.0330 740 0.9104 - -
0.0334 750 0.8927 - -
0.0339 760 0.8508 - -
0.0343 770 0.8835 - -
0.0348 780 0.9531 - -
0.0352 790 0.926 - -
0.0356 800 0.8718 - -
0.0361 810 0.8261 - -
0.0365 820 0.8169 - -
0.0370 830 0.8525 - -
0.0374 840 0.8504 - -
0.0379 850 0.7625 - -
0.0383 860 0.8259 - -
0.0388 870 0.7558 - -
0.0392 880 0.7898 - -
0.0397 890 0.7694 - -
0.0401 900 0.7429 - -
0.0405 910 0.6666 - -
0.0410 920 0.7407 - -
0.0414 930 0.6665 - -
0.0419 940 0.7597 - -
0.0423 950 0.7035 - -
0.0428 960 0.7166 - -
0.0432 970 0.6889 - -
0.0437 980 0.7541 - -
0.0441 990 0.7175 - -
0.0446 1000 0.7389 0.6420 0.6403
0.0450 1010 0.7142 - -
0.0454 1020 0.7301 - -
0.0459 1030 0.7299 - -
0.0463 1040 0.6759 - -
0.0468 1050 0.7036 - -
0.0472 1060 0.6286 - -
0.0477 1070 0.595 - -
0.0481 1080 0.6099 - -
0.0486 1090 0.6377 - -
0.0490 1100 0.6309 - -
0.0495 1110 0.6306 - -
0.0499 1120 0.557 - -
0.0504 1130 0.5898 - -
0.0508 1140 0.5896 - -
0.0512 1150 0.6399 - -
0.0517 1160 0.5923 - -
0.0521 1170 0.5787 - -
0.0526 1180 0.591 - -
0.0530 1190 0.5714 - -
0.0535 1200 0.6047 - -
0.0539 1210 0.5904 - -
0.0544 1220 0.543 - -
0.0548 1230 0.6033 - -
0.0553 1240 0.5445 - -
0.0557 1250 0.5217 - -
0.0561 1260 0.5835 - -
0.0566 1270 0.5353 - -
0.0570 1280 0.5887 - -
0.0575 1290 0.5967 - -
0.0579 1300 0.5036 - -
0.0584 1310 0.5915 - -
0.0588 1320 0.5719 - -
0.0593 1330 0.5238 - -
0.0597 1340 0.5647 - -
0.0602 1350 0.538 - -
0.0606 1360 0.5457 - -
0.0610 1370 0.5169 - -
0.0615 1380 0.4967 - -
0.0619 1390 0.4864 - -
0.0624 1400 0.5133 - -
0.0628 1410 0.5587 - -
0.0633 1420 0.4691 - -
0.0637 1430 0.5186 - -
0.0642 1440 0.4907 - -
0.0646 1450 0.5281 - -
0.0651 1460 0.4741 - -
0.0655 1470 0.4452 - -
0.0659 1480 0.4771 - -
0.0664 1490 0.4289 - -
0.0668 1500 0.4551 - -
0.0673 1510 0.4558 - -
0.0677 1520 0.5159 - -
0.0682 1530 0.4296 - -
0.0686 1540 0.4548 - -
0.0691 1550 0.4439 - -
0.0695 1560 0.4295 - -
0.0700 1570 0.4466 - -
0.0704 1580 0.4717 - -
0.0708 1590 0.492 - -
0.0713 1600 0.4566 - -
0.0717 1610 0.4451 - -
0.0722 1620 0.4715 - -
0.0726 1630 0.4573 - -
0.0731 1640 0.3972 - -
0.0735 1650 0.5212 - -
0.0740 1660 0.4381 - -
0.0744 1670 0.4552 - -
0.0749 1680 0.4767 - -
0.0753 1690 0.4398 - -
0.0757 1700 0.4801 - -
0.0762 1710 0.3751 - -
0.0766 1720 0.4407 - -
0.0771 1730 0.4305 - -
0.0775 1740 0.3938 - -
0.0780 1750 0.4748 - -
0.0784 1760 0.428 - -
0.0789 1770 0.404 - -
0.0793 1780 0.4261 - -
0.0798 1790 0.359 - -
0.0802 1800 0.4422 - -
0.0807 1810 0.4748 - -
0.0811 1820 0.4352 - -
0.0815 1830 0.4032 - -
0.0820 1840 0.4124 - -
0.0824 1850 0.4486 - -
0.0829 1860 0.429 - -
0.0833 1870 0.4189 - -
0.0838 1880 0.3658 - -
0.0842 1890 0.4297 - -
0.0847 1900 0.4215 - -
0.0851 1910 0.3726 - -
0.0856 1920 0.3736 - -
0.0860 1930 0.4287 - -
0.0864 1940 0.4402 - -
0.0869 1950 0.4353 - -
0.0873 1960 0.3622 - -
0.0878 1970 0.3557 - -
0.0882 1980 0.4107 - -
0.0887 1990 0.3982 - -
0.0891 2000 0.453 0.7292 0.7261
0.0896 2010 0.3971 - -
0.0900 2020 0.4374 - -
0.0905 2030 0.4322 - -
0.0909 2040 0.3945 - -
0.0913 2050 0.356 - -
0.0918 2060 0.4182 - -
0.0922 2070 0.3694 - -
0.0927 2080 0.3989 - -
0.0931 2090 0.4237 - -
0.0936 2100 0.3961 - -
0.0940 2110 0.4264 - -
0.0945 2120 0.3609 - -
0.0949 2130 0.4154 - -
0.0954 2140 0.3661 - -
0.0958 2150 0.3328 - -
0.0962 2160 0.3456 - -
0.0967 2170 0.3478 - -
0.0971 2180 0.3339 - -
0.0976 2190 0.3833 - -
0.0980 2200 0.3238 - -
0.0985 2210 0.3871 - -
0.0989 2220 0.4009 - -
0.0994 2230 0.4115 - -
0.0998 2240 0.4024 - -
0.1003 2250 0.35 - -
0.1007 2260 0.3649 - -
0.1011 2270 0.3615 - -
0.1016 2280 0.3898 - -
0.1020 2290 0.3866 - -
0.1025 2300 0.3904 - -
0.1029 2310 0.3321 - -
0.1034 2320 0.3803 - -
0.1038 2330 0.3831 - -
0.1043 2340 0.403 - -
0.1047 2350 0.3803 - -
0.1052 2360 0.3463 - -
0.1056 2370 0.3987 - -
0.1060 2380 0.3731 - -
0.1065 2390 0.353 - -
0.1069 2400 0.3166 - -
0.1074 2410 0.3895 - -
0.1078 2420 0.4025 - -
0.1083 2430 0.3798 - -
0.1087 2440 0.2991 - -
0.1092 2450 0.3094 - -
0.1096 2460 0.3669 - -
0.1101 2470 0.3412 - -
0.1105 2480 0.3697 - -
0.1110 2490 0.369 - -
0.1114 2500 0.3393 - -
0.1118 2510 0.4232 - -
0.1123 2520 0.3445 - -
0.1127 2530 0.4165 - -
0.1132 2540 0.3721 - -
0.1136 2550 0.3476 - -
0.1141 2560 0.2847 - -
0.1145 2570 0.3609 - -
0.1150 2580 0.3017 - -
0.1154 2590 0.374 - -
0.1159 2600 0.3365 - -
0.1163 2610 0.393 - -
0.1167 2620 0.3623 - -
0.1172 2630 0.3538 - -
0.1176 2640 0.3206 - -
0.1181 2650 0.3962 - -
0.1185 2660 0.3087 - -
0.1190 2670 0.3482 - -
0.1194 2680 0.3616 - -
0.1199 2690 0.3955 - -
0.1203 2700 0.3915 - -
0.1208 2710 0.3782 - -
0.1212 2720 0.3576 - -
0.1216 2730 0.3544 - -
0.1221 2740 0.3572 - -
0.1225 2750 0.3107 - -
0.1230 2760 0.3579 - -
0.1234 2770 0.3571 - -
0.1239 2780 0.3694 - -
0.1243 2790 0.3674 - -
0.1248 2800 0.3373 - -
0.1252 2810 0.3362 - -
0.1257 2820 0.3225 - -
0.1261 2830 0.3609 - -
0.1265 2840 0.3681 - -
0.1270 2850 0.4059 - -
0.1274 2860 0.3047 - -
0.1279 2870 0.3446 - -
0.1283 2880 0.3507 - -
0.1288 2890 0.3124 - -
0.1292 2900 0.3712 - -
0.1297 2910 0.3394 - -
0.1301 2920 0.3869 - -
0.1306 2930 0.3449 - -
0.1310 2940 0.3752 - -
0.1314 2950 0.3341 - -
0.1319 2960 0.3329 - -
0.1323 2970 0.36 - -
0.1328 2980 0.3788 - -
0.1332 2990 0.3834 - -
0.1337 3000 0.3426 0.7603 0.7590
0.1341 3010 0.3591 - -
0.1346 3020 0.2923 - -
0.1350 3030 0.332 - -
0.1355 3040 0.3867 - -
0.1359 3050 0.3778 - -
0.1363 3060 0.3389 - -
0.1368 3070 0.3069 - -
0.1372 3080 0.3833 - -
0.1377 3090 0.3497 - -
0.1381 3100 0.3698 - -
0.1386 3110 0.335 - -
0.1390 3120 0.3578 - -
0.1395 3130 0.3171 - -
0.1399 3140 0.3073 - -
0.1404 3150 0.3354 - -
0.1408 3160 0.3338 - -
0.1412 3170 0.367 - -
0.1417 3180 0.3299 - -
0.1421 3190 0.3622 - -
0.1426 3200 0.3158 - -
0.1430 3210 0.3242 - -
0.1435 3220 0.388 - -
0.1439 3230 0.3626 - -
0.1444 3240 0.3371 - -
0.1448 3250 0.3808 - -
0.1453 3260 0.3375 - -
0.1457 3270 0.352 - -
0.1462 3280 0.3466 - -
0.1466 3290 0.3355 - -
0.1470 3300 0.3432 - -
0.1475 3310 0.372 - -
0.1479 3320 0.3501 - -
0.1484 3330 0.3311 - -
0.1488 3340 0.3312 - -
0.1493 3350 0.3276 - -
0.1497 3360 0.3218 - -
0.1502 3370 0.4019 - -
0.1506 3380 0.3132 - -
0.1511 3390 0.3741 - -
0.1515 3400 0.3359 - -
0.1519 3410 0.381 - -
0.1524 3420 0.3024 - -
0.1528 3430 0.3238 - -
0.1533 3440 0.2675 - -
0.1537 3450 0.3568 - -
0.1542 3460 0.3666 - -
0.1546 3470 0.3307 - -
0.1551 3480 0.3698 - -
0.1555 3490 0.3668 - -
0.1560 3500 0.385 - -
0.1564 3510 0.3068 - -
0.1568 3520 0.3015 - -
0.1573 3530 0.3604 - -
0.1577 3540 0.3592 - -
0.1582 3550 0.3483 - -
0.1586 3560 0.3131 - -
0.1591 3570 0.3738 - -
0.1595 3580 0.3719 - -
0.1600 3590 0.3409 - -
0.1604 3600 0.4082 - -
0.1609 3610 0.2881 - -
0.1613 3620 0.3214 - -
0.1617 3630 0.4413 - -
0.1622 3640 0.3706 - -
0.1626 3650 0.3643 - -
0.1631 3660 0.3493 - -
0.1635 3670 0.3877 - -
0.1640 3680 0.3278 - -
0.1644 3690 0.3211 - -
0.1649 3700 0.4104 - -
0.1653 3710 0.4558 - -
0.1658 3720 0.3602 - -
0.1662 3730 0.3348 - -
0.1666 3740 0.2922 - -
0.1671 3750 0.329 - -
0.1675 3760 0.3507 - -
0.1680 3770 0.2853 - -
0.1684 3780 0.3556 - -
0.1689 3790 0.3138 - -
0.1693 3800 0.3536 - -
0.1698 3810 0.3762 - -
0.1702 3820 0.3262 - -
0.1707 3830 0.3571 - -
0.1711 3840 0.3455 - -
0.1715 3850 0.3283 - -
0.1720 3860 0.3317 - -
0.1724 3870 0.2984 - -
0.1729 3880 0.2659 - -
0.1733 3890 0.2844 - -
0.1738 3900 0.2999 - -
0.1742 3910 0.2991 - -
0.1747 3920 0.2667 - -
0.1751 3930 0.3529 - -
0.1756 3940 0.3767 - -
0.1760 3950 0.3909 - -
0.1765 3960 0.3393 - -
0.1769 3970 0.2918 - -
0.1773 3980 0.3363 - -
0.1778 3990 0.3694 - -
0.1782 4000 0.3 0.7572 0.7542
0.1787 4010 0.3266 - -
0.1791 4020 0.3059 - -
0.1796 4030 0.3038 - -
0.1800 4040 0.3415 - -
0.1805 4050 0.3385 - -
0.1809 4060 0.3145 - -
0.1814 4070 0.2816 - -
0.1818 4080 0.3272 - -
0.1822 4090 0.3335 - -
0.1827 4100 0.3412 - -
0.1831 4110 0.3367 - -
0.1836 4120 0.2754 - -
0.1840 4130 0.298 - -
0.1845 4140 0.3252 - -
0.1849 4150 0.3613 - -
0.1854 4160 0.3197 - -
0.1858 4170 0.3578 - -
0.1863 4180 0.3254 - -
0.1867 4190 0.2993 - -
0.1871 4200 0.3188 - -
0.1876 4210 0.3217 - -
0.1880 4220 0.2893 - -
0.1885 4230 0.3223 - -
0.1889 4240 0.3522 - -
0.1894 4250 0.3489 - -
0.1898 4260 0.3313 - -
0.1903 4270 0.3612 - -
0.1907 4280 0.3323 - -
0.1912 4290 0.2971 - -
0.1916 4300 0.3009 - -
0.1920 4310 0.3336 - -
0.1925 4320 0.3655 - -
0.1929 4330 0.3414 - -
0.1934 4340 0.2903 - -
0.1938 4350 0.3732 - -
0.1943 4360 0.3526 - -
0.1947 4370 0.3424 - -
0.1952 4380 0.3371 - -
0.1956 4390 0.3407 - -
0.1961 4400 0.3626 - -
0.1965 4410 0.3104 - -
0.1969 4420 0.3432 - -
0.1974 4430 0.2897 - -
0.1978 4440 0.2952 - -
0.1983 4450 0.3032 - -
0.1987 4460 0.3179 - -
0.1992 4470 0.3364 - -
0.1996 4480 0.2757 - -
0.2001 4490 0.3775 - -
0.2005 4500 0.2782 - -
0.2010 4510 0.2787 - -
0.2014 4520 0.3433 - -
0.2018 4530 0.3348 - -
0.2023 4540 0.295 - -
0.2027 4550 0.3076 - -
0.2032 4560 0.3489 - -
0.2036 4570 0.3741 - -
0.2041 4580 0.3121 - -
0.2045 4590 0.2682 - -
0.2050 4600 0.3106 - -
0.2054 4610 0.312 - -
0.2059 4620 0.3537 - -
0.2063 4630 0.2801 - -
0.2068 4640 0.3378 - -
0.2072 4650 0.3417 - -
0.2076 4660 0.4114 - -
0.2081 4670 0.3325 - -
0.2085 4680 0.3085 - -
0.2090 4690 0.2875 - -
0.2094 4700 0.3864 - -
0.2099 4710 0.3235 - -
0.2103 4720 0.3187 - -
0.2108 4730 0.2956 - -
0.2112 4740 0.3405 - -
0.2117 4750 0.313 - -
0.2121 4760 0.2865 - -
0.2125 4770 0.3555 - -
0.2130 4780 0.3089 - -
0.2134 4790 0.3021 - -
0.2139 4800 0.353 - -
0.2143 4810 0.3356 - -
0.2148 4820 0.338 - -
0.2152 4830 0.3362 - -
0.2157 4840 0.3152 - -
0.2161 4850 0.3321 - -
0.2166 4860 0.3087 - -
0.2170 4870 0.3503 - -
0.2174 4880 0.3841 - -
0.2179 4890 0.333 - -
0.2183 4900 0.3705 - -
0.2188 4910 0.3121 - -
0.2192 4920 0.3151 - -
0.2197 4930 0.3138 - -
0.2201 4940 0.3525 - -
0.2206 4950 0.3233 - -
0.2210 4960 0.2762 - -
0.2215 4970 0.3679 - -
0.2219 4980 0.3351 - -
0.2223 4990 0.3733 - -
0.2228 5000 0.366 0.7601 0.7577
0.2232 5010 0.2968 - -
0.2237 5020 0.3618 - -
0.2241 5030 0.3758 - -
0.2246 5040 0.2664 - -
0.2250 5050 0.3232 - -
0.2255 5060 0.3452 - -
0.2259 5070 0.4011 - -
0.2264 5080 0.3521 - -
0.2268 5090 0.3029 - -
0.2272 5100 0.3058 - -
0.2277 5110 0.3198 - -
0.2281 5120 0.2958 - -
0.2286 5130 0.3046 - -
0.2290 5140 0.3284 - -
0.2295 5150 0.333 - -
0.2299 5160 0.3385 - -
0.2304 5170 0.3359 - -
0.2308 5180 0.3572 - -
0.2313 5190 0.2992 - -
0.2317 5200 0.318 - -
0.2321 5210 0.3002 - -
0.2326 5220 0.3194 - -
0.2330 5230 0.3398 - -
0.2335 5240 0.2675 - -
0.2339 5250 0.312 - -
0.2344 5260 0.3199 - -
0.2348 5270 0.3446 - -
0.2353 5280 0.3082 - -
0.2357 5290 0.3522 - -
0.2362 5300 0.3347 - -
0.2366 5310 0.3571 - -
0.2371 5320 0.3275 - -
0.2375 5330 0.3524 - -
0.2379 5340 0.3151 - -
0.2384 5350 0.3338 - -
0.2388 5360 0.3794 - -
0.2393 5370 0.3591 - -
0.2397 5380 0.3442 - -
0.2402 5390 0.2927 - -
0.2406 5400 0.3316 - -
0.2411 5410 0.3152 - -
0.2415 5420 0.3876 - -
0.2420 5430 0.324 - -
0.2424 5440 0.3296 - -
0.2428 5450 0.3499 - -
0.2433 5460 0.3552 - -
0.2437 5470 0.3394 - -
0.2442 5480 0.3083 - -
0.2446 5490 0.3198 - -
0.2451 5500 0.2887 - -
0.2455 5510 0.2898 - -
0.2460 5520 0.3092 - -
0.2464 5530 0.3025 - -
0.2469 5540 0.3253 - -
0.2473 5550 0.3686 - -
0.2477 5560 0.3205 - -
0.2482 5570 0.3507 - -
0.2486 5580 0.2809 - -
0.2491 5590 0.3339 - -
0.2495 5600 0.3261 - -
0.2500 5610 0.2804 - -
0.2504 5620 0.2856 - -
0.2509 5630 0.3211 - -
0.2513 5640 0.3126 - -
0.2518 5650 0.3374 - -
0.2522 5660 0.2957 - -
0.2526 5670 0.3414 - -
0.2531 5680 0.3219 - -
0.2535 5690 0.3554 - -
0.2540 5700 0.2738 - -
0.2544 5710 0.361 - -
0.2549 5720 0.336 - -
0.2553 5730 0.3254 - -
0.2558 5740 0.3453 - -
0.2562 5750 0.2984 - -
0.2567 5760 0.3224 - -
0.2571 5770 0.2553 - -
0.2575 5780 0.301 - -
0.2580 5790 0.3767 - -
0.2584 5800 0.3092 - -
0.2589 5810 0.2676 - -
0.2593 5820 0.3178 - -
0.2598 5830 0.3117 - -
0.2602 5840 0.3446 - -
0.2607 5850 0.3347 - -
0.2611 5860 0.3841 - -
0.2616 5870 0.2847 - -
0.2620 5880 0.3587 - -
0.2624 5890 0.2812 - -
0.2629 5900 0.3577 - -
0.2633 5910 0.3011 - -
0.2638 5920 0.3102 - -
0.2642 5930 0.3297 - -
0.2647 5940 0.2603 - -
0.2651 5950 0.3575 - -
0.2656 5960 0.3617 - -
0.2660 5970 0.3587 - -
0.2665 5980 0.3198 - -
0.2669 5990 0.3536 - -
0.2673 6000 0.3047 0.7725 0.7699
0.2678 6010 0.3211 - -
0.2682 6020 0.392 - -
0.2687 6030 0.3359 - -
0.2691 6040 0.2903 - -
0.2696 6050 0.286 - -
0.2700 6060 0.3426 - -
0.2705 6070 0.3406 - -
0.2709 6080 0.2903 - -
0.2714 6090 0.3175 - -
0.2718 6100 0.2794 - -
0.2723 6110 0.3232 - -
0.2727 6120 0.3054 - -
0.2731 6130 0.361 - -
0.2736 6140 0.3524 - -
0.2740 6150 0.3371 - -
0.2745 6160 0.313 - -
0.2749 6170 0.2713 - -
0.2754 6180 0.3141 - -
0.2758 6190 0.3197 - -
0.2763 6200 0.2792 - -
0.2767 6210 0.3169 - -
0.2772 6220 0.307 - -
0.2776 6230 0.2737 - -
0.2780 6240 0.3348 - -
0.2785 6250 0.2885 - -
0.2789 6260 0.3416 - -
0.2794 6270 0.3422 - -
0.2798 6280 0.2758 - -
0.2803 6290 0.3736 - -
0.2807 6300 0.3036 - -
0.2812 6310 0.3704 - -
0.2816 6320 0.3312 - -
0.2821 6330 0.3431 - -
0.2825 6340 0.3502 - -
0.2829 6350 0.2821 - -
0.2834 6360 0.3097 - -
0.2838 6370 0.3444 - -
0.2843 6380 0.3349 - -
0.2847 6390 0.2999 - -
0.2852 6400 0.3149 - -
0.2856 6410 0.3462 - -
0.2861 6420 0.3337 - -
0.2865 6430 0.3329 - -
0.2870 6440 0.3294 - -
0.2874 6450 0.2917 - -
0.2878 6460 0.3007 - -
0.2883 6470 0.2809 - -
0.2887 6480 0.3745 - -
0.2892 6490 0.3625 - -
0.2896 6500 0.3123 - -
0.2901 6510 0.3209 - -
0.2905 6520 0.347 - -
0.2910 6530 0.3084 - -
0.2914 6540 0.2829 - -
0.2919 6550 0.3569 - -
0.2923 6560 0.2686 - -
0.2927 6570 0.2929 - -
0.2932 6580 0.3237 - -
0.2936 6590 0.3451 - -
0.2941 6600 0.3199 - -
0.2945 6610 0.2848 - -
0.2950 6620 0.2842 - -
0.2954 6630 0.3168 - -
0.2959 6640 0.3094 - -
0.2963 6650 0.3239 - -
0.2968 6660 0.357 - -
0.2972 6670 0.3279 - -
0.2976 6680 0.4015 - -
0.2981 6690 0.2901 - -
0.2985 6700 0.3387 - -
0.2990 6710 0.3282 - -
0.2994 6720 0.2909 - -
0.2999 6730 0.3556 - -
0.3003 6740 0.3008 - -
0.3008 6750 0.3205 - -
0.3012 6760 0.3132 - -
0.3017 6770 0.3181 - -
0.3021 6780 0.3752 - -
0.3026 6790 0.317 - -
0.3030 6800 0.3584 - -
0.3034 6810 0.3475 - -
0.3039 6820 0.2827 - -
0.3043 6830 0.2925 - -
0.3048 6840 0.2941 - -
0.3052 6850 0.3154 - -
0.3057 6860 0.3301 - -
0.3061 6870 0.3492 - -
0.3066 6880 0.3147 - -
0.3070 6890 0.348 - -
0.3075 6900 0.3577 - -
0.3079 6910 0.2893 - -
0.3083 6920 0.3298 - -
0.3088 6930 0.3071 - -
0.3092 6940 0.322 - -
0.3097 6950 0.3055 - -
0.3101 6960 0.3333 - -
0.3106 6970 0.3329 - -
0.3110 6980 0.3298 - -
0.3115 6990 0.3061 - -
0.3119 7000 0.3005 0.7686 0.7672

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.3.0
  • Transformers: 4.46.2
  • PyTorch: 2.1.0+cu118
  • Accelerate: 1.1.1
  • Datasets: 3.1.0
  • Tokenizers: 0.20.3

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
9
Safetensors
Model size
494M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for AlexWortega/qwen7k

Base model

Qwen/Qwen2.5-0.5B
Finetuned
(65)
this model

Evaluation results