filipiv's picture
Upload 11 files
c1f666d verified
metadata
base_model: sentence-transformers/multi-qa-MiniLM-L6-cos-v1
datasets: []
language: []
library_name: sentence-transformers
metrics:
  - pearson_cosine
  - spearman_cosine
  - pearson_manhattan
  - spearman_manhattan
  - pearson_euclidean
  - spearman_euclidean
  - pearson_dot
  - spearman_dot
  - pearson_max
  - spearman_max
pipeline_tag: sentence-similarity
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:3192024
  - loss:CosineSimilarityLoss
widget:
  - source_sentence: Must have experience in interdisciplinary collaboration
    sentences:
      - >-
        Nurse Coordinator specializing in advanced heart failure programs at The
        Queen's Health System. Skilled in patient care coordination, clinical
        assessments, and interdisciplinary collaboration. Experienced in
        managing complex health cases and ensuring compliance with healthcare
        regulations. Proficient in utilizing advanced medical technologies and
        technologies to enhance patient outcomes. Strong background in nonprofit
        healthcare environments, contributing to optimal health and wellness
        initiatives.
      - >-
        Administrative Assistant in the judiciary with experience at the
        Minnesota Judicial Branch and Mayo Clinic. Skilled in managing
        administrative tasks, coordinating schedules, and supporting judicial
        processes. Proficient in office software and communication tools.
        Previous roles include bank teller positions, enhancing customer service
        and financial transactions. Strong organizational skills and attention
        to detail, contributing to efficient operations in high-pressure
        environments.
      - >-
        Area Manager in facilities services with expertise in managing public
        parks, campgrounds, and recreational facilities. Skilled in operational
        management, team leadership, and customer service. Proven track record
        in enhancing service delivery and operational efficiency. Previous roles
        include Management Team and Accounts Payable Manager, demonstrating
        versatility across various industries. Strong background in office
        management and office operations, contributing to a well-rounded
        understanding of facility management practices.
  - source_sentence: Must have a customer service orientation
    sentences:
      - >-
        Research Assistant in biotechnology with expertise in Molecular Biology,
        Protein Expression, Purification, and Crystallization. Currently
        employed at Seagen, contributing to innovative cancer treatments. Holds
        a B.S. in Biochemistry and minors in Chemistry and Spanish. Previous
        experience includes roles as a Manufacturing Technician at AGC Biologics
        and undergraduate research at NG Lab and Mueller Lab, focusing on
        recombinant human proteins and protein processing. Proficient in leading
        project cooperation and public speaking.
      - >-
        Instructional Developer with a Master's in Human Resource Development,
        specializing in learning solutions across various media platforms.
        Experienced in storyboarding, animation, videography, and
        post-production. Proven track record in e-learning design and
        development, team leadership, and creative problem-solving. Currently
        employed at The University of Texas Health Science Center at Houston,
        focusing on enhancing organizational value through tailored corporate
        learning. Previous roles include Learning Consultant at Strategic Ascent
        and Assistant Manager at Cicis Pizza. Strong background in healthcare
        and professional training industries.
      - >-
        Human Resource professional with expertise in hiring, compliance,
        benefits, and compensation within the hospitality and semiconductor
        industries. Currently a Talent Acquisition Specialist at MKS
        Instruments, skilled in relationship building and attention to detail.
        Previous roles include Recruitment Manager at Block by Block and Talent
        Acquisition Specialist at Manpower. Proficient in advanced computer
        skills and a customer service orientation. Experienced in staffing
        management and recruitment strategies, with a strong focus on enhancing
        workforce capabilities and fostering client relationships.
  - source_sentence: Must be proficient in graphic design software
    sentences:
      - >-
        Senior Software Engineer with expertise in developing innovative
        solutions for the aviation and defense industries. Currently at Delta
        Flight Products, specializing in aircraft cabin interiors and avionics.
        Proficient in backend ETL processes, REST API development, and software
        development life cycle. Previous experience includes roles at Cisco,
        Thales, Safran, and FatPipe Networks, focusing on enhancing operational
        efficiency and user experience. Holds multiple patents for web
        application design and deployment. Strong background in collaborating
        with cross-functional teams to deliver high-quality software solutions.
      - >-
        Client Advisor in financial services with a strong background in luxury
        goods and retail. Currently at Louis Vuitton, specializing in client
        relationship management and personalized service. Previously worked at
        Salvatore Ferragano, enhancing client engagement and driving sales.
        Experienced in marketing management from SkPros, focusing on brand
        strategy and market analysis. Proficient in leveraging data to inform
        decision-making and improve client experiences.
      - >-
        Weld Process Specialist at Airgas with expertise in industrial
        automation and chemicals. Skilled in Resistance weld gun calibration,
        schedule database management, and asset locating matrix creation.
        Previous experience as a Welding Engineer at R&E Automated, providing
        support in automation systems for manufacturing applications. Proficient
        in DCEN and various welding techniques, including Fanuc and Motoman.
        Background includes roles in drafting and welding, enhancing fabrication
        efficiency and quality. Strong foundation in mechanical design and
        engineering principles, with a focus on improving performance and
        performance in manufacturing environments.
  - source_sentence: Must have experience in pharmaceutical marketing
    sentences:
      - >-
        Brand Influencer specializing in Black Literary, Culture, and Lifestyle.
        Certified UrbanAg with over 20 years of experience in urban agriculture
        consulting and retail operations. Currently supervises community gardens
        at Chicago Botanic Garden, educating residents on organic growing
        methods and addressing nutrition, food security, and healthy lifestyle
        options. Previously served as president of Af-Am Bookstore,
        demonstrating entrepreneurial skills and community engagement. Expertise
        in marketing and advertising, with a focus on enhancing community
        engagement and promoting sustainable practices.
      - >-
        Experienced Studio Manager and Executive Producer in media production,
        specializing in immersive entertainment and virtual environments.
        Proficient in business planning, team building, fundraising, and
        management. Co-founder of Dirty Secret, focusing on brand activation and
        custom worlds. Previous roles at Wevr involved production coordination
        and project management, with a strong background in arts and design.
        Holds a degree from California State University-Los Angeles.
      - >-
        Owner and CEO of Cake N Wings, a catering company specializing in food
        and travel PR. Experienced in public relations across health,
        technology, and entertainment sectors. Proven track record in developing
        innovative urban cuisine and enhancing customer experiences. Previous
        roles include account executive at Development Counsellors International
        and public relations manager at Creole Restaurant. Skilled in brand
        development, event management, and community engagement.
  - source_sentence: Must have experience in software development
    sentences:
      - >-
        Multi-skilled Business Analytics professional with a Master’s in
        Business Analytics and a dual MBA. Experienced in data analytics,
        predictive modeling, and project management within the health and
        wellness sector. Proficient in extracting, summarizing, and analyzing
        claims databases and healthcare analytics. Skilled in statistical
        analysis, database management, and data visualization. Previous roles
        include Business Analytics Advisor at Cigna Healthcare and Informatics
        Senior Specialist at Cigna Healthcare. Strong leadership and project
        management abilities, with a solid foundation in healthcare economics
        and outcomes observational research. Familiar with Base SAS 9.2, SAS EG,
        SAS EM, SAS JMP, Tableau, and Oracle Crystal Ball.
      - >-
        Assistant Vice President in commercial real estate financing with a
        strong background in banking. Experienced in business banking and branch
        management, having held roles as Assistant Vice President and Business
        Banking Officer. Proven track record in business development and branch
        operations within a large independent bank. Skilled in building client
        relationships and driving financial growth. Holds expertise in managing
        diverse teams and enhancing operational efficiency. Previous experience
        includes branch management across multiple branches, demonstrating a
        commitment to community engagement and financial wellness.
      - >-
        CEO of IMPROVLearning, specializing in e-learning and driver education.
        Founded and managed multiple ventures in training, healthcare, and real
        estate. Proven track record of expanding product offerings and achieving
        recognition on the Inc 500/5000 list. Active board member of the LA
        Chapter of the Entrepreneur Organization, contributing to the growth of
        over 3 million students. Experienced in venture capital and
        entrepreneurship, with a focus on innovative training solutions and
        community engagement. Active member of various organizations, including
        the Entrepreneurs' Organization and the Los Angeles County Business
        Federation.
model-index:
  - name: >-
      SentenceTransformer based on
      sentence-transformers/multi-qa-MiniLM-L6-cos-v1
    results:
      - task:
          type: semantic-similarity
          name: Semantic Similarity
        dataset:
          name: validation
          type: validation
        metrics:
          - type: pearson_cosine
            value: 0.9594453206302572
            name: Pearson Cosine
          - type: spearman_cosine
            value: 0.860568334150162
            name: Spearman Cosine
          - type: pearson_manhattan
            value: 0.9436690128729379
            name: Pearson Manhattan
          - type: spearman_manhattan
            value: 0.8604275677997159
            name: Spearman Manhattan
          - type: pearson_euclidean
            value: 0.9443183012069103
            name: Pearson Euclidean
          - type: spearman_euclidean
            value: 0.8605683342374743
            name: Spearman Euclidean
          - type: pearson_dot
            value: 0.9594453207129489
            name: Pearson Dot
          - type: spearman_dot
            value: 0.8605683341225518
            name: Spearman Dot
          - type: pearson_max
            value: 0.9594453207129489
            name: Pearson Max
          - type: spearman_max
            value: 0.8605683342374743
            name: Spearman Max

SentenceTransformer based on sentence-transformers/multi-qa-MiniLM-L6-cos-v1

This is a sentence-transformers model finetuned from sentence-transformers/multi-qa-MiniLM-L6-cos-v1. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'Must have experience in software development',
    "CEO of IMPROVLearning, specializing in e-learning and driver education. Founded and managed multiple ventures in training, healthcare, and real estate. Proven track record of expanding product offerings and achieving recognition on the Inc 500/5000 list. Active board member of the LA Chapter of the Entrepreneur Organization, contributing to the growth of over 3 million students. Experienced in venture capital and entrepreneurship, with a focus on innovative training solutions and community engagement. Active member of various organizations, including the Entrepreneurs' Organization and the Los Angeles County Business Federation.",
    'Multi-skilled Business Analytics professional with a Master’s in Business Analytics and a dual MBA. Experienced in data analytics, predictive modeling, and project management within the health and wellness sector. Proficient in extracting, summarizing, and analyzing claims databases and healthcare analytics. Skilled in statistical analysis, database management, and data visualization. Previous roles include Business Analytics Advisor at Cigna Healthcare and Informatics Senior Specialist at Cigna Healthcare. Strong leadership and project management abilities, with a solid foundation in healthcare economics and outcomes observational research. Familiar with Base SAS 9.2, SAS EG, SAS EM, SAS JMP, Tableau, and Oracle Crystal Ball.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Semantic Similarity

Metric Value
pearson_cosine 0.9594
spearman_cosine 0.8606
pearson_manhattan 0.9437
spearman_manhattan 0.8604
pearson_euclidean 0.9443
spearman_euclidean 0.8606
pearson_dot 0.9594
spearman_dot 0.8606
pearson_max 0.9594
spearman_max 0.8606

Training Details

Training Dataset

Unnamed Dataset

  • Size: 3,192,024 training samples
  • Columns: sentence_0, sentence_1, and label
  • Approximate statistics based on the first 1000 samples:
    sentence_0 sentence_1 label
    type string string float
    details
    • min: 6 tokens
    • mean: 9.15 tokens
    • max: 17 tokens
    • min: 53 tokens
    • mean: 93.6 tokens
    • max: 150 tokens
    • min: 0.0
    • mean: 0.5
    • max: 1.0
  • Samples:
    sentence_0 sentence_1 label
    Must have experience in software development Executive Assistant with a strong background in real estate and financial services. Experienced in managing executive schedules, coordinating communications, and supporting investment banking operations. Proficient in office management software and adept at multitasking in fast-paced environments. Previous roles at Blackstone, Piper Sandler, and Broe Real Estate Group, where responsibilities included supporting high-level executives and enhancing operational efficiency. Skilled in fostering relationships and facilitating smooth transitions in fast-paced settings. 0.0
    Must have experience in overseeing service delivery for health initiatives Director of Solution Strategy in health, wellness, and fitness, specializing in relationship building and strategy execution. Experienced in overseeing service delivery and performance management for telehealth and digital health initiatives at Blue Cross Blue Shield of Massachusetts. Proven track record in vendor lifecycle management, contract strategy, and operational leadership. Skilled in developing standardized wellness programs and enhancing client satisfaction through innovative solutions. Strong background in managing cross-functional teams and driving performance metrics in health engagement and wellness services. 1.0
    Must have experience collaborating with Fortune 500 companies Senior Sales and Business Development Manager in the energy sector, specializing in increasing profitable sales for small to large companies. Proven track record in relationship building, team management, and strategy development. Experienced in collaborating with diverse stakeholders, including Fortune 500 companies and small to large privately held companies. Previous roles include Vice President of Operations at NovaStar LP and Director of Sales at NovaStar Mortgage and Athlon Solutions. Strong communicator and team player, with a focus on customer needs and operational efficiency. 1.0
  • Loss: CosineSimilarityLoss with these parameters:
    {
        "loss_fct": "torch.nn.modules.loss.MSELoss"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 128
  • per_device_eval_batch_size: 128
  • num_train_epochs: 1.0
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 128
  • per_device_eval_batch_size: 128
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 1.0
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • eval_use_gather_object: False
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin

Training Logs

Epoch Step Training Loss validation_spearman_max
0.0200 500 0.1362 -
0.0401 1000 0.0533 -
0.0601 1500 0.0433 -
0.0802 2000 0.0386 -
0.1002 2500 0.0356 -
0.1203 3000 0.0345 -
0.1403 3500 0.0326 -
0.1604 4000 0.0323 -
0.1804 4500 0.0313 -
0.2005 5000 0.0305 -
0.2205 5500 0.0298 -
0.2406 6000 0.0296 -
0.2606 6500 0.0291 -
0.2807 7000 0.0286 -
0.3007 7500 0.0286 -
0.3208 8000 0.0281 -
0.3408 8500 0.0278 -
0.3609 9000 0.0273 -
0.3809 9500 0.0276 -
0.4010 10000 0.0274 -
0.4210 10500 0.0266 -
0.4411 11000 0.0261 -
0.4611 11500 0.0264 -
0.4812 12000 0.0256 -
0.5012 12500 0.0254 -
0.5213 13000 0.0251 -
0.5413 13500 0.0249 -
0.5614 14000 0.0253 -
0.5814 14500 0.0247 -
0.6015 15000 0.0254 -
0.6215 15500 0.0246 -
0.6416 16000 0.0251 -
0.6616 16500 0.0248 -
0.6817 17000 0.0247 -
0.7017 17500 0.0246 -
0.7218 18000 0.0242 -
0.7418 18500 0.024 -
0.7619 19000 0.0247 -
0.7819 19500 0.0238 -
0.8020 20000 0.0244 0.8603
0.8220 20500 0.024 -
0.8421 21000 0.0244 -
0.8621 21500 0.0242 -
0.8822 22000 0.0239 -
0.9022 22500 0.0237 -
0.9223 23000 0.0241 -
0.9423 23500 0.0242 -
0.9624 24000 0.0238 -
0.9824 24500 0.0236 -
1.0 24938 - 0.8606

Framework Versions

  • Python: 3.11.6
  • Sentence Transformers: 3.0.1
  • Transformers: 4.44.1
  • PyTorch: 2.4.0+cu121
  • Accelerate: 0.33.0
  • Datasets: 2.21.0
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}