Edit model card

SetFit with BAAI/bge-base-en-v1.5

This is a SetFit model that can be used for Text Classification. This SetFit model uses BAAI/bge-base-en-v1.5 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Sources

Model Labels

Label Examples
0
  • 'The answer provided does not address the question asked about the lifespan of John Augustine Zahm. It discusses the ownership of the land where the Bell and Gemmell tannery was located, which is unrelated to the specific question.\n\nReasoning:\n1. Context Grounding: The information about John Buchanan and the Bell and Gemmell tannery is grounded in the provided document but does not relate to John Augustine Zahm or his lifespan.\n2. Relevance: The answer does not address the question regarding John Augustine Zahm’s lifespan at all. It is completely off-topic.\n3. Conciseness: While the provided answer is concise, it is entirely irrelevant to the question concerning John Augustine Zahm’s lifespan. \n\nFinal Result:'
  • 'Reasoning: \n\n1. Context Grounding: The answer correctly identifies the federal funds rate as the main tool of conventional monetary policy in the USA. This statement is well-supported by the provided document, which specifies that the federal funds rate is the primary mechanism used in conventional monetary policy in the USA, and further explains its role in interbank lending.\n \n2. Relevance: The question asks specifically about the main tool of conventional monetary policy in the USA. The answer correctly states that the federal funds rate is this tool. However, there is an error in attributing the decision of the federal funds rate to Congress when in fact it is determined by the Federal Reserve.\n\n3. Conciseness: The answer is brief and directly addresses the question without unnecessary information. However, the incorrect statement about Congress deciding the rate detracts from its overall clarity and correctness.\n\nFinal Result:'
  • 'Reasoning:\n1. Context Grounding: The answer partially reflects the document's suggestions about using opponents' momentum and techniques to trip them. However, it introduces methods like "flailing wildly" and "jumping above punches," which are not grounded in the document provided.\n2. Relevance: The answer does not fully align with the document's practical and structured advice for takedowns. It diverges into suggestions that are unrealistic and not mentioned in the document.\n3. Conciseness: The answer is overly detailed in unnecessary ways, such as the emphasis on slow execution and crossing legs, which are not key points mentioned in the document provided. Rather, the document stresses quick and committed actions.\n\nFinal Result:'
1
  • "Reasoning:\n1. Context Grounding: The answer is directly supported by the content in the document. It correctly identifies the disagreement as being about the duration of the payroll tax cut and highlights the specific durations preferred by both parties.\n2. Relevance: The answer precisely addresses the question by focusing solely on the disagreement about the payroll tax cut's duration between Democrats and Republicans.\n3. Conciseness: The answer is succinct and to the point. It avoids extraneous information, sticking strictly to explaining the core disagreement.\n\nResult:"
  • 'Reasoning:\n\n1. Context Grounding: The provided answer draws directly from the supplied document, mentioning the variety and customizability of templates offered by the organization as well as their qualities, such as including sample content and different features.\n\n2. Relevance: The response is related to the inquiry about available blog post templates. However, while it accurately includes information about the variety of templates and customization options, it should focus more narrowly on templates specifically for blog posts.\n\n3. Conciseness: The answer is somewhat verbose. It includes information that isn’t strictly necessary to answer the question, such as specifics about the overall number of templates and customization options, instead of focusing solely on blog templates.\n\n4. Correct Instructions: The answer does correctly instruct the reader on how to choose a template and its customization, but it lacks specific emphasis on the blog aspect of the templates which is critical to the question.\n\nEvaluation Result:'
  • 'Reasoning:\n\n1. Context Grounding: The document clearly identifies "Father Joseph Carrier" as the person holding the professorship of Chemistry and Physics at Notre Dame, not "Father Josh Carrier."\n2. Relevance: The answer provided states that "Father Josh Carrier" held the professorship, which is incorrect based on the provided document.\n3. Conciseness: While the answer is concise, it is not factually accurate.\n\nFinal Result:'

Evaluation

Metrics

Label Accuracy
all 0.7324

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("Netta1994/setfit_baai_cybereason_gpt-4o_cot-instructions_remove_final_evaluation_e2_one_out_172")
# Run inference
preds = model("The percentage in the response status column indicates the total amount of successful completion of response actions.

Reasoning:
1. **Context Grounding**: The answer is well-supported by the document which states, \"percentage indicates the total amount of successful completion of response actions.\"
2. **Relevance**: The answer directly addresses the specific question asked about what the percentage in the response status column indicates.
3. **Conciseness**: The answer is succinct and to the point without unnecessary information.
4. **Specificity**: The answer is specific to what is being asked, detailing exactly what the percentage represents.
5. **Accuracy**: The answer provides the correct key/value as per the document.

Final result:")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 32 103.2508 245
Label Training Sample Count
0 312
1 322

Training Hyperparameters

  • batch_size: (16, 16)
  • num_epochs: (2, 2)
  • max_steps: -1
  • sampling_strategy: oversampling
  • num_iterations: 20
  • body_learning_rate: (2e-05, 2e-05)
  • head_learning_rate: 2e-05
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • l2_weight: 0.01
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: False

Training Results

Epoch Step Training Loss Validation Loss
0.0006 1 0.2802 -
0.0315 50 0.2661 -
0.0631 100 0.2533 -
0.0946 150 0.2551 -
0.1262 200 0.2561 -
0.1577 250 0.2516 -
0.1893 300 0.2488 -
0.2208 350 0.2216 -
0.2524 400 0.1693 -
0.2839 450 0.1131 -
0.3155 500 0.0797 -
0.3470 550 0.0429 -
0.3785 600 0.029 -
0.4101 650 0.0202 -
0.4416 700 0.0151 -
0.4732 750 0.0167 -
0.5047 800 0.02 -
0.5363 850 0.0118 -
0.5678 900 0.0027 -
0.5994 950 0.0031 -
0.6309 1000 0.0025 -
0.6625 1050 0.0028 -
0.6940 1100 0.0021 -
0.7256 1150 0.0019 -
0.7571 1200 0.0017 -
0.7886 1250 0.0013 -
0.8202 1300 0.0017 -
0.8517 1350 0.0014 -
0.8833 1400 0.0013 -
0.9148 1450 0.0011 -
0.9464 1500 0.0013 -
0.9779 1550 0.0013 -
1.0095 1600 0.0013 -
1.0410 1650 0.0011 -
1.0726 1700 0.0012 -
1.1041 1750 0.001 -
1.1356 1800 0.001 -
1.1672 1850 0.001 -
1.1987 1900 0.001 -
1.2303 1950 0.0009 -
1.2618 2000 0.001 -
1.2934 2050 0.001 -
1.3249 2100 0.001 -
1.3565 2150 0.0009 -
1.3880 2200 0.001 -
1.4196 2250 0.0009 -
1.4511 2300 0.0009 -
1.4826 2350 0.001 -
1.5142 2400 0.0018 -
1.5457 2450 0.0008 -
1.5773 2500 0.0008 -
1.6088 2550 0.0008 -
1.6404 2600 0.0009 -
1.6719 2650 0.0008 -
1.7035 2700 0.0008 -
1.7350 2750 0.0009 -
1.7666 2800 0.0009 -
1.7981 2850 0.0008 -
1.8297 2900 0.0008 -
1.8612 2950 0.0008 -
1.8927 3000 0.0008 -
1.9243 3050 0.0008 -
1.9558 3100 0.0009 -
1.9874 3150 0.0008 -

Framework Versions

  • Python: 3.10.14
  • SetFit: 1.1.0
  • Sentence Transformers: 3.1.1
  • Transformers: 4.44.0
  • PyTorch: 2.4.0+cu121
  • Datasets: 3.0.0
  • Tokenizers: 0.19.1

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
Downloads last month
4
Safetensors
Model size
109M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Netta1994/setfit_baai_cybereason_gpt-4o_cot-instructions_remove_final_evaluation_e2_one_out_172

Finetuned
(293)
this model

Evaluation results