SetFit

This is a SetFit model that can be used for Text Classification. A SVC instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

  • Model Type: SetFit
  • Classification head: a SVC instance
  • Maximum Sequence Length: 384 tokens
  • Number of Classes: 2 classes

Model Sources

Model Labels

Label Examples
0
  • 'ESG funds often charge many times more for investment funds that are nearly indistinguishable from those without the ESG title.'
  • 'They are California, Florida, Illinois, Nebraska, New York, and Wyoming.'
  • 'And so it goes.'
1
  • 'Republicans attempted to pass a resolution that would have enabled Congress to force workers to accept a deal, which was fortunately blocked by (who else) Senator Bernie Sanders.'
  • 'No government ever surrenders power, even its emergency powers—not really.'
  • 'No citizen in a democratic society should want executives from $10trn financial institutions to play a larger role than they already do in defining and implementing social values.'

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("SOUMYADEEPSAR/Setfit_random_sample_svm_head")
# Run inference
preds = model("What could possibly go wrong?")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 3 23.4159 68
Label Training Sample Count
0 136
1 78

Training Hyperparameters

  • batch_size: (8, 8)
  • num_epochs: (1, 1)
  • max_steps: -1
  • sampling_strategy: oversampling
  • body_learning_rate: (2e-05, 2e-05)
  • head_learning_rate: 2e-05
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: False

Training Results

Epoch Step Training Loss Validation Loss
0.0003 1 0.3597 -
0.0161 50 0.2693 -
0.0323 100 0.2501 -
0.0484 150 0.2691 -
0.0645 200 0.063 -
0.0806 250 0.0179 -
0.0968 300 0.0044 -
0.1129 350 0.0003 -
0.1290 400 0.0005 -
0.1452 450 0.0002 -
0.1613 500 0.0003 -
0.1774 550 0.0001 -
0.1935 600 0.0001 -
0.2097 650 0.0001 -
0.2258 700 0.0001 -
0.2419 750 0.0001 -
0.2581 800 0.0 -
0.2742 850 0.0001 -
0.2903 900 0.0002 -
0.3065 950 0.0 -
0.3226 1000 0.0 -
0.3387 1050 0.0002 -
0.3548 1100 0.0 -
0.3710 1150 0.0001 -
0.3871 1200 0.0001 -
0.4032 1250 0.0 -
0.4194 1300 0.0 -
0.4355 1350 0.0 -
0.4516 1400 0.0001 -
0.4677 1450 0.0 -
0.4839 1500 0.0 -
0.5 1550 0.0001 -
0.5161 1600 0.0001 -
0.5323 1650 0.0 -
0.5484 1700 0.0 -
0.5645 1750 0.0 -
0.5806 1800 0.0 -
0.5968 1850 0.0 -
0.6129 1900 0.0 -
0.6290 1950 0.0001 -
0.6452 2000 0.0 -
0.6613 2050 0.0 -
0.6774 2100 0.0 -
0.6935 2150 0.0001 -
0.7097 2200 0.0 -
0.7258 2250 0.0 -
0.7419 2300 0.0001 -
0.7581 2350 0.0001 -
0.7742 2400 0.0001 -
0.7903 2450 0.0 -
0.8065 2500 0.0 -
0.8226 2550 0.0 -
0.8387 2600 0.0 -
0.8548 2650 0.0001 -
0.8710 2700 0.0001 -
0.8871 2750 0.0 -
0.9032 2800 0.0 -
0.9194 2850 0.0 -
0.9355 2900 0.0001 -
0.9516 2950 0.0 -
0.9677 3000 0.0001 -
0.9839 3050 0.0 -
1.0 3100 0.0 -

Framework Versions

  • Python: 3.10.12
  • SetFit: 1.0.3
  • Sentence Transformers: 3.0.1
  • Transformers: 4.39.0
  • PyTorch: 2.3.0+cu121
  • Datasets: 2.20.0
  • Tokenizers: 0.15.2

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
Downloads last month
11
Safetensors
Model size
109M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.