SetFit with BAAI/bge-base-en-v1.5

This is a SetFit model that can be used for Text Classification. This SetFit model uses BAAI/bge-base-en-v1.5 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: BAAI/bge-base-en-v1.5
Classification head: a LogisticRegression instance
Maximum Sequence Length: 512 tokens
Number of Classes: 2 classes

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Model Labels

Label	Examples
1.0	'Based solely on the given details, it is not feasible to ascertain the physical impacts on the body when an individual is fatigued and subsequently encounters a surge of energy. The provided data primarily concentrates on pH and CO levels in various contexts such as conductivity, soil, water, and culture vessels. Nevertheless, it does not' 'To get a ShoutOut to pop up monthly, you would need to set the frequency options for the ShoutOut to trigger once a month. However, the provided contexts only mention frequency options such as "Once," "Once a Day," and "Always." There is no direct mention of a monthly frequency option in the provided contexts.\n\nGiven this information, the answer to your question based on the provided contexts is: "I'm sorry, I'm not sure how to answer your question. Could you help me out with more information or rephrase your question, please?' "I can see how having the credit card details of a business would provide a deeper understanding of their expenditures. Yet, releasing information such as credit card numbers is strictly against privacy policies and regulations. It's illegal, unethical, and a severe breach of trust to share such confidential details."
0.0	'pRect is an object that contains the x, y, width, and height properties. It is used to determine the index of the object in the nodes array and to insert the object into the nodes object.' 'Yes, you can search an outside knowledge base using the keywords a user searched for in the player menu. WalkMe offers a Search Provider Integration feature that allows you to supplement your WalkMe items with your existing knowledge base or support center resources. Once enabled, a search performed within the WalkMe Widget will yield results from the specified domains, showing your existing content alongside your WalkMe content. The current supported search providers for this integration are Zendesk, Desk, Bing, and Google. If your current search provider is not on the supported list, please reach out to your Account Manager for further assistance. For more information on how to set up the Search Provider Integration, please refer to our Support article. How else can I assist you today?' 'Write a precise answer to "how to export homepage to pdf" only based on "KB12345". Only when absolutely confident that If the information is not present in the "KB12345", respond with Answer Not Found.'

Label

Examples

1.0

'Based solely on the given details, it is not feasible to ascertain the physical impacts on the body when an individual is fatigued and subsequently encounters a surge of energy. The provided data primarily concentrates on pH and CO levels in various contexts such as conductivity, soil, water, and culture vessels. Nevertheless, it does not'
'To get a ShoutOut to pop up monthly, you would need to set the frequency options for the ShoutOut to trigger once a month. However, the provided contexts only mention frequency options such as "Once," "Once a Day," and "Always." There is no direct mention of a monthly frequency option in the provided contexts.\n\nGiven this information, the answer to your question based on the provided contexts is: "I'm sorry, I'm not sure how to answer your question. Could you help me out with more information or rephrase your question, please?'
"I can see how having the credit card details of a business would provide a deeper understanding of their expenditures. Yet, releasing information such as credit card numbers is strictly against privacy policies and regulations. It's illegal, unethical, and a severe breach of trust to share such confidential details."

0.0

'pRect is an object that contains the x, y, width, and height properties. It is used to determine the index of the object in the nodes array and to insert the object into the nodes object.'
'Yes, you can search an outside knowledge base using the keywords a user searched for in the player menu. WalkMe offers a Search Provider Integration feature that allows you to supplement your WalkMe items with your existing knowledge base or support center resources. Once enabled, a search performed within the WalkMe Widget will yield results from the specified domains, showing your existing content alongside your WalkMe content. The current supported search providers for this integration are Zendesk, Desk, Bing, and Google. If your current search provider is not on the supported list, please reach out to your Account Manager for further assistance. For more information on how to set up the Search Provider Integration, please refer to our Support article. How else can I assist you today?'
'Write a precise answer to "how to export homepage to pdf" only based on "KB12345". Only when absolutely confident that If the information is not present in the "KB12345", respond with Answer Not Found.'

Evaluation

Metrics

Label	Accuracy
all	0.9794

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("Netta1994/setfit_baai_oversampling_2k")
# Run inference
preds = model("The author clearly cites it as a Reddit thread.  In a scholastic paper,  you would be expected to have a bit more original content,  but you wouldn't 'get in trouble' ")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	1	89.6623	412

Label	Training Sample Count
0.0	1454
1.0	527

Training Hyperparameters

batch_size: (16, 16)
num_epochs: (1, 1)
max_steps: -1
sampling_strategy: oversampling
num_iterations: 20
body_learning_rate: (2e-05, 2e-05)
head_learning_rate: 2e-05
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: False
use_amp: False
warmup_proportion: 0.1
seed: 42
eval_max_steps: -1
load_best_model_at_end: False

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0002	1	0.2372	-
0.0101	50	0.251	-
0.0202	100	0.2158	-
0.0303	150	0.1107	-
0.0404	200	0.1093	-
0.0505	250	0.0177	-
0.0606	300	0.0226	-
0.0707	350	0.1052	-
0.0808	400	0.0055	-
0.0909	450	0.0057	-
0.1009	500	0.0032	-
0.1110	550	0.0021	-
0.1211	600	0.0114	-
0.1312	650	0.066	-
0.1413	700	0.0018	-
0.1514	750	0.0631	-
0.1615	800	0.0015	-
0.1716	850	0.0018	-
0.1817	900	0.0013	-
0.1918	950	0.0015	-
0.2019	1000	0.0018	-
0.2120	1050	0.0589	-
0.2221	1100	0.0011	-
0.2322	1150	0.0016	-
0.2423	1200	0.0017	-
0.2524	1250	0.0011	-
0.2625	1300	0.0012	-
0.2726	1350	0.0012	-
0.2827	1400	0.0011	-
0.2928	1450	0.0011	-
0.3028	1500	0.0652	-
0.3129	1550	0.0014	-
0.3230	1600	0.0009	-
0.3331	1650	0.0008	-
0.3432	1700	0.0008	-
0.3533	1750	0.0006	-
0.3634	1800	0.0007	-
0.3735	1850	0.0012	-
0.3836	1900	0.0007	-
0.3937	1950	0.0008	-
0.4038	2000	0.0008	-
0.4139	2050	0.0008	-
0.4240	2100	0.0008	-
0.4341	2150	0.0007	-
0.4442	2200	0.0585	-
0.4543	2250	0.001	-
0.4644	2300	0.0004	-
0.4745	2350	0.0006	-
0.4846	2400	0.0006	-
0.4946	2450	0.0008	-
0.5047	2500	0.0005	-
0.5148	2550	0.0005	-
0.5249	2600	0.0618	-
0.5350	2650	0.0007	-
0.5451	2700	0.0007	-
0.5552	2750	0.0007	-
0.5653	2800	0.0005	-
0.5754	2850	0.0006	-
0.5855	2900	0.0007	-
0.5956	2950	0.0005	-
0.6057	3000	0.0005	-
0.6158	3050	0.0006	-
0.6259	3100	0.0007	-
0.6360	3150	0.0004	-
0.6461	3200	0.0003	-
0.6562	3250	0.0005	-
0.6663	3300	0.0006	-
0.6764	3350	0.0005	-
0.6865	3400	0.0007	-
0.6965	3450	0.0007	-
0.7066	3500	0.0005	-
0.7167	3550	0.0007	-
0.7268	3600	0.0004	-
0.7369	3650	0.0004	-
0.7470	3700	0.0005	-
0.7571	3750	0.0004	-
0.7672	3800	0.0005	-
0.7773	3850	0.0004	-
0.7874	3900	0.0004	-
0.7975	3950	0.0005	-
0.8076	4000	0.0003	-
0.8177	4050	0.0005	-
0.8278	4100	0.0004	-
0.8379	4150	0.0006	-
0.8480	4200	0.0004	-
0.8581	4250	0.0004	-
0.8682	4300	0.0005	-
0.8783	4350	0.0003	-
0.8884	4400	0.0005	-
0.8984	4450	0.0003	-
0.9085	4500	0.0005	-
0.9186	4550	0.0004	-
0.9287	4600	0.0004	-
0.9388	4650	0.0008	-
0.9489	4700	0.0003	-
0.9590	4750	0.0005	-
0.9691	4800	0.0003	-
0.9792	4850	0.0004	-
0.9893	4900	0.0004	-
0.9994	4950	0.0003	-

Framework Versions

Python: 3.10.14
SetFit: 1.0.3
Sentence Transformers: 3.0.0
Transformers: 4.40.1
PyTorch: 2.2.0+cu121
Datasets: 2.19.1
Tokenizers: 0.19.1

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}

Netta1994
/

setfit_baai_oversampling_2k