SetFit with sentence-transformers/paraphrase-mpnet-base-v2

This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/paraphrase-mpnet-base-v2 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: sentence-transformers/paraphrase-mpnet-base-v2
Classification head: a LogisticRegression instance
Maximum Sequence Length: 512 tokens
Number of Classes: 28 classes

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Model Labels

Label	Examples
Relegious	'Badc Jame Masjid' 'Modina Masjid' 'Baitul Ehsan Jame Masjid'
Food	'Bombay Biriyani Restaurant' 'Sanim Ghorowa Reatora' 'Attel Mati Restaurant'
Religious PLAce	'Darbar Sharif(Dorbeshe Badsha)' 'Mazar'
Education	'The English Academy' 'Economics Batch' 'Al Manar Model School'
Health Care	'Hope Haspital' 'North Para Community Clinic' 'Al Sami Medical Hall'
Office	'Nari Maitri Dholpur Branch' 'Techsam IT And Computer' 'Chandpur It'
Landmark	'Godaun Moar' 'Kuril Flyover U Turn Bridge' 'Manik Miya Avenue Moar'
Fuel	'Mimi Enterprise' 'Sariful Filling Station' 'M/s Aruja Enterprise'
Religious Place	'Kabbir Khan Jame Masjid' 'Sri Sri Nayanta Babar Mandir' 'Jordan Church of Christ'
Transportation	'Lala Khal Ferry Terminal' 'Porshuram Cng Stand' 'Riad Cycle Garage'
Agricultural	'Catlle Farm' 'Pushon Narsari' 'Vegetable garden'
Residential	'Ovinondon Chattrabas' 'TH Chattrabas' 'Seven Star Chattrabas'
shop	'Mayer Doya Store'
Bank	'Dutch Bangla Bank Limited Maijde (DBBL)' 'Jamuna Bank Limited Dholaikhal Branch' 'Prime Bank Limited Elephant Branch'
Utility	'Shahi Eidgah Water Tank' 'Pole No 31' 'Kalmilata Kacha Bazar'
Healthcare	'Oloukik' 'Burhanuddin Upazila Health Complex' 'Dr Nazmin Akter Najma'
Government	'Zilla Parishad Karjaloy Bhola' "Sub Police Commissioner's Bhaban (Tejgaon Branch)" 'Family Planning Office Satkhira'
Recreation	'Shaikh Rasel Sriti Shongho' 'Beraid Camping And Kayaking Zone (BCKZ)' 'Shohag Palli Picnic Spot & Resort'
Religious	'Baitul Mamur Jame Masjid' 'Petrol Pump Jame Masjid' 'Opsonnin Pharma Ltd Jame Masjid'
Religious Place	'Jame Masjid' 'Hospital Masjid' 'Badar Mokam Jame Masjid'
Shop	'Nayeem General Store' 'Bazlu Engineering & Refrigeration' 'Mukta Dulal'
Commercial	'Mazar Kacha Bazar' 'Fall Bazar Kola Potti' 'Venus Autos'
Industry	'Rn Integrated Argo' 'Fresh Dairy Firm' 'Hemple Rhee Mfg Limited'
Hotel	'Warisan' 'Hotel New London Palace Abashik' 'Sada Vat'
construction	'Fahim Hardware Store' 'O A Frame Gallery'
Construction	'Khalil Steel' 'Sanaullah Tiles And Sanitary House' 'Mukta Glass And Thai Aluminum'
Relegious Place	'Baitul Atiq Jam-E Masjid' 'Hathazari Bus Stand Baitussalam Jame Masjid' 'Osman Bin Affan Jame Masjid'
education	'Masum Electronic'

Evaluation

Metrics

Label	Accuracy
all	0.33

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("rafi138/setfit-paraphrase-mpnet-base-v2-type")
# Run inference
preds = model("Dadon Hotel")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	1	3.5	7

Label	Training Sample Count
ShopCommercialGovernmentHealthcareEducationFoodOfficeReligious PlaceBankTransportationConstructionIndustryResidentialLandmarkRecreationFuelHotelUtilityAgricultural	0

Training Hyperparameters

batch_size: (32, 32)
num_epochs: (4, 4)
max_steps: -1
sampling_strategy: oversampling
body_learning_rate: (2e-05, 1e-05)
head_learning_rate: 0.01
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: False
use_amp: False
warmup_proportion: 0.1
seed: 42
eval_max_steps: -1
load_best_model_at_end: True

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0006	1	0.1851	-
0.0282	50	0.1697	-
0.0564	100	0.1876	-
0.0032	1	0.169	-
0.1597	50	0.081	-
0.3195	100	0.0641	-
0.4792	150	0.033	-
0.6390	200	0.0128	-
0.7987	250	0.0089	-
0.9585	300	0.0106	-
1.0	313	-	0.3235
1.1182	350	0.0215	-
1.2780	400	0.017	-
1.4377	450	0.0057	-
1.5974	500	0.0047	-
1.7572	550	0.0064	-
1.9169	600	0.003	-
2.0	626	-	0.3481
2.0767	650	0.0043	-
2.2364	700	0.0022	-
2.3962	750	0.0014	-
2.5559	800	0.0028	-
2.7157	850	0.0018	-
2.8754	900	0.002	-
3.0	939	-	0.3393
3.0351	950	0.0294	-
3.1949	1000	0.002	-
3.3546	1050	0.0017	-
3.5144	1100	0.0017	-
3.6741	1150	0.0015	-
3.8339	1200	0.0013	-
3.9936	1250	0.0014	-
4.0	1252	-	0.348

The bold row denotes the saved checkpoint.

Framework Versions

Python: 3.10.12
SetFit: 1.0.3
Sentence Transformers: 2.2.2
Transformers: 4.35.2
PyTorch: 2.1.0+cu121
Datasets: 2.16.1
Tokenizers: 0.15.0

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}

rafi138
/

setfit-paraphrase-mpnet-base-v2-type