File size: 9,387 Bytes
ee0bcef
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
---
base_model: mini1013/master_domain
library_name: setfit
metrics:
- metric
pipeline_tag: text-classification
tags:
- setfit
- sentence-transformers
- text-classification
- generated_from_setfit_trainer
widget:
- text: JCP 애플 펜슬 1세대 USB-C Apple Pencil 어뎁터 포함 (MQLY3KH/A)  주식회사 제이씨엠컴퍼니
- text: 힐링쉴드 갤럭시탭S9 울트라 ARAG 고화질 저반사 액정보호필름1매 후면1매  (주) 힐링쉴드코리아
- text: 다이아큐브 아이패드 프로 13 M4 (2024) 9H PET 슬림강화유리 깨지지않는 액정보호필름, 간편부착 2 6H 고투명 방탄 2
    뷰티코리아(Beauti korea)
- text: 뷰씨 갤럭시탭S6 라이트 10.4인치 강화유리필름(2매) 강화유리필름(2매구성) 주식회사 오토스마트
- text: 갤럭시탭A9 슈페리어 저반사 액정보호필름  (주) 폰트리
inference: true
model-index:
- name: SetFit with mini1013/master_domain
  results:
  - task:
      type: text-classification
      name: Text Classification
    dataset:
      name: Unknown
      type: unknown
      split: test
    metrics:
    - type: metric
      value: 0.9694656488549618
      name: Metric
---

# SetFit with mini1013/master_domain

This is a [SetFit](https://github.com/huggingface/setfit) model that can be used for Text Classification. This SetFit model uses [mini1013/master_domain](https://huggingface.co/mini1013/master_domain) as the Sentence Transformer embedding model. A [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

1. Fine-tuning a [Sentence Transformer](https://www.sbert.net) with contrastive learning.
2. Training a classification head with features from the fine-tuned Sentence Transformer.

## Model Details

### Model Description
- **Model Type:** SetFit
- **Sentence Transformer body:** [mini1013/master_domain](https://huggingface.co/mini1013/master_domain)
- **Classification head:** a [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance
- **Maximum Sequence Length:** 512 tokens
- **Number of Classes:** 4 classes
<!-- - **Training Dataset:** [Unknown](https://huggingface.co/datasets/unknown) -->
<!-- - **Language:** Unknown -->
<!-- - **License:** Unknown -->

### Model Sources

- **Repository:** [SetFit on GitHub](https://github.com/huggingface/setfit)
- **Paper:** [Efficient Few-Shot Learning Without Prompts](https://arxiv.org/abs/2209.11055)
- **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)

### Model Labels
| Label | Examples                                                                                                                                                                                                                                                                                                        |
|:------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 3     | <ul><li>'와콤 KP-501E 표준그립펜 인튜어스 프로 펜 와콤펜  에이엠스토어'</li><li>'Apple 애플 펜슬 2세대 미국정품 MU8F2KH/A (3-5일배송)  굿웍스코리아 유한책임회사'</li><li>'교체형 갤탭 볼펜심 펜촉 탭S7 펜슬 S펜 라미 (G428) 블랙 몽실왕자A'</li></ul>                                                                                                                                |
| 0     | <ul><li>'코끼리리빙 아이패드 갤럭시탭S 마그네틱 드로잉 필기 스탠드 거치대 P2WA-3419 12.9(2018/2020/2021/2022)_그레이 주식회사예스대현'</li><li>'뷰씨 갤럭시탭 아이패드 태블릿 거치대 침대 책상 틈새 고정 블랙 주식회사 오토스마트'</li><li>'알파플랜 휴대용 태블릿 거치대 스탠드 갤럭시탭 아이패드 ATH01 매트블랙 주식회사 로리스토어'</li></ul>                                                                               |
| 2     | <ul><li>'뷰씨 아이패드 에어 6세대 11인치 M2 종이 질감 저반사 액정 보호 필름 에어6세대 11인치 (저반사)종이질감필름 제이포레스트'</li><li>'아이패드 에어 6세대 11 종이질감 Light 액정보호필름1매 후면1매  주식회사 스마트'</li><li>'아이패드 프로 3세대 12.9인치 지문방지 종이질감 액정보호필름 아이패드 프로 3세대 12.9_종이질감 액정보호필름 1매 주식회사 제이앤에이'</li></ul>                                                                |
| 1     | <ul><li>'Apple 아이패드 에어 스마트 폴리오 (iPad Air 4,5세대용) - 다크 체리 (MNA43FE/A) 다크 체리 MNA43FE/A (주)블루박스 (Blue Box Co., Ltd)'</li><li>'[N페이적립+커피쿠폰] ESR 아이패드 프로13 폴리오 케이스 프로13_네이비 EC587 주식회사 샘빌'</li><li>'뷰씨 갤럭시탭 S8플러스 / S7플러스 / S7 FE 12.4인치 보디가드 투명범퍼 케이스 갤럭시탭S8+/S7+/S7 FE(공용)_보디가드ㅣ투명 광주스마트폰친구 아이폰 사설수리센터점'</li></ul> |

## Evaluation

### Metrics
| Label   | Metric |
|:--------|:-------|
| **all** | 0.9695 |

## Uses

### Direct Use for Inference

First install the SetFit library:

```bash
pip install setfit
```

Then you can load this model and run inference.

```python
from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("mini1013/master_cate_el22")
# Run inference
preds = model("갤럭시탭A9 슈페리어 저반사 액정보호필름  (주) 폰트리")
```

<!--
### Downstream Use

*List how someone could finetune this model on their own dataset.*
-->

<!--
### Out-of-Scope Use

*List how the model may foreseeably be misused and address what users ought not to do with the model.*
-->

<!--
## Bias, Risks and Limitations

*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
-->

<!--
### Recommendations

*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
-->

## Training Details

### Training Set Metrics
| Training set | Min | Median | Max |
|:-------------|:----|:-------|:----|
| Word count   | 6   | 12.075 | 34  |

| Label | Training Sample Count |
|:------|:----------------------|
| 0     | 50                    |
| 1     | 50                    |
| 2     | 50                    |
| 3     | 50                    |

### Training Hyperparameters
- batch_size: (512, 512)
- num_epochs: (20, 20)
- max_steps: -1
- sampling_strategy: oversampling
- num_iterations: 40
- body_learning_rate: (2e-05, 2e-05)
- head_learning_rate: 2e-05
- loss: CosineSimilarityLoss
- distance_metric: cosine_distance
- margin: 0.25
- end_to_end: False
- use_amp: False
- warmup_proportion: 0.1
- seed: 42
- eval_max_steps: -1
- load_best_model_at_end: False

### Training Results
| Epoch   | Step | Training Loss | Validation Loss |
|:-------:|:----:|:-------------:|:---------------:|
| 0.0312  | 1    | 0.4959        | -               |
| 1.5625  | 50   | 0.0683        | -               |
| 3.125   | 100  | 0.0002        | -               |
| 4.6875  | 150  | 0.0001        | -               |
| 6.25    | 200  | 0.0001        | -               |
| 7.8125  | 250  | 0.0           | -               |
| 9.375   | 300  | 0.0           | -               |
| 10.9375 | 350  | 0.0           | -               |
| 12.5    | 400  | 0.0           | -               |
| 14.0625 | 450  | 0.0           | -               |
| 15.625  | 500  | 0.0           | -               |
| 17.1875 | 550  | 0.0           | -               |
| 18.75   | 600  | 0.0           | -               |

### Framework Versions
- Python: 3.10.12
- SetFit: 1.1.0.dev0
- Sentence Transformers: 3.1.1
- Transformers: 4.46.1
- PyTorch: 2.4.0+cu121
- Datasets: 2.20.0
- Tokenizers: 0.20.0

## Citation

### BibTeX
```bibtex
@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
```

<!--
## Glossary

*Clearly define terms in order to be accessible across audiences.*
-->

<!--
## Model Card Authors

*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
-->

<!--
## Model Card Contact

*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
-->