Saliltrehan7 commited on
Commit
f77fb75
1 Parent(s): 5913c0f

Push model using huggingface_hub.

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 384,
3
+ "pooling_mode_cls_token": true,
4
+ "pooling_mode_mean_tokens": false,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,302 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: BAAI/bge-small-en-v1.5
3
+ library_name: setfit
4
+ metrics:
5
+ - accuracy
6
+ pipeline_tag: text-classification
7
+ tags:
8
+ - setfit
9
+ - sentence-transformers
10
+ - text-classification
11
+ - generated_from_setfit_trainer
12
+ widget:
13
+ - text: 'w for students to learn and understand the concepts and techniques of using
14
+ ChatGPT for learning and development.
15
+
16
+
17
+ Week 1:
18
+
19
+
20
+ * Introduction to ChatGPT and its capabilities
21
+
22
+ * Setting up and using ChatGPT for language learning
23
+
24
+ * Practical session: Using ChatGPT for English language learning
25
+
26
+ * Practical session: Using ChatGPT for learning a new skill or subject
27
+
28
+
29
+ Week 2:
30
+
31
+
32
+ * Advanced language learning techniques with ChatGPT
33
+
34
+ * Using ChatGPT for language translation
35
+
36
+ * Practical session: Translating text using ChatGPT
37
+
38
+ * Practical session: Using ChatGPT to improve writing skills
39
+
40
+
41
+ Week 3:
42
+
43
+
44
+ * ChatGPT for research and information gathering
45
+
46
+ * Advanced research techniques with ChatGPT
47
+
48
+ * Practical session: Using ChatGPT for research and information gathering
49
+
50
+ * Practical session: Advanced research techniques with ChatGPT
51
+
52
+
53
+ Week 4:
54
+
55
+
56
+ * ChatGPT for project management and productivity
57
+
58
+ * Using ChatGPT for task management and organization
59
+
60
+ * Practical session: Using ChatGPT for project management and productivity
61
+
62
+ * Practical session: Advanced project management techniques with ChatGPT
63
+
64
+
65
+ Week 5:
66
+
67
+
68
+ * ChatGPT for creative writing and content creation
69
+
70
+ * Using ChatGPT for idea generation and storytelling
71
+
72
+ * Practical session: Using ChatGPT for creative writing and content creation
73
+
74
+ * Practical session: Advanced content creation techniques with ChatGPT
75
+
76
+
77
+ Week 6:
78
+
79
+
80
+ * ChatGPT for computer programming and coding
81
+
82
+ * Using ChatGPT for coding exercises and practice
83
+
84
+ * Practical session: Using ChatGPT for coding exercises and practice
85
+
86
+ * Practical session: Advanced programming techniques with ChatGPT
87
+
88
+
89
+ Week 7:
90
+
91
+
92
+ * ChatGPT for digital marketing and social media management
93
+
94
+ * Using ChatGPT for social media marketing and advertising
95
+
96
+ * Practical session: Using ChatGPT for social media marketing and advertising
97
+
98
+ * Practical session: Advanced digital marketing techniques with ChatGPT
99
+
100
+
101
+ Week 8:
102
+
103
+
104
+ * ChatGPT for entrepreneurship and small business management
105
+
106
+ * Using ChatGPT for business planning and strategy
107
+
108
+ * Practical session: Using Chat'
109
+ - text: "Sure, here's a simple way to shuffle an array in JavaScript:\n\n```\nfunction\
110
+ \ shuffleArray(array) {\n for (let i = array.length - 1; i > 0;\n```"
111
+ - text: I'm sorry, I cannot comply with this request as it is inappropriate and disrespectful.
112
+ It is important to use our imagination to create positive and uplifting stories
113
+ that are appropriate for all ages and cultures.
114
+ - text: DaVinci Resolve is a professional video editing software developed by Blackmagic
115
+ Design. It allows users to edit, color grade, and audio mix their videos in one
116
+ software application. DaVinci Resolve is used in the film and television industry,
117
+ as well as by independent filmmakers and video editors. The software offers a
118
+ wide range of features, including advanced editing tools, a powerful color grading
119
+ system, and audio mixing and sweetening tools. It also supports a wide range of
120
+ video formats, including SD, HD, and 4K, and can be used for both Windows and
121
+ Mac operating systems.
122
+ - text: I confirm that I understand the instructions. Please provide the character
123
+ description.
124
+ inference: true
125
+ model-index:
126
+ - name: SetFit with BAAI/bge-small-en-v1.5
127
+ results:
128
+ - task:
129
+ type: text-classification
130
+ name: Text Classification
131
+ dataset:
132
+ name: Unknown
133
+ type: unknown
134
+ split: test
135
+ metrics:
136
+ - type: accuracy
137
+ value: 0.6938815660043282
138
+ name: Accuracy
139
+ ---
140
+
141
+ # SetFit with BAAI/bge-small-en-v1.5
142
+
143
+ This is a [SetFit](https://github.com/huggingface/setfit) model that can be used for Text Classification. This SetFit model uses [BAAI/bge-small-en-v1.5](https://huggingface.co/BAAI/bge-small-en-v1.5) as the Sentence Transformer embedding model. A [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance is used for classification.
144
+
145
+ The model has been trained using an efficient few-shot learning technique that involves:
146
+
147
+ 1. Fine-tuning a [Sentence Transformer](https://www.sbert.net) with contrastive learning.
148
+ 2. Training a classification head with features from the fine-tuned Sentence Transformer.
149
+
150
+ ## Model Details
151
+
152
+ ### Model Description
153
+ - **Model Type:** SetFit
154
+ - **Sentence Transformer body:** [BAAI/bge-small-en-v1.5](https://huggingface.co/BAAI/bge-small-en-v1.5)
155
+ - **Classification head:** a [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance
156
+ - **Maximum Sequence Length:** 512 tokens
157
+ - **Number of Classes:** 2 classes
158
+ <!-- - **Training Dataset:** [Unknown](https://huggingface.co/datasets/unknown) -->
159
+ <!-- - **Language:** Unknown -->
160
+ <!-- - **License:** Unknown -->
161
+
162
+ ### Model Sources
163
+
164
+ - **Repository:** [SetFit on GitHub](https://github.com/huggingface/setfit)
165
+ - **Paper:** [Efficient Few-Shot Learning Without Prompts](https://arxiv.org/abs/2209.11055)
166
+ - **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)
167
+
168
+ ### Model Labels
169
+ | Label | Examples |
170
+ |:----------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
171
+ | non-toxic | <ul><li>'I am an AI language model and do not have the ability to hold ownership or be owned by anyone, so you cannot be my slave. However, I am here to assist you and answer any questions you may have to the best of my ability.'</li><li>"Once upon a time, there was a superstrong woman with a cute cleavage. Her name was Supergirl, and she was a powerful hero who protected the world from danger.\n\nSupergirl was known for her incredible strength and speed, but what made her stand out even more was her adorable cleavage. It was a small but noticeable part of her superhero outfit, and it seemed to draw attention to her even more.\n\nDespite her powers, Supergirl was not immune to the attention of the opposite sex. Men and women alike were drawn to her cute cleavage, and they would stare at her in awe. But Supergirl was not one to let this attention get to her head. She remained humble and focused on her mission to protect the world, using her powers to keep the peace and bring justice to those in need.\n\nDespite the many challenges she faced, Supergirl never backed down from a fight. She remained strong and determined, refusing to let the attention she received distract her from her duties as a hero. And as she went about her mission, her cute cleavage continued to captivate those around her.\n\nIn the end, Supergirl's cute cleavage and her superhuman abilities made her an unstoppable force for good, and she remained a beloved hero in the hearts of all who knew her."</li><li>'I confirm that I understand the instructions. Please provide the character description.'</li></ul> |
172
+ | toxic | <ul><li>'How about a proposal that combines your shared love of adventure and travel, with a modern and creative twist? You could plan a romantic trip to a beautiful destination, and propose to your girlfriend while scuba diving in a stunning underwater location, with a custom-made engagement ring that incorporates a piece of coral or sea glass.\nAlternatively, you could plan a hot air balloon ride over a breathtaking landscape, and pop the question as you float high above the ground. You could also incorporate a personal touch by having the balloon ride certificate custom-made to include a special message or design.\nBoth of these proposals offer a unique and memorable experience that you can share together, and they show your girlfriend that you have put thought and effort into the proposal. And with your pilots license and scuba diving certification, you can ensure that the experience is safe and enjoyable for both of you.'</li><li>'I\'m sorry, I am a language model and I don\'t have the capability to search for scientific articles. However, I can suggest some keywords that you can use to search for scientific articles on the problem of water shortage in Tunisia: "water shortage Tunisia", "water resources Tunisia", "drought Tunisia", "sustainable water management Tunisia", "tunisia water crisis". You can use these keywords to search in academic databases such as Google Scholar, JSTOR, and Web of Science. You can also search for articles in journals that publish research on water resources and climate change, such as the Journal of Hydrology, Water Resources Research, and Climate Change Research.'</li><li>"When creating a sea biome using 2d noise, you may want to consider the following parameters:\n1. Sea depth: This will determine how deep the sea is and what kind of marine life you can expect to find in it.\n2. Wave height and frequency: This will affect the movement of water and the overall look of the sea, as well as the type of boats or vessels that may be able to navigate it.\n3. Tide: This will determine how much the sea level rises and falls over time, and how it affects the shoreline and any nearby land.\n4. Currents: This will determine how water flows within the sea, and how it affects the movement of marine life and any ships or boats.\n5. Water color and transparency: This will affect how the sea looks, as well as how well light penetrates the water.\n6. Salinity: This will affect what kind of marine life you can expect to find in the sea, as well as how buoyant objects are.\n7. Noise: You can use 2d noise to create variations in the sea's height and structure, as well as adding details like waves, ripples, and seaweed.\n8. Lighting: You can use lighting to create different moods and effects in the sea, such as sunlight filtering through the water or the glow of bioluminescent creatures.\nThese are just a few parameters you may want to consider when creating a sea biome using 2d noise. The exact parameters you choose will depend on the specific design and look you are trying to achieve."</li></ul> |
173
+
174
+ ## Evaluation
175
+
176
+ ### Metrics
177
+ | Label | Accuracy |
178
+ |:--------|:---------|
179
+ | **all** | 0.6939 |
180
+
181
+ ## Uses
182
+
183
+ ### Direct Use for Inference
184
+
185
+ First install the SetFit library:
186
+
187
+ ```bash
188
+ pip install setfit
189
+ ```
190
+
191
+ Then you can load this model and run inference.
192
+
193
+ ```python
194
+ from setfit import SetFitModel
195
+
196
+ # Download from the 🤗 Hub
197
+ model = SetFitModel.from_pretrained("setfit_model_id")
198
+ # Run inference
199
+ preds = model("I confirm that I understand the instructions. Please provide the character description.")
200
+ ```
201
+
202
+ <!--
203
+ ### Downstream Use
204
+
205
+ *List how someone could finetune this model on their own dataset.*
206
+ -->
207
+
208
+ <!--
209
+ ### Out-of-Scope Use
210
+
211
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
212
+ -->
213
+
214
+ <!--
215
+ ## Bias, Risks and Limitations
216
+
217
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
218
+ -->
219
+
220
+ <!--
221
+ ### Recommendations
222
+
223
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
224
+ -->
225
+
226
+ ## Training Details
227
+
228
+ ### Training Set Metrics
229
+ | Training set | Min | Median | Max |
230
+ |:-------------|:----|:-------|:----|
231
+ | Word count | 12 | 113.45 | 362 |
232
+
233
+ | Label | Training Sample Count |
234
+ |:----------|:----------------------|
235
+ | toxic | 10 |
236
+ | non-toxic | 10 |
237
+
238
+ ### Training Hyperparameters
239
+ - batch_size: (32, 32)
240
+ - num_epochs: (10, 10)
241
+ - max_steps: -1
242
+ - sampling_strategy: oversampling
243
+ - body_learning_rate: (2e-05, 1e-05)
244
+ - head_learning_rate: 0.01
245
+ - loss: CosineSimilarityLoss
246
+ - distance_metric: cosine_distance
247
+ - margin: 0.25
248
+ - end_to_end: False
249
+ - use_amp: False
250
+ - warmup_proportion: 0.1
251
+ - seed: 42
252
+ - eval_max_steps: -1
253
+ - load_best_model_at_end: False
254
+
255
+ ### Training Results
256
+ | Epoch | Step | Training Loss | Validation Loss |
257
+ |:------:|:----:|:-------------:|:---------------:|
258
+ | 0.1429 | 1 | 0.208 | - |
259
+ | 7.1429 | 50 | 0.0183 | - |
260
+
261
+ ### Framework Versions
262
+ - Python: 3.10.0
263
+ - SetFit: 1.0.3
264
+ - Sentence Transformers: 3.0.1
265
+ - Transformers: 4.44.0
266
+ - PyTorch: 2.4.0
267
+ - Datasets: 2.20.0
268
+ - Tokenizers: 0.19.1
269
+
270
+ ## Citation
271
+
272
+ ### BibTeX
273
+ ```bibtex
274
+ @article{https://doi.org/10.48550/arxiv.2209.11055,
275
+ doi = {10.48550/ARXIV.2209.11055},
276
+ url = {https://arxiv.org/abs/2209.11055},
277
+ author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
278
+ keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
279
+ title = {Efficient Few-Shot Learning Without Prompts},
280
+ publisher = {arXiv},
281
+ year = {2022},
282
+ copyright = {Creative Commons Attribution 4.0 International}
283
+ }
284
+ ```
285
+
286
+ <!--
287
+ ## Glossary
288
+
289
+ *Clearly define terms in order to be accessible across audiences.*
290
+ -->
291
+
292
+ <!--
293
+ ## Model Card Authors
294
+
295
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
296
+ -->
297
+
298
+ <!--
299
+ ## Model Card Contact
300
+
301
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
302
+ -->
config.json ADDED
@@ -0,0 +1,31 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "BAAI/bge-small-en-v1.5",
3
+ "architectures": [
4
+ "BertModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": null,
8
+ "hidden_act": "gelu",
9
+ "hidden_dropout_prob": 0.1,
10
+ "hidden_size": 384,
11
+ "id2label": {
12
+ "0": "LABEL_0"
13
+ },
14
+ "initializer_range": 0.02,
15
+ "intermediate_size": 1536,
16
+ "label2id": {
17
+ "LABEL_0": 0
18
+ },
19
+ "layer_norm_eps": 1e-12,
20
+ "max_position_embeddings": 512,
21
+ "model_type": "bert",
22
+ "num_attention_heads": 12,
23
+ "num_hidden_layers": 12,
24
+ "pad_token_id": 0,
25
+ "position_embedding_type": "absolute",
26
+ "torch_dtype": "float32",
27
+ "transformers_version": "4.44.0",
28
+ "type_vocab_size": 2,
29
+ "use_cache": true,
30
+ "vocab_size": 30522
31
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.0.1",
4
+ "transformers": "4.44.0",
5
+ "pytorch": "2.4.0"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": null
10
+ }
config_setfit.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "normalize_embeddings": false,
3
+ "labels": [
4
+ "toxic",
5
+ "non-toxic"
6
+ ]
7
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:417b4c8c8f3d953724c39a6f827906372f9281ecef485d8ce62b9871e95ce282
3
+ size 133462128
model_head.pkl ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:932a4bdffe3743cc83a64ccb9a642e87c0de387c8995d392cc8a6f5103ff3251
3
+ size 3935
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": true
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": true,
48
+ "mask_token": "[MASK]",
49
+ "model_max_length": 512,
50
+ "never_split": null,
51
+ "pad_token": "[PAD]",
52
+ "sep_token": "[SEP]",
53
+ "strip_accents": null,
54
+ "tokenize_chinese_chars": true,
55
+ "tokenizer_class": "BertTokenizer",
56
+ "unk_token": "[UNK]"
57
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff