pjbhaumik commited on
Commit
7576c5b
1 Parent(s): 228db95

pjbhaumik/km2-cross-encoder2

Browse files
Files changed (4) hide show
  1. README.md +14 -14
  2. config.json +4 -5
  3. model.safetensors +2 -2
  4. training_args.bin +1 -1
README.md CHANGED
@@ -1,6 +1,6 @@
1
  ---
2
  license: apache-2.0
3
- base_model: cross-encoder/ms-marco-MiniLM-L-6-v2
4
  tags:
5
  - generated_from_trainer
6
  model-index:
@@ -13,9 +13,9 @@ should probably proofread and complete it, then remove this comment. -->
13
 
14
  # crossencoder-km1
15
 
16
- This model is a fine-tuned version of [cross-encoder/ms-marco-MiniLM-L-6-v2](https://huggingface.co/cross-encoder/ms-marco-MiniLM-L-6-v2) on the None dataset.
17
  It achieves the following results on the evaluation set:
18
- - Loss: 0.0889
19
 
20
  ## Model description
21
 
@@ -35,7 +35,7 @@ More information needed
35
 
36
  The following hyperparameters were used during training:
37
  - learning_rate: 5e-05
38
- - train_batch_size: 80
39
  - eval_batch_size: 80
40
  - seed: 42
41
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
@@ -47,16 +47,16 @@ The following hyperparameters were used during training:
47
 
48
  | Training Loss | Epoch | Step | Validation Loss |
49
  |:-------------:|:-----:|:----:|:---------------:|
50
- | 99.5928 | 1.0 | 13 | 102.9756 |
51
- | 98.0949 | 2.0 | 26 | 99.8849 |
52
- | 95.3147 | 3.0 | 39 | 92.5186 |
53
- | 76.7659 | 4.0 | 52 | 69.9192 |
54
- | 54.0033 | 5.0 | 65 | 21.3535 |
55
- | 20.7192 | 6.0 | 78 | 7.2049 |
56
- | 3.2855 | 7.0 | 91 | 2.0372 |
57
- | 0.9629 | 8.0 | 104 | 0.3949 |
58
- | 0.4633 | 9.0 | 117 | 0.1386 |
59
- | 0.2156 | 10.0 | 130 | 0.0975 |
60
 
61
 
62
  ### Framework versions
 
1
  ---
2
  license: apache-2.0
3
+ base_model: cross-encoder/stsb-TinyBERT-L-4
4
  tags:
5
  - generated_from_trainer
6
  model-index:
 
13
 
14
  # crossencoder-km1
15
 
16
+ This model is a fine-tuned version of [cross-encoder/stsb-TinyBERT-L-4](https://huggingface.co/cross-encoder/stsb-TinyBERT-L-4) on the None dataset.
17
  It achieves the following results on the evaluation set:
18
+ - Loss: 0.0119
19
 
20
  ## Model description
21
 
 
35
 
36
  The following hyperparameters were used during training:
37
  - learning_rate: 5e-05
38
+ - train_batch_size: 100
39
  - eval_batch_size: 80
40
  - seed: 42
41
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 
47
 
48
  | Training Loss | Epoch | Step | Validation Loss |
49
  |:-------------:|:-----:|:----:|:---------------:|
50
+ | 7.4105 | 1.0 | 20 | 7.0307 |
51
+ | 3.9992 | 2.0 | 40 | 3.3007 |
52
+ | 1.1734 | 3.0 | 60 | 0.9568 |
53
+ | 0.2736 | 4.0 | 80 | 0.2017 |
54
+ | 0.1073 | 5.0 | 100 | 0.0679 |
55
+ | 0.0364 | 6.0 | 120 | 0.0288 |
56
+ | 0.0219 | 7.0 | 140 | 0.0221 |
57
+ | 0.0129 | 8.0 | 160 | 0.0140 |
58
+ | 0.0096 | 9.0 | 180 | 0.0118 |
59
+ | 0.009 | 10.0 | 200 | 0.0108 |
60
 
61
 
62
  ### Framework versions
config.json CHANGED
@@ -1,5 +1,5 @@
1
  {
2
- "_name_or_path": "cross-encoder/ms-marco-MiniLM-L-6-v2",
3
  "architectures": [
4
  "BertForSequenceClassification"
5
  ],
@@ -8,12 +8,12 @@
8
  "gradient_checkpointing": false,
9
  "hidden_act": "gelu",
10
  "hidden_dropout_prob": 0.1,
11
- "hidden_size": 384,
12
  "id2label": {
13
  "0": "LABEL_0"
14
  },
15
  "initializer_range": 0.02,
16
- "intermediate_size": 1536,
17
  "label2id": {
18
  "LABEL_0": 0
19
  },
@@ -21,11 +21,10 @@
21
  "max_position_embeddings": 512,
22
  "model_type": "bert",
23
  "num_attention_heads": 12,
24
- "num_hidden_layers": 6,
25
  "pad_token_id": 0,
26
  "position_embedding_type": "absolute",
27
  "problem_type": "regression",
28
- "sbert_ce_default_activation_function": "torch.nn.modules.linear.Identity",
29
  "torch_dtype": "float32",
30
  "transformers_version": "4.37.2",
31
  "type_vocab_size": 2,
 
1
  {
2
+ "_name_or_path": "cross-encoder/stsb-TinyBERT-L-4",
3
  "architectures": [
4
  "BertForSequenceClassification"
5
  ],
 
8
  "gradient_checkpointing": false,
9
  "hidden_act": "gelu",
10
  "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 312,
12
  "id2label": {
13
  "0": "LABEL_0"
14
  },
15
  "initializer_range": 0.02,
16
+ "intermediate_size": 1200,
17
  "label2id": {
18
  "LABEL_0": 0
19
  },
 
21
  "max_position_embeddings": 512,
22
  "model_type": "bert",
23
  "num_attention_heads": 12,
24
+ "num_hidden_layers": 4,
25
  "pad_token_id": 0,
26
  "position_embedding_type": "absolute",
27
  "problem_type": "regression",
 
28
  "torch_dtype": "float32",
29
  "transformers_version": "4.37.2",
30
  "type_vocab_size": 2,
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:98d4c651e4369c3d33de5e192606c39264aa45b3e9d9c2d055d3c8d3656b0ab5
3
- size 90866412
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d3a0b56782cd84c8bc3c4c977b387109502129debe8bf1921ee0982103cdb4b3
3
+ size 57410556
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:deafebc61456310c26a138c83b25911d8768755817dc0fc13ca717640fe240f9
3
  size 4155
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cf8fa5b1057725a4bc8e0d36487737d0d0d0698fd98931692572c0b5b1b59b8c
3
  size 4155