Sandiago21 commited on
Commit
f9b552c
1 Parent(s): 9195435

End of training

Browse files
README.md CHANGED
@@ -17,6 +17,9 @@ should probably proofread and complete it, then remove this comment. -->
17
  # mistral-7b-llm-science-exam
18
 
19
  This model is a fine-tuned version of [/kaggle/input/mistral-7b/Mistral-7B-v0.1](https://huggingface.co//kaggle/input/mistral-7b/Mistral-7B-v0.1) on the llm-science-exam dataset.
 
 
 
20
 
21
  ## Model description
22
 
@@ -39,14 +42,30 @@ The following hyperparameters were used during training:
39
  - train_batch_size: 2
40
  - eval_batch_size: 2
41
  - seed: 42
 
 
42
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
43
  - lr_scheduler_type: cosine
44
- - lr_scheduler_warmup_steps: 25
45
  - num_epochs: 1
46
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
47
  ### Framework versions
48
 
49
  - Transformers 4.34.0.dev0
50
  - Pytorch 2.0.0
51
- - Datasets 2.1.0
52
  - Tokenizers 0.14.0
 
17
  # mistral-7b-llm-science-exam
18
 
19
  This model is a fine-tuned version of [/kaggle/input/mistral-7b/Mistral-7B-v0.1](https://huggingface.co//kaggle/input/mistral-7b/Mistral-7B-v0.1) on the llm-science-exam dataset.
20
+ It achieves the following results on the evaluation set:
21
+ - Loss: 0.3951
22
+ - Map@3: 0.8976
23
 
24
  ## Model description
25
 
 
42
  - train_batch_size: 2
43
  - eval_batch_size: 2
44
  - seed: 42
45
+ - gradient_accumulation_steps: 4
46
+ - total_train_batch_size: 8
47
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
48
  - lr_scheduler_type: cosine
49
+ - lr_scheduler_warmup_steps: 50
50
  - num_epochs: 1
51
 
52
+ ### Training results
53
+
54
+ | Training Loss | Epoch | Step | Validation Loss | Map@3 |
55
+ |:-------------:|:-----:|:----:|:---------------:|:------:|
56
+ | 3.3769 | 0.11 | 50 | 1.8621 | 0.9238 |
57
+ | 1.5772 | 0.23 | 100 | 0.5619 | 0.9119 |
58
+ | 0.9202 | 0.34 | 150 | 0.3942 | 0.9095 |
59
+ | 0.9485 | 0.45 | 200 | 0.4117 | 0.8976 |
60
+ | 0.9698 | 0.56 | 250 | 0.4145 | 0.9048 |
61
+ | 0.8731 | 0.68 | 300 | 0.4054 | 0.9048 |
62
+ | 0.8929 | 0.79 | 350 | 0.3967 | 0.8976 |
63
+ | 0.9737 | 0.9 | 400 | 0.3951 | 0.8976 |
64
+
65
+
66
  ### Framework versions
67
 
68
  - Transformers 4.34.0.dev0
69
  - Pytorch 2.0.0
70
+ - Datasets 2.14.4
71
  - Tokenizers 0.14.0
adapter_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:c3d39294b67223d564271ed55404c99e10246224a561fabb38ec9b818b7ebf49
3
  size 109097933
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2d86ce6f9f84cb26a1333f2846a479131f01f28104293d0fe5b57b944a02addb
3
  size 109097933
tokenizer.json CHANGED
@@ -1,14 +1,7 @@
1
  {
2
  "version": "1.0",
3
  "truncation": null,
4
- "padding": {
5
- "strategy": "BatchLongest",
6
- "direction": "Left",
7
- "pad_to_multiple_of": null,
8
- "pad_id": 2,
9
- "pad_type_id": 0,
10
- "pad_token": "</s>"
11
- },
12
  "added_tokens": [
13
  {
14
  "id": 0,
 
1
  {
2
  "version": "1.0",
3
  "truncation": null,
4
+ "padding": null,
 
 
 
 
 
 
 
5
  "added_tokens": [
6
  {
7
  "id": 0,
train_with_llm_answers.csv CHANGED
The diff for this file is too large to render. See raw diff
 
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:f8957f73c3ccd9a89bc0397df00a27e0ec2e91861139fda13bd63095b59264be
3
  size 4091
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8edb6de5035d85d5594bfc396384890041683eb5dd7618eacf9061692226bffb
3
  size 4091