--- language: - multilingual base_model: /kaggle/input/mistral-7b/Mistral-7B-v0.1 tags: - generated_from_trainer datasets: - STEM model-index: - name: mistral-7b-llm-science-exam results: [] --- # mistral-7b-llm-science-exam This model is a fine-tuned version of [/kaggle/input/mistral-7b/Mistral-7B-v0.1](https://huggingface.co//kaggle/input/mistral-7b/Mistral-7B-v0.1) on the llm-science-exam dataset. It achieves the following results on the evaluation set: - Loss: 0.3951 - Map@3: 0.8976 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 1e-05 - train_batch_size: 2 - eval_batch_size: 2 - seed: 42 - gradient_accumulation_steps: 4 - total_train_batch_size: 8 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: cosine - lr_scheduler_warmup_steps: 50 - num_epochs: 1 ### Training results | Training Loss | Epoch | Step | Validation Loss | Map@3 | |:-------------:|:-----:|:----:|:---------------:|:------:| | 3.3769 | 0.11 | 50 | 1.8621 | 0.9238 | | 1.5772 | 0.23 | 100 | 0.5619 | 0.9119 | | 0.9202 | 0.34 | 150 | 0.3942 | 0.9095 | | 0.9485 | 0.45 | 200 | 0.4117 | 0.8976 | | 0.9698 | 0.56 | 250 | 0.4145 | 0.9048 | | 0.8731 | 0.68 | 300 | 0.4054 | 0.9048 | | 0.8929 | 0.79 | 350 | 0.3967 | 0.8976 | | 0.9737 | 0.9 | 400 | 0.3951 | 0.8976 | ### Framework versions - Transformers 4.34.0.dev0 - Pytorch 2.0.0 - Datasets 2.14.4 - Tokenizers 0.14.0