Weni
/

ZeroShot-3.0.3-Mistral-7b-Multilanguage-3.0.3

Text Generation

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Mel-Iza0 commited on Dec 7, 2023

Commit

25b9dfd

•

1 Parent(s): c1b46e1

Model save

Files changed (1) hide show

README.md +57 -45

README.md CHANGED Viewed

@@ -1,48 +1,60 @@
 ---
-language:
-- pt
-- es
-- en
-metrics:
-- accuracy
-datasets:
-- Weni/zeroshot-3.0.3
 tags:
-- Zeroshot
 ---
-# Model
-The model was finetuned on mistral-7b-v1
-# Training Arguments
-```
-training_arguments = {
-    'push_to_hub': True,
-    'hub_strategy': 'all_checkpoints',
-    'max_seq_length': 2048,
-    'disable_tqdm': False,
-    'num_train_epochs': 1,
-    'per_device_train_batch_size': 2,
-    'per_device_eval_batch_size': 2,
-    'gradient_accumulation_steps': 2,
-    'gradient_checkpointing': True,
-    'optim': 'adamw_torch',
-    'lr_scheduler_type': "cosine",
-    'save_strategy': "epoch",
-    'evaluation_strategy': "epoch",
-    'load_best_model_at_end': True,
-    'metric_for_best_model': 'eval_loss',
-    'greater_is_better': False,
-    'save_safetensors': True,
-    'learning_rate': 4e-4,
-    'save_total_limit': 5,
-    'fp16': True,
-    'max_grad_norm': 0.3,
-    'warmup_ratio': 0.1,
-    'weight_decay': 0.01,
-    'dataset_text_field': "prompt",
-    'prediction_loss_only': False,
-    'eval_accumulation_steps': 1,
-    'report_to':'tensorboard'
-}
-```

 ---
+license: apache-2.0
+base_model: mistralai/Mistral-7B-v0.1
 tags:
+- generated_from_trainer
+model-index:
+- name: ZeroShot-3.0.3-Mistral-7b-Multilanguage-3.0.3
+  results: []
 ---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# ZeroShot-3.0.3-Mistral-7b-Multilanguage-3.0.3
+This model is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) on the None dataset.
+It achieves the following results on the evaluation set:
+- Loss: nan
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 0.04
+- train_batch_size: 16
+- eval_batch_size: 16
+- seed: 42
+- gradient_accumulation_steps: 2
+- total_train_batch_size: 32
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: cosine
+- lr_scheduler_warmup_ratio: 0.1
+- num_epochs: 1
+### Training results
+| Training Loss | Epoch | Step | Validation Loss |
+|:-------------:|:-----:|:----:|:---------------:|
+| 248.6968      | 1.0   | 915  | nan             |
+### Framework versions
+- Transformers 4.34.0
+- Pytorch 2.0.1+cu117
+- Datasets 2.13.0
+- Tokenizers 0.14.1