Yi-1.5-9B-sft-241128

This model is a fine-tuned version of saves/Yi-1.5-9B-pt-241124 on the chinese-medical-dialogue, the CMB, the cMedQA2, the CMExam, the CMtMedQA, the COIG-CQIA-full, the COIG_full, the HuatuoGPT_sft_data_v, the huatuo_encyclopedia_q, the huatuo_lite, the imcs21, the Med-single-choice, the Medical_dialogue_system_en_single_turn, the qizhengpt-sft-20, the self_cognition, the sharegpt_zh_38K_format, the shennong, the shibing642-medica, the tigerbot_sft_data, the xywy-KG and the zhongyi-zhiku datasets. It achieves the following results on the evaluation set:

  • Loss: 1.4478

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2.5e-06
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 8
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.05
  • num_epochs: 2.0

Training results

Training Loss Epoch Step Validation Loss
1.6544 0.1277 1000 1.6105
1.5595 0.2554 2000 1.5668
1.5297 0.3830 3000 1.5394
1.5637 0.5107 4000 1.5188
1.5051 0.6384 5000 1.5028
1.4765 0.7661 6000 1.4895
1.4504 0.8938 7000 1.4779
1.4084 1.0215 8000 1.4716
1.4292 1.1491 9000 1.4653
1.4349 1.2768 10000 1.4597
1.4442 1.4045 11000 1.4548
1.422 1.5322 12000 1.4517
1.3986 1.6599 13000 1.4491
1.3949 1.7875 14000 1.4482
1.4241 1.9152 15000 1.4478

Framework versions

  • Transformers 4.44.2
  • Pytorch 2.4.0+cu121
  • Datasets 2.21.0
  • Tokenizers 0.19.1
Downloads last month
12
Safetensors
Model size
8.83B params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.