aftonposten-6b-align-scan / README.md

hugodk-sch

End of training

572c514 verified 5 months ago

preview code

raw

history blame

No virus

4.91 kB

	---
	library_name: peft
	tags:
	- alignment-handbook
	- trl
	- dpo
	- generated_from_trainer
	base_model: NbAiLab/nb-gpt-j-6B-v2
	datasets:
	- hugodk-sch/aftonposten_title_prefs
	model-index:
	- name: aftonposten-6b-align-scan
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# aftonposten-6b-align-scan

	This model is a fine-tuned version of [data/ap-gpt-j-6b-sft-qlora-04-08](https://huggingface.co/data/ap-gpt-j-6b-sft-qlora-04-08) on the hugodk-sch/aftonposten_title_prefs dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.4763
	- Rewards/chosen: 0.2774
	- Rewards/rejected: 0.1670
	- Rewards/accuracies: 0.5685
	- Rewards/margins: 0.1104
	- Logps/rejected: -37.2781
	- Logps/chosen: -33.6382
	- Logits/rejected: -2.1561
	- Logits/chosen: -2.1608

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 5e-06
	- train_batch_size: 4
	- eval_batch_size: 8
	- seed: 42
	- distributed_type: multi-GPU
	- gradient_accumulation_steps: 2
	- total_train_batch_size: 8
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: cosine
	- lr_scheduler_warmup_ratio: 0.1
	- num_epochs: 4

	### Training results

	\| Training Loss \| Epoch \| Step \| Logits/chosen \| Logits/rejected \| Logps/chosen \| Logps/rejected \| Validation Loss \| Rewards/accuracies \| Rewards/chosen \| Rewards/margins \| Rewards/rejected \|
	\|:-------------:\|:-----:\|:----:\|:-------------:\|:---------------:\|:------------:\|:--------------:\|:---------------:\|:------------------:\|:--------------:\|:---------------:\|:----------------:\|
	\| 0.4799 \| 0.26 \| 100 \| -2.2381 \| -2.2333 \| -33.8607 \| -37.3570 \| 0.4978 \| 0.5341 \| 0.1217 \| 0.0100 \| 0.1117 \|
	\| 0.4453 \| 0.52 \| 200 \| -2.2347 \| -2.2299 \| -33.7685 \| -37.2937 \| 0.4928 \| 0.5370 \| 0.1862 \| 0.0302 \| 0.1561 \|
	\| 0.3947 \| 0.78 \| 300 \| -2.2322 \| -2.2274 \| -33.7551 \| -37.2894 \| 0.4910 \| 0.5565 \| 0.1956 \| 0.0365 \| 0.1591 \|
	\| 0.3136 \| 1.04 \| 400 \| 0.4857 \| 0.2846 \| 0.2244 \| 0.5797 \| 0.0602 \| -37.1961 \| -33.6280 \| -2.2032 \| -2.2080 \|
	\| 0.2784 \| 1.3 \| 500 \| 0.4891 \| 0.2959 \| 0.2519 \| 0.5220 \| 0.0439 \| -37.1567 \| -33.6119 \| -2.2050 \| -2.2098 \|
	\| 0.2593 \| 1.56 \| 600 \| 0.4795 \| 0.3345 \| 0.2439 \| 0.5743 \| 0.0906 \| -37.1682 \| -33.5567 \| -2.1866 \| -2.1914 \|
	\| 0.2606 \| 1.82 \| 700 \| 0.4764 \| 0.3188 \| 0.2158 \| 0.6063 \| 0.1031 \| -37.2084 \| -33.5791 \| -2.1788 \| -2.1836 \|
	\| 0.1758 \| 2.08 \| 800 \| 0.4767 \| 0.2840 \| 0.1749 \| 0.5860 \| 0.1091 \| -37.2668 \| -33.6289 \| -2.1680 \| -2.1727 \|
	\| 0.1687 \| 2.34 \| 900 \| 0.4770 \| 0.2898 \| 0.1833 \| 0.5486 \| 0.1065 \| -37.2547 \| -33.6205 \| -2.1626 \| -2.1674 \|
	\| 0.1826 \| 2.6 \| 1000 \| 0.4764 \| 0.2700 \| 0.1574 \| 0.5831 \| 0.1126 \| -37.2917 \| -33.6489 \| -2.1578 \| -2.1625 \|
	\| 0.1541 \| 2.86 \| 1100 \| 0.4751 \| 0.2864 \| 0.1692 \| 0.5777 \| 0.1171 \| -37.2748 \| -33.6254 \| -2.1561 \| -2.1608 \|
	\| 0.194 \| 3.12 \| 1200 \| 0.4748 \| 0.2856 \| 0.1654 \| 0.5801 \| 0.1202 \| -37.2803 \| -33.6265 \| -2.1565 \| -2.1612 \|
	\| 0.1414 \| 3.38 \| 1300 \| 0.4753 \| 0.2859 \| 0.1690 \| 0.5831 \| 0.1169 \| -37.2751 \| -33.6261 \| -2.1558 \| -2.1605 \|
	\| 0.1492 \| 3.64 \| 1400 \| 0.4744 \| 0.2846 \| 0.1627 \| 0.5918 \| 0.1220 \| -37.2842 \| -33.6279 \| -2.1556 \| -2.1603 \|
	\| 0.1694 \| 3.9 \| 1500 \| 0.4747 \| 0.2822 \| 0.1614 \| 0.5569 \| 0.1208 \| -37.2860 \| -33.6314 \| -2.1560 \| -2.1607 \|


	### Framework versions

	- PEFT 0.10.0
	- Transformers 4.39.0.dev0
	- Pytorch 2.1.2+cu121
	- Datasets 2.14.6
	- Tokenizers 0.15.1