bert-large-uncased-sst-2-32-13

This model is a fine-tuned version of bert-large-uncased on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.6858
Accuracy: 0.8906

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 32
eval_batch_size: 32
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 50
num_epochs: 150

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
No log	1.0	2	0.7579	0.4844
No log	2.0	4	0.7556	0.4844
No log	3.0	6	0.7502	0.4844
No log	4.0	8	0.7421	0.4844
0.7491	5.0	10	0.7348	0.5
0.7491	6.0	12	0.7286	0.5
0.7491	7.0	14	0.7228	0.5
0.7491	8.0	16	0.7177	0.5
0.7491	9.0	18	0.7136	0.5
0.7136	10.0	20	0.7095	0.5
0.7136	11.0	22	0.7049	0.5
0.7136	12.0	24	0.6993	0.5
0.7136	13.0	26	0.6926	0.5
0.7136	14.0	28	0.6860	0.5
0.6781	15.0	30	0.6777	0.5
0.6781	16.0	32	0.6691	0.5
0.6781	17.0	34	0.6605	0.5312
0.6781	18.0	36	0.6524	0.5156
0.6781	19.0	38	0.6428	0.5469
0.584	20.0	40	0.6239	0.6562
0.584	21.0	42	0.6124	0.6719
0.584	22.0	44	0.6053	0.6562
0.584	23.0	46	0.5981	0.6875
0.584	24.0	48	0.5707	0.7031
0.439	25.0	50	0.5284	0.7656
0.439	26.0	52	0.5125	0.7812
0.439	27.0	54	0.5117	0.75
0.439	28.0	56	0.4922	0.7656
0.439	29.0	58	0.4698	0.7812
0.2661	30.0	60	0.4417	0.7656
0.2661	31.0	62	0.4234	0.7812
0.2661	32.0	64	0.4309	0.7656
0.2661	33.0	66	0.4503	0.7812
0.2661	34.0	68	0.4344	0.8125
0.1	35.0	70	0.3772	0.8281
0.1	36.0	72	0.3475	0.875
0.1	37.0	74	0.3404	0.875
0.1	38.0	76	0.3334	0.8906
0.1	39.0	78	0.3313	0.9062
0.033	40.0	80	0.3315	0.9062
0.033	41.0	82	0.3340	0.9062
0.033	42.0	84	0.3364	0.9062
0.033	43.0	86	0.3412	0.9062
0.033	44.0	88	0.3509	0.8906
0.0142	45.0	90	0.3588	0.875
0.0142	46.0	92	0.3675	0.875
0.0142	47.0	94	0.3788	0.875
0.0142	48.0	96	0.3957	0.875
0.0142	49.0	98	0.4137	0.875
0.0081	50.0	100	0.4338	0.875
0.0081	51.0	102	0.4507	0.875
0.0081	52.0	104	0.4645	0.8906
0.0081	53.0	106	0.4767	0.8906
0.0081	54.0	108	0.4875	0.8906
0.0048	55.0	110	0.4977	0.8906
0.0048	56.0	112	0.5052	0.8906
0.0048	57.0	114	0.5082	0.8906
0.0048	58.0	116	0.5095	0.8906
0.0048	59.0	118	0.4912	0.875
0.0032	60.0	120	0.4782	0.875
0.0032	61.0	122	0.4720	0.875
0.0032	62.0	124	0.4713	0.875
0.0032	63.0	126	0.4757	0.875
0.0032	64.0	128	0.4820	0.875
0.0021	65.0	130	0.4919	0.875
0.0021	66.0	132	0.5045	0.875
0.0021	67.0	134	0.5175	0.875
0.0021	68.0	136	0.5308	0.875
0.0021	69.0	138	0.5430	0.875
0.0014	70.0	140	0.5544	0.875
0.0014	71.0	142	0.5643	0.8906
0.0014	72.0	144	0.5735	0.8906
0.0014	73.0	146	0.5810	0.8906
0.0014	74.0	148	0.5871	0.8906
0.0011	75.0	150	0.6019	0.8906
0.0011	76.0	152	0.6149	0.8906
0.0011	77.0	154	0.6262	0.8906
0.0011	78.0	156	0.6356	0.8906
0.0011	79.0	158	0.6435	0.8906
0.0007	80.0	160	0.6504	0.8906
0.0007	81.0	162	0.6568	0.8906
0.0007	82.0	164	0.6606	0.8906
0.0007	83.0	166	0.6625	0.8906
0.0007	84.0	168	0.6645	0.8906
0.0006	85.0	170	0.6663	0.8906
0.0006	86.0	172	0.6676	0.8906
0.0006	87.0	174	0.6691	0.8906
0.0006	88.0	176	0.6705	0.8906
0.0006	89.0	178	0.6717	0.8906
0.0006	90.0	180	0.6726	0.8906
0.0006	91.0	182	0.6735	0.8906
0.0006	92.0	184	0.6745	0.8906
0.0006	93.0	186	0.6756	0.8906
0.0006	94.0	188	0.6768	0.8906
0.0005	95.0	190	0.6781	0.8906
0.0005	96.0	192	0.6788	0.8906
0.0005	97.0	194	0.6791	0.8906
0.0005	98.0	196	0.6794	0.8906
0.0005	99.0	198	0.6798	0.8906
0.0004	100.0	200	0.6801	0.8906
0.0004	101.0	202	0.6805	0.8906
0.0004	102.0	204	0.6810	0.8906
0.0004	103.0	206	0.6817	0.8906
0.0004	104.0	208	0.6826	0.8906
0.0004	105.0	210	0.6833	0.8906
0.0004	106.0	212	0.6841	0.8906
0.0004	107.0	214	0.6850	0.8906
0.0004	108.0	216	0.6857	0.8906
0.0004	109.0	218	0.6866	0.8906
0.0004	110.0	220	0.6874	0.8906
0.0004	111.0	222	0.6881	0.8906
0.0004	112.0	224	0.6886	0.8906
0.0004	113.0	226	0.6889	0.8906
0.0004	114.0	228	0.6890	0.8906
0.0003	115.0	230	0.6889	0.8906
0.0003	116.0	232	0.6888	0.8906
0.0003	117.0	234	0.6886	0.8906
0.0003	118.0	236	0.6885	0.8906
0.0003	119.0	238	0.6874	0.8906
0.0003	120.0	240	0.6866	0.8906
0.0003	121.0	242	0.6860	0.8906
0.0003	122.0	244	0.6857	0.8906
0.0003	123.0	246	0.6855	0.8906
0.0003	124.0	248	0.6852	0.8906
0.0003	125.0	250	0.6850	0.8906
0.0003	126.0	252	0.6847	0.8906
0.0003	127.0	254	0.6846	0.8906
0.0003	128.0	256	0.6846	0.8906
0.0003	129.0	258	0.6846	0.8906
0.0003	130.0	260	0.6846	0.8906
0.0003	131.0	262	0.6847	0.8906
0.0003	132.0	264	0.6847	0.8906
0.0003	133.0	266	0.6848	0.8906
0.0003	134.0	268	0.6846	0.8906
0.0003	135.0	270	0.6846	0.8906
0.0003	136.0	272	0.6846	0.8906
0.0003	137.0	274	0.6846	0.8906
0.0003	138.0	276	0.6847	0.8906
0.0003	139.0	278	0.6848	0.8906
0.0003	140.0	280	0.6849	0.8906
0.0003	141.0	282	0.6851	0.8906
0.0003	142.0	284	0.6852	0.8906
0.0003	143.0	286	0.6854	0.8906
0.0003	144.0	288	0.6855	0.8906
0.0003	145.0	290	0.6855	0.8906
0.0003	146.0	292	0.6856	0.8906
0.0003	147.0	294	0.6857	0.8906
0.0003	148.0	296	0.6857	0.8906
0.0003	149.0	298	0.6858	0.8906
0.0003	150.0	300	0.6858	0.8906

Framework versions

Transformers 4.32.0.dev0
Pytorch 2.0.1+cu118
Datasets 2.4.0
Tokenizers 0.13.3

simonycl
/

bert-large-uncased-sst-2-32-13

bert-large-uncased-sst-2-32-13

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for simonycl/bert-large-uncased-sst-2-32-13

Evaluation results