result

This model is a fine-tuned version of google/flan-t5-large on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.5606
Squad: {'exact_match': 31.547619047619047, 'f1': 65.97520968920112}
Bleu: {'bleu': 0.4478359370566898, 'precisions': [0.4970939125114714, 0.45436955820703, 0.43196470987444857, 0.4122681883024251], 'brevity_penalty': 1.0, 'length_ratio': 1.3752629364745477, 'translation_length': 3269, 'reference_length': 2377}

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 5

Training results

Training Loss	Epoch	Step	Validation Loss	Squad	Bleu
0.3089	0.14	100	0.4816	{'exact_match': 35.714285714285715, 'f1': 62.521389029334365}	{'bleu': 0.3454328094223805, 'precisions': [0.4022636892015907, 0.35407932924862945, 0.327451645741432, 0.30527817403708984], 'brevity_penalty': 1.0, 'length_ratio': 1.7017178552837064, 'translation_length': 3269, 'reference_length': 1921}
0.4283	0.27	200	0.4634	{'exact_match': 19.047619047619047, 'f1': 58.11077854533359}	{'bleu': 0.34061189083646615, 'precisions': [0.598654022636892, 0.5469203482747501, 0.5269765863590091, 0.510342368045649], 'brevity_penalty': 0.6252757307345338, 'length_ratio': 0.6804746044962531, 'translation_length': 3269, 'reference_length': 4804}
0.4923	0.41	300	0.4645	{'exact_match': 33.92857142857143, 'f1': 63.34981920724316}	{'bleu': 0.39658908018568106, 'precisions': [0.4472315692872438, 0.4040632054176072, 0.38004750593824227, 0.3601997146932953], 'brevity_penalty': 1.0, 'length_ratio': 1.5693710993759002, 'translation_length': 3269, 'reference_length': 2083}
0.5249	0.55	400	0.5079	{'exact_match': 33.92857142857143, 'f1': 62.939136694434445}	{'bleu': 0.31207996448397607, 'precisions': [0.365861119608443, 0.3227990970654628, 0.29555480149304375, 0.2717546362339515], 'brevity_penalty': 1.0, 'length_ratio': 2.19543317662861, 'translation_length': 3269, 'reference_length': 1489}
0.4899	0.68	500	0.4844	{'exact_match': 29.166666666666668, 'f1': 63.91710424811924}	{'bleu': 0.43831704036458624, 'precisions': [0.49036402569593146, 0.44437278297323446, 0.42144553783508654, 0.40192582025677603], 'brevity_penalty': 1.0, 'length_ratio': 1.307077169132347, 'translation_length': 3269, 'reference_length': 2501}
0.3923	0.82	600	0.4786	{'exact_match': 29.166666666666668, 'f1': 62.65868074864891}	{'bleu': 0.4580633444938437, 'precisions': [0.5062710308962985, 0.4630764269590455, 0.44248388191381066, 0.4243937232524964], 'brevity_penalty': 1.0, 'length_ratio': 1.1974358974358974, 'translation_length': 3269, 'reference_length': 2730}
0.5221	0.96	700	0.4612	{'exact_match': 30.952380952380953, 'f1': 62.84973727787755}	{'bleu': 0.5515506497257063, 'precisions': [0.6044661976139493, 0.563044179297001, 0.5446216491347132, 0.5281740370898717], 'brevity_penalty': 0.9860269604247005, 'length_ratio': 0.9861236802413273, 'translation_length': 3269, 'reference_length': 3315}
0.2892	1.1	800	0.4958	{'exact_match': 30.952380952380953, 'f1': 63.021615827809455}	{'bleu': 0.4763133340972634, 'precisions': [0.5310492505353319, 0.48081264108352145, 0.4584323040380047, 0.4397289586305278], 'brevity_penalty': 1.0, 'length_ratio': 1.1742097701149425, 'translation_length': 3269, 'reference_length': 2784}
0.3726	1.23	900	0.4814	{'exact_match': 28.571428571428573, 'f1': 65.07276724002743}	{'bleu': 0.4631031930079369, 'precisions': [0.514530437442643, 0.4669461464043857, 0.4462164913471327, 0.4290299572039943], 'brevity_penalty': 1.0, 'length_ratio': 1.3003182179793158, 'translation_length': 3269, 'reference_length': 2514}
0.3296	1.37	1000	0.4888	{'exact_match': 30.952380952380953, 'f1': 63.75823569325163}	{'bleu': 0.3620195814265076, 'precisions': [0.4138880391557051, 0.37181554337310546, 0.34543603664743805, 0.3231098430813124], 'brevity_penalty': 1.0, 'length_ratio': 1.6885330578512396, 'translation_length': 3269, 'reference_length': 1936}
0.5811	1.51	1100	0.5143	{'exact_match': 29.166666666666668, 'f1': 61.409131873413195}	{'bleu': 0.41718747361604114, 'precisions': [0.465891710003059, 0.4240567558851983, 0.4014251781472684, 0.3819543509272468], 'brevity_penalty': 1.0, 'length_ratio': 1.4541814946619218, 'translation_length': 3269, 'reference_length': 2248}
0.3257	1.64	1200	0.5088	{'exact_match': 32.142857142857146, 'f1': 62.3134623709586}	{'bleu': 0.3381083910829391, 'precisions': [0.39186295503211993, 0.3473073202192841, 0.32168306752629794, 0.2985021398002853], 'brevity_penalty': 1.0, 'length_ratio': 1.8961716937354989, 'translation_length': 3269, 'reference_length': 1724}
0.3282	1.78	1300	0.4795	{'exact_match': 35.714285714285715, 'f1': 64.30776389272819}	{'bleu': 0.37087590095858125, 'precisions': [0.4227592535943714, 0.38052241212512095, 0.3545978961655921, 0.33166904422253923], 'brevity_penalty': 1.0, 'length_ratio': 1.69730010384216, 'translation_length': 3269, 'reference_length': 1926}
0.3582	1.92	1400	0.5072	{'exact_match': 31.547619047619047, 'f1': 64.26308637164007}	{'bleu': 0.3596752944859653, 'precisions': [0.4083817681248088, 0.36988068365043536, 0.34441805225653205, 0.32168330955777463], 'brevity_penalty': 1.0, 'length_ratio': 1.7951674903898958, 'translation_length': 3269, 'reference_length': 1821}
0.2635	2.05	1500	0.5226	{'exact_match': 32.73809523809524, 'f1': 63.86332207970659}	{'bleu': 0.3937338044365284, 'precisions': [0.44447843377179563, 0.40148339245404707, 0.37801153715643027, 0.35627674750356636], 'brevity_penalty': 1.0, 'length_ratio': 1.4988537368179735, 'translation_length': 3269, 'reference_length': 2181}
0.2464	2.19	1600	0.5321	{'exact_match': 33.333333333333336, 'f1': 67.38930420708819}	{'bleu': 0.45354685012711804, 'precisions': [0.5013765677577241, 0.4601741373750403, 0.43841194435018666, 0.41833095577746077], 'brevity_penalty': 1.0, 'length_ratio': 1.4225413402959095, 'translation_length': 3269, 'reference_length': 2298}
0.2393	2.33	1700	0.5099	{'exact_match': 35.714285714285715, 'f1': 65.16842917580313}	{'bleu': 0.4114344617968027, 'precisions': [0.4588559192413582, 0.41921960657852303, 0.39667458432304037, 0.3755349500713267], 'brevity_penalty': 1.0, 'length_ratio': 1.5977517106549364, 'translation_length': 3269, 'reference_length': 2046}
0.2676	2.47	1800	0.5987	{'exact_match': 29.761904761904763, 'f1': 63.78707251910303}	{'bleu': 0.3658122932518219, 'precisions': [0.42551238910981953, 0.37697516930022573, 0.34781133355955207, 0.3209700427960057], 'brevity_penalty': 1.0, 'length_ratio': 1.7187171398527865, 'translation_length': 3269, 'reference_length': 1902}
0.3071	2.6	1900	0.5240	{'exact_match': 29.166666666666668, 'f1': 62.92129166698199}	{'bleu': 0.4832099557140429, 'precisions': [0.5377791373508718, 0.4872621734924218, 0.46521886664404477, 0.4472182596291013], 'brevity_penalty': 1.0, 'length_ratio': 1.0732107682206171, 'translation_length': 3269, 'reference_length': 3046}
0.2839	2.74	2000	0.5110	{'exact_match': 35.714285714285715, 'f1': 65.45284344390186}	{'bleu': 0.43592975914902093, 'precisions': [0.4857754665035179, 0.4414704933892293, 0.4200882253138785, 0.4008559201141227], 'brevity_penalty': 1.0, 'length_ratio': 1.3502684840974803, 'translation_length': 3269, 'reference_length': 2421}
0.3259	2.88	2100	0.5020	{'exact_match': 30.357142857142858, 'f1': 66.21243212925624}	{'bleu': 0.5131744912223959, 'precisions': [0.5646986846130315, 0.5169300225733634, 0.496776382762131, 0.4782453637660485], 'brevity_penalty': 1.0, 'length_ratio': 1.091121495327103, 'translation_length': 3269, 'reference_length': 2996}
0.3272	3.01	2200	0.5239	{'exact_match': 30.952380952380953, 'f1': 67.09846522112723}	{'bleu': 0.5405154750897113, 'precisions': [0.5830529213826858, 0.5446630119316349, 0.5262979300984052, 0.5106990014265336], 'brevity_penalty': 1.0, 'length_ratio': 1.1176068376068375, 'translation_length': 3269, 'reference_length': 2925}
0.1856	3.15	2300	0.5524	{'exact_match': 33.92857142857143, 'f1': 68.43656293435919}	{'bleu': 0.49285332133310633, 'precisions': [0.5396145610278372, 0.4982263785875524, 0.47777400746521886, 0.4593437945791726], 'brevity_penalty': 1.0, 'length_ratio': 1.2987683750496624, 'translation_length': 3269, 'reference_length': 2517}
0.233	3.29	2400	0.5277	{'exact_match': 30.952380952380953, 'f1': 66.14618600499504}	{'bleu': 0.47457637309615686, 'precisions': [0.5221780360966657, 0.47887778136085135, 0.4594502884289108, 0.44151212553495006], 'brevity_penalty': 1.0, 'length_ratio': 1.2854895792371215, 'translation_length': 3269, 'reference_length': 2543}
0.2027	3.42	2500	0.5362	{'exact_match': 30.952380952380953, 'f1': 67.73189400443187}	{'bleu': 0.4828086230052905, 'precisions': [0.5273784031814011, 0.4888745565946469, 0.46827281981676283, 0.4500713266761769], 'brevity_penalty': 1.0, 'length_ratio': 1.3672103722291928, 'translation_length': 3269, 'reference_length': 2391}
0.1462	3.56	2600	0.5681	{'exact_match': 34.523809523809526, 'f1': 66.90869030594244}	{'bleu': 0.4470358204326154, 'precisions': [0.49434077699602325, 0.4524346984843599, 0.4316253817441466, 0.4136947218259629], 'brevity_penalty': 1.0, 'length_ratio': 1.4535349044019563, 'translation_length': 3269, 'reference_length': 2249}
0.2218	3.7	2700	0.5582	{'exact_match': 30.952380952380953, 'f1': 64.88762964740765}	{'bleu': 0.3810698909088452, 'precisions': [0.4353013153869685, 0.38987423411802646, 0.36443841194435017, 0.340941512125535], 'brevity_penalty': 1.0, 'length_ratio': 1.5588936576061039, 'translation_length': 3269, 'reference_length': 2097}
0.2644	3.84	2800	0.5324	{'exact_match': 33.92857142857143, 'f1': 66.27147689733899}	{'bleu': 0.46985264711666236, 'precisions': [0.5234016518813093, 0.4746855852950661, 0.4523243976925687, 0.43366619115549215], 'brevity_penalty': 1.0, 'length_ratio': 1.2926057730328193, 'translation_length': 3269, 'reference_length': 2529}
0.1883	3.97	2900	0.5218	{'exact_match': 31.547619047619047, 'f1': 65.86132294563414}	{'bleu': 0.4691457249654658, 'precisions': [0.5151422453349648, 0.4756530151564012, 0.4540210383440787, 0.4354493580599144], 'brevity_penalty': 1.0, 'length_ratio': 1.3165525573902537, 'translation_length': 3269, 'reference_length': 2483}
0.0983	4.11	3000	0.5570	{'exact_match': 34.523809523809526, 'f1': 66.88532922899014}	{'bleu': 0.4637264731606073, 'precisions': [0.5074946466809421, 0.4708158658497259, 0.4496097726501527, 0.4304564907275321], 'brevity_penalty': 1.0, 'length_ratio': 1.3834109183241643, 'translation_length': 3269, 'reference_length': 2363}
0.1298	4.25	3100	0.5555	{'exact_match': 31.547619047619047, 'f1': 65.9162762624782}	{'bleu': 0.4642779990799932, 'precisions': [0.5126950137656776, 0.47049338922928086, 0.44859178825924667, 0.42938659058487877], 'brevity_penalty': 1.0, 'length_ratio': 1.3039489429597129, 'translation_length': 3269, 'reference_length': 2507}
0.1631	4.38	3200	0.5684	{'exact_match': 32.73809523809524, 'f1': 67.06642684079529}	{'bleu': 0.5088183152910757, 'precisions': [0.5512389109819517, 0.5146726862302483, 0.494740413980319, 0.4775320970042796], 'brevity_penalty': 1.0, 'length_ratio': 1.2799530148786218, 'translation_length': 3269, 'reference_length': 2554}
0.1247	4.52	3300	0.5779	{'exact_match': 32.142857142857146, 'f1': 66.7145922073288}	{'bleu': 0.4891310418038067, 'precisions': [0.5359437136739064, 0.4946791357626572, 0.47370206990159486, 0.4557774607703281], 'brevity_penalty': 1.0, 'length_ratio': 1.2592449922958397, 'translation_length': 3269, 'reference_length': 2596}
0.1375	4.66	3400	0.5710	{'exact_match': 32.73809523809524, 'f1': 67.12997909883624}	{'bleu': 0.4788691854085839, 'precisions': [0.5240134597736311, 0.4853273137697517, 0.46420088225313877, 0.445435092724679], 'brevity_penalty': 1.0, 'length_ratio': 1.3485973597359735, 'translation_length': 3269, 'reference_length': 2424}
0.2406	4.79	3500	0.5565	{'exact_match': 32.73809523809524, 'f1': 67.80127256596187}	{'bleu': 0.4744984435764912, 'precisions': [0.5206485163658611, 0.48113511770396644, 0.4594502884289108, 0.4404422253922967], 'brevity_penalty': 1.0, 'length_ratio': 1.4145391605365643, 'translation_length': 3269, 'reference_length': 2311}
0.1866	4.93	3600	0.5606	{'exact_match': 31.547619047619047, 'f1': 65.97520968920112}	{'bleu': 0.4478359370566898, 'precisions': [0.4970939125114714, 0.45436955820703, 0.43196470987444857, 0.4122681883024251], 'brevity_penalty': 1.0, 'length_ratio': 1.3752629364745477, 'translation_length': 3269, 'reference_length': 2377}

Framework versions

Transformers 4.34.0
Pytorch 2.0.1+cu118
Datasets 2.14.5
Tokenizers 0.14.1

checkiejan
/

result

result

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for checkiejan/result

Evaluation results