--- license: apache-2.0 tags: - generated_from_trainer metrics: - accuracy base_model: EleutherAI/gpt-neo-2.7B model-index: - name: output results: [] --- # output ## Model description This model is a fine-tuned version of [EleutherAI/gpt-neo-2.7B](https://huggingface.co/EleutherAI/gpt-neo-2.7B) on the Lila-IID-train/dev set from the [Lila dataset](https://github.com/allenai/Lila). ## Usage Bhaskara was trained with the following format: ~~~ Question: ... Answer: ... Program: ```python ... ``` ~~~ It will perform best if queried in this way. ## Intended uses & limitations If you use this model, please cite our work. ``` @INPROCEEDINGS{Mishra2022Lila, author = { Swaroop Mishra and Matthew Finlayson and Pan Lu and Leonard Tang and Sean Welleck and Chitta Baral and Tanmay Rajpurohit and Oyvind Tafjord and Ashish Sabharwal and Peter Clark and Ashwin Kalyan}, title = {Lila: A Unified Benchmark for Mathematical Reasoning}, booktitle = {Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP)}, year = {2022} } ``` ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 5e-05 - train_batch_size: 4 - eval_batch_size: 4 - seed: 42 - distributed_type: multi-GPU - num_devices: 2 - total_train_batch_size: 8 - total_eval_batch_size: 8 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - num_epochs: 10.0 ### Training results | Training Loss | Epoch | Step | Validation Loss | Accuracy | |:-------------:|:-----:|:-----:|:---------------:|:--------:| | No log | 0.06 | 100 | 0.7930 | 0.8214 | | No log | 0.11 | 200 | 0.7544 | 0.8290 | | No log | 0.17 | 300 | 0.7358 | 0.8328 | | No log | 0.23 | 400 | 0.7192 | 0.8357 | | 0.8156 | 0.28 | 500 | 0.7012 | 0.8397 | | 0.8156 | 0.34 | 600 | 0.6904 | 0.8419 | | 0.8156 | 0.4 | 700 | 0.6802 | 0.8440 | | 0.8156 | 0.45 | 800 | 0.6670 | 0.8465 | | 0.8156 | 0.51 | 900 | 0.6572 | 0.8486 | | 0.7219 | 0.57 | 1000 | 0.6499 | 0.8500 | | 0.7219 | 0.62 | 1100 | 0.6411 | 0.8522 | | 0.7219 | 0.68 | 1200 | 0.6343 | 0.8537 | | 0.7219 | 0.74 | 1300 | 0.6299 | 0.8546 | | 0.7219 | 0.79 | 1400 | 0.6221 | 0.8561 | | 0.662 | 0.85 | 1500 | 0.6157 | 0.8574 | | 0.662 | 0.91 | 1600 | 0.6138 | 0.8579 | | 0.662 | 0.96 | 1700 | 0.6055 | 0.8595 | | 0.662 | 1.02 | 1800 | 0.6143 | 0.8598 | | 0.662 | 1.08 | 1900 | 0.6191 | 0.8599 | | 0.5707 | 1.14 | 2000 | 0.6118 | 0.8607 | | 0.5707 | 1.19 | 2100 | 0.6123 | 0.8611 | | 0.5707 | 1.25 | 2200 | 0.6089 | 0.8617 | | 0.5707 | 1.31 | 2300 | 0.6064 | 0.8619 | | 0.5707 | 1.36 | 2400 | 0.6079 | 0.8625 | | 0.4923 | 1.42 | 2500 | 0.6040 | 0.8625 | | 0.4923 | 1.48 | 2600 | 0.6030 | 0.8630 | | 0.4923 | 1.53 | 2700 | 0.6021 | 0.8636 | | 0.4923 | 1.59 | 2800 | 0.6001 | 0.8643 | | 0.4923 | 1.65 | 2900 | 0.5981 | 0.8644 | | 0.4909 | 1.7 | 3000 | 0.5942 | 0.8648 | | 0.4909 | 1.76 | 3100 | 0.5918 | 0.8650 | | 0.4909 | 1.82 | 3200 | 0.5923 | 0.8659 | | 0.4909 | 1.87 | 3300 | 0.5884 | 0.8664 | | 0.4909 | 1.93 | 3400 | 0.5884 | 0.8663 | | 0.4964 | 1.99 | 3500 | 0.5903 | 0.8669 | | 0.4964 | 2.04 | 3600 | 0.6421 | 0.8655 | | 0.4964 | 2.1 | 3700 | 0.6401 | 0.8651 | | 0.4964 | 2.16 | 3800 | 0.6411 | 0.8649 | | 0.4964 | 2.21 | 3900 | 0.6387 | 0.8645 | | 0.345 | 2.27 | 4000 | 0.6362 | 0.8654 | | 0.345 | 2.33 | 4100 | 0.6362 | 0.8654 | | 0.345 | 2.38 | 4200 | 0.6362 | 0.8654 | | 0.345 | 2.44 | 4300 | 0.6357 | 0.8655 | | 0.345 | 2.5 | 4400 | 0.6362 | 0.8656 | | 0.3463 | 2.55 | 4500 | 0.6377 | 0.8658 | | 0.3463 | 2.61 | 4600 | 0.6357 | 0.8660 | | 0.3463 | 2.67 | 4700 | 0.6294 | 0.8665 | | 0.3463 | 2.72 | 4800 | 0.6333 | 0.8665 | | 0.3463 | 2.78 | 4900 | 0.6362 | 0.8662 | | 0.3508 | 2.84 | 5000 | 0.6357 | 0.8666 | | 0.3508 | 2.89 | 5100 | 0.6299 | 0.8673 | | 0.3508 | 2.95 | 5200 | 0.6313 | 0.8668 | | 0.3508 | 3.01 | 5300 | 0.7188 | 0.8646 | | 0.3508 | 3.06 | 5400 | 0.7017 | 0.8656 | | 0.295 | 3.12 | 5500 | 0.6982 | 0.8653 | | 0.295 | 3.18 | 5600 | 0.7031 | 0.8655 | | 0.295 | 3.23 | 5700 | 0.6992 | 0.8651 | | 0.295 | 3.29 | 5800 | 0.6997 | 0.8653 | | 0.295 | 3.35 | 5900 | 0.7041 | 0.8651 | | 0.2348 | 3.41 | 6000 | 0.7075 | 0.8649 | | 0.2348 | 3.46 | 6100 | 0.6992 | 0.8650 | | 0.2348 | 3.52 | 6200 | 0.7065 | 0.8647 | | 0.2348 | 3.58 | 6300 | 0.6997 | 0.8652 | | 0.2348 | 3.63 | 6400 | 0.7026 | 0.8651 | | 0.2411 | 3.69 | 6500 | 0.7046 | 0.8656 | | 0.2411 | 3.75 | 6600 | 0.7007 | 0.8655 | | 0.2411 | 3.8 | 6700 | 0.7026 | 0.8651 | | 0.2411 | 3.86 | 6800 | 0.7031 | 0.8655 | | 0.2411 | 3.92 | 6900 | 0.7012 | 0.8658 | | 0.251 | 3.97 | 7000 | 0.7051 | 0.8656 | | 0.251 | 4.03 | 7100 | 0.7607 | 0.8650 | | 0.251 | 4.09 | 7200 | 0.7632 | 0.8656 | | 0.251 | 4.14 | 7300 | 0.7588 | 0.8655 | | 0.251 | 4.2 | 7400 | 0.7578 | 0.8651 | | 0.1797 | 4.26 | 7500 | 0.7710 | 0.8645 | | 0.1797 | 4.31 | 7600 | 0.7627 | 0.8648 | | 0.1797 | 4.37 | 7700 | 0.7583 | 0.8650 | | 0.1797 | 4.43 | 7800 | 0.7646 | 0.8649 | | 0.1797 | 4.48 | 7900 | 0.7598 | 0.8646 | | 0.1784 | 4.54 | 8000 | 0.7656 | 0.8650 | | 0.1784 | 4.6 | 8100 | 0.7617 | 0.8648 | | 0.1784 | 4.65 | 8200 | 0.7573 | 0.8651 | | 0.1784 | 4.71 | 8300 | 0.7671 | 0.8648 | | 0.1784 | 4.77 | 8400 | 0.7563 | 0.8651 | | 0.1827 | 4.82 | 8500 | 0.7651 | 0.8649 | | 0.1827 | 4.88 | 8600 | 0.7637 | 0.8650 | | 0.1827 | 4.94 | 8700 | 0.7607 | 0.8654 | | 0.1827 | 4.99 | 8800 | 0.7607 | 0.8650 | | 0.1827 | 5.05 | 8900 | 0.8149 | 0.8646 | | 0.167 | 5.11 | 9000 | 0.8081 | 0.8648 | | 0.167 | 5.16 | 9100 | 0.8184 | 0.8644 | | 0.167 | 5.22 | 9200 | 0.8140 | 0.8647 | | 0.167 | 5.28 | 9300 | 0.8169 | 0.8644 | | 0.167 | 5.33 | 9400 | 0.8120 | 0.8645 | | 0.1371 | 5.39 | 9500 | 0.8154 | 0.8643 | | 0.1371 | 5.45 | 9600 | 0.8179 | 0.8642 | | 0.1371 | 5.51 | 9700 | 0.8154 | 0.8643 | | 0.1371 | 5.56 | 9800 | 0.8120 | 0.8645 | | 0.1371 | 5.62 | 9900 | 0.8110 | 0.8650 | | 0.1425 | 5.68 | 10000 | 0.8159 | 0.8645 | | 0.1425 | 5.73 | 10100 | 0.8174 | 0.8646 | | 0.1425 | 5.79 | 10200 | 0.8159 | 0.8649 | | 0.1425 | 5.85 | 10300 | 0.8110 | 0.8639 | | 0.1425 | 5.9 | 10400 | 0.8135 | 0.8645 | | 0.1505 | 5.96 | 10500 | 0.8140 | 0.8642 | | 0.1505 | 6.02 | 10600 | 0.8628 | 0.8640 | | 0.1505 | 6.07 | 10700 | 0.8540 | 0.8644 | | 0.1505 | 6.13 | 10800 | 0.8530 | 0.8642 | | 0.1505 | 6.19 | 10900 | 0.8560 | 0.8647 | | 0.1086 | 6.24 | 11000 | 0.8555 | 0.8649 | | 0.1086 | 6.3 | 11100 | 0.8604 | 0.8644 | | 0.1086 | 6.36 | 11200 | 0.8569 | 0.8642 | | 0.1086 | 6.41 | 11300 | 0.8530 | 0.8639 | | 0.1086 | 6.47 | 11400 | 0.8589 | 0.8643 | | 0.1076 | 6.53 | 11500 | 0.8525 | 0.8639 | | 0.1076 | 6.58 | 11600 | 0.8579 | 0.8640 | | 0.1076 | 6.64 | 11700 | 0.8594 | 0.8640 | | 0.1076 | 6.7 | 11800 | 0.8599 | 0.8643 | | 0.1076 | 6.75 | 11900 | 0.8564 | 0.8640 | | 0.1109 | 6.81 | 12000 | 0.8633 | 0.8640 | | 0.1109 | 6.87 | 12100 | 0.8584 | 0.8638 | | 0.1109 | 6.92 | 12200 | 0.8647 | 0.8636 | | 0.1109 | 6.98 | 12300 | 0.8599 | 0.8635 | | 0.1109 | 7.04 | 12400 | 0.8979 | 0.8632 | | 0.1028 | 7.09 | 12500 | 0.8936 | 0.8635 | | 0.1028 | 7.15 | 12600 | 0.9043 | 0.8637 | | 0.1028 | 7.21 | 12700 | 0.8989 | 0.8642 | | 0.1028 | 7.26 | 12800 | 0.8936 | 0.8642 | | 0.1028 | 7.32 | 12900 | 0.8921 | 0.8641 | | 0.0774 | 7.38 | 13000 | 0.8955 | 0.8634 | | 0.0774 | 7.43 | 13100 | 0.8950 | 0.8636 | | 0.0774 | 7.49 | 13200 | 0.8994 | 0.8635 | | 0.0774 | 7.55 | 13300 | 0.8999 | 0.8635 | | 0.0774 | 7.6 | 13400 | 0.8936 | 0.8631 | | 0.0852 | 7.66 | 13500 | 0.9048 | 0.8634 | | 0.0852 | 7.72 | 13600 | 0.8960 | 0.8632 | | 0.0852 | 7.78 | 13700 | 0.9023 | 0.8635 | | 0.0852 | 7.83 | 13800 | 0.8984 | 0.8638 | | 0.0852 | 7.89 | 13900 | 0.9019 | 0.8635 | | 0.0879 | 7.95 | 14000 | 0.9014 | 0.8634 | | 0.0879 | 8.0 | 14100 | 0.9136 | 0.8630 | | 0.0879 | 8.06 | 14200 | 0.9312 | 0.8639 | | 0.0879 | 8.12 | 14300 | 0.9346 | 0.8635 | | 0.0879 | 8.17 | 14400 | 0.9307 | 0.8635 | | 0.0611 | 8.23 | 14500 | 0.9419 | 0.8641 | | 0.0611 | 8.29 | 14600 | 0.9331 | 0.8631 | | 0.0611 | 8.34 | 14700 | 0.9375 | 0.8636 | | 0.0611 | 8.4 | 14800 | 0.9292 | 0.8626 | | 0.0611 | 8.46 | 14900 | 0.9458 | 0.8637 | | 0.061 | 8.51 | 15000 | 0.9336 | 0.8634 | | 0.061 | 8.57 | 15100 | 0.9409 | 0.8630 | | 0.061 | 8.63 | 15200 | 0.9390 | 0.8632 | | 0.061 | 8.68 | 15300 | 0.9375 | 0.8628 | | 0.061 | 8.74 | 15400 | 0.9365 | 0.8630 | | 0.0646 | 8.8 | 15500 | 0.9370 | 0.8628 | | 0.0646 | 8.85 | 15600 | 0.9355 | 0.8629 | | 0.0646 | 8.91 | 15700 | 0.9375 | 0.8632 | | 0.0646 | 8.97 | 15800 | 0.9390 | 0.8630 | | 0.0646 | 9.02 | 15900 | 0.9717 | 0.8630 | | 0.0593 | 9.08 | 16000 | 0.9673 | 0.8626 | | 0.0593 | 9.14 | 16100 | 0.9644 | 0.8630 | | 0.0593 | 9.19 | 16200 | 0.9624 | 0.8631 | | 0.0593 | 9.25 | 16300 | 0.9648 | 0.8633 | | 0.0593 | 9.31 | 16400 | 0.9673 | 0.8632 | | 0.0415 | 9.36 | 16500 | 0.9658 | 0.8633 | | 0.0415 | 9.42 | 16600 | 0.9688 | 0.8628 | | 0.0415 | 9.48 | 16700 | 0.9653 | 0.8632 | | 0.0415 | 9.53 | 16800 | 0.9658 | 0.8628 | | 0.0415 | 9.59 | 16900 | 0.9668 | 0.8629 | | 0.0471 | 9.65 | 17000 | 0.9604 | 0.8625 | | 0.0471 | 9.7 | 17100 | 0.9658 | 0.8621 | | 0.0471 | 9.76 | 17200 | 0.9731 | 0.8630 | | 0.0471 | 9.82 | 17300 | 0.9692 | 0.8626 | | 0.0471 | 9.88 | 17400 | 0.9673 | 0.8623 | | 0.0528 | 9.93 | 17500 | 0.9614 | 0.8620 | | 0.0528 | 9.99 | 17600 | 0.9697 | 0.8621 | ### Framework versions - Transformers 4.21.0.dev0 - Pytorch 1.12.1+cu113 - Datasets 2.4.0 - Tokenizers 0.12.1