--- library_name: transformers base_model: dbmdz/bert-base-german-uncased license: mit language: - de model-index: - name: LernnaviBERT results: [] --- # LernnaviBERT Model Card LernnaviBERT is finetuning of [German BERT](https://huggingface.co/dbmdz/bert-base-german-uncased) on educational textual data from the Lernnavi Intelligent Tutoring Systems (ITS). It is trained on masked language modeling following the BERT training scheme. ### Model Sources - **Repository:** [https://github.com/epfl-ml4ed/answer-forecasting](https://github.com/epfl-ml4ed/answer-forecasting) - **Paper:** [https://arxiv.org/abs/2405.20079](https://arxiv.org/abs/2405.20079) ### Direct Use Being a fine-tuning of a base BERT model, LernnaviBERT is suitable for all BERT uses, especially in the educational domain in the German language. ### Downstream Use LernnaviBERT has been fine-tuned for [MCQ answering](https://huggingface.co/epfl-ml4ed/MCQBert) and Student Answer Forecasting (like [MCQStudentBertCat](https://huggingface.co/epfl-ml4ed/MCQStudentBertCat) and [MCQStudentBertSum](https://huggingface.co/epfl-ml4ed/MCQStudentBertSum)) as described in [https://arxiv.org/abs/2405.20079](https://arxiv.org/abs/2405.20079) ## Training Details The model was trained on text data from a real-world ITS, Lernnavi, on ~40k text pieces for 3 epochs with a batch size of 16, going from an initial perplexity of 1.21 on Lernnavi data to a final perplexity of 1.01 ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 2e-05 - train_batch_size: 16 - eval_batch_size: 16 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - num_epochs: 3 - mixed_precision_training: Native AMP ### Training results | Training Loss | Epoch | Step | Validation Loss | |:-------------:|:-----:|:----:|:---------------:| | 0.0385 | 1.0 | 2405 | 0.0137 | | 0.0142 | 2.0 | 4810 | 0.0084 | | 0.0096 | 3.0 | 7215 | 0.0072 | ## Citation If you find this useful in your work, please cite our paper ``` @misc{gado2024student, title={Student Answer Forecasting: Transformer-Driven Answer Choice Prediction for Language Learning}, author={Elena Grazia Gado and Tommaso Martorella and Luca Zunino and Paola Mejia-Domenzain and Vinitra Swamy and Jibril Frej and Tanja Käser}, year={2024}, eprint={2405.20079}, archivePrefix={arXiv}, } ``` ``` Gado, E., Martorella, T., Zunino, L., Mejia-Domenzain, P., Swamy, V., Frej, J., Käser, T. (2024). Student Answer Forecasting: Transformer-Driven Answer Choice Prediction for Language Learning. In: Proceedings of the Conference on Educational Data Mining (EDM 2024). ``` ### Framework versions - Transformers 4.37.1 - Pytorch 2.2.0 - Datasets 2.2.1 - Tokenizers 0.15.1