---
library_name: transformers
base_model: dbmdz/bert-base-german-uncased
license: mit
language:
- de
model-index:
  - name: LernnaviBERT
    results: []
---

# LernnaviBERT Model Card

LernnaviBERT is finetuning of [German BERT](https://huggingface.co/dbmdz/bert-base-german-uncased) on educational textual data from the Lernnavi Intelligent Tutoring Systems (ITS). It is trained on masked language modeling following the BERT training scheme.

### Model Sources

- **Repository:** [https://github.com/epfl-ml4ed/answer-forecasting](https://github.com/epfl-ml4ed/answer-forecasting)
- **Paper:** [https://arxiv.org/abs/2405.20079](https://arxiv.org/abs/2405.20079)

### Direct Use

Being a fine-tuning of a base BERT model, LernnaviBERT is suitable for all BERT uses, especially in the educational domain in the German language.

### Downstream Use

LernnaviBERT has been fine-tuned for [MCQ answering](https://huggingface.co/epfl-ml4ed/MCQBert) and Student Answer Forecasting (like [MCQStudentBertCat](https://huggingface.co/epfl-ml4ed/MCQStudentBertCat) and [MCQStudentBertSum](https://huggingface.co/epfl-ml4ed/MCQStudentBertSum)) as described in [https://arxiv.org/abs/2405.20079](https://arxiv.org/abs/2405.20079)


## Training Details

The model was trained on text data from a real-world ITS, Lernnavi, on ~40k text pieces for 3 epochs with a batch size of 16, going from an initial perplexity of 1.21 on Lernnavi data to a final perplexity of 1.01

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 16
- eval_batch_size: 16
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 3
- mixed_precision_training: Native AMP

### Training results

| Training Loss | Epoch | Step | Validation Loss |
|:-------------:|:-----:|:----:|:---------------:|
| 0.0385        | 1.0   | 2405 | 0.0137          |
| 0.0142        | 2.0   | 4810 | 0.0084          |
| 0.0096        | 3.0   | 7215 | 0.0072          |


## Citation

If you find this useful in your work, please cite our paper

```
@misc{gado2024student,
      title={Student Answer Forecasting: Transformer-Driven Answer Choice Prediction for Language Learning}, 
      author={Elena Grazia Gado and Tommaso Martorella and Luca Zunino and Paola Mejia-Domenzain and Vinitra Swamy and Jibril Frej and Tanja Käser},
      year={2024},
      eprint={2405.20079},
      archivePrefix={arXiv},
}
```

```
Gado, E., Martorella, T., Zunino, L., Mejia-Domenzain, P., Swamy, V., Frej, J., Käser, T. (2024). 
Student Answer Forecasting: Transformer-Driven Answer Choice Prediction for Language Learning. 
In: Proceedings of the Conference on Educational Data Mining (EDM 2024). 
```

### Framework versions

- Transformers 4.37.1
- Pytorch 2.2.0
- Datasets 2.2.1
- Tokenizers 0.15.1