--- license: mit language: - en tags: - babylm --- # Lil-Bevo Lil-Bevo is UT Austin's submission to the BabyLM challenge, specifically the *strict-small* track. [Link to GitHub Repo](https://github.com/venkatasg/Lil-Bevo) ## TLDR: - Unigram tokenizer trained on 10M BabyLM tokens plus MAESTRO dataset for a vocab size of 16k. - `deberta-small-v3` trained on mixture of MAESTRO and 10M tokens for 5 epochs. - Model continues training for 50 epochs on 10M tokens with sequence length of 128. - Model is trained for 2 epochs with targeted linguistic masking with sequence length of 512. This README will be updated with more details soon.